Is a ton of psychology just ... wrong?

2021/4/7

Unexplainable

Chapters

The episode begins with a discussion on the crisis in psychology, starting with the controversial ESP paper by Daryl Bem and the subsequent realization of methodological flaws in many psychological studies.

Shownotes Transcript

On September 28th, the Global Citizen Festival will gather thousands of people who took action to end extreme poverty. Join Post Malone, Doja Cat, Lisa, Jelly Roll, and Raul Alejandro as they take the stage with world leaders and activists to defeat poverty, defend the planet, and demand equity. Download the Global Citizen app today and earn your spot at the festival. Learn more at globalcitizen.org.com.

It seems like each news cycle is filled with stories of people testing the boundaries of our laws. To help illuminate the complex legal issues shaping our country, CAFE has assembled a team of legal experts for a new podcast called...

The Council. You'll hear from former U.S. attorneys Joyce Vance and Barbara McQuaid, legal scholar Rachel Barco, former FBI Special Agent Asher Mangappa, and of course me, Eli Honig, a former prosecutor and CNN senior legal analyst. Listen to commentary from The Council twice a week by subscribing on your favorite podcast app. That's Council, C-O-U-N-S-E-L. ... ...

This is Unexplainable. I'm Noam Hassenfeld, here with our senior science reporter, Brian Resnick. And Brian, as always, we're here to talk about an unanswered scientific question. Yeah, I actually have this really huge, kind of uncomfortable, unanswered question. Okay. And it's, how much of psychology is wrong? Do we have reason to assume that tons of psychology is wrong?

Yeah. So 10 years ago, researchers started to notice and to really get worried about some really rotten things going on at the core of a lot of psychological research. It led them to really question and rewrite a lot of assumptions. And since then, there's just been this kind of huge fight. It's really a crisis. Scary. Okay. Where do we start here? Let's start with the psychologist who's been working on this problem for years now.

I'm Samin Vazir, and I study how we can do psychology better. Samin said, like, this crisis really started getting going in 2011 when this one paper came out that had some really weird claims in it. Yeah, so it was a paper claiming that people have ESP, like, predicting the future. ♪

Wait, wait, like ESP, like sci-fi, knowing what's in people's heads? Yeah. Okay. Yeah, this paper was from a psychologist named Daryl Bem, an interesting guy. And in it, he wrote up a lot of experiments. ♪

This is an experiment that tests for ESP. It takes about 20 minutes and is run completely by computer. This is the language participants read when they sat in front of a screen. On each trial of the experiment, pictures of two curtains will appear on the screen side by side.

The computer would put a picture behind one of the curtains, and behind the other one would just be a blank wall. Your task is to click on the curtain that you feel has the picture behind it. If participants guessed correctly, BEM took that as evidence that they predicted the future, that ESP is real.

Yeah, but it's not real, right? Yeah, not according to laws of physics. Okay, according to those things, sure. Yeah, no. And for most of us, we thought that the claims were preposterous. Okay, preposterous. I mean, is that just a nice way of saying this is all nonsense? Oh, it was nonsense. Okay. The results here were absurd. But this is the thing. The methods Bem used...

Those weren't absurd. It was a nine-study paper, very methodologically rigorous by the standards of the day. If this paper had more plausible results, like if you sub in ESP for something more normal, it would have been viewed as a great paper. Okay, but if you're using all the right methods that psychologists everywhere agree are fine and legit, how do you get a result that's as absurd as people can predict the future? No, this is the key question.

So shortly after this whole ESP paper comes out, another paper comes out. So it's a 2011 paper by Simmons, Nelson, and Simonson called "False Positive Psychology." And it really answers your question. It shows really clearly how absurd conclusions can come from seemingly well-built studies. The basic idea here is experiments generate a lot of data, and a lot of it just seems like noise.

But you could do a thing where if you only publish the patterns that are interesting to you, like within that noise, like you can get these results. Let's say you tell me that you accurately predicted the winner of the Australian Open. You know, two days before the Open, before any of the players get on any of the courts, you wrote down your prediction on a piece of paper. And you told me that Naomi Osaka, she's definitely going to win this.

And then lo and behold... Osaka is champion in Australia for the second time.

If I know that's the only prediction you made, I'm going to be much more impressed than if you had 50 other pieces of paper dated the same date, predicting 50 different winners of the Australian Open. Yeah, like one scenario is a real prediction and the other one is just, you have a bunch of guesses and obviously one of them is going to be right. So it's relevant to the reader to know what else you tried, to know how many times you had to try in order to get this beautiful result. So this paper is saying that some psychologists, like,

are doing the equivalent of writing down 50 predictions on 50 pieces of paper and just not publishing when they predicted wrongly. But then critically, this paper then goes on for a while, just listing a bunch of ways, a bunch of acceptable methodological ways that researchers can implement and get flawed results. But to be clear here, it's not usually as blatant as this Australian Open example.

what it looks like in the real scientist lab is not nearly that malicious or intentional or extreme. And you have a rationale for it. I think Samin is right. A lot of this was done in good faith. Some of these methods that can be used are money-saving. Some of the methods are used to just...

find something to publish. And publishing is like the currency of an academic career. Like, there's a lot of pressure to do it. Samin, for example, remembers going through her dissertation results with a mentor. And we picked out specific results and we're like, these results together would make a good story. So she built that story and put it in a paper and then published it in like a really prestigious journal right before she got tenure. Yeah.

And then when this stuff hit, I was like, "This is exactly what I was doing." Pulling out a pattern from noisy data, highlighting just that pattern and ignoring everything that didn't fit, and then telling a story around it. And it's not just Samin coming to this really quite unsettling revelation.

a lot of people are starting to say, wait, something's not right here. We need to check up on our work. Okay, and how do you do that? How do you check years of past research? It's called replication.

All replication is, is saying, oh, you think you made a discovery? Let's make sure that it's a real discovery. And to do that, let's repeat it as closely as possible to what the original researcher did. Let's make sure that we can consistently get that result. Maybe not 100% of the time, but at least most of the time. If something replicates, it doesn't tell you anything about how something works. It just tells you something works, like there's a there there.

It's really foundational. These are the bricks that science is built on. You know, the 2011 events made us think maybe it's all a house of cards. So the next step was to take each brick in our house, each of the findings that we think we're building on, and make sure they're real findings. Psychologists basically crack open Psych 101 textbooks, and they also think about a lot of the most popular findings in their field. And they just start asking, are these experiments replicable?

Things like the 10,000 hour rule. It looks like we need to spend 10,000 hours practicing before we get good. Power posing. Stances researchers said would make people feel more powerful.

The marshmallow test. An elegant, simple way to test a person's ability to delay gratification. Psychologists start testing these famous studies, in a lot of cases with a lot bigger sample sizes and much more rigorous methodology. And a lot of them just don't really hold up, or their results just seem much less impressive.

Some studies have started to cast doubt on this whole ego depletion thing. One of three scientists who conducted the research said, quote, I do not have any faith in the embodied effects of power poses. So you found that of these hundred studies, in most of them, they did not confirm the original finding. Right. That sounds like a crisis. Yes. This whole episode has been called the replication crisis. So all of this is coming out. The studies are failing to replicate. Right.

How does the whole institution of psychology respond? Not well. Okay. Members of some of the old guard in psychology start to get really heated. They're feeling attacked. Like, this is their work that's coming under scrutiny, the work of their colleagues. I found one symposium, a series of talks from 2016 that has some of this tense flavor in it. Our symposium will touch on no less weighty topics.

than the foundations, methods, and implications of science. At this symposium, this famous psychologist gets up. Dan Gilbert. His name is Dan Gilbert, and he says, this symposium here, it's all been friendly, but this friendliness is an illusion. It can almost make you forget that there's a war going on. Almost can make you forget that psychology is in crisis.

This war, he says, it's leading to some of his colleagues getting really stressed out and retiring early. Middle-aged people are screaming at each other at conferences and hiring attorneys. And he directs some of the blame to just young people and their attitudes. Today, it seems to me, young psychologists are remarkably unkind, self-righteous, intolerant, jealous, petty, and snarky.

This summarizes a lot of the pushback I've heard to this replication crisis, that it's really about tone, that people are just being really rude about it in public. But it can't just be about tone, right? It seems kind of weird to say psychology is falling apart, but you all could be nicer about it. I guess I'd say some people think it's overblown, but...

But actually, in the years since this symposium, more and more researchers are agreeing this is a real problem. Samin says as they do this big roundup, as they check the bricks of their foundation that their science is built on...

They find that maybe 50% of psychology papers replicate. That's not a house. You can't build a house. You can't extend on a foundation where less than half of it is real. This isn't an authoritative number. The

very genuinely, we just don't know how many studies meet this basic test. There's just been a ton published. That'd be a lot to check up on. But this 50% number is realistic enough to be scary. So does that mean that 50% of psychology is wrong? This is a really tough question. For one, yes, there's good reason to assume that a lot of psychology might be wrong. Mm-hmm.

This is the unexplainable. We really don't know how much of psychology is wrong. If we don't want to have all these questions remaining about the validity of psychological science, so much work needs to be done just to rebuild this foundation. How do you do that? How do you rebuild an entire institution? I'll introduce you to someone who's trying after this break. Support for Unexplainable comes from Greenlight.

People with kids tell me time moves a lot faster. Before you know it, your kid is all grown up, they've got their own credit card, and they have no idea how to use it. But you can help. If you want your kids to get some financial literacy early on,

You might want to try Greenlight. Greenlight is a debit card and money app that's made for families. Parents can send money to their kids, they can keep an eye on kids' spending and saving, and kids and teens can build money confidence and lifelong financial literacy skills.

Oda Sham is my colleague here at Vox, and she got a chance to try out Greenline. There are videos you can watch on how to invest money. So we took a portion of his savings to put into investing where I told him, watch the videos so that he can start learning how to invest money as well.

Millions of parents and kids are learning about money on Greenlight. You can sign up for Greenlight today and get your first month free trial when you go to greenlight.com slash unexplainable. That's greenlight.com slash unexplainable to try Greenlight for free. greenlight.com slash unexplainable.

Unexplainable. We're back. Brian, in the first half of the show, you were talking about how maybe 50% of psychological research has been called into question over the last decade. It's all part of something called the replication crisis. So where does psychology go from here? Yeah, there are a few directions. And one really interesting one is from this guy, Chris Chartier. I'm an associate professor of psychology at Ashland University and...

And I'm a social psychologist by training. Chris's story also starts in 2011. He read the BEM paper, and he read the METHIS paper, and like Samin, it made him wonder about his own work. And I even asked him, like, when you looked, did you think any of it was bogus? Yes. Um...

Maybe that's the wrong word, but like, did you like, did you applaud? Yes. Yeah. Yeah. Or was reported incompletely or selectively against, in spite of my best intentions, I would say yes. Chris did see these problems in his own work, even though it hurt a little to look. And,

And he also saw these problems in the fields of psychology. And he started attending these conferences where a lot of people were talking about replication. And then in 2017, he really, he had this epiphany. Do you remember the great American solar eclipse? The first total solar eclipse of the century all across America. An unforgettable moment. Sure, I remember well. I had my paper glasses. Eclipse glasses are very good, except for driving.

So a few days after the eclipse, Chris wasn't wearing his eclipse glasses, but he was driving. He was driving to go mountain biking, and he was thinking about the eclipse and just feeling jealous. Yeah, physics envy. That's the Freudian phrase that you could use.

toss in there, I suppose. He started to think about what led to this moment. How much physicists and astronomers had to know about the solar system and how right they had to be and how precise they had to be to predict an eclipse. They allowed us to have this grand moment together. And then he just thought about his own field, psychology. They do not have this precision.

Then I took this three-hour mountain bike ride out in the woods, spotty cell service. It's a great place, time, setting for me to think. And the contours of this idea that emerged as the Psychological Science Accelerator started to take shape.

Sorry, the what? Yeah, the psychological science accelerator. This is Chris's idea to make psychology precise. Eclipse-level precise. On the drive home from that mountain bike ride, I recorded an audio memo to myself. Oh, it would be a real trip to listen back to my audio memo. Please tell me you got this voice memo, Brian. Oh, I have it. Um, blog post, building a CERN for psychology...

Chris wants to imitate CERN, which runs this huge particle collider in Europe. And Chris is starting to get a little technical here. But the gist is that he wants to imitate CERN.

He wants to fix psychology's crisis by bringing together lots and lots of researchers from all around the world and form this one big testing squad. So he gets home from his bike trip and he just starts on this. So I'm still in my bike shorts.

because I was like too frantic to shower. He writes his blog post, like this kind of mini manifesto of like this dream he has. And he puts up a sign-up sheet. And, you know, as a young researcher does, he puts it on Twitter. Really, it was a classic academic viral experience. I always put the academic qualifier on there, like,

I think the tweet was retweeted like 30 times. And I was like, oh my God, Kim, I'm famous. We made it. And so like the yada, yada, yada here is that now there are more than a thousand researchers in this big network and they come from more than 70 countries all around the world. It switched from like, this is a cool idea. Maybe 10 people will do it with me very quickly to like, we really are going to do this thing.

Okay, that seems like a great idea. You know, lots of people all over the place working together. How does this actually solve the problems at the core of psychology, though? Yes, there are a few things here. One is that they're going to do huge experiments, bring together lots of people, have huge sample sizes, internationally, you know, test things around the world. But the kind of main, like the key ingredient here is called pre-registration.

Yeah. Remember that this whole big problem starts with researchers just tweaking things as they go along? So pre-registration cancels that. It's basically before a researcher does anything, they write something down. It's like a recipe for a study.

They list all their ingredients. They list all their steps. They list, like, how they'll analyze the results. And, you know, they submit this. They put this out in public so other people can, like, look and, you know, keep them accountable.

It prevents researchers from tweaking things along the way. They can't do a, oops, I forgot to publish these 49 failed predictions. You know, that just can't happen. This is my plan, and I'm sticking to it. Couldn't you just still tweak the recipe and just kind of keep it quiet? I asked Chris the same question. Yes, you could still do that, but it clearly makes that fraudulent. Okay, fair, fair.

So I guess you have all these labs all over the world making these promises, you know, to stick to these recipes that they lay out ahead of time for important studies. That's pre-registration. That makes sense in theory. I get it. What does it look like when you actually try to make it happen? Well, you know what happens when dreams meet the real world. Oh, dear God. My kitchen is such a mess.

Chris asked people to email him and he asked, what psychological studies should we look at? You know, should we check up on? So he wanted people to submit their own research and ask people to prove them wrong? No, it was not that. Yeah, people were submitting other researchers' studies and saying, let's check this. Okay, that's awkward. It is awkward. Right?

Yeah, so they get these potential papers, these potential experiments to check. And the accelerator group had to decide...

which one should we test? And they did pick one. The first person was a researcher at Princeton University, Alex Todorov. He's kind of a big deal. He has this really influential idea about how we make snap judgments from people's faces. It really helps explain a piece of where stereotypes come from. And so Chris reaches out and tells Alex, hey, we want to test your idea.

But here's the beauty in the accelerator, or maybe the beauty in Alex, or both. His reaction was positive. Well, I remember, of course. I remember it very clearly. That's Alex Todorov? Yeah, I called him up. I really just wanted to know how he felt when Chris reached out to him. Everybody has been talking about replication crisis, and often when people go to replicate something,

It's because they don't believe in it. And, you know, and that can put you somewhat uncomfortable, feeling uncomfortable. But Alex was pretty confident. And he was great in those initial interactions, just enthusiastic, supportive, I think excited. This was his chance to test this idea all around the world. And I've always been interested, but it's impossible.

For a single lab to do this at this large scope. So Alex gives advice and then Chris and his team do the super critical step, pre-registration. They contact a journal and they send in their whole recipe for the study. Ah,

Our methodology. Three eggs. Our data collection plan. Three tablespoons of butter. Our analytical strategy. Six cups of grated cheese. And then it starts. They start testing this idea around the world and data starts coming in and they start analyzing it. Remember, if the top isn't brown and crispy, then it is not a souffle. And then there's a snack.

Uh-oh. Yeah, yeah. So remember here, Chris and his team have their recipe down. They've committed to it. But then Alex Todorov, he starts to do some reading and he starts to have some doubts about one of the final steps in the recipe. Wait a second. If the top gets overly brown, then the souffle might be burned. He asked them, can we tweak this?

Isn't that like the whole point of this whole pre-registration thing? Like no tweaking the recipe? Yes, absolutely. That is the whole point. But Alex makes an interesting argument here. He says you don't always know everything you're going to need when you start a project. Imagine you're a brain surgeon and you pre-register all of the steps of your brain surgery. And then you started poking in the brain of your patient and say, oops, if I don't do this, I'm actually going to kill him or her.

Would you change the procedures? Okay, what does Chris say to that? Oh, he didn't agree. He and his team, they didn't think there was a problem with this recipe, and they didn't want to let the person who had the most to lose here come in and fiddle with it, so they refused. I would say that the initial enthusiasm ran out.

during this process? It sounded really tense. There was a lot of back and forth. Eventually, this paper published. The replication results came out. And honestly, they were kind of messy and split.

It sounds almost like Chris had this big, beautiful idea. You know, everyone will just do pre-registration, promise not to tweak their recipe, and everything would just be fine. But I guess in the end, if you actually try to make a CERN for psychology in the real world, it's just a lot harder. Could there be a better demonstration that it's needed? A pattern of human response where initial enthusiasm gives way...

debate and argument only once the data are in hand is kind of a really elegant demonstration of some of the problems. And I guess Alex is an example of one of the problems here?

I don't want to demonize Alex. There are rude scientists out there. I have talked to them. He is not one of them. He was very generous with his time, and he was very thoughtful. But this is just really hard. It's really hard to invite people, to invite others, to look for flaws in your own work.

People always talk, I mean, the stereotype of the scientists, especially from an outside perspective, is like, okay, we're objective, we have personal life, and the science is completely different. And that's just not true. I mean, half of your life is your work, and then you've built in your reputation. Science is done by people, and people are going to have emotions about their work.

It sounds like the method to fix psychology is really, really complicated and difficult. And it's going to take a long time and tons of people are going to burn out. And in the meantime, we still have psychology, right? I mean, can we trust psychology right now? I actually think psychology is leading the way on a lot of this. These problems we've been talking about with cherry-picking data and the need to publish at all costs—

This exists in other fields of science. What, like a ton of biological research and chemical research isn't true either? Yeah, go Google replication crisis in biomedicine. There's one there. It's just less well-documented. Psychology is really paving a path forward here. And right now, we don't have to throw out everything in psychology. We have learned things. There are studies that replicate things.

But good science is a gift we give to the future. So we have to get it right. If you can just marginally improve the way we collect and analyze our data and draw conclusions from them, there are untold future human beings that can benefit from that tiny advance. You don't have to answer all the questions today. It's good enough to make incremental but lasting progress on how we ask those questions.

And if psychology can find a way to make those incremental changes, maybe other fields can replicate their success. Brian Resnick is a senior science reporter for Unexplainable and Vox.com.

This episode was produced by Bird Pinkerton. We had editing from Meredith Hodnot and me, Noam Hassenfeld. Liliana Michelina did the fact-checking. Hannes Brown did the mixing and sound design. I wrote the music. And Liz Kelly Nelson is the editorial director of Vox Podcasts. For more about The Accelerator and their plans for the future, check out our newsletter. You can sign up at vox.com slash unexplainable. And as always, you can find a link to subscribe in our show notes.

And while you're over at vox.com slash unexplainable, we've got something new for you over there. In case you want to share any of our past episodes with kids or students, episodes that have curse words in them, check out our site for a link to clean versions of past episodes. And also email us. We love reading emails. We're at unexplainable at vox.com. Unexplainable is part of the Vox Media Podcast Network. And we'll be back in your feed next Wednesday.

Is a ton of psychology just ... wrong? 30:42 Share

Unexplainable

Chapters

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine

Is a ton of psychology just ... wrong?