How Southwest Airlines Is Changing Modern Science

The history of science is largely the history of individual genius. From Galileo to Einstein, Isaac Newton to Charles Darwin, we tend to celebrate the breakthroughs achieved by a mind working by itself, seeing more reality than anyone has ever seen before.

It’s a romantic narrative. It’s also obsolete. As documented in a pair of Science papers by Stefan Wuchty, Benjamin Jones and Brian Uzzi, modern science is increasingly a team sport: more than 80 percent of science papers are now co-authored. These teams are also producing the most influential research, as papers with multiple authors are 6.3 times more likely to get at least 1000 citations. The era of the lone genius is over.

What’s causing the dramatic increase in scientific collaboration? One possibility is that the rise of teams is a response to the increasing complexity of modern science. To advance knowledge in the 21st century, one has to master an astonishing amount of information and experimental know-how; because we have discovered so much, it’s harder to discover something new. (In other words, the mysteries that remain often exceed the capabilities of the individual mind.) This means that the most important contributions now require collaboration, as people from different specialties work together to solve extremely difficult problems.

But this might not be the only reason scientists are working together more frequently. Another possibility is that the rise of teams is less about shifts in knowledge and more about the increasing ease of interacting with other researchers. It’s not about science getting hard. It’s about collaboration getting easy.

While it seems likely that both of these explanations are true—the trend is probably being driven by multiple factors—a new paper emphasizes the changes that have reduced the costs of academic collaboration. To do this, the economists Christian Catalini, Christian Fons-Rosen and Patrick Gaule looked at what happens to scientific teams after Southwest Airlines enters a metropolitan market. (On average, the entrance of Southwest leads to a roughly 20 percent reduction in fares and a 44 percent increase in passengers.) If these research partnerships are held back by practical obstacles—money, time, distance, etc.—then the arrival of Southwest should lead to a spike in teamwork.

That’s exactly what they found. According to the researchers, after Southwest begins a new route collaborations among scientists increase across every scientific discipline. (Physicists increase their collaborations by 26 percent, while biologists seem to really love cheap airfare: their collaborations increase by 85 percent.) To better understand these trends, and to rule out some possible confounds, Catalini et al. zoomed in on collaborations among chemists. They tracked the research produced by 819 pairs of chemists between 1993 and 2012. Once again, they found that the entry of Southwest into a new market leads to an approximately 30 percent spike in collaboration among chemists living near the new routes. What’s more, this trend towards teamwork showed no signs of existing before the arrival of the low-cost airline.

At first glance, it seems likely that these new collaborations triggered by Southwest will produce research of lower quality. After all, the fact that the scientists waited to work together until airfares were slightly cheaper suggests that they didn’t think their new partnership would create a lot of value. (A really enticing collaboration should have been worth a more expensive flight, especially since the arrival of Southwest didn’t significantly increase the number of direct routes.) But that isn’t what Catalini et al. found. Instead, they discovered that Southwest’s entry into a market led to an increase in higher quality publications, at least as measured by the number of citations. Taken together, these results suggest that cheaper air travel is not only redrawing the map of scientific collaboration, but fundamentally improving the quality of research.

There is one last fascinating implication of this dataset. The spread of Southwest paralleled the rise of the Internet, as it became far easier to communicate and collaborate using digital tools, such as email and Skype. In theory, these virtual interactions should make face-to-face conversations unnecessary. Why put up with the hassle of air travel when there’s Facetime? Why meet in person when there’s Google Docs? The Death of Distance and all that.

But this new paper is a reminder that face-to-face interactions are still uniquely valuable. I’ve written before about the research of Isaac Kohane, a professor at Harvard Medical School. A few years ago, he published a study that looked at the influence physical proximity on the quality of the research. He analyzed more than thirty-five thousand peer-reviewed papers, mapping the precise location of co-authors. Geography turned out to be a crucial variable: when coauthors were closer together, their papers tended to be of significantly higher quality. The best research was consistently produced when scientists were located within ten meters of each other, while the least cited papers tended to emerge from collaborators who were a kilometer or more apart.

Even in the 21st century, the best way to work together is to be together. The digital world is full of collaborative tools, but these tools are still not a substitute for meetings that take place in person.* That’s why we get on a plane.

Never change Southwest.

Catalini, Christian, Christian Fons-Rosen, and Patrick Gaulé. "Did cheaper flights change the geography of scientific collaboration?" SSRN Working Paper (2016). 

* Consider a study that looked at the spread of Bitnet, a precursor to the internet. As one might expect, the computer network significantly increased collaboration among electrical engineers at connected universities. However, the boost in collaboration was far larger among engineers who were within driving distance of each other.  Yet more evidence for the power of in-person interactions comes from a 2015 paper by Catalini, which looked at the relocation of scientists following the removal of asbestos from Paris Jussieu, the largest science university in France. He found that science labs that had been randomly relocated in the same area were 3.4 to 5 times more likely to collaborate. Meat space matters.

Do Social Scientists Know What They're Talking About?

The world is lousy with experts. They are everywhere: opining in op-eds, prognosticating on television, tweeting out their predictions. These experts have currency because their opinions are, at least in theory, grounded in their expertise. Unlike the rest of us, they know what they’re talking about.

But do they really? The most famous study of political experts, led by Philip Tetlock at the University of Pennsylvania, concluded that the vast majority of pundits barely beat random chance when it came to predicting future events, such as the winner of the next presidential election. They spun out confident predictions but were never held accountable when their predictions proved wrong. The end result was a public sphere that rewarded overconfident blowhards. Cable news, Q.E.D.

While the thinking sins identified by Tetlock are universal - we’re all vulnerable to overconfidence and confirmation bias - it’s not clear that the flaws of political experts can be generalized to other forms of expertise. For one thing, predicting geopolitics is famously fraught: there are countless variables to consider, interacting in unknowable ways. It’s possible, then, that experts might perform better in a narrower setting, attempting to predict the outcomes of experiments in their own field.

A new study, by Stefano DellaVigna at UC Berkeley and Devin Pope at the University of Chicago, aims to put academic experts to this more stringent test. They assembled 208 experts from the fields of economics, behavioral economics and psychology and asked them to forecast the impact of different motivators on the performance of subjects performing an extremely tedious task. (They had to press the “a” and “b” buttons on their keyboard as quickly as possible for ten minutes.) The experimental conditions ranged from the obvious - paying for better performance - to the subtle, as DellaVigna and Pope also looked at the influence of peer comparisons, charity and loss aversion.  What makes these questions interesting is that DellaVigna and Pope already knew the answers: they’d run these motivational studies on nearly 10,000 subjects. The mystery was whether or not the experts could predict the actual results.

To make the forecasting easier, the experts were given three benchmark conditions and told the average number of presses, or “points,” in each condition. For instance, when subjects were told that their performance would not affect their payment, they only averaged 1521 points. However, when they were paid 10 cents for every 100 points, they averaged 2175 total points. The experts were asked to predict the number of points in fifteen additional experimental conditions.

The good news for experts is that these academics did far better than Tetlock’s pundits. When asked to predict the average points in each condition, they demonstrated the wisdom of crowds: their predictions were off by only 5 percent. If you’re a policy maker, trying to anticipate the impact of a motivational nudge, you’d be well served by asking a bunch of academics for their opinions. 

The bad news is that, on an individual level, these academics still weren’t very good. They might have looked prescient when their answers were pooled together, but the results were far less impressive if you looked at the accuracy of experts in isolation. Perhaps most distressing, at least for the egos of experts, is that non-scientists were much better at ranking the treatments against each other, forecasting which conditions would be most and least effective. (As DellaVigna pointed out in an email, this is less a consequence of expert failure and more a tribute to the fact that non-experts did “amazingly well” at the task.) The takeaway is straightforward: there might be predictive value in a diverse group of academics, but you’d be foolish to trust the forecast of a single one.

Furthermore, there was shockingly little relationship between the credentials of academia and overall performance. Full professors tended to underperform assistant professors, while having more Google Scholar citations was correlated with lower levels of accuracy. (PhD students were “at least as good” as their bosses.) Academic experience clearly has virtues. But making better predictions about experiments does not seem to be one of them.

Since Tetlock published his damning critique of political pundits, he has gone on to study so-called “superforecasters,” those amateurs whose predictions of world events are consistently more accurate than those of intelligence analysts with access to classified information. (In general, these superforecasters share a particular temperament: they’re willing to learn from their mistakes, quick to update their beliefs and tend to think in shades of gray.) After mining the data, DellaVigna and Pope were able to identify their own superforecasters. As a group, these non-experts significantly outperformed the academics, improving on the average error rate of the professors by more than 20 percent. These people had no background in behavioral research. They were paid $1.50 for 10 minutes of their time. And yet, they were better than the experts at predicting research outcomes.

The limitations of expertise are best revealed by the failure of the experts to foresee their own shortcomings. When the academics were surveyed by DellaVigna and Pope, they predicted that high-citation experts would be significantly more accurate. (The opposite turned out to be true.) They also expected PhD students to underperform the professors – that didn’t happen, either – and that academics with training in psychology would perform the best. (The data points in the opposite direction.)

It’s a poignant lapse. These experts have been trained in human behavior. They have studied our biases and flaws. And yet, when it comes to their own performance, they are blind to their own blindspots. The hardest thing to know is what we don’t.

DellaVigna, Stefano, and Devin Pope. Predicting Experimental Results: Who Knows What? NBER Working Paper, 2016.      

The Power of Family Memory

In a famous series of studies conducted in the 1980s, the psychologists Betty Hart and Todd Risley gave parents a new variable to worry about: the number of words they speak to their children. According to Hart and Risley, the quantity of spoken language in a household is predictive of IQ scores, vocabulary size and overall academic success. The language gap even begins to explain socio-economic disparities in educational outcomes, as upper-class parents speak, on average, about 3.5 times more to their kids than their poorer peers. Hart and Risley referred to the lack of spoken words in poor households as "the early catastrophe."

In recent years, however, it’s become clear that it’s not just the amount of language that counts. Rather, researchers have found that some kinds of conversations are far more effective at promoting mental and emotional development than others. While all parents engage in roughly similar amounts of so-called “business talk” – these are interactions in which the parent is offering instructions, such as “Hold out your hands,” or “Stop whining!” – there is far more variation when it comes to what Hart and Risley called “language dancing,” or conversations in which the parent and child are engaged in a genuine dialogue. According to a 2009 study by researchers at the UCLA School of Public Health, parent-child dialogues were six times as effective in promoting the development of language skills as those in which the adult did all the talking.

So conversation is better than instruction; dialogues over monologues. But this only leads to the next practical question: What’s the best kind of conversation to have with children? If we only have a limited amount of “language dancing” time every day - my kids usually start negotiating for dessert roughly five minutes into dinner - then what should we choose to chat about? And this isn’t just a concern for precious helicopter parents. Rather, it’s a relevant topic for researchers trying to design interventions for at-risk children, as they attempt to give caregivers the tools to ensure successful development.

A new answer is emerging. According to a recent paper by the psychologists Karen Salmon and Elaine Reese, one of the best subjects of parent-child conversation is the past, or what they refer to as “elaborative reminiscing.” As evidence, Salmon and Reese cite a wide variety of studies, drawn from more than three decades of research on children between the ages of 18 months and 5 years, all of which converge on a similar theme: discussing our memories is an extremely effective way to promote cognitive and emotional growth. Maybe it’s a scene from our last family vacation, or an accounting of what happened at school that day, or that time I locked my keys in the car - the details of the memory don’t seem to matter that much. What does is that we remember together.

Here’s an example of the everyday reminiscing the scientists recommend:

Mother: “What was the first thing he [the barber] did?”

Child: “Bzzzz.” (running his hand over his head)

Mother: “He used the clippers, and I think you liked the clippers. And you know how I know? Because you were smiling.”

Child: “Because they were tickling.”

Mother: “They were tickling, is that how they felt? Did they feel scratchy?”

Child: “No.”

Mother: “And after the clippers, what did he use then?”

Child: “The spray.”

Mother: “Yes. Why did he use the spray?”

Child: (silent)

Mother: “He used the spray to tidy your hair. And I noticed that you closed your eyes, and I thought ‘Jesse’s feeling a little bit scared,’ but you didn’t move or cry and I thought you were being very brave.”

It’s such an ordinary conversation, but Salmon and Reese point out its many virtues. For one thing, the questions are leading the child through his recent haircut experience. He is learning how to remember, what it takes to unpack a scene, the mechanics of turning the past into a story. Over time, these skills play a huge role in language development, which is why children that engage in more elaborative reminiscing with their parents tend to have more advanced vocabularies, better early literacy scores and improved narrative skills. In fact, one study found that teaching low-income mothers to “reminisce in more elaborative ways” led to bigger improvements in narrative skills and story comprehension than an interactive book-reading program.

But talking about the past isn’t just about turning our kids into better storytellers. It’s also about boosting their emotional intelligence, teaching them how to handle those feelings they’d rather forget. In A Book About Love, I wrote about research showing that children raised in households that engage in the most shared recollection report higher levels of emotional well-being and a stronger sense of personal identity. The family unit also becomes stronger, as those children and parents who know more about the past also scored higher on a widely used measure of “reported family functioning.” Salmon and Reese expand on these findings, citing research showing that emotional reminiscing is linked to long-term improvements in the ability of children to regulate their negative emotions, handle difficult situations and identify the feelings of themselves and others.

Consider the haircut conversation above. Notice how the mother identifies the feelings felt by the child: enjoyment, tickling, fear. She suggests triggers for these emotions - the clippers, the water spray - and helps her son understand their fleeting nature. (Because the feelings are no longer present, they can be discussed calmly. That’s why talking about remembered emotions is often more useful than talking about emotions in the heat of the moment.) The virtue of such dialogues is that they teach children how to cope with their feelings, even when what they feel is fury and fear. As Salmon and Reese note, these are particularly important skills for mothers who have been exposed to adverse or traumatic experiences, such as drug abuse or domestic violence. Studies show that these at-risk parents are much less likely to incorporate “emotion words” when talking with their children. And when they do discuss their memories, Salmon and Reese write, they often “remain stuck in anger.” Their past isn’t past yet.

Perhaps this is another benefit of elaborative reminiscing. When we talk about our memories with loved ones, we translate the event into language, giving that swirl of emotion a narrative arc. (As the psychologist James Pennebaker has written, "Once it [a painful memory] is language based, people can better understand the experience and ultimately put it behind them.") And so the conversation becomes a moment of therapy, allowing us to make sense of what happened and move on. 

It was just a haircut, but you were so brave.   

Salmon, Karen, and Elaine Reese. "The Benefits of Reminiscing With Young Children." Current Directions in Psychological Science 25.4 (2016): 233-238.       


The Overview Effect

After six weeks in orbit, circling the earth in a claustrophobic space station, the three-person crew of Skylab 4 decided to go on a strike. For 24 hours, the astronauts refused to work, and even turned off their communications radio linking them to Earth. While NASA was confused by the space revolt—mission control was concerned the astronauts were depressed—the men up in space insisted they just wanted more time to admire their view of the earth. As the NASA flight director later put it, the astronauts were asserting “their needs to reflect, to observe, to find their place amid these baffling, fascinating, unprecedented experiences.”

The Skylab 4 crew was experiencing a phenomenon known as the overview effect, which refers to the intense emotional reaction that can be triggered by the sight of the earth from beyond its atmosphere. Sam Durrance, who flew on two shuttle missions, described the feeling like this: “You’ve seen pictures and you’ve heard people talk about it. But nothing can prepare you for what it actually looks like. The Earth is dramatically beautiful when you see it from orbit, more beautiful than any picture you’ve ever seen. It’s an emotional experience because you’re removed from the Earth but at the same time you feel this incredible connection to the Earth like nothing I’d ever felt before.”

The Caribbean Sea, as seen from ISS Expedition 40

The Caribbean Sea, as seen from ISS Expedition 40

What’s most remarkable about the overview effect is that the effect lasts: the experience of awe often leaves a permanent mark on the lives of astronauts. A new paper by a team of scientists (the lead author is David Yaden at the University of Pennsylvania) investigates the overview effect in detail, with a particular focus on how this vision of earth can “settle into long-term changes in personal outlook and attitude involving the individual’s relationship to Earth and its inhabitants.” For many astronauts, this is the view they never get over.

How does this happen? How does a short-lived perception alter one’s identity? There is no easy answer. In this paper, the scientists focus on how the sight of the distant earth is so contrary to our usual perspective that it forces our “self-schema” to accommodate an entirely new point of view. We might conceptually understand that the earth is a lonely speck floating in space, a dot of blue amid so much black. But it’s an entirely different thing to bear witness to this reality, to see our fragile planet from hundreds of miles away. The end result is that the self itself is changed; this new perspective of earth alters one’s perspective on life, with the typical astronaut reporting “a greater affiliation with humanity as a whole.” Here’s Ed Gibson, the science pilot on Skylab 4: “You see how diminutive your life and concerns are compared to other things in the universe. Your life and concerns are important to you, of course. But you can see that a lot of the things you worry about do not make much difference in an overall sense.”

There are two interesting takeaways. The first one, emphasized in the paper, is that the overview effect might serve as a crucial coping mechanism for the challenges of space travel. Astronauts live a grueling existence: they are stressed, isolated and exhausted. They live in cramped quarters, eat terrible food and never stop working. If we are going to get people to Mars, then we need to give astronauts tools to endure their time on a spaceship. As the crew of Skylab 4 understood, one of the best ways to withstand space travel is to appreciate its strange beauty.

The second takeaway has to do with the power of awe and wonder. When you read old treatises on human nature, these lofty emotions are often celebrated. Aristotle argued that all inquiry began with the feeling of awe, that “it is owing to their wonder that men both now begin and at first began to philosophize.” Rene Descartes, meanwhile, referred to wonder as the first of the passions, “a sudden surprise of the soul that brings it to focus on things that strike it as unusual and extraordinary.” In short, these thinkers saw the experience of awe as a fundamental human state, a feeling so strong it could shape our lives.

But now? We have little time for awe in the 21st century; wonder is for the young and unsophisticated. To the extent we consider these feelings it’s for a few brief moments on a hike in a National Park, or to marvel at a child’s face when they first enter Disneyland. (And then we get out our phones and take a picture.) Instead of cultivating awe, we treat it as just another fleeting feeling; wonder is for those who don’t know any better.

The overview effect, however, is a reminder that these emotions can have a lasting impact. Like the Skylab 4 astronauts, we can push back against our hectic schedules, insisting that we find some time to stare out the window.  

Who knows? The view just might change your life.

Yaden, David B., et al. "The overview effect: Awe and self-transcendent experience in space flight." Psychology of Consciousness: Theory, Research, and Practice 3.1 (2016): 1.


How Magicians Make You Stupid

The egg bag magic trick is simple enough. A magician produces an egg and places it in a cloth bag. Then, the magician uses some poor sleight of hand, pretending to hide the egg in his armpit. When the bag is revealed as empty, the audience assumes it knows where the egg really is.

But the egg isn’t there. The armpit was a false solution, distracting the crowd from the real trick: the bag contains a secret compartment. When the magician finally lifts his arm, the audience is impressed by the vanishing. How did he remove the egg from his armpit? It never occurs to them that the egg never left the bag.

Magicians are intuitive psychologists, reverse-engineering the mind and preying on all its weak spots. They build illusions out of our frailties, hiding rabbits in our attentional blind spots and distracting the eyes with hand waves and wands. And while people in the audience might be aware of their perceptual shortcomings – those fingers move so fast! - they are often blind to a crucial cognitive limitation, which allows magicians to keep us from deciphering the trick. In short, magicians know that people tend to to fixate on particular answers (the egg is in the armpit), and thus ignore alternative ones (it’s a trick bag), even when the alternatives are easier to execute. 

When it comes to problem-solving, this phenomenon is known as the einstellung effect. (Einstellung is German for “setting” or “attitude.”) First identified by the psychologist Abraham Luchins in the early 1940s, the effect has since been replicated in numerous domains. Consider a study that gave chess experts a series of difficult chess problems, each of which contained two solutions. The players were asked to find the shortest possible way to win. The first solution was obvious and took five moves to execute. The second solution was less familiar, but could be achieved in only three moves. As expected, these expert players found the first solution right away. Unfortunately, most of them then failed to identify the second one, even though it was more efficient. The good answer blinded them to the better one.  

Back to magic tricks. A new paper in Cognition, by Cyril Thomas and Andre Didierjean, extends the reach of the einstellung effect by showing that it limits our problem-solving abilities even when the false solution is unfamiliar and unlikely. Put another way, preposterous explanations can also become mental blocks, preventing us from finding answers that should be obvious. To demonstrate this, the scientists showed 90 students one of three versions of a card trick. The first version went like this: a performer showed the subject a brown-backed card surrounded by six red-backed cards. After randomly touching the back of the red cards, he asked the subject to choose one of the six, which was turned face up. It was a jack of hearts. The magician then flipped over the brown-backed card at the center, which was also a jack of hearts. The experiment concluded with the magician asking the subject to guess the secret of the trick. In this version, 83 percent of subjects quickly figured it out: all of the cards were the same.

The second version featured the same trick, except that the magician slyly introduced a false solution. Before a card was picked, he explained that he was able to influence other people’s choices through physical suggestions. He then touched the back of the red cards, acting as if these touches could sway the subject’s mind. After the trick was complete, these subjects were also asked to identify the secret. However, most of these subjects couldn’t figure it out: only 17 percent of people realized that every card was the jack of hearts. Their confusion persisted even after the magician encouraged them to keep thinking of alternative explanations.

This is a remarkable mental failure. It’s a reminder that our beliefs are not a mirror to the world, but rather bound up with the limits of the human mind. In this particular case, our inability to see the obvious trick seems to be a side-effect of our feeble working memory, which can only focus on a few bits of information at any given moment. (In an email, Thomas notes that it is more “economical to focus on one solution, and to not lose time…searching for a hypothetical alternative one.”) And so we fixate on the most salient answer, even when it makes no sense. As Thomas points out, a similar lapse explains the success of most mind-reading performances: we are so seduced by the false explanation (parapsychology!) that we neglect the obvious trick, which is that the magician gathered personal information about us from Facebook. The performance works because we lack the bandwidth to think of a far more reasonable explanation.

Thomas and Didierjean end their paper with a disturbing thought. “If a complete stranger (the magician) can fix spectators’ minds by convincing them that he/she can control their individual choice with his own gesture,” they write, “to what extent can an authority figure (e.g., policeman) or someone that we trust (e.g., doctors, politicians) fix our mind with unsuitable ideas?” They don’t answer the question, but they don’t need to. Just turn on the news.

Thomas, Cyril, and André Didierjean. "Magicians fix your mind: How unlikely solutions block obvious ones." Cognition 154 (2016): 169-173.

What Can Toilet Paper Teach Us About Poverty?

“Costco is where you go broke saving money.”

-My Uncle

The fundamental paradox of big box stores is that the only way to save money is to spend lots of it. Want to get a discount on that shampoo? Here's a liter. That’s a great price for chapstick – now you have 32 of them. The same logic applies to most staples of modern life, from diapers to Pellegrino, Uni-ball pens to laundry detergent.

For consumers, this buy-in-bulk strategy can lead to real savings, especially if the alternative is a bodega or Whole Foods. (Brand name diapers, for instance, cost nearly twice as much at my local grocery store compared to Costco.) However, not every American is equally likely to seek out these discounts. In particular, some studies have found that lower-income households  – the ones who could benefit the most from that huge bottle of Kirkland shampoo – pay higher prices because they don’t make bulk purchases.

A new paper, “Frugality is Hard to Afford,” by A. Yesim Orhun and Mike Palazzolo investigates why this phenomenon exists. Their data set featured the toilet paper purchases of more than 100,000 American families over seven years. Orhun and Palazzolo focused on toilet paper for several reasons. First, consumption of toilet paper is relatively constant. Second, toilet paper is easy to store – it doesn’t spoil – making it an ideal product to purchase in bulk, at least if you’re trying to get a discount. Third, the range of differences between brands of toilet paper is rather small, at least when compared to other consumer products such as detergent and toothpaste. 

So what did Orhun and Palazzolo find? As expected, lower income households were far less likely take advantage of the lower unit prices that come with bulk purchases. Over time, these shopping habits add up, as the poorest families end up paying, on average, 5.9 percent more per sheet of toilet paper. 

The question, of course, is why this behavior exists. Shouldn’t poor households be the most determined to shop around for cheap rolls? The most obvious explanation is what Orhun and Palazzolo refer to as a liquidity constraint: the poor simply lack the cash to “invest” in a big package of toilet paper. As a result, they are forced to buy basic household supplies on an as-needed basis, which makes it much harder to find the best possible price.

But this is not the only constraint imposed by poverty. In a 2013 Science paper, the behavioral scientists Anandi Mani, Sendhil Mullainathan, Eldar Shafir and Jiaying Zhao argued that not having money also imposes a mental burden, as our budgetary worries consume scarce attentional resources. This makes it harder for low-income households to plan for the future, whether it’s buying toilet paper in bulk or saving for retirement. “The poor, in this view, are less capable not because of inherent traits,” write the scientists, “but because the very context of poverty imposes load and impedes cognitive capacity.”

Consider a clever experiment conducted by Mani, et al. at a New Jersey mall. They asked shoppers about various hypothetical scenarios involving a financial problem. For instance, they might be told that their “car is having some trouble and requires $[X] to be fixed.” Some subjects were told that their repair was extremely expensive ($1500), while others were told it was relatively cheap ($150.) Then, all participants were given a series of challenging cognitive tasks, including some questions from an intelligence test and a measure of impulse control.

The results were startling. Among rich subjects, it didn’t really matter how much the car cost to fix – they performed equally well when the repair estimate was $150 or $1500.  Poor subjects, however, showed a troubling difference. When the repair estimate was low, they performed roughly equivalent to rich subjects. But when the repair estimate was high they suddenly showed a steep drop off in performance on both tests, comparable in magnitude to the mental deficit associated with losing a full night of sleep or becoming an alcoholic.

This new toilet paper study provides some additional evidence that poverty takes a toll on our choices. In one analysis, Orhun and Palazzolo looked at how purchase behavior was altered at the start of the month, when low income households are more likely to receive paychecks and food stamps. As the researchers note, this influx of money should temporarily ease the stress of being poor, thus making it easier to buy in bulk.  

That’s exactly what they found. When the poorest households were freed from their most pressing liquidity constraints, they made much more cost-effective toilet paper decisions. (This also suggests that poorer households are not simply buying smaller bundles due to a lack of storage space or transportation, as these factors are not likely to exhibit week-by-week fluctuation.) Of course, the money didn't last long; the following week, these households reverted back to their old habits, overpaying for household products. And so those with the least end up with even less.

Orhun, A. Yesim, and Mike Palazzolo. "Frugality is hard to afford." University of Michigan Working Paper (2016).

Mani, Anandi, Sendhil Mullainathan, Eldar Shafir, and Jiaying Zhao. "Poverty impedes cognitive function." Science 341 (2013): 976-980.

The Nordic Paradox

By virtually every measure, the Nordic countries – Denmark, Finland, Iceland, Norway and Sweden - are a paragon of gender equality. It doesn’t matter if you’re looking at the wage gap or political participation or educational attainment: the Nordic region is the most gender equal place in the world.

But this equality comes with a disturbing exception: Nordic women also suffer from intimate partner violence (IPV) at extremely high rates. (IPV is defined by the CDC as the experience of “physical violence, sexual violence, stalking and psychological aggression by a current or former intimate partner.”) While the average lifetime prevalence for intimate partner violence for women living in Europe is 22 percent – a horrifyingly high number by itself – Nordic countries perform even worse. In fact, Denmark has the highest rate of IPV in the EU at 32 percent, closely followed by Finland (30 percent) and Sweden (28 percent.) And it’s not just violence from partners: other surveys have looked at violence against women in general. Once again, the Nordic countries had some of the highest rates of violence in the EU, as measured by reports of sexual assault, physical abuse or emotional abuse.

A new paper in Social Science & Medicine by Enrique Gracia and Juan Merlo refers to the existence of these two realities – gender equality and high rates of violence against woman – as the Nordic paradox. It’s a paradox because a high risk of IPV for women is generally associated with lower levels of gender equality, particularly in poorer countries. (For example, 71 percent of Ethiopian women have suffered from IPV.) This makes intuitive sense: a country that disregards the rights of women, or fails to treat them as equals, also seems more likely to tolerate their abuse.

And yet, the same logic doesn’t seem to apply at the other extreme of gender equality. As Gracia and Merlo note, European countries with lower levels of gender equality, such as Italy and Greece, also report much lower levels of IPV (roughly 30 percent lower) than Nordic nations.

What explains this paradox? Why hasn’t the gender equality of Nordic countries reduced violence against women? That’s the tragic mystery investigated by Gracia and Merlo.

One possibility is that the paradox is caused by differences in reporting, as women in Nordic countries might feel more free to disclose the abuse. This also makes intuitive sense: if you live in a country with higher levels of gender equality, then you might be less likely to fear retribution when accusing a partner, or telling the police about a sex crime. (In Saudi Arabia, only 3.3 of women who suffered from IPV told the police or a judge.) However, Gracia and Merlo cast shade on this explanation, noting that the available evidence suggests lower levels of disclosure of IPV among women in the Nordic countries. For instance, while 20 percent of women in Europe said that the most serious incident of IPV they’d experienced was brought to the attention of the police, only 10 percent of women in Denmark and Finland could say the same thing. The same trend is supported by other data, including rape statistics and “victim blaming” surveys. Finally, even if part of the Nordic paradox was a reporting issue, this would only reinforce the real mystery, which is that gender equal societies still suffer from epidemic levels of violence against women.

The main hypothesis advanced by Gracia and Merlo – and it’s only a hypothesis – is that high gender equality might create a backlash effect among men, triggering high levels of violence against women.  Because gender equality disrupts traditional gender norms, it might also reinforce “victim-blaming attitudes,” in which the violence is excused or justified. Gracia and Merlo cite related studies showing that women with “higher economic status relative to their partners can be at greater IPV risk depending on whether their partners hold more traditional gender beliefs.” For these backwards men, the success of women is perceived as a threat, an undermining of their identity. This backlash is further exacerbated by women becoming more independent and competitive in gender equal societies, thus increasing the potential for conflict with partners who insist on control and subservience. Progress leaves some people behind, and those people tend to get angry.

At best, the backlash effect is only a partial explanation for the Nordic Paradox. Gracia and Merlo argue that a real understanding of the prevalence of IPV – why is it still so common, even in developed countries? – will require looking beyond national differences and instead investigating the risk factors that affect the individual. How much does he drink? What is her employment status? Do they live together? What is the neighborhood like? Even brutish behaviors have complicated roots; we need a thick description of life to understand them.  

On the one hand, the Nordic paradox is a testament to liberal values, a reminder that thousands of years of gender inequality can be reversed in a few short decades. The progress is real. But it’s also a reminder that progress is difficult, full of strange backlashes and reversals. Two steps forward, one step back. Or is it the other way around? We can see the moral universe bending, but goddamn is it slow.

Gracia, Enrique, and Juan Merlo. "Intimate partner violence against women and the Nordic paradox." Social Science & Medicine 157 (2016): 27-30.

via MR

Did "Clean" Water Increase the Murder Rate?

The construction of public waterworks across the United States in the late 19th and early 20th centuries was one of the great infrastructure investments in American history. As David Cutler and Grant Miller have demonstrated, these waterworks accounted for “nearly half of the total mortality reduction in major cities, three-quarters of the infant mortality reduction, and nearly two-thirds of the child mortality reduction.” Within a generation, the scourge of waterborne infectious diseases – from cholera to typhoid fever – was largely eliminated. Moving to a city no longer took years off your life, a sociological trend that unleashed untold amounts of human innovation.

However, not all urban waterworks were created equal. Some systems were built with metal pipes containing large amounts of lead. (At the time, lead pipes were considered superior to iron pipes, as they were more durable and easier to bend.) Unfortunately, these pipes leached lead particulates into the water, exposing city dwellers to water that tasted clean but was actually a poison.

Over the last few decades, researchers have amassed an impressive body of evidence linking lead exposure in childhood to a tragic list of symptoms, including higher rates of violent crime and lower scores on the IQ test. (One study found that lead levels are four times higher among convicted juvenile offenders than among non-delinquent high school students.) In 2014, I wrote about a paper by Jesssica Wolpow Reyes that documented the association between leaded gasoline and violent crime:

Reyes concluded that “the phase-out of lead from gasoline was responsible for approximately a 56 percent decline in violent crime” in the 1990s. What’s more, Reyes predicted that the Clean Air Act would continue to generate massive societal benefits in the future, “up to a 70 percent drop in violent crime by the year 2020.” And so a law designed to get rid of smog ended up getting rid of crime. It’s not the prison-industrial complex that keeps us safe. It’s the EPA.

But these studies have their limitations. For one thing, the pace at which states reduced their use of leaded gas might be related to other social or political variables that influence the crime rate. It’s also possible that those neighborhoods with the highest risk of lead poisoning might suffer from additional maladies linked to crime, such as poverty and poor schools. To convincingly demonstrate that lead causes crime, researchers need to find a credible source of variation in lead exposure that is completely independent (aka exogenous) to the factors that might shape criminal behavior.

That is the goal of a new paper by James Feigenbaum and Christopher Muller. Their study mines the historical record, drawing from homicide data between 1921 and 1936 (when the first generation of children exposed to lead pipes were adults) and the materials used to construct each urban water system. If lead was responsible for higher crime rates, then those cities with higher lead content in their pipes (and also more acidic water, which leaches out the lead) should also experience larger spikes in crime decades later.

What makes this research strategy especially useful is that the decision to use lead pipes in a city’s water system was based in part on its proximity to a lead refinery. (Cities that were closer to a refinery were more likely to invest in lead pipes, as the lower transportation costs made the “superior” option more affordable.) In addition, Feigenbaum and Muller were able to look at how the lead content of pipes interacted with the acidity of a city’s water supply, thus allowing them to further isolate the causal role of lead.

The results were clear: cities that used lead pipes had homicide rates that were between 14 and 36 percent higher than cities that opted for cheaper iron pipes.

These violent crime increases are especially striking given that those cities using lead pipes tended to be wealthier, better educated and more “health conscious” than those that did not. All things being equal, one might expect these places to have lower rates of violent crime. But because of a little noticed engineering decision, the water of these cities contained a neurotoxin, which interfered with brain development and made it harder for their residents to reign in their emotions.

The brain is a plastic machine, molded by its environment. When we introduce a new technology – and it doesn’t matter if it’s an urban water system or the smartphone – it’s often impossible to predict the long-term consequences. Who would have guessed that the more expensive lead pipes would lead to spikes in crime decades later? Or that the heavy use of road salt in the winter would lead to a 21st century water crisis in Flint, as the chloride ions pull lead out of the old pipes?

One day, the scientists of the future will study our own blind spots, as we invest in technologies that mess with the mind in all sorts of subtle ways. History reminds us that these tradeoffs are often unexpected. After all, it took decades before we realized that, for some unlucky cities, even clean water came with a terrible cost.

James Feigenbaum and Christopher Muller. "Lead Exposure and Violent Crime in the Early Twentieth Century." Explorations in Economic History (2016)

The Importance of Learning How to Fail

“An expert is a person who has found out by his own painful experience all the mistakes that one can make in a very narrow field.” -Niels Bohr

Carol Dweck has devoted her career to studying how our beliefs about intelligence influence the way we learn. In general, she finds that people subscribe to one of two different theories of mental ability. The first theory is known as the fixed mindset – it holds that intelligence is a fixed quantity, and that each of us is allotted a certain amount of smarts we cannot change. The second theory is known as the growth mindset. It’s more optimistic, holding that our intelligence and talents can be developed through hard work and practice. "Do people with this mindset believe than anyone can be anything?" Dweck asks. "No, but they believe that a person's true potential is unknown (and unknowable); that it's impossible to foresee what can be accomplished with years of passion, toil, and training."

You can probably guess which mindset is more useful for learning. As Dweck and colleagues have repeatedly demonstrated, children with a fixed mindset tend to wilt in the face of challenges. For them, struggle and failure are a clear sign they aren’t smart enough for the task; they should quit before things get embarrassing. Those with a growth mindset, in contrast, respond to difficulty by working harder. Their faith in growth becomes a self-fulling prophecy; they get smarter because they believe they can.

The question, of course, is how to instill a growth mindset in our children. While Dweck is perhaps best known for her research on praise – it’s better to compliment a child for her effort than for her intelligence, as telling a kid she’s smart can lead to a fixed mindset – it remains unclear how children develop their own theories about intelligence.  What makes this mystery even more puzzling is that, according to multiple studies, the mindset of parents’ is surprisingly disconnected from the mindsets of their children. In other words, believing in the plasticity of intelligence is no guarantee that our kids will feel the same way.

What explains this disconnect? One possibility is that parents are accidental hypocrites. We might subscribe to the growth mindset for ourselves, but routinely praise our kids for being smart. Or perhaps we tell them to practice, practice, practice, but then get frustrated when they can’t master fractions, or free throws, or riding without training wheels. (I’m guilty of both these sins.) The end result is a muddled message about the mind’s potential.

However, in an important new paper, Kyla Haimovitz and Carol Dweck reveal the real influence behind the mindsets of our children. It turns out that the crucial variable is not what we think about intelligence – it’s how we react to failure.

Consider the following scenario: a child comes home with a bad grade on a math quiz. How do you respond? Do you try to comfort the child and tell him that it’s okay if he isn’t the most talented? Will you worry that he isn’t good at math? Or would you encourage him to describe what he learned from doing poorly on the test?

Parents with a failure-is-debilitating attitude tend to focus on the importance of performance: doing well on the quiz, succeeding at school, getting praise from other people. When confronted with the specter of failure, these parents get anxious and worried. Over time, their children internalize these negative reactions, concluding that failure is a dead-end, to be avoided at all costs. If at first you don’t succeed, then don’t try again.

In contrast, those parents who see failure as part of the learning process are more likely to see the bad grade as an impetus for extra effort, whether it’s asking the teacher for help or trying a new studying strategy. They realize that success is a marathon, requiring some pain along the way. You only learn how to get it right by getting it wrong.

According to the scientists, this is how our failure mindsets get inherited – our children either learn to focus on the appearance of success or on the long-term rewards of learning. Over time, these attitudes towards failure shape their other mindsets, influencing how they felt about their own potential. If they worked harder, could they get good at math? Or was algebra simply beyond their reach? 

Although the scientists found that children were bad at guessing the intelligence mindsets of their parents – they don’t know if we’re in the growth or fixed category - the kids were surprisingly good at predicting their parents’ relationship to failure. This suggests that our failure mindsets are much more “visible” than our beliefs about intelligence. Our children might forget what happened after the home-run, but they damn sure remember what we said after the strike-out. 

This study helps clarify the forces that shape our children. What matters most is not what we say after a triumph or achievement - it’s how we deal with their disappointments. Do we pity our kids when they struggle? (Sympathy is a natural reaction; it also sends the wrong message.) Do we steer them away from potential defeats? Or do we remind them that failure is an inescapable part of life, a state that cannot be avoided, only endured. Most worthy things are hard.

Haimovitz, K., and C. S. Dweck. "What Predicts Children's Fixed and Growth Intelligence Mind-Sets? Not Their Parents' Views of Intelligence but Their Parents' Views of Failure." Psychological Science (2016).


Is Tanking An Effective Strategy in the NBA?

In his farewell manifesto, former Philadelphia 76ers General Manager Sam Hinkie spends 13 pages explaining away the dismal performance of his team, which has gone 47-199 over the last three seasons. Hinkie’s main justification for all the losses involves the consolation of draft picks, which are the NBA’s way of rewarding the worst teams in the league. Here’s Hinkie:

"In the first 26 months on the job we added more than one draft pick (or pick swap) per month to our coffers. That’s more than 26 new picks or options to swap picks over and above the two per year the NBA allots each club. That’s not any official record, because no one keeps track of such records. But it is the most ever. And it’s not close. And we kick ourselves for not adding another handful."

This is the tanking strategy. While the 76ers have been widely criticized for their consistent disinterest in winning games, Hinkie argues that it was a necessary by-product of their competitive position in 2013, when he took over as GM. (According to a 2013 ESPN ranking of each NBA team’s three-year winning potential, the 76ers ranked 24th out of 30.) And so Hinkie, with his self-described “reverence for disruption” and “contrarian mindset,” set out to take a “long view” of basketball success. The best way for the 76ers to win in the future was to keep on losing in the present.  

Hinkie is a smart guy. At the very least, he was taking advantage of the NBA’s warped incentive structure, which can lead to a “treadmill of mediocrity” among teams too good for the lottery but too bad to succeed in the playoffs. However, Hinkie’s devotion to tanking – and his inability to improve the team’s performance - does raise an interesting set of empirical questions. Simply put: is tanking in the NBA an effective strategy? (I’m a Lakers fan, so it would be nice to know.) And if tanking doesn't work, why doesn't it?

A new study in the Journal of Sports Economics, published six days before Hinkie’s resignation, provides some tentative answers. In the paper, Akira Motomura, Kelsey Roberts, Daniel Leeds and Michael Leeds set out to determine whether or not it “pays to build through the draft in the National Basketball Association.” (The alternative, of course, is to build through free agency and trades.) Motomura et al. rely on two statistical tests to make this determination. The first test is whether teams with more high draft picks (presumably because they tanked) improve at a faster rate than teams with fewer such picks. The second test is whether teams that rely more on players they have drafted for themselves win more games than teams that acquire players in other ways. The researchers analyzed data from the 1995 to 2013 NBA seasons.

What did they find? The punchline is clear: building through the draft is not a good idea. Based on the data, Motomura et al. conclude that “recent high draft picks do not help and often reduce improvement,” as teams with one additional draft pick between 4 and 10 can be expected to lose an additional 6 to 9 games three years later. Meanwhile, those teams lucky enough to have one of the first three picks should limit their expectations, as those picks tend to have “little or no impact” on team performance. The researchers are blunt: “Overall, having more picks in the Top 17 slots of the draft does not help and tends to be associated with less improvement.”

There are a few possible explanations for why the draft doesn’t rescue bad teams. The most likely source of failure is the sheer difficultly of selecting college players, even when you’re selecting first. (One study found that draft order predicts only about 5 percent of a player’s performance in the NBA.) For every Durant there are countless Greg Odens; Hinkie’s own draft record is a testament to the intrinsic uncertainty of picking professional athletes.

That said, some general managers appear to be far better at evaluating players. “While more and higher picks do not generally help teams, having better pickers does,” write the scientists. They find, for instance, that R.C. Buford, the GM of the Spurs, is worth an additional 23 to 29 wins per season. Compare that to the “Wins Over Replacement” generated by Stephen Curry, who has just finished one of the best regular season performances in NBA history. According to Basketball Reference, Curry was worth an additional 26.4 wins during the 2015-2016 regular season. If you believe these numbers, R.C. Buford is one of the most valuable (and underpaid) men in the NBA.

So it’s important to hire the best GM. But this new study also finds franchise effects that exist independently of the general manager, as certain organizations are simply more likely to squeeze wins from their draft picks. The researchers credit these franchise differences largely to player development, especially when it comes to “developing players who might not have been highly regarded entering the NBA.” This is proof that “winning cultures” are a real thing, and that a select few NBA teams are able to consistently instill the habits required to maximize the talent of their players. Draft picks are nice. Organizations win championships. And tanking is no way to build an organization.

In his manifesto, Hinkie writes at length about the importance of bringing the rigors of science to the uncertainties of sport: “If you’re not sure, test it,” Hinkie writes. “Measure it. Do it again. See if it repeats.” Although previous research by the sports economist Dave Berri has cast doubt on the effectiveness of tanking,” this new paper should remind every basketball GM that the best way to win over the long-term is to develop a culture that doesn’t try to lose.

Motomura, Akira, et al. “Does It Pay to Build Through the Draft in the National Basketball Association?” Journal of Sports Economics, March 2016.

Does Stress Cause Early Puberty?

The arrival of puberty is a bodily event influenced by psychological forces. The most potent of these forces is stress: decades of research have demonstrated that a stressful childhood accelerates reproductive development, at least as measured by menarche, or the first menstrual cycle. For instance, girls growing up with fathers who have a history of socially deviant behavior tend to undergo puberty a year earlier than those with more stable fathers, while girls who have been maltreated (primarily because of physical or sexual abuse) begin menarche before those who have not. One study even found that Finnish girls evacuated from their homeland during WWII – they had to endure the trauma of separation from their parents – reached puberty at a younger age and had more children than those who stayed behind. 

There’s a cold logic behind these correlations. When times are stressful, living things tend to devote more resources to reproductive development, as they want to increase the probability of passing on their genes before death. This leads to earlier puberty and reduced investment in developmental processes less directly related to sex and mating. If nothing else, the data is yet another reminder that early childhood stress has lasting effects, establishing developmental trajectories that are hard to undo.

But these unsettling findings leave many questions unanswered. For starters, what kind of stress is the most likely to speed up reproductive development? Scientists often divide early life stressors into two broad categories: harshness and unpredictability. Harshness is strongly related to a lack of money, and is typically measured by looking at how a family’s income relates to the federal poverty line.  Unpredictability, in contrast, is linked to factors such as the consistency of father figures inside the house and the number of changes in residence. Are both of these forms of stress equally important at triggering the onset of reproductive maturation? Or do they have different impacts on human development?

Another key question is how this stress can be buffered. If a child is going to endure a difficult beginning, then what is the best way to minimize the damage?

These questions get compelling answers in a new study by a team of researchers from four different universities. (The lead author is Sooyeon Sung at the University of Minnesota.) Their subjects were 492 females born in 1991 at ten different hospitals across the United States. Because these girls were part of a larger study led by the National Institute of Child Health and Human Development, Sung. et al. were able to draw on a vast amount of relevant data, from a child’s attachment to her mother at 15 months to the fluctuating income of her family. These factors were then tested against the age of menarche, as the scientists attempted to figure out the psychological variables that determine the onset of puberty. 

The first thing they found is that environmental harshness (but not unpredictability) predicts the timing of the first menstrual cycle. While this correlation is limited by the relatively small number of impoverished families in the sample, it does suggest that not all stress is created equal, at least when it comes to the acceleration of reproductive development. It’s also evidence that poverty itself is stressful, and that children raised in the poorest households are marked by their scarcities.

But the news isn’t all terrible. The most significant result to emerge from this new paper is that the effects of childhood stress on reproductive development can be minimized by a secure mother-daughter relationship. When the subjects were 15 months old, they were classified using the Strange Situation procedure, a task pioneered by Mary Ainsworth in the mid-1960s. The experiment is a carefully scripted melodrama, as a child is repeatedly separated and reunited with his or her mother. The key variable is how the child responds to these reunions. Securely attached infants get upset when their mothers leave, but are excited by her return; they greet her with affectionate hugs and are quickly soothed. Insecure infants, on the other hand, are difficult to calm down, either because they feign indifference to their parent or because they react with anger when she comes back.  

Countless studies have confirmed the power of these attachment categories: Securely attached infants get better grades in high school, have more satisfying marriages and are more likely to be sensitive parents to their own children, to cite just a few consistent findings. However, this new study shows that having a secure attachment can also dramatically minimize the developmental effects of stress and poverty, at least when measured by the onset of puberty. 

Love is easy to dismiss as a scientific variable. It’s an intangible feeling, a fiction invented by randy poets and medieval troubadours. How could love matter when life is sex and death and selfish genes?

And yet, even within the unsparing framework of evolution we can still measure the sweeping influence of love. For these children growing up in the harshest environments, the security of attachment is not just a source of pleasure. It's their shield.

Sung, Sooyeon, Jeffry A. Simpson, Vladas Griskevicius, I. Sally, Chun Kuo, Gabriel L. Schlomer, and Jay Belsky. "Secure infant-mother attachment buffers the effect of early-life stress on age of menarche." Psychological Science, 2016.

The Curious Robot

Curiosity is the strangest mental state. The mind usually craves certainty; being right feels nice; mystery is frustrating. But curiosity pushes back against these lesser wants, compelling us to seek out the unknown and unclear. To be curious is to feel the pleasure of learning, even when what we learn is that we’re wrong.

One of my favorite theories of creativity is the so-called “information gap” model, first developed by George Loewenstein of Carnegie-Mellon in the early 90s. According to Loewenstein, curiosity is what happens when we experience a gap “between what we know and what we want to know…It is the feeling of deprivation that results from an awareness of the gap.” As such, curiosity is a mostly aversive state, an intellectual itch begging to be scratched. It occurs when we know just enough to know how little we understand.

The abstract nature of curiosity – it’s a motivational state unlinked to any specific stimulus or reward – has made it difficult to study, especially in the lab. There is no test to measure curiosity, nor is there a way to assess its benefits in the real world. Curiosity seems important – “Curiosity is, in great and generous minds, the first passion and the last,” wrote Samuel Johnson – but at times this importance verges on the intangible.

Enter a new paper by the scientists Pierre-Yves Oudeyer and Linda Smith that explores curiosity in robots and its implications for human nature. The paper is based on a series of experiments led by Ouedeyer, Frederic Kaplan and colleagues in which a pair of adorable quadruped machines – they look like a dogs from the 22nd century – were set loose on an infant play mat. One of these robots is the “learner,” while the other is the “teacher.” Here’s a picture of the setup:

The learner robot begins with a set of “primitives,” or simple pre-programmed instincts. It can, for instance, turn its head, kick its legs and make sounds of various pitches. These primitives begin as just that: crude scripts of being, patterns of actions that are not very impressive. The robot looks about as useful as a newborn.

But these primitives have a magic trick: they are bootstrapped to a curious creature, as the robot has been programmed to seek out those experiences that are the most educational. Consider a simple leg movement. The robot begins by predicting what will happen after the movement. Will the toy move to the left? Will the teacher respond with a sound? Then, after the leg kick, the robot measures the gap between its predictions and reality. This feedback leads to a new set of predictions, which leads to another leg kick and another measurement of the gap. A shrinking gap is evidence of its learning.

Here’s where curiosity proves essential. As the scientists note, the robot is tuned to explore “activities where the estimated reward from learning progress is high,” where the gap between what it predicts and what actually happens decreases most quickly. Let’s say, for instance, that the robot has four possible activities to pursue, represented in the chart below:

A robot driven by curiosity will avoid activity 4 - too easy, no improvement - and also activity 1, which is too hard. Instead, it will first focus on activity 3, as investing in that experience leads to a sharp drop in prediction errors. Once that curve starts to flatten - the robot has begun learning at a slower rate - it will shift to activity 2, as that activity now generates the biggest educational reward.

This simple model of curiosity – it leads us to the biggest knowledge gaps that can be closed in the least amount of time - generates consistent patterns of development, at least among these robots. In Oudeyer's experiments, the curious machines typically followed the same sequence of pursuits. The first phase involved “unorganized body babbling,” which led to the exploration of each “motor primitive.” These primitives were then applied to the external environment, often with poor results: the robot might vocalize towards the elephant toy (which can’t talk back), or try to hit the teacher. The fourth phase featured more effective interactions, such as talking to the teacher robot (rather than hitting it), or grasping the elephant. “None of these specific objectives were pre-programmed,” write the scientists. “Instead, they self-organized through the dynamic interaction between curiosity-driven exploration, statistical inference, the properties of the body, and the properties of the environment.”

It’s an impressive achievement for a mindless machine. It’s also a clear demonstration of the power of curiosity, at least when unleashed in the right situation. As Oudeyer and Smith note, many species are locked in a brutal struggle to survive; they have to prioritize risk avoidance over unbridled interest. (Curiosity killed the cat and all that.) Humans, however, are “highly protected for a long period” in childhood, a condition of safety that allows us, at least in theory, to engage in reckless exploration of the world. Because we have such a secure beginning, our minds are free to enjoy learning with little consideration of its downside. Curiosity is the faith that education is all upside.*

The implication, of course, is that curiosity is a defining feature of human development, allowing us to develop “domain-specific” talents – speech, tool use, literacy, chess, etc. – that require huge investments of time and attention.* When it comes to complex skills, failure is often a prerequisite for success; we only learn how to get it right by getting it wrong again and again. Curiosity is what draws us to these useful errors. It’s the mental quirk that lets us enjoy the steepest learning curves, those moments when we become all too aware of the endless gaps in our knowledge. The point of curiosity is not to make those gaps disappear – it’s to help us realize they never will.

*A new paper by Christopher Hsee and Bowen Ruan in Psychological Science demonstrates that even curiosity can have negative consequences. Across a series of studies, they show that our "inherent desire" to resolve uncertainty can lead people to endure aversive stimuli, such as electric shocks, even when the curiosity comes with no apparent benefit. They refer to this as the Pandora Effect. I'd argue, however, that the occasional perversities of curiosity are far outweighed by the curse of being incurious, as that can lead to confirmation bias, overconfidence, filter bubbles and all sorts of errors with massive consequences, both at the individual and societal level.

Oudeyer, Pierre-Yves, and L. Smith. "How evolution may work through curiosity-driven developmental process." Topics in Cognitive Science.

Money, Pain, Death

Last December, the economists Anne Case and Angus Deaton published a paper in PNAS highlighting a disturbing trend: more middle-aged white Americans are dying. In particular, whites between the ages of 45 and 54 with a high school degree or less have seen their mortality rate increase by 134 people per 100,000 between 1999 and 2013. This increase exists in stark contrast to every other age and ethnic demographic group, both in America and other developed countries. In the 21st century, people are supposed to be living longer, not dying in the middle of life.

What’s going on? A subsequent statistical analysis by Andrew Gelman suggested that a significant part of the effect was due to the aging population, as there are now more people in the older part of 45-54 cohort. (And older people are more likely to die.) However, this correction still doesn’t explain much of the recent changes to the mortality rate, nor does it explain why the trend only exists in the United States.

To explain these rising death rates, Case and Deaton cite a number of potential causes, from a spike in suicides to the prevalence of obesity. However, their data reveal that the single biggest contributor was drug poisonings, which rose more than fourfold between 1999 and 2013. This tragic surge has an equally tragic explanation: in the late 1990s, powerful opioid painkillers become widely available, leading to a surge in prescriptions. In 1991, there were roughly 76 million prescriptions written for opioids in America. By 2013, there were nearly 207 million.

Here’s where the causal story gets murky, at least in the Case and Deaton paper. Nobody really knows why painkillers have become so much more popular. Are they simply a highly addictive scourge unleashed by Big Pharma? Or is the rise in opioid prescriptions triggered, at least in part, by a parallel rise in chronic physical pain? Case and Deaton suggest that it’s largely the later, as their paper highlights the increase in reports of pain among middle-aged whites. “One in three white non-Hispanics aged 45–54 reported chronic joint pain,” write the economists, “one in five reported neck pain; and one in seven reported sciatica.” America is in the midst of a pain epidemic.

To review the proposed causal chain: more white people are dying because more white people are taking painkillers because more white people are experiencing severe pain. But this bleak narrative leads to the obvious question: what is causing all this pain?

That question, which has no easy answer, is the subject of a new paper in Psychological Science by Eileen Chou, Bidhan Parmar and Adam Galinsky. Their hypothesis is that our epidemic of pain is caused, at least in part, by rising levels of economic insecurity.

The paper begins with a revealing survey result. After getting data on 33,720 households spread across the United States, the scientists found that when both adults were unemployed, households spent 20 percent more on over-the-counter painkillers, such as Tylenol and Midol. A follow-up survey revealed that employment status was indeed correlated with reports of pain, and that inducing a feeling of economic hardship – the scientists asked people to recall a time when they felt financially insecure – nearly doubled the amount of pain people reported. In other words, the mere memory of money problems set their nerves on fire.

Why does economic insecurity increase our perception of physical pain? In a lab experiment, the scientists asked more than 100 undergraduates at the University of Virginia to plunge their hand into a bucket of 34 degree ice water for as long as it felt comfortable. Then, the students were randomly divided into two groups. The first group was the high-insecurity condition. They read a short text that highlighted their bleak economic prospects:

"Research conducted by Bureau of Labor Statistics reveals that more than 300,000 recent college grads are working minimum wage jobs, a figure that is twice as high as it was merely 10 years ago. Certain college grads bear more of the burden than others. In particular, students who do not graduate from top 10 national universities (e.g., Princeton and Harvard) fare significantly worse than those who do".

The students were then reminded that the University of Virginia was the 23rd best college in the United States, at least according US News & World Report.

In contrast, those students assigned to the low insecurity condition were given good news:

"Certain college grads are shield [sic] from the economic turmoil more than others. In particular, students who graduate from top 10 public universities (e.g., UC Berkeley and UVA) fare significantly better on the job market than those who do not. These college grads have a much easier time finding jobs."

These students were reminded that the University of Virginia was the second highest ranked public university.

After this intervention, all of the students were taken back to the ice bucket station. Once again, they were asked to keep their hand in the cold water for as long as it felt comfortable. As predicted, those primed to feel economically insecure showed much lower levels of pain tolerance:

The scientists speculate that the mediating variable between economic insecurity and physical pain is a lack of control. When people feel stressed about money, they feel less in control of their lives, and that lack of control exacerbates their perception of pain. The ice water feels colder, their nerves more sensitive to the sting.

In Case and Deaton's paper on the rising death rates of white Americans, the economists note that less educated whites have been hit hard by recent economic trends. “With widening income inequality, many of the baby-boom generation are the first to find, in midlife, that they will not be better off than were their parents,” they write. Job prospects are bleak; debt levels are high; median income has fallen by 4 percent for the middle class over the last 15 years.

The power of this new paper by Chou et al. is that it tells the human impact of these facts. When we feel buffeted by forces beyond our control – by global shifts involving the rise of automation and the growth of Chinese manufacturing and the decline of the American middle class – we are more likely to experience aches we can’t escape. As the scientists point out, the end result is a downward spiral, as economic insecurity causes physical pain which makes it harder for people to work which leads to even more pain.

It shouldn’t be a surprise, then, that dangerous painkillers become such a tempting way out. Side effects include death.

Chou, E. Y., B. L. Parmar, and A. D. Galinsky. "Economic Insecurity Increases Physical Pain." Psychological Science (2016)


The Fastest Way To Learn

Practice makes perfect: One of those clichés that gets endlessly trotted out, told to children at the piano and point guards shooting from behind the arc. It applies to multiplication tables and stickshifts, sex and writing. And the line is true, even if it overpromises. Perfection might be impossible, but practice is the only way to get close.

Unfortunately, the cliche is limited by its imprecision. What kind of practice makes perfect? And what aspects of practice are most valuable? Is it the repetition? The time? The focus? Given the burden of practice – it’s rarely much fun – knowing what works is useful knowledge, since it comes with the promise of learning faster. To invoke another cliché: Less pain, more gain.

These practical questions are the subject of a new paper by Nicholas Wymbs, Amy Bastian and Pablo Celnik in Current Biology that investigates the best ways to practice a motor skill. In the experiment, the scientists had subjects play a simple computer game featuring an isometric pinch task. Basically, subjects had to squeeze a small device that translated the amount of force they applied into cursor movements. The goal of the game was to move the cursor to specific windows on the screen.

The scientists divided their subjects into three main groups. The first group practiced the isometric task and then, six hours later, repeated the exact same lesson. The second group practiced the task but then, when called back six hours later, completed a slightly different version of the training, as the scientists required varying amounts of force to move the cursor. (The variations were so minor that subjects didn’t even notice them.) The last group only performed a single practice session. There was no follow-up six hours later.

The next day, all three groups returned to the lab for another training session. Their performance on the task was also measured. How accurate were their squeezes? How effectively were they able to control the cursor?

At first glance, the extra variability might seem counterproductive. Motor learning, after all, is supposed to be about the rote memorization of muscles, as the brain learns how to execute the exact same plan again and again. (As the scientists write, “motor learning is commonly described as a reduction of variability.”) It doesn’t matter if we’re talking about free throws or a Bach fugue – it’s all about mindless consistency, reinforcing the skill until it’s a robotic script.

However, the scientists found that making practice less predictable came with big benefits. When subjects were given a second training session requiring variable amounts of force, they showed gains in performance nearly twice as large as those who practiced for the same amount of time but always did the same thing. (Not surprisingly, the group given less practice time performed significantly worse.) In other words, a little inconsistency in practice led people to perform much more effectively when they returned to the original task.

This same technique – forcing people to make small alterations during practice - can be easily extended to all sorts of other motor activities. Perhaps it means shooting a basketball of a slightly different size, or doctoring the weight of a baseball bat, or adjusting the tension of tennis racquet strings. According to the scientists, these seemingly insignificant changes should accelerate your education, wringing more learning from every minute of training.

Why does variability enhance practice? The scientists credit a phenomenon known as memory reconsolidation. Ever since the pioneering work of Karim Nader, et al. it’s become clear that the act of recall is not a passive process. Rather, remembering changes the memory itself, as the original source file is revised every time it’s recalled. Such a mechanism has its curses – for one thing, it makes our memories highly unreliable, as they never stay the same – but it also ensures that all those synaptic files get updated in light of the latest events. The brain isn’t interested in useless precision; it wants the most useful version of the world, even if that utility comes at the expense of verisimilitude. It’s pragmatism all the way down.

While reconsolidation theory is already being used to help treat patients with PTSD and traumatic memories – the terrible past can always be rewritten – this current study extends the promise of reconsolidation to complex motor skills. In short, the scientists show that training people on a physical task, and then giving them subtle variations on that task after it has been recalled, can strengthen the original memory trace.  Because subjects were forced to rapidly adjust their “motor control policy” to achieve the same goals, their brains seamlessly incorporated these new lessons into the old motor skill. The practice felt the same, but what they’d learned had changed: they were now that much closer to perfect.

Wymbs, Nicholas F., Amy J. Bastian, and Pablo A. Celnik. "Motor Skills Are Strengthened through Reconsolidation." Current Biology 26.3 (2016): 338-343.


The Psychology of 'Making A Murderer'

Roughly ten hours into Making a Murderer, a Netflix documentary about the murder trial of Steven Avery, his defense lawyer Dean Strang delivers the basic thesis of the show:

“The forces that caused that [the conviction of Brendan Dassey and Steven Avery]…I don’t think they are driven by malice, they’re just expressions of ordinary human failing. But the consequences are what are so sad and awful.”

Strang then goes on to elaborate on these “ordinary human failing[s]”:

“Most of what ails our criminal justice system lies in unwarranted certitude among police officers and prosecutors and defense lawyers and judges and jurors that they’re getting it right, that they simply are right. Just a tragic lack of humility of everyone who participates in our criminal justice system.”

Strang is making a psychological diagnosis. He is arguing that at the root of injustice is a cognitive error, an “unwarranted certitude” that our version is the truth, the whole truth, and nothing but the truth. In the Avery case, this certitude is most relevant when it comes to forensic evidence, as his lawyers (Dean Strang and Jerry Buting) argue that the police planted keys and blood to ensure a conviction. And then, after the evidence was discovered, Strang and Buting insist that forensic scientists working for the state distorted their analysis to fit the beliefs of the prosection. Because they needed to find the victim’s DNA on a bullet in Avery’s garage – that was the best way to connect him to the crime – the scientists bent protocol and procedure to make a positive match.

Regardless of how you feel about the details of the Avery case, or even about the narrative techniques of Making A Murderer, the documentary raises important questions about the limitations of forensics. As such, it’s a useful antidote to all those omniscient detectives in the CSI pantheon, solving crimes with threads of hair and fragments of fingerprints. In real life, the evidence is usually imperfect and incomplete. In real life, our judgments are marred by emotions, mental short-cuts and the desire to be right. What we see is through a glass darkly.

One of the scientists who has done the most to illuminate the potential flaws of forensic science is Itiel Dror, a cognitive psychologist at the University College of London. Consider an experiment conducted by Dror that featured five fingerprint experts with more than ten years of experience working in the field. Dror asked these experts to examine a set of prints from Brandon Mayfield, an American attorney who’d been falsely accused of being involved with the Madrid terror attacks. The experts were instructed to assess the accuracy of the FBI’s final analysis, which concluded that Mayfield's prints were not a match. (The failures of forensic science in the Mayfield case led to a searing 2009 report from the National Academy of Sciences. I wrote about Mayfield and forensics here.) 

Dror was playing a trick. In reality, each set of prints was from one of the experts’ past cases, and had been successfully matched to a suspect. Nevertheless, Dror found that the new context – telling the forensic analysts that the prints came from the exonerated Mayfield - strongly influenced their judgment, as four out of five now concluded that there was insufficient evidence to link the prints. While Dror was careful to note that his data did “not necessarily indicate basic flaws” in the science of fingerprint identification – those ridges of skin remain a valid way to link suspects to a crime scene – he did question the reliability of forensic analysis, especially when the evidence gathered from the field is ambiguous. Here's an example of an ambiguous set of prints, which might give you a sense of just how difficult forensic analysis can be:

Similar results have emerged from other experiments. When Dror gave forensic analysts more typical stories about fingerprints they’d already reviewed, such as informing them that a suspect had already confessed, the new stories were able to get two-thirds of analysts to reverse their previous conclusions at least once. In an email, Dror noted that the FBI has replicated this basic finding, showing that in roughly 10 percent of cases examiners reverse their findings even when given the exact same prints. They are consistently inconsistent.

If the flaws of forensics were limited to fingerprints and other forms of evidence requiring visual interpretation, such as bite marks and hair samples, that would still be extremely worrying. (Fingerprints have been a crucial police tool since the French detective Alphonse Bertillon used a bloody print left behind on a pane of glass to secure a murder conviction in 1902.) But Dror and colleagues have shown that these same basic failings can even afflict the gold-standard of forensic evidence: DNA.

The experiment went like this: Dror and Greg Hampikian presented DNA evidence from a 2002 Georgia gang rape case to 17 professional DNA examiners working in an accredited government lab. Although the suspect in question (Kerry Robinson) had pleaded not guilty, the forensic analysts in the original case concluded that he could not be excluded based on the genetic data. This testimony, write the scientists, was “critical to the prosecution.”

But was it the best interpretation of the evidence? After all, the DNA gathered from the rape victim was part of a genetic mixture, containing samples from multiple individuals. In such instances, the genetics become increasingly complicated and unclear, making forensic analysts more likely to be swayed by their presumptions and prejudices. And because crime labs are typically part of a police department, these biases are almost always tilted in the direction of the prosecution. For instance, in the Avery case, the analyst who identified the victim’s DNA on a bullet fragment had been explicitly instructed by a detective to find evidence that the victim had been “in his house or his garage.”

To explore the impact of this potentially biasing information, Dror and Hampikian sent the DNA evidence from the Georgia gang rape case to additional examiners. The only difference was that these forensic scientists did the analysis blind - they weren’t told about the grisly crime, or the corroborating testimony, or the prior criminal history of the defendants. Of these 17 additional experts, only one concurred with the original conclusion. Twelve directly contradicted the finding presented during the trial – they said Robinson could be excluded - and four said the sample itself was insufficient.

These inconsistencies are not an indictment of DNA evidence. Genetic data remains, by far, the most reliable form of forensic proof. And yet, when the sample contains biological material from multiple individuals, or when it’s so degraded that it cannot be easily sequenced, or when low numbers of template molecules are amplified, the visual readout provided by the DNA processing software must be actively interpreted by the forensic scientists. They are no longer passive observers – they have become the instrument of analysis, forced to fill in the blanks and make sense of what they see. And that’s when things can go astray.

The errors of forensic analysts can have tragic consequences. In Convicting the Innocent, Brandon Garrett’s investigation of more than 150 wrongful convictions, he found that “in 61 percent of the trials where a forensic analyst testified for the prosecution, the analyst gave invalid testimony.” While these mistakes occurred most frequently with less reliable forms of forensic evidence, such as hair samples, 17 percent of cases involving DNA testing also featured misleading or incorrect evidence. “All of this invalid testimony had something in common,” Garret writes. “All of it made the forensic evidence seem like stronger evidence of guilt than it really was.”

So what can be done? In a recent article in the Journal of Applied Research in Memory and Cognition, Saul Kassin, Itiel Dror and Jeff Kukucka propose several simple ways to improve the reliability of forensic evidence. While their suggestions might seem obvious, they would represent a radical overhaul of typical forensic procedure. Here are the psychologists top five recommendations:

1) Forensic examiners should work in a linear fashion, analyzing the evidence (and documenting their analysis) before they compare it to the evidence taken from the target/suspect. If their initial analysis is later revised, the revisions should be documented and justified.

2) Whenever possible, forensic analysts should be shielded from potentially biasing contextual information from the police and prosecution. Here are the psychologists: “We recommend, as much as possible, that forensic examiners be isolated from undue influences such as direct contact with the investigating officer, the victims and their families, and other irrelevant information—such as whether the suspect had confessed. “

3) When attempting to match evidence from the field to that taken from a target/suspect, forensic analysts should be given multiple samples to test, and not just a single sample taken from the suspect. This recommendation is analogous to the eyewitness lineup, in which eyewitnesses are asked to identify a suspect among a pool of six other individuals. Previous research looking at the use of an "evidence lineup" with hair samples found that introducing additional samples reduced the false positive error rate from 30.4 percent to 3.8 percent.  

4) When a second forensic examiner is asked to verify a judgment, the verification should be done blindly. The “verifier” should not be told about the initial conclusion or given the identity of the first examiner.

5) Forensic training should also include lessons in basic psychology relevant to forensic work. Examiners should be introduced to the principles of perception (the mind is not a camera), judgment and decision-making (we are vulnerable to a long list of biases and foibles) and social influence (it’s potent).

The good news is that change is occurring, albeit at a slow pace. Many major police forces – including the NYPD, SFPD, and FBI – have started introducing these psychological concepts to their forensic examiners. In addition, leading forensic organizations, such as the US National Commission on Forensic Science, have endorsed Dror’s work and recommendations.

But fixing the practice of forensics isn’t enough: Kassin, Dror and Kukucka also recommend changes to the way scientific evidence is treated in the courtroom. “We believe it is important that legal decision makers be educated with regard to the procedures by which forensic examiners reached their conclusions and the information that was available to them at that time,” they write. The psychologists also call for a reconsideration of the “harmless error doctrine,” which holds that trial errors can be tolerated provided they aren’t sufficient to reverse the guilty verdict. Kassin, Dror and Kukucka point out that this doctrine assumes that all evidence is analyzed independently. Unfortunately, such independence is often compromised, as a false confession or other erroneous “facts” can easily influence the forensic analysis. (This is a possible issue in the Avery case, as Brendan Dassey’s confession – which contains clearly false elements and was elicited using very troubling police techniques – might have tainted conclusions about the other evidence. I've written about the science of false confessions here.) And so error begets error; our beliefs become a kind of blindness.

It’s important to stress that, in most instances, these failures of forensics don't require intentionality. When Strang observes that injustice is not necessarily driven by malice, he's pointing out all the sly and subtle ways that the mind can trick itself, slouching towards deceit while convinced it's pursuing the truth. These failures are part of life, a basic feature of human nature, but when they occur in the courtroom the stakes are too great to ignore. One man’s slip can take away another man’s freedom.

Dror, Itiel E., David Charlton, and Ailsa E. Péron. "Contextual information renders experts vulnerable to making erroneous identifications." Forensic Science International 156.1 (2006): 74-78.

Ulery, Bradford T., et al. "Repeatability and reproducibility of decisions by latent fingerprint examiners." PloS one 7.3 (2012): e32800.

Dror, Itiel E., and Greg Hampikian. "Subjectivity and bias in forensic DNA mixture interpretation." Science & Justice 51.4 (2011): 204-208.

Kassin, Saul M., Itiel E. Dror, and Jeff Kukucka. "The forensic confirmation bias: Problems, perspectives, and proposed solutions." Journal of Applied Research in Memory and Cognition 2.1 (2013): 42-52.

The Danger of Safety Equipment

My car is a safety braggart. When I glance at the dashboard, there’s a cluster of glowing orange lights, reminding me of all the smart technology designed to save me from my stupid mistakes. Airbags, check. Anti-lock brakes, check. Traction control, check. Collision Alert system, check.

It’s a comforting sight. It might also be a dangerous one. In fact, if you follow the science, all of these safety reminders could turn me into a more dangerous driver. This is known as the risk compensation effect, and it refers to the fact that people tend to take increased risks when using protective equipment. It’s been found among bicycle riders (people go faster when wearing helmets), taxi drivers and children running an obstacle course (safety gear leads kids to run more “recklessly.”) It’s why football players probably hit harder when playing with helmets and the fatality rate for skydivers has remained constant, despite significant improvements in safety equipment. (When given better parachute technology, people tend to open their parachutes closer to the ground, leading to a sharp increase in landing deaths.) It’s why improved treatments for HIV can lead to riskier sexual behaviors, why childproof aspirin caps don’t reduce poisoning rates (parents are more likely to leave the caps off bottles) and why countries with mandatory seat belt laws shift the risk from drivers to pedestrians and cyclists. As John Adams, professor of geography at the University of College London notes, “Protecting car occupants from the consequences of bad driving encourages bad driving.”

However, despite this surfeit of field data, the precise psychological mechanisms of risk compensation remain unclear. One of the lingering mysteries involves the narrowness of the effect. For instance, when people drive a car loaded with safety equipment, it’s clear that they often drive faster. But are they also more likely to not follow parking regulations? Similarly, a football player wearing an advanced helmet is probably more likely to deliver a dangerous hit with their head. But are they also more willing to commit a penalty? Safety equipment makes us take risks, but what kind of risks?

To explore this mystery, the psychologists Tim Gamble and Ian Walker at the University of Bath came up with a very clever experimental design. They recruited 80 subjects to play a computer game in which they had to inflate an animated balloon until it burst. The bigger the balloon, the bigger the payout, but every additional pump came with a risk: the balloon could pop, and then the player would get nothing.

Here’s the twist: Before the subjects played the game, they were given one of two pieces of headgear to wear. Some were given a baseball hat, while others were given a bicycle helmet. They were told that the gear was necessary part of the study, since the scientists had to track their eye movements. You can see the equipment below:

In reality, the headgear was a test of risk compensation. Gamble and Walker wanted to know how wearing a bike helmet, as opposed to a baseball hat, influenced risk-taking behavior on a totally unrelated task. (Obviously, a bike helmet won’t protect you from an exploding balloon on a computer screen.) Sure enough, those subjects randomly assigned to wear the helmet inflated the balloon to a much greater extent, receiving risk-taking scores that were roughly 30 percent higher. They also were more likely to admit to various forms of “sensation-seeking,” such as saying they “wish they could be a mountain climber,” or that they “enjoy the company of real ‘swingers.’” In short, the mere act of wearing a helmet that provided no actual protection still led people to act as if they were protected from all sorts of risks.

This lab research has practical implications. If using safety gear induces a general increase in risky behavior - and not just behavior directly linked to the equipment - then it might also lead to unanticipated dangers for which we are ill prepared. “This is not to suggest that the safety equipment will necessarily have its specific utility nullified,” write Gamble and Walker, “but rather that there could be changes in behavior wider than previously envisaged.” If anti-lock brakes lead us to drive faster in the rain, that’s too bad, but at least it’s a danger the technology is designed to mitigate. However, if the presence of the safety equipment also makes us more likely to text on the phone, then it might be responsible for a net reduction in overall safety, at least in some cases. Anti-lock brakes are no match for a distracted driver.

This doesn’t mean we're better off without air bags or behind the wheel of a Ford Pinto. But perhaps we should think of ways to lessen the salience of our safety gear. (At the very least, we should get rid of all those indicators on the dashboard.) Given the risk compensation effect, the safest car just might be the one that never tells you how safe it really is.

Gamble, Tim and Walker, Ian. “Wearing a Bicycle Helmet Can Increase Risk Taking and Sensation Seeking in Adults,” Psychological Science, 2016.

Do Genes Predict Intelligence? In America, It Depends on Your Class

There’s a longstanding academic debate about the genetics of intelligence. On the one side is the “hereditarian” camp, which cites a vast amount of research showing a strong link between genes and intelligence. This group can point to persuasive twin studies showing that, by the time children are 17 years old, their genetics explain approximately 66 percent of the variation in intelligence. To the extent we can measure smarts, what we measure is a factor largely dictated by the double helices in our cells.

On the other side is the “sociological” camp. These scientists tend to view differences in intelligence as primarily rooted in environmental factors, whether it’s the number of books in the home or the quality of the classroom. They cite research showing that many children suffering from severe IQ deficits can recover when placed in more enriching environments. Their genes haven’t changed, but their cognitive scores have soared.

These seem like contradictory positions, irreconcilable descriptions of the mind. However, when science provides evidence of two opposing theories, it’s usually a sign that something more subtle is going on. And this leads us to the Scarr-Rowe hypothesis, an idea developed by Sanda Scarr in the early 1970s and replicated by David Rowe in 1999. It’s a simple conjecture, at least in outline: according to the Scarr-Rowe hypothesis, the influence of genetics on intelligence depends on the socioeconomic status of the child. In particular, the genetic influence is suppressed in conditions of privation – say, a stressed home without a lot of books – and enhanced in conditions of enrichment. These differences have a tragic cause: when children grow up in poor environments, they are unable to reach their full genetic potential. The lack of nurture holds back their nature.

You can see this relationship in the chart below. As socioeconomic status increases on the x-axis, the amount of variance in cognitive-test performance explained by genes nearly triples. Meanwhile, nurture generates diminishing returns. Although upper class parents tend to fret over the details of their parenting — Is it better to play the piano or the violin? Should I be a Tiger Mom or imitate those chill Parisian parents?— these details of enrichment become increasingly insignificant. Their children are ultimately held back by their genetics.

It’s a compelling theory, with significant empirical support. However, a number of studies have failed to replicate the Scarr-Rowe hypothesis, including a 2012 paper that looked at 8716 pairs of twins in the United Kingdom. This inconsistency has two possible explanations. The first is that the Scarr-Rowe hypothesis is false, a by-product of underpowered studies and publication bias. The second possibility, however, is that different societies might vary in how socioeconomic status interacts with genetics. In particular, places with a more generous social welfare system – and an educational system less stratified by income - might show less support for the Scarr-Rowe hypothesis, since their poor children are less likely to be cognitively limited by their environment.

These cross-country differences are the subject of a new meta-analysis in Psychological Science by Elliot Tucker-Drob and Timothy Bates. In total, the scientists looked at 14 studies drawn from nearly 25,000 pairs of twins and siblings, split rather evenly between the United States and other developed countries in Western Europe and Australia. The goal of their study was threefold: 1) measure the power of the Scarr-Rowe hypothesis in the United States 2) measure the power of the Scarr-Rowe hypothesis outside of the United States, in countries with stronger social-welfare systems and 3) compare these measurements.

The results should depress every American: we are the great bastion of socioeconomic inequality, the only rich country where many poor children grow up in conditions so stifling they fail to reach their full genetic potential. The economic numbers echo this inequality, showing how these differences in opportunity persist over time. Although America likes to celebrate its upward mobility, the income numbers suggest that such mobility is mostly a myth, with only 4 percent of people born into the bottom quintile moving into the top quintile as adults. As Michael Harrington wrote in 1962, “The real explanation of why the poor are where they are is that they made the mistake of being born to the wrong parents.” 

Life isn’t fair. Some children will be born into poor households. Some children will inherit genes that make it harder for them to succeed. Nevertheless, we have a duty to ensure that every child has a chance to learn what his or her brain is capable of. We should be ashamed that, in 21st century America, the effects of inequality are so pervasive that people on different ends of the socioeconomic spectrum have minds shaped by fundamentally different forces. Rich kids are shaped by the genes they have. Poor kids are shaped by the support they lack.

Tucker-Drob, Elliot and Bates, Timothy. “Large cross-national differences in gene x socioeconomic status interaction on intelligence,” Psychological Science. 2015.        

The Louis-Schmeling Paradox


Why do we go to sporting events?

The reasons to stay home are obvious. Here’s my list, in mostly random order: a beer costs $12, the view is better from my couch, die-hard fans can be scary, the price of parking, post-game traffic.

That’s a pretty persuasive list. And yet, as I stare into my high-resolution television, I still find myself hankering for the live event, jealous of all those people eating bad nachos in the bleachers, or struggling to see the basketball from the last row. It’s an irrational desire - I realize I should stay home, save money, avoid the hassle – but I still want to be there, at the game, complaining about the cost of beer.

In a classic 1964 paper, “The Peculiar Economics of Professional Sports,” Walter Neale came up with an elegant explanation for the allure of live sporting events. He began his discussion with what he called the Louis-Schmeling Paradox, after the epic duo of fights between heavyweights Joe Louis and Max Schmeling. (Louis lost the first fight, but won the second.) According to Neale, the boxers perfectly illustrate the “peculiar economics” of sports. Although normal business firms seek out monopolies – they want to minimize competition and maximize profits – such a situation would be disastrous for a heavyweight fighter. If Joe Louis had a boxing monopoly, then he’d have “no one to fight and therefore no income,” for “doubt about the competition is what arouses interest.” Louis needed a Schmeling, the Lakers needed the Celtics and the Patriots benefit from a healthy Peyton Manning. It’s the uncertainty that’s entertaining.

Professional sports leagues closely follow Neale’s advice. They construct elaborate structures to smooth out the differences between teams, instituting salary caps, revenue sharing and lottery-style drafts. The goal is to make every game a roughly equal match, just like a Louis-Schmelling fight. Because sports monopolies are bad for business, Neale writes that the secret prayer of every team owner should be: “Oh Lord, make us good, but not that good.”

It’s an alluring theory. It’s also just that: a theory, devoid of proof. Apart from a few scattered anecdotes – when the San Diego Chargers ran roughshod over the AFL in 1961 “fans stayed away”– Neale’s paper is all conjecture.

Enter a new study by the economists Brad Humphreys and Li Zhou, which puts the Louis-Schmeling paradox to the empirical test. Humphreys and Zhou decided to delve into the actual numbers, looking at the relationship between league competition, team performance and game attendance. Their data was drawn from the home games of every Major League Baseball team between 2006 and 2010, as they sought to identify the variables that actually made people want to buy expensive tickets and overpay for crappy food.

What did they find? In “Peculiar Economics,” Neale made a clear prediction: “The closer the standings, and within any range of standings, the more frequently the standings change, the larger will be the gate receipts.” (Neale called this the “League Standing Effect,” arguing that the flux of brute competition was a “kind of advertising.”) However, Humphreys and Zhou reject this hypothesis, as they find that changes in the standings, and the overall closeness of team win percentages, have absolutely no impact on game attendance. Uncertainty is overrated.

But the study isn’t all null results. After looking at more than 12,000 baseball games, Humphreys and Zhou found that two variables were particularly important in determining attendance. The first variable was win preference, which isn’t exactly shocking: fans are more likely to attend games in which the home team is more likely to win. If we’re going to invest time and money in a live performance, then we want the investment to pay off; we don’t want to be stuck in post-game traffic after a defeat, thinking ruefully of all the better ways we could have spent our cash.

The second variable driving ticket sales is loss aversion, an emotional quirk of the mind in which losses hurt more than gains feel good. According to Humphreys and Zhou, loss aversion compounds the pain of a team’s defeat, especially when we expected a win. This suggests that the impact of an upset is asymmetric, with surprising losses packing a far greater emotional punch than surprising wins. The end result is that the pursuit of competitive balance – a league in which upsets are common – is ultimately a losing proposition for teams trying to sell more tickets. Instead of seeking out parity, greedy owners should focus on avoiding home losses, as that tends to discourage attendance at games.*

And so a familiar tension is revealed in the world of sports. On the one hand, there are the collective benefits of equality, which is why sports leagues aggressively redistribute wealth and draft picks. (The NFL is a bastion of socialism.) However, the individual team owners have a much narrower set of interests – they just want to win, especially at home, because that's what sells tickets.

The fans are stuck somewhere in between. While Neale might have been mistaken about the short-term motives of attendance – we want Louis to knock the shit out of Schmeling, not witness a close boxing match – he was almost certainly correct about the long-term impact of a league with a severe competitive imbalance. (It’s exciting when the Warriors are 23-0; it’s a travesty if they go undefeated for an entire season.) Sports fans might not be drawn to uncertainty, but they sure as hell need hope. Just ask those poor folks packed into Wrigley Field.

*Baseball owners should also invest in pitching: teams that give up more runs at home also exhibit lower attendance.

Neale, Walter C. "The peculiar economics of professional sports: A contribution to the theory of the firm in sporting competition and in market competition." The Quarterly Journal of Economics (1964): 1-14.

Humphreys, Brad, and Li Zhou. "The Louis-Schmelling Paradox and the League Standing Effect Reconsidered." The Journal of Sports Economics. (2015) 16: 835-852

When Should Children Start Kindergarten?

One of the fundamental challenges of parenting is that the practice is also the performance; childcare is all about learning on the job. The baby is born, a lump of need, and we’re expected to keep her warm, nourished and free of diaper rash. (Happy, too.) A few random instincts kick in, but mostly we just muddle our way through, stumbling from nap to nap, meal to meal. Or at least that’s how it feels to me.

Given the steep learning curve of parenting, it’s not surprising that many of us yearn for the reassurance of science. I want my sleep training to have an empirical basis; I’m a sucker for fatty acids and probiotics and the latest overhyped ingredient; my bookshelf groans with tomes on the emotionally intelligent toddler.

Occasionally, I find a little clarity in the research. The science has taught me about about the power of emotional control and the importance of secure attachments. But mostly I find that the studies complicate and unsettle, leaving me questioning choices that, only a generation or two ago, were barely even a consideration.  I’m searching for answers. I end up with anxiety.

Consider the kindergartner. Once upon a time, a child started kindergarten whenever they were old enough to make the age cutoff, which was usually after they turned five. (Different states had slightly different requirements.) However, over the last decade roughly 20 percent of children have been held back from formal schooling until the age of six, a process known as “redshirting.” The numbers are even higher for children in “socioeconomically advantaged families.”

What’s behind the redshirting trend? There are many causes, but one of the main factors has been research suggesting that delaying a child’s entry into a competitive process offers lasting benefits. Most famously, researchers have demonstrated that Canadian hockey players and European soccer stars are far more likely to have birthdays at the beginning of the year. The explanation is straightforward: because these redshirted children are slightly older than their peers, they get more playing time and better coaching. Over time, this creates a feedback loop of success.

However, the data has been much more muddled when it comes to the classroom. Athletes might benefit from a later start date, but the case for kindergartners isn’t nearly as clear. One study of Norwegian students concluded that the academic benefits were a statistical illusion: older children score slightly higher on various tests because they’re older, not because they entered kindergarten at a later date. Other studies have found associations between delayed kindergarten and educational attainment – starting school later makes us stay in school longer – but no correlation with lifetime earnings. To make matters even more complicated, starting school late seems to have adverse consequences for boys from poorer households, who are more likely to drop out of high-school once they reach the legal age of school exit.

Are you confused? Me too, and I’ve got a got a kid on the cusp of kindergarten. To help settle this debate, Thomas Dee of Stanford University and Hans Henrik Sievertsen at the Danish National Centre for Social Research decided to study Danish schoolchildren. This is for two reasons: 1) the country had high quality longitudinal data on the mental health of its students and 2) children in Denmark are supposed to begin formal schooling in the calendar year in which they turn six. This rule allowed Dee and Sievertsen to compare children born at the start of January with children born just a few days before in December. Although these kids are essentially the same age, they ended up in in different grades, making them an ideal population to study the impact of a delayed start to school.

After comparing these two groups of students, Dee and Sieversten found a surprisingly large difference in their mental health. According to the scientists, children who were older when they started kindergarten – they fell on the January side of the calendar – displayed significant improvements in mental health, both at the age of 7 and 11. In particular, the late starters showed much lower levels of inattention and hyperactivity, with a one-year delay leading to 73 percent decrease in reported problems.

As the scientists note, these results jive with a large body of research in developmental psychology suggesting that children benefit from an extended period of play and unstructured learning. When a child is busy pretending – when they turn a banana into a phone or a rock into a spaceship – they are practicing crucial mental skills. They are learning how to lose themselves in an activity and sustain their own interest. They are discovering the power of emotion and the tricks of emotional control. “To become mature,” Nietzsche once said, “is to recover that sense of seriousness which one had as a child at play.” But it takes time to develop that seriousness; the imagination cannot be rushed. 

Does this mean I should hold back my daughter? Is it always better to start kindergarten at a later date? Probably not. Dee and Sieverten are careful to note that the benefits of a later start to school were distributed unevenly among the Danish children. As a result, the scientists emphasize the importance of taking the individual child into account when making decisions about when to start school. Where is he on the developmental spectrum? Has she had a chance to develop her play skills? What is the alternative to kindergarten? As Dee noted in The Guardian, “the benefits of delays are unlikely to exist for children in preschools that lack the resources to provide well-trained staff and a developmentally rich environment.”

And so we’re left with the usual uncertainty. The data is compelling in aggregate – more years of play leads to better attention skills – but every child is an n of 1, a potential exception to the rule. (Parents are also forced to juggle more mundane concerns, like money; not every family can afford the luxury of redshirting.) The public policy implications are equally complicated. Starting kindergarten at the age of six might reduce attention problems, but only if we can replace the academic year with high-quality alternatives. (And that’s really hard to do.)

The takeaway, then, is that there really isn’t one. We keep looking to science for easy answers to the dilemmas of parenting, but mostly what we learn is that such answers don’t exist. Childcare is a humbling art. Practice, performance, repeat.

Dee, Thomas and Hans Henrik Sievertsen. "The Gift of Time? School Starting Age and Mental Health,” NBER Working Paper No. 21610, October 2015.

The Root of Wisdom: Why Old People Learn Better

In Plato’s Apology, Socrates defines the essence of wisdom. He makes his case by comparison, arguing that wisdom is ultimately an awareness of ignorance. The wise man is not the one who always gets it right. He’s the one who notices when he gets it wrong: 

I am wiser than this man, for neither of us appears to know anything great and good; but he fancies he knows something, although he knows nothing; whereas I, as I do not know anything, so I do not fancy I do. In this trifling particular, then, I appear to be wiser than he, because I do not fancy I know what I do not know. 

I was thinking of Socrates while reading a new paper in Psychological Science by Janet Metcalfe, Lindsey Casal-Roscum, Arielle Radin and David Friedman. The paper addresses a deeply practical question, which is how the mind changes as it gets older. It’s easy to complain about the lapses of age: the lost keys, the vanished names, the forgotten numbers. But perhaps these shortcomings come with a consolation. 

The study focused on how well people learn from their factual errors. The scientists gave 44 young adults (mean age = 24.2 years) and 45 older adults (mean age = 73.7 years) more than 400 hundred general information questions. Subjects were asked, for instance, to name the ancient city with the hanging gardens, or to remember the name of the woman who founded the American Red Cross. After answering each question, they were asked to rate, on a 7-point scale, their “confidence in the correctness of their response.” Then, they were then shown the correct answer. (Babylon, Clara Barton.) This phase of the experiment was done while subjects were fitted with an EEG cap, a device able to measure the waves of electrical activity generated by the brain.

The second part of the experiment consisted of a short retest. The subjects were asked, once again, to answer 20 of their high-confidence errors – questions they thought they got right but actually got wrong – and 20 low-confidence errors, or those questions they always suspected they didn’t know.

The first thing to note is that older adults did a lot better on the test overall. While the young group only got 26 percent of questions correct, the aged subjects got 41 percent. This is to be expected: the mind accumulates facts over time, slowly filling up with stray bits of knowledge.

What’s more surprising, however, is how the older adults performed on the retest, after they were given the answers to the questions they got wrong. Although current theory assumes that older adults have a harder time learning new material - their semantic memory has become rigid, or “crystallized” – the scientists found that the older subjects performed much better than younger ones during the second round of questioning. In short, they were far more likely to correct their errors, especially when it came to low-confidence questions:

Why did older adults score so much higher on the retest? The answer is straightforward: they paid more attention to what they got wrong. They were more interested in their ignorance, more likely to notice what they didn’t know. While younger subjects were most focused on their high-confidence errors – those mistakes that catch us by surprise – older subjects were more likely to consider every error, which allowed them to remember more of the corrections. Socrates would be proud.

You can see these age-related differences in the EEG data. When older subjects were shown the correct answer in red, they exhibited a much larger P3a amplitude, a signature of brain activity associated with the engagement of attention and the encoding of memory.

Towards the end of their paper, the scientists try to make sense of these results in light of research documenting the shortcomings of the older brain. For instance, previous studies have shown that older adults have a harder time learning new (and incorrect) answers to math problems, remembering arbitrary word pairs, and learning “deviant variations” to well-known fairy tales. Although these results are often used as evidence of our inevitable mental decline – the hippocampus falls apart, etc. – Metcalfe and colleagues speculate that something else is going on, and that older adults are simply “unwilling or unable to recruit their efforts to learn irrelevant mumbo jumbo.” In short, they have less patience for silly lab tasks. However, when senior citizens are given factually correct information to remember – when they are asked to learn the truth – they can rally their attention and memory. The old dog can still learn new tricks. The tricks just have to be worth learning.

Metcalfe, Janet, Lindsey Casal-Roscum, Arielle Radin, and David Friedman. "On Teaching Old Dogs New Tricks." Psychological Science (2015)