MIT researchers surprised mice to learn how humans judge risky behavior

The researchers got a surprise of their own.
Grant Currin
Neurons in the striatum (red) encode information about the potential outcomes of a particular action.MIT

Decisions are extraordinarily important, whether you're a human or a mouse.

Surprisingly, neuroscientists are still trying to figure out how our brains make them. Now, they're a little bit closer to understanding that complex process.

In a study published Tuesday in the academic journal Nature Communications, researchers used neural imaging to observe the neurons of mice as they were presented with a choice that sometimes resulted in a small reward and sometimes led to a small punishment.

The researchers expected to see those neurons light up either when the mice experienced something good or something bad, but that's not what happened. Instead, these neurons were activated by the unexpected. 

“A lot of this brain activity deals with surprising outcomes, because if an outcome is expected, there’s really nothing to be learned," says neuroscientist Bernard Bloem, a co-author on the paper. "What we see is that there’s a strong encoding of both unexpected rewards and unexpected negative outcomes," he says.

The researchers started by confusing some thirsty mice

Our brains — and the brains of nearly every other species of animal — are constantly making decisions. Do I eat this berry? Which path do I take? Is the risk worth the potential reward? We typically make these decisions based on what we think is going to happen as a result. Those predictions, in turn, are usually based on past experiences. Plenty of research has shown that a lot of this processing takes place deep in the brain, in a region called the striatum. But knowing where something happens is a far cry from understanding how it works.

The researchers behind this new study wanted to figure out how neurons in the striatum learn from past experiences. They did it by presenting mice with a Lego wheel they could turn clockwise or counterclockwise. The experimental setup also included a couple of tubes. One of them gave the mice some water (they had been given only a milliliter of water each day for roughly a week leading up to the experiment) and the other delivered a small, unpleasant puff of air.

Most Popular

A short beep told the mice when the experiment began. They had a few seconds to turn the wheel in one direction or the other. There wasn't a right answer. Sometimes, turning the wheel to the left would have a higher chance (80 percent) of reward and a lower chance of punishment. At other times, the opposite was true. Each mouse went through this procedure hundreds of times, with the probabilities changing periodically.

The point of the experiment wasn't really to teach the mice anything. The researchers wanted the mice to develop expectations so they could watch how the rodent brains reacted to the punishments and rewards. 

Unexpected outcomes caused the neurons to fire

The researchers thought the neurons they were looking at would show certain patterns when the mice got water and other patterns when they had to endure a puff of air. 

But that's not what they saw. The neurons actually fired when the rules of the wheel-turning game changed, like when turning it to the left suddenly resulted in punishment rather than reward. That's why the researchers termed these patterns of neuron activity "error signals." They seem to be the brain's way of noting that its model of the world — a set of expectations for the relationship between an action and its outcome — is wrong and needs to be revised. 

The researchers think the activity in these neurons helps register the fact that the prediction was incorrect, sparking learning. The data also seems to inform other areas of the brain about what's going on. That information probably helps the mouse make better decisions in the future. 

“The decision whether to do an action or not, which essentially requires integrating multiple outcomes, probably happens somewhere downstream in the brain,” Bloem says.

The researchers say their work could inform behavioral therapies for people who live with the many neurological and psychological conditions that impact decision-making, like anxiety, depression, OCD, and PTSD. 


Learning about positive and negative outcomes of actions is crucial for survival and underpinned by conserved circuits including the striatum. How associations between actions and outcomes are formed is not fully understood, particularly when the outcomes have mixed positive and negative features. We developed a novel foraging (‘bandit’) task requiring mice to maximize rewards while minimizing punishments. By 2-photon Ca++ imaging, we monitored activity of visually identified anterodorsal striatal striosomal and matrix neurons. We found that action-outcome associations for reward and punishment were encoded in parallel in partially overlapping populations. Single neurons could, for one action, encode outcomes of opposing valence. Striosome compartments consistently exhibited stronger representations of reinforcement outcomes than matrix, especially for high reward or punishment prediction errors. These findings demonstrate multiplexing of action-outcome contingencies by single identified striatal neurons and suggest that striosomal neurons are particularly important in action-outcome learning.