Are students learning? At a time when teachers’ unions are being bashed, budgets are shrinking, and the big funders are demanding that every teacher be measured by their test scores, it seems only fair to ask: What do the numbers tell us about student learning?

Many of the projects we are hired to complete at Eskolta come from requests to help analyze and interpret complex data on student learning. Often, the starting assumption is that the “data” that matter are quite specific: test scores in English and math. So in this brief space, let’s focus on those.

In a recent project, Eskolta reviewed “value-added” changes in literacy assessment scores for a group of 58 students over the course of a year. This kind of analysis is similar to what is being demanded to assess teacher effectiveness at the city, state, and federal levels: comparing test scores on two different dates to see change over time. When we looked at the change from September 2009 to June 2010, some students made dramatic progress, growing several grade levels, others made almost none, still others went backward. When we extended the analysis out to September 2010, the broad numbers again showed a mix of results. But the mix was different from what it had been just a few months earlier. Some of the students who had made notable progress from September to June later backslid by months or years. Others who had shown almost no progress had dramatically accelerated over the summer.

What was going on? Had the school neglected its high achievers over the summer while focusing on the strugglers? Had students gotten sporadic help leading to different jumps in achievement? Actually, the answer is likely simpler. By way of analogy, let’s say we had done the same thing with the weather. We know the weather generally gets warmer from September to June. But if you happened to take the temperature at Times Square on September 24, 2009, and then again on June 9, 2010, you’d see that it had dropped from 75 degrees to 62. The weather as a whole was warmer in June than it was in September, but at any one moment on any one day it is far harder to predict.

Even if students are generally improving from September to June, if we take their temperature in one test in September and one in June, it is hard to know what the results will tell us. This is partly because one test, like one weather report, is subject more to the volatility of the moment than the trends of the year. The relentless focus on test-score data tends to overlook the fact that this data comes in the form of highly limited snapshots.

But there is a second, greater, challenge: students, unlike the weather, are human. Learning is a complex matter, and reducing it to a few snapshots of numbers is naïve. Measuring the many of aspects of learning by comparing a few test scores, though done with the best of intentions, may not only provide bad data but may in fact unintentionally encourage bad teaching: the teachers who are rewarded are those who obsessively focus on the numbers, when what we need are teachers who obsessively focus on students. Of course, the two (focusing on students and focusing on numbers) are not mutually exclusive at all; a good teacher relies on the numbers, along with his skills as a professional, his understanding of his students, and his knowledge of the content, to figure out what to do next. But, to shift the analogy from the weather to the kitchen, if you gave me the choice between a chef who focused obsessively on measuring out each ingredient to test that the numbers matched the recipe and one who kept adding and tasting and adjusting as she cooked, I’d prefer the taster to the tester.