I agree that perfection should not get in way of progress. But the LA Times (and a number of think tank analysts) continue to justify the use of VAM — value-added estimates of individual teachers — as the sole determinant of who is effective in raising student achievement and who is not.
The pundits may give a slight tip of the hat toward multiple measures of effectiveness, but their conventional wisdom (as surfaced again in an August 28th LA Times article) is that “the benefits of singling out those who consistently succeed” far exceeds the bad “publicity” for those with “low rankings” (and the lack of fully reliable data be damned).
While Secretary of Education Arne Duncan has said that “no one thinks test scores should be the only factor in teacher evaluations,” he and many others may be encouraging school districts to make it the overwhelming factor when they give their blessing for newspapers to identify effective teachers solely on the basis of VAM mathematical formulas, as the Secretary did with some gusto in his August 29, 2010 article for the New York Daily News.
Meanwhile the hard research is piling up, showing how the use of standardized student test scores and value-added modeling in teacher evaluation can do lots more harm than good — and harm not just to teachers, but to schools in every kind of American community, by distorting the reasons for student success.
A major new study from a top national team of education measurement experts* offers up mounds of evidence to confirm how unstable the results of VAM evaluations are across time, across classes and across tests. These respected scholars, assembled by the Economic Policy Institute, found that applying value-added methodologies to student test scores does not fully control for the varied factors that affect student learning, including the effects of summer learning loss for some students.
They also reveal how VAM does not account for the fact that some students receive instruction in particular subjects from more than one teacher or that some are offered afterschool programs, while others are not. And they make an even more critical point:
Creating a system in which teachers are, in effect, competing with each other can reduce the incentive to collaborate within schools – and studies have shown that better schools are marked by teaching staffs that work together.The Secretary suggests that the issue of publicly releasing the names of effective and ineffective teachers in the newspaper, based solely on VAM results, is an “emotional” one. But I would argue that this issue is far more about science than emotion.
The science is clear: VAM stratagems — no matter how sophisticated their statistical procedures may be — are not capable alone of fairly and accurately identifying who is an effective teacher. There’s some irony in the fact that this major new study supports an analysis actually commissioned by the Secretary recently. The Secretary’s report concluded that over 1 in 4 teachers who are judged by VAM tools will be falsely labeled as effective or ineffective, even when there are three years worth of data available.
In its most recent story on teacher performance, the LA Times identified and painted a powerful portrait of a number of effective teachers like Zenaida Tan and Hollie Bloch, two very experienced educators who teach in LAUSD. But the further irony is that while VAM helped the Times find these two teachers, there is at least one chance in four that Ms. Tan and Ms. Bloch might be labeled “ineffective” by the newspaper in some future year, using the same exact methods.
Returning to the EPI report, I could not agree more with America’s educational measurement experts that “although standardized test scores of students are one piece of information for school leaders to use to make judgments about teacher effectiveness, such scores should be only a part of an overall comprehensive evaluation.”
I agree with those who say that perfection should not get in the way of progress. A comprehensive system of evaluating teacher performance would include VAM measures as one component, but also an array of other metrics — e.g., teachers’ analyses of student work or how teachers spread expertise from one classroom to the next. What’s more, a really good system would reveal why some teachers were more effective than others and what we can do to increase their numbers and their spread.
Such a system also would rely on the leadership of classroom experts like Tan and Bloch, and not on school principals who have neither the time nor training to make the granular pedagogical decisions that reveal who is effective or not.
I agree with the Secretary that a public spotlight needs to shine on our nation’s most effective teachers. I’ve been advocating for just such practices for more than a decade. But let’s make sure we understand all the dimensions of teaching that make them outstanding. And then let's advocate — together — for a system that guarantees and rewards those expert practices.
* The distinguished authors of EPI's report include four former presidents of the American Educational Research Association; two former presidents of the National Council on Measurement in Education; the current and two former chairs of the Board of Testing and Assessment of the National Research Council of the National Academy of Sciences; the president-elect of the Association for Public Policy Analysis and Management; the former director of the Educational Testing Service's Policy Information Center and a former associate director of the National Assessment of Educational Progress; a former assistant U.S. Secretary of Education; a former and current member of the National Assessment Governing Board; and the current vice-president, a former president, and three other members of the National Academy of Education.