There's no question that if teaching is ever to become fully professionalized, then teachers need to be evaluated on results. Across the nation, we're making progress, but we rarely seem to be discussing the most important issues about using (or not) test score data in teacher assessments.
Granted, over the last several days, teacher evaluation news from New York City has revealed the gross lack of trust that exists between district and union leaders. And, from most media accounts, it seems that district administrators believe that teachers don't want to be accountable, while union leaders think that school managers just seek to fire more expensive and vocal classroom practitioners more easily. However, I want to suggest three issues that deserve more attention if we want our public school system to create the kind of results-oriented teacher evaluation system that students deserve:
- Standardized tests can provide important information to policymakers as well as parents and educators, but researchers have shown that most items on multiple-choice exams are instructionally insensitive, i.e., student performance does not differ much for which teachers have taught a topic and those for which they have not.
- Value-added methods, while offering a much-needed tool to isolate individual teacher effects, depend on random assignment of students, but this is virtually impossible to achieve.
- Statistical tools can control for extraneous variables such as poverty, school resources, and class size, but can't account for other important factors—such as whether students get after-school tutoring, whether teachers collaborate with colleagues, and what working conditions teachers experience.
These are serious matters that can confound researchers’ efforts to accurately identify who teaches effectively or not. Teaching is complex work—and as confirmed by the recent MET study—cannot be reduced to a single number. More than 20 years ago Brian Rowan, using data from the U.S. Department of Labor, analyzed the nature of teachers' work in comparison to other occupations and concluded that “teaching is a complex form of work that requires high levels of formal knowledge for successful performance.” Reformers often ignore the scholarship of yesterday.
But one education leader, Jerry Weast, did not, and his record in forging teacher evaluation reform in Montgomery County, Maryland, is a model for how administrators and union leaders should proceed. I was able to hear Jerry tell his story, in just a few minutes at recent meeting of the Asia Society in Seattle, which had assembled small teams of education leaders from nine cities, including those in Asia, Australia, and North America.
In his brief remarks, Jerry told of how he built trust with teachers over twelve years. He explained several specifics of their Peer Assistance and Review (PAR) program, which has been documented extensively by the Harvard Graduate School of Education and is part of a larger set of “deep changes.” Montgomery County is a large, diverse school district of over 140,000 students and 9,300 teachers. Jerry made it clear that “we never could have developed a results-oriented system of teacher evaluation without our union—they became the best judge of pedagogical performance.”
In fact, while Jerry noted that over 500 teachers were dismissed over a decade as a result of PAR (about ten times more when only administrators conducted evaluations), the real benefit has been “all the support” now provided to those who teach. And behind PAR is the district’s Professional Growth System that includes incentives to help teachers pursue the rigorous process of obtaining National Board certification. “Our approach to teacher evalution,” Jerry reminded us, “was built on stability and a long-term commitment to a coherent strategy.”
All this matters for student achievement. In 2010, when Jerry left the district, Montgomery County had the highest high school graduation rate—83 percent—of any large school system in the nation. The gap between white and Hispanic students reaching proficiency in third-grade reading narrowed from 43 percent in 2003 to just 12 percent in 2009; the black-white gap narrowed from 35 percent to 15 percent.
Trust matters. And so do teacher evaluation systems that respect the complexity of teaching.