Check out this Twitter exchange from Matt Yglesias.
Let’s answer snark with substance.
First question: who?
This is a common theme from Yglesias, that there are lots of critics of school reform who believe that there is no such thing as teacher quality or that it cannot possibly be measured. You can see this most clearly in his original post on “edunihilism”: “If it’s true that we don’t need to shake up the K-12 school system because what happens inside K-12 schools doesn’t alter socioeconomically determined achievement gaps, then the policy remedy is random across-the-board cuts in K-12 school spending.” The problem here is that I know of literally no one who believes that what happens inside K-12 school does nothing to alter socioeconomically determined achievement gaps. This is both a personal interest of mine and an academic interest of mine, and I have not encountered anyone, and certainly no prominent critics of school reform, who endorses an edunhilist philosophy. This lone post from Kevin Drum is the best evidence I can find that anybody in the liberal blogosphere believes it, and it’s unclear to me either from that individual post or from his subsequent commentary that Drum believes in edunihilism as Yglesias defines it. Yet Yglesias flogs the idea again and again and again, without ever being specific about who he is referring to. Don’t just take my word for it. Erik Kain has been asking the same question for months.
One of the undeniable advantages of the web is that it makes citation and evidence a snap. Even on Twitter, it would be very easy for Yglesias to cite specific examples. It’s really easy to critique people when you are actually critiquing a broad caricature that they would never sign off on. The problem is that this is no way to get at the truth. Edunihilism is a strawman, or at best, a very rarely encountered weak man. And, incidentally, I’m not even sure his conclusions proceed logically from his premise. Even if some say that there is a threshold beyond which students cannot achieve because of socioeconomic factors, that doesn’t imply that there would be no negative impact on their performance if you slashed funding. It’s perfectly consistent to argue that there is a plateau beyond which student success is impossible while nevertheless saying that scores would fall if funding were slashed.
(Predictably, Yglesias earned plaudits from the National Review for this post; the school reform movement is intrinsically conservative, as it is primarily oriented towards smashing unions and defunding government programs.)
What the majority of critics of school reform that I am aware of say is not that there is no such thing as teacher quality or that educational outcomes are purely determined by socioeconomic factors. What many say, and what I say, is that talk of an American crisis of education is out of tune with the broader reality, as our deepest problems are largely contained to terribly performing outliers among urban poor black and Hispanic students. This in and of itself represents a major challenge to public policy, but talk of widespread crisis and a need for total reorganization of our public education system is folly. We recognize the rare disadvantages that hurt American educational outputs include a) a mammoth child poverty rate which negatively effect them in international comparisons and make those comparisons of dubious value and b) more controversially and distressingly, a racial achievement gap that is not explained by socioeconomic status and that affects black and Hispanic students across the income distribution. In this context, what becomes particularly difficult is assessing the meaning of teacher effect signal against student variable noise. Epistemology is a big problem, and one of my continuing frustrations is Yglesias’s refusal to consider the difficulty of effective teacher assessment.
I’m not quite sure how seriously Yglesias wants to take an analogy between student educational outputs and the tensile strength of concrete. But let’s roll with it. Simply put, it is vastly easier to assess whether concrete has been mixed correctly than it is to measure a teacher’s input based on student performance. Questions about the efficacy of standardized testing or similar metrics to determine how well a teacher is performing is nothing new. Anyone in an educational or pedagogical field who has tried to come up with dissertation research can tell you that it’s very difficult to test which teacher inputs produce differences in student outputs. There are simply so many confounding variables, the most vexing of which is that students have final agency over what they produce.
The quality of concrete doesn’t change because its parents are getting a divorce; it doesn’t change because it had a stomach ache the day it got tested; it doesn’t change because it just went through puberty and there’s a cute girl in its classroom; it doesn’t change because it has a behavior problem; it doesn’t change because it doesn’t get along with its teacher; it doesn’t change because it just decided to dick around that day. Anyone who has been a student understands the wisdom of this. There are few people out there who didn’t have that one semester in college where their grades were markedly worse than in other semesters, because they broke up with their girlfriend or started doing coke or stayed up all night every night playing Wizards of Warlock. Nobody blames their professors for that. (Indeed, we get on professors if the grades they give are too high.)
If you mix and pour concrete correctly, you get quality concrete. There’s no doubt about it. A teacher can perform his or her function perfectly and students can still fail. Indeed, some number of his or her students are statistically certain to fail regardless of his or her performance. That doesn’t mean that the quality of instruction is immaterial to all the students of that class, or even of the failing students. It does mean that there is simply a limit to the degree of teacher control on that output. And when dealing with the large disadvantages that socioeconomic factors and the racial achievement gap inflict, it becomes even harder to separate signal from noise in evaluating teacher performance. Under such conditions, where teachers have great constraints in how much they can affect student outcomes, and the consequences for failure to produce those outcomes is dire, fraud is inevitable. We are seeing just such fraud now— as was predicted perfectly by critics of school reform.
You could see some of the nuances of the epistemological problems with standardized testing in the now notorious case of Washington DC private school vouchers (PDF), where students showed no statistically significant gains in standardized testing but graduated at far higher rates. There are two possibilities, neither of which are pleasant for the school reform movement: either the students did make marked improvement from participation in the program, which directly undercuts the efficacy of the standardized tests so beloved of school reformers to measure student educational gains; or the students did not improve, but were graduated anyway because their grades were teacher-dependent and teachers felt great pressure to graduate underperforming students.
When I argue this subject online I often find commenters who say things like “surely you can’t think that there are no effective ways to measure teacher output.” I’m not such a defeatist, no, but I do think that a large tradition of evidence gives us considerable reason to doubt our current metrics for teacher performance against the considerable statistical noise of student environment and makeup. Yes, I wish we had more consistently effective metrics, but wishing still doesn’t make it so.
How then to use standardized testing? The only way to reap any benefit from them is to remove the inevitable pressure for fraud, and the only way to assess their efficacy fairly is as part of a broad mixed methods approach of assessment that also includes direct observation and coding of student behavior, tracking of students across wide age ranges, student and parent satisfaction surveys, evaluations by peers and independent arbiters, and a broad commitment to correcting for student inputs. It’s a regime worth trying and testing empirically. Unfortunately, this does not serve the primary interests of school reformers, which long since devolved into “smash unions first.”
When your movement gets to the point where it is more concerned with talking tough and hurting teachers, that is of course the outcome you get– tough talk, worse employment conditions for teachers, and not much else.
The question for school reformers like Yglesias is how much longer we have to wait for these reforms to show their long-promised, little demonstrated positive effects. The school reform movement is decades old, as are voucher efforts and charter schools. The extant empirical evidence (based on the very metrics school reformers champion!) have failed to consistently show repeatable gains, despite the loud assurances of the movement. When do their failures receive the fanfare that public schools’ do, and at what point do school reformers drop their failing solutions? After all, accountability begins at home.