education

Thursday, September 30, 2010

Why the school grading system, and Joel Klein, still deserve a big "F"

Amidst all the hype and furor of the release of today’s NYC school "progress reports", everyone should remember how the grades are not to be trusted. By their inherent design, the grades are statistically invalid, and the DOE must be fully aware of this fact. Why?

See this Daily News oped I wrote in 2007, in which all the criticisms still hold true, “Why parents and teachers should reject the new grades”.

In part, this is because 85% of each school’s grade depends on one year’s test scores alone – which according to experts, is highly unreliable. Researchers have found that 32 to 80% of the annual fluctuations in a typical school’s scores are random or due to one time factors alone, unrelated to the amount of learning taking place. Thus, given the formula used by the Department of Education, a school’s grade may be based more on chance than anything else.

(source: Thomas Kane, Douglas O. Staiger, “The Promise and Pitfalls of Using Imprecise School Accountability Measures, The Journal of Economic Perspectives, Autumn, 2002.)

Now Jim Liebman admitted this fact, that one year’s test score data was inherently unreliable, in testimony to the City Council, and to numerous parent groups, including to CEC D2, as recounted on p. 121 of Beth Fertig’s book, Why can’t U teach me 2 read.” In responding to Michael Markowitz’s observations that the grading system was designed to provide essentially random results, he admitted:

“There’s a lot I actually agree with, he said in a concession to his opponent…He then proceeded to explain how the system would eventually include three years’ worth of data on every school so the risk of big fluctuations from one year to the next wouldn’t be such a problem.”

Nevertheless, the DOE and Liebman have refused to comply with this promise, which reveals a basic intellectual dishonesty. This is what Suransky emailed me about the issue, a couple of weeks ago, when I asked him about it before our NY Law school “debate.”

“We use one year of data because it is critical to focus schools’ attention on making progress with their students every year. While we have made gains as a system over the last 9 years, we still have a long way to reach our goal of ensuring that all students who come out of a New York City school are prepared for post-secondary opportunities. Measuring multiple years’ results on the Progress Report could allow some schools to “ride the coattails” of prior years’ success or unduly punish schools that rebound quickly from a difficult year.”

Of course, this is nonsense. No educators would “coast” on a prior year’s “success”, but they would be far more confident in a system that didn’t give them an inherently inaccurate rating.

Given the fact that that school grades bounce up and down each year, most teachers, administrators and even parents have long figured out how they should be discounted, and justifiably believe that any administration that would punish or reward a school based on such invalid measures is not to be trusted.

That DOE has changed the school grading formula in other ways every year for the last three years also doesn’t give one any confidence….though they refuse to change the most fundamental flaw. Yet another major problem is while the teacher data reports take class size into account as a significant limiting factor in how much schools can get student test scores to improve, the progress reports do not.

There are lots more problems with the school grading system, including the fact that they are primarily based upon state exams that we know are themselves completely unreliable. As MIT professor Doug Ariely recently wrote about the damaging nature of value-added teacher pay, because of the way they are based on highly unreliable measurements:

…What if, after you finished kicking [a ball] somebody comes and moves the ball either 20 feet right or 20 feet left? How good would you be under those conditions? It turns out you would be terrible. Because human beings can learn very well in deterministic systems, but in a probabilistic system—what we call a stochastic system, with some random error—people very quickly become very bad at it.

So now imagine a schoolteacher. A schoolteacher is doing what [he or she] thinks is best for the class, who then gets feedback. Feedback, for example, from a standardized test. How much random error is in the feedback of the teacher? How much is somebody moving the ball right and left? A ton. Teachers actually control a very small part of the variance. Parents control some of it. Neighborhoods control some of it. What people decide to put on the test controls some of it. And the weather, and whether a kid is sick, and lots of other things determine the final score.

So when we create these score-based systems, we not only tend to focus teachers on a very small subset of [what we want schools to accomplish], but we also reward them largely on things that are outside of their control. And that's a very, very bad system.”

Indeed. The invalid nature of the school grades are just one more indication of the fundamentally dishonest nature of the Bloomberg/Klein administration, and yet another reason for the cynicism, frustration and justifiable anger of teachers and parents.

Also be sure to check out this Aaron Pallas classic: Could a Monkey Do a Better Job of Predicting Which Schools Show Student Progress in English Skills than the New York City Department of Education?

Thursday, July 9, 2009

A Failing Grade for Mr. Liebman

Several articles appeared today about James Liebman's resignation after serving three years as head of the Tweed's Office of Accountability -- finally returning to Columbia University law school full time: Chief Accountability Officer for City Schools Resigns (NY Times); and New accountability chief says he’ll carry on Liebman’s legacy (Gotham Schools).

Let us remember that this man had no qualifications for the job, and proved this repeatedly over the years. In fact the only person who probably knew less about education and how to nurture conditions for learning was the man who hired him: Chancellor Klein. Columbia University finally woke up to the fact that he had been double-dipping: while holding the office of Chief Accountability Officer at Tweed, he was also supposedly on the full-time law faculty for the last year.

The progress reports he designed were widely derided as unreliable and statistically untenable; the quality reviews were an expensive waste of time and paperwork, and ignored when DOE was deciding which schools to close and which schools to commend; the $80 million supercomputer called ARIS was a super-expensive super-mugging by IBM, according to techies who found it laughable how much DOE was taken for a ride.

The surveys were badly designed, and counted for only a small percentage of school grades. Yet because principals were terrified of bad results, parents were pressured into giving favorable reviews for fear their schools would otherwise be punished. And the top priority of parents on these surveys — class size reduction — was ignored; worse, it was repeatedly derided by Liebman et. al. as a goal not worthy to pursue.

Under his leadership or lack thereof, the Accountability office continued to mushroom with more and more high priced educrats, "Knowledge Managers" and the like, few of whom, like him, had any experience or qualifications for the job, no less an understanding of statistics or the limitations of data.

One would think that a man who had focused professionally on the large error rate in capital punishment cases would have a little humility in terms of recognizing the fallibility of human judgment -- but no such luck. When confronted with the question of why schools should be given single grades, rather than a more nuanced system that might recognize their variety of attributes, he opined that a single grade, from A to F was useful "to concentrate the mind."

The ostensible point of the test score data from the periodic assessments and standardized tests, collected and spewed out by ARIS, to be analyzed by each school's "data inquiry teams” and "Senior Achievement Facilitators" was supposedly to encourage “differentiated instruction” to occur , although this goal was severely hampered by the fact that under Klein's leadership or lack thereof, overcrowding and excessive class sizes have continued.

No matter how much data is available — even assuming it is statistically reliable— the best way to allow differentiated instruction to occur is to lower class size.

And let us not forget Liebman’s cowardly run out the back door of City Hall in order to escape parents and hundreds of petitions collected by Time out from Testing — even though City Council Education Chair Robert Jackson had specifically requested that he leave through the front door of the chambers after he testified so that he could receive the petitions with the respect that they deserved. A perfect emblem of his three years at DOE.

education

Thursday, September 30, 2010

Why the school grading system, and Joel Klein, still deserve a big "F"

Thursday, July 9, 2009

A Failing Grade for Mr. Liebman

Blog Archive

Total Pageviews