Sunday, April 11, 2010

Statistician's analysis could determine teachers' fates

The opening sentence says a lot: "I can't explain how you calculate it. I can explain the concept."

So the concept in theory is understood but how it plays out in practice is not. These are some of the very disconnects in policy making that ultimately hurt many of our children and youth.

-Patricia


Erica Mellon | Houston Morning Star
Feb. 21, 2010

Statisticians don't often make the news. But William “Bill” Sanders became known among local teachers when the Houston Independent School District decided this month to evaluate — and possibly fire — them based on his statistical analysis of student test scores. HISD has paid Sanders' employer, the SAS Institute in Cary, N.C., more than $1.3 million since 2008 for his analysis, which is the basis of the district's performance bonus system.

Q: Teachers have criticized your formula as being too complicated. In simple terms, can you explain how you calculate a teacher's effectiveness?

A: I can't explain how you calculate it. I can explain the concept. The concept is very simple. You look at the progress rates for the individual student. The way to visualize it is, every student has his own trajectory. If the kid is having a really good year, there will be a positive bubble. If the kid is not having a good year, there will be a negative dent. By mathematically aggregating the dimples and bubbles across all students, that gives you a basis for measuring the effectiveness of the schooling entity — the district, the building or the classroom.

Q: I've heard your formula is a secret or proprietary.

A: That's not true. Sanders has given the Chronicle a paper about the Tennessee Value-Added Assessment System, which he developed. Find it at blogs.chron.com/schoolzone

Q: Some teachers say they would prefer being judged on a test given to students at the beginning and the end of the school year. What's wrong with that?

A: When any one kid takes any one test on any one day any one year, there's this huge error of measurement. If you're doing a simple gain, you don't have enough data to dampen those errors.

Q: Much of your analysis is based on TAKS test scores. Isn't the TAKS imperfect?

A: (He laughs.) There is no such thing as a perfect test. What you're looking for are three conditions. The scales have got to be highly correlated with curricular objectives. They must have appropriate stretch to enable you to measure progress of high- and low-achieving students and have appropriate reliabilities. We feel very comfortable using the TAKS test.

Q: Is a school or a teacher with a lot of high-achieving students at a disadvantage?

A: Absolutely not. We have determined there's not a major ceiling effect problem with the TAKS test, unlike the (former state test) TAAS. In fact, with that we had several districts asking us to do these analyses and we said, ‘Thanks but no thanks. The ceiling effects are too great.'

Q: How can teachers have, say, 90 percent of their students pass TAKS yet have negative growth?

A: Those are totally different things. The proficiency level that's set for passing is relatively low. So you could have students that are scoring above proficiency but yet not making appropriate growth, particularly for the above-average kids.

Q: If I'm a seventh-grade English teacher and I have a student who in sixth grade got every question right on the English TAKS, how do I show growth?

A: For that kid, you can't. But you're evaluating it on all the kids. When you look at the composite of the kids, those things start averaging out.

Q: Is it harder for schools with more disadvantaged students to show growth?

A: There is no relationship between the percentage of free- and reduced-priced lunch students in the classroom and the measure of classroom effectiveness.

Q: Some social studies teachers complain that the seventh-grade curriculum doesn't align with the Stanford test, which is used in your formula.

A: There's probably some very legitimate question about that. Remember, one of the three criteria — it's got to be highly correlated with curricular objectives. The question is, ‘How high is high enough?' I would submit to you that there is considerable relationship between the Stanford test and the social studies curriculum, but it certainly would not be as highly correlated as math, say. We still are very comfortable that these are unbiased estimates.

Q: HISD has placed high stakes on your data, awarding bonuses to teachers who do well and threatening to fire those who don't. Do you support HISD's plans?

A: What I've said consistently is, ‘I'm the numbers guy. I'm not the policy guy.' I have maintained for over 20 years that an appropriate value-added measurement should be a major component of formal evaluation. I have also said, ‘It should not be the sole component.'

Q: But if HISD's decisions weren't statistically smart, would you speak up?

A: Of course.

ericka.mellon@chron.com

No comments:

Post a Comment