Translate

Wednesday, May 24, 2006

Analysis suggests cheating on TAKS

Here we go again. This is reminiscent of the 1999 TAAS erasure-marks-cheating scandal. Clearly, this stuff is systemic in terms of the (perverse) incentives that motivate this behavior in adults. -Angela

Analysis suggests cheating on TAKS
TEA consultant cites suspicious scores in 1 in 12 Texas schools in '05

12:15 AM CDT on Tuesday, May 23, 2006

By JOSHUA BENTON / The Dallas Morning News

About one in 12 Texas schools had unusual TAKS results that suggest cheating occurred last year, according to a consultant hired by the Texas Education Agency.

The consultant, a Utah test security firm named Caveon, was hired after a Dallas Morning News series found suspicious scores in nearly 400 schools statewide, based on 2003 and 2004 testing results.

Caveon's analysis, using 2005 TAKS results, found even more: 609 schools, or 8.6 percent of the state's campuses.

But state officials say even those numbers are not a sign of cheating in Texas schools.

"Given the size of this program and the size of this state, yes, we had 600 campuses identified," said Gloria Zyskowski, TEA's director of test administration. "But we have over 5,000 campuses where the test was administered.

"While we take very seriously any allegations of cheating – we don't take any of that lightly – I believe that for the most part these tests are being administered according to the guidelines provided by the state."

The report, obtained using the Texas open records act, reopens a debate about the validity of results on the state's top test, the Texas Assessment of Knowledge and Skills. TEA has traditionally left investigations into allegations of cheating to the districts, and few teachers or students are ever disciplined for wrongdoing.

Caveon's report, like The News' analysis, is based on an extended statistical analysis of student answer sheets. For example, it would flag a classroom where every student answered all the test's questions in exactly the same way, or a classroom where very weak students made seemingly impossible gains in one year.

It would also catch classrooms in which an adult erased a large number of student answers after the test was completed.

The analysis found "statistical inconsistencies" in 609 of the 7,112 Texas public schools where testing was conducted last year. In many of those schools, only one classroom was found to have suspicious activity; in all, 702 classrooms statewide were identified.

Caveon's report emphasizes that the statistical measures are not, by themselves, proof of cheating. In some cases, there may be another explanation for the unusual data patterns.

But the report says Caveon used "a very conservative statistical approach" that means "reasonable explanations of these inconsistencies by referring to normal circumstances become improbable."

TEA does not plan to investigate each of the 609 campuses identified, and Dr. Zyskowski said the agency may not even release their names to school districts. "You want to be pretty cautious about releasing something like that," she said. "As soon as something like that is posted, you have to be very cautious that it is as accurate as it can be."

Instead, agency officials will compare the list with incident reports from 2005. Those reports are generated whenever an educator witnesses something improper during testing at his or her school. If no such report exists for a school on the Caveon list, Dr. Zyskowski said, it's unlikely there would be any further investigation.

Self-investigation
If further investigation is warranted, TEA typically asks districts to investigate themselves. Dr. Zyskowski said the agency does not have the resources to look into many allegations of cheating.

"That's sort of why we tend to be a little judicious, because we are limited in our resources," she said. "So we can only look at a certain number of issues, and we try to look at those that appear to be most serious."

The Caveon report also recommended increasing the number of staffers who monitor the testing process in suspicious schools. But Dr. Zyskowski said TEA does not have the staff to do that; additional personnel would have to come from school districts.

She defended the state testing system as fundamentally sound. State and federal government school accountability systems are based on test scores, which are a major driver of nearly everything in Texas public schools. "I really think that overall that it's not as big of an issue as it sometimes is portrayed to be," Dr. Zyskowski said.

The Caveon report did not name any of the schools it found, but it did provide examples without identifying them.

In one elementary school, 45 of the 262 answer sheets were exact duplicates of one another. An additional 29 students had perfect scores. In all, 141 answer sheets were flagged by the analysis, and Caveon says the chances of such a pattern happening naturally would be less than 1 in 1 trillion trillion trillion trillion trillion trillion – a 1 followed by 72 zeros.

The results also indicate the prevalence of cheating on the TAKS test with the highest stakes of all: the 11th-grade test, which students must pass to graduate.

The Caveon report does not break out suspicious incidents by grade level. But while it examined math and reading scores in grades three through 11, it looked at science and social studies scores only in 11th grade.

The study found suspicious scores in 4.8 percent of all 11th-grade science classrooms and 4.2 percent of 11th-grade social studies classrooms. Those figures are much higher than the 0.7 percent of math classrooms and 0.3 percent of reading classrooms flagged.

If those 11th-graders cheated on the TAKS test last year, they are probably graduating this month.

The News' series on cheating was prompted by unusual scores in Wilmer-Hutchins ISD, the much-troubled district on Dallas' southeast side. A News analysis found strong evidence of cheating in the district's elementary schools.

For example, it found that Wilmer Elementary had Texas' highest raw scores on the third-grade reading test in 2003 – despite the school's abysmal academic track record and having one of the state's most disadvantaged student bodies. Nearly every student at Wilmer had a perfect score on the exam.

The News' findings prompted a state investigation into Wilmer-Hutchins that found evidence that two-thirds of the district's elementary school teachers were helping students improperly on the exams, in some cases creating and distributing answer keys on test day.

As a result of those findings, the Wilmer-Hutchins school board was removed from office and the district is being dissolved. Later stories led to investigations, which led to educators being disciplined in Houston and Dallas.

The state's reaction
In response to the News stories, state Education Commissioner Shirley Neeley said she did not think cheating was a significant problem. "If we have cheating on one campus, or in one classroom, that's unacceptable," she said in February 2005. "But I just don't think it's quite the widespread problem that it's been reported to be."

Still, her agency hired a test security firm as part of the renewal of its overall testing contract last year. That company is Caveon, which is led by former state and national testing officials.

Proving a cheating allegation after the fact is very difficult. Typically, discipline is not pursued against a cheating teacher unless there is eyewitness evidence of wrongdoing – something that can be hard to obtain. As of 2005, only two teachers had lost their teaching license because of cheating allegations in the previous decade.

That problem is compounded by the Caveon report's long lag time – which covers alleged irregularities more than a year old. Dr. Zyskowski said she hopes the company's analysis of 2006 data will arrive more quickly. Having two years of data will also make it easier to see patterns, she said.

E-mail jbenton@dallasnews.com

RAISING SUSPICION
An example of one unidentified high school whose scores the Caveon report found suspicious:

• 91 students took the 11th-grade math TAKS test.

• 55 percent of test takers got an unusual number of hard questions right but an unusual number of easy questions wrong. (Statistically expected number: 4 percent)

• 98 percent of answer sheets were identical or nearly identical to another answer sheet in the group. (Statistically expected: 6 percent)

• 49 percent of students showed unusually high gains from the previous year's test. (Statistically expected: 5 percent)

• The report: "The probability value that these identical answer sheets occurred by chance is so small as to approach the realm of impossibility." Caveon says that chance is less than 1 in 1,000,000,000,000,000,000,000.

SOURCES: Caveon report, Texas Education Agency

SPOTTING THE PROBLEMS
Caveon's analysis of Texas test scores looked for four types of irregularities:

• Answer sheets with unusual numbers of wrong responses that have been erased and replaced with correct ones

• Inexplicably large jumps in students' test scores from the previous year

• Students who answer the harder questions on a test correctly but miss the easy ones

• Answer sheets that are unusually similar to those of other students in the same classroom

Online at: http://www.dallasnews.com/sharedcontent/dws/dn/education/stories/052306dnmetcheating.125e559b.html

1 comment:

  1. I am not sure that 702 instances of cheating could be considered systemic.

    Let's see--there are about 2,934,067 students in grades 3
    through 11. There are 15 students per teacher. But lets say
    that during TAKS testing, there are 20 students in each
    classroom, so 20 students per teacher. That means there was
    about 146,703 classrooms of kids taking the TAKS. Actually,
    this number is higher since there are multiple tests, but
    let's just stick with this number for arguments sake. So,
    Caveon finds that 702 of these classrooms have unusual
    scores that indicate possible cheating. This would be 0.5% of
    all the classrooms. I don't think 0.5% constitutes systemic.

    This is not to say the test is not over-emphasized, but you should think carefully about the data before making generalizations or conclusions.

    ReplyDelete