Translate

Thursday, November 29, 2012

Standardized Testing Costs States $1.7 Billion a Year, Study Says

Standardized Testing Costs States $1.7 Billion a Year, Study Says

This is an exhorbitant cost and doesn't even take into account the indirect costs of testing. 
"What new piece of the universe did Your Students Touch Today?" by Rice University Professor Linda McNeil.

http://tsta.org/sites/default/files/12fallAdvocate-web.pdf

How to Use Value-Added Measures Right




November 2012 | Volume 70 | Number 3
Teacher Evaluation: What's Fair? What's Effective? Pages 38-42
Matthew Di Carlo
How can districts that are required to use value-added measures ensure that they do so responsibly?
The debate is polarized. Both sides are entrenched in their views. And schools are caught in the middle, having to implement evaluations using value-added measures whose practical value is unclear.
Value-added models are a specific type of growth model, a diverse group of statistical techniques to isolate a teacher's impact on his or her students' testing progress while controlling for other measurable factors, such as student and school characteristics, that are outside that teacher's control. Opponents, including many teachers, argue that value-added models are unreliable and invalid and have absolutely no business at all in teacher evaluations, especially high-stakes evaluations that guide employment and compensation decisions. Supporters, in stark contrast, assert that teacher evaluations are only meaningful if these measures are a heavily weighted component.
Supporters and opponents alike draw on a large and growing body of research that spans three decades (see Lipscomb, Teh, Gill, Chiang, & Owens, 2010, for a policy-oriented review). But despite the confidence on both sides, there is virtually no empirical evidence as to whether using value-added or other growth models—the types of models being used vary from state to state—in high-stakes evaluations can improve teacher performance or student outcomes. The reason is simple: It has never really been tried before.
It will probably be several years before there is solid initial evidence on whether and how the various new evaluation systems work in practice. In the meantime, the existing research can and must inform the design and implementation of these systems.

Reliability and Validity Apply to All Measures

Critics of value-added measures make a powerful case that value-added estimates are unreliable. Depending on how much data are available and where you set the bar, a teacher could be classified as a "top" or "bottom" teacher because of a random statistical error (Schochet & Chiang, 2010). That same teacher could receive a significantly different rating the next year (Goldhaber & Hansen, 2008; McCaffrey, Sass, Lockwood, & Mihaly, 2009). It makes little sense, critics argue, to base hiring, firing, and compensation decisions on such imprecise estimates. There are also strong objections to value-added measures in terms of validity—that is, the degree to which they actually measure teacher performance. (For an accessible discussion of validity and reliability in value-added measures, see Harris, 2011.)
Value-added estimates are based exclusively on scores from standardized tests, which are of notoriously varying quality and are not necessarily suitable for measuring teacher effectiveness (Koretz, 2002). Moreover, different models can produce different results for the same teacher (Harris, Sass, & Semykina, 2010; McCaffrey, Lockwood, Koretz, & Hamilton, 2004), as can different tests plugged into the same model (Papay, 2011).
These are all important points, but the unfortunate truth is that virtually all measures can be subject to such criticism, including the one that value-added opponents tend to support—classroom observations. Observation scores can be similarly imprecise and unstable over time (Measures of Effective Teaching Project, 2012). Different protocols yield different results for the same teacher, as do different observers using the same protocol (Rockoff & Speroni, 2010).
As states put together new observation systems, most are attempting to address these issues. For instance, many are requiring that each teacher be evaluated multiple times every year by different observers.
The same cannot, however, be said about value-added estimates. Too often, states fail to address the potential problems with using these measures.

Four Research-Based Recommendations

It is easy to sympathize with educators who balk at having their fates decided in part by complex, seemingly imprecise statistical models that few understand. But it is not convincing to argue that value-added scores provide absolutely no useful information about teacher performance. There is some evidence that value-added scores can predict the future performance of a teacher's students (Gordon, Kane, & Staiger, 2006; Rockoff & Speroni, 2010) and that high value-added scores are associated with modest improvements in long-term student outcomes, such as earnings (Chetty, Friedman, & Rockoff, 2011). It is, however, equally unconvincing to assert that value-added data must be the dominant component in any meaningful evaluation system or that the value-added estimates are essential no matter how they are used (Baker et al., 2010).
By themselves, value-added data are neither good nor bad. It is how we use them that matters. There are basic steps that states and districts can take to minimize mistakes while still preserving the information the estimates provide. None of these recommendations are sexy or even necessarily controversial. Yet they are not all being sufficiently addressed in new evaluation systems.

Avoid mandating universally high weights for value-added measures.

There is no "correct" weight to give value-added measures within a teacher's overall evaluation score. At least, there isn't one that is supported by research. Yet many states are mandating evaluations that require a specific and relatively high weight (usually 35–50 percent). Some states do not specify a weight but employ a matrix by which different combinations of value-added scores, observations, and other components generate final ratings; in these systems, value-added scores still tend to be a driving component. Because there will be minimal variation between districts, there will be little opportunity to test whether outcomes differ for different designs.
A more logical approach would be to set a lower minimum weight—say, 10–20 percent—and let districts experiment with going higher. Such variation could be useful in assessing whether and why different configurations lead to divergent results, and this information could then be used to make informed decisions about increasing or decreasing weights in the future.

Pay attention to all components of the evaluation.

No matter what the weight of value-added measures may be on paper, their actual importance will depend in no small part on the other components chosen and how they are scored. Consider an extreme hypothetical example: If an evaluation is composed of value-added data and observations, with each counting for 50 percent, and a time-strapped principal gives all teachers the same observation score, then value-added measures will determine 100 percent of the variation in teachers' final scores.
System designers must pay close attention to how raw value-added scores are converted into evaluation ratings and how those ratings are distributed in relation to other components. This attention is particularly important given that value-added models, unlike many other measures (such as observations), are designed to produce a spread of results—some teachers at the top, some at the bottom, and some in the middle. This imposed variability will increase the impact of value-added scores if other components do not produce much of a spread.
Some states and districts that have already determined scoring formulas do not seem to be paying much attention to this issue. They are instead relying on the easy way out. For example, they are converting scores to simplistic, seemingly arbitrary four- or five-category sorting schemes (perhaps based on percentile ranks) with little flexibility or guidance on how districts might calibrate the scoring to suit the other components they choose.

Don't ignore error—address it.

Although the existence of error in value-added data is discussed continually, there is almost never any discussion, let alone action, about whether and how to address it. There are different types of error, although they are often conflated.
Some of the imprecision associated with value-added measures is systematic. For example, there may be differences between students in different classes that are not measurable, and these differences may cause some teachers to receive lower (or higher) scores for reasons they cannot control (Rothstein, 2009).
In practice, systematic error is arguably no less important than random error—statistical noise due largely to small samples. Even a perfect value-added model would generate estimates with random error.
Think about the political polls cited almost every day on television and in newspapers. A poll might show a politician's approval rating at 60 percent, but there is usually a margin of error accompanying that estimate. In this case, let's say it is plus or minus four percentage points. Given this margin of error, we can be confident that the "true" rating is somewhere between 56 and 64 percent (though more likely closer to 60 than to 56 or 64). This range is called a confidence interval.
In polls, this confidence interval is usually relatively narrow because polling companies use very large samples, which reduces the chance that anomalies will influence the results. Classes, on the other hand, tend to be small—a few dozen students at most. Thus, value-added estimates—especially those based on one year of data, small classes, or both—are often subject to huge margins of error; 20 to 40 percentage points is not unusual (see, for example, Corcoran, 2010).
If you were told that a politician's approval rating was 60 percent, plus or minus 30 percentage points, you would laugh off the statistic. You would know that it is foolish to draw any strong conclusions from a rating so imprecise. Yet this is exactly what states and districts are doing with value-added estimates. It is at least defensible to argue that these estimates, used in this manner, have no business driving high-stakes decisions.
There are relatively simple ways that states and districts can increase accuracy. One basic step would be to require that at least two or three years of data be accumulated for teachers before counting their value-added scores toward their evaluation (or, alternatively, varying the weight of value-added measures by sample size). Larger samples make for more precise estimates and have also been shown to mitigate some forms of systematic error (Koedel & Betts, 2011). Value-added estimates can also be adjusted ("shrunken") according to sample size, which can reduce the noise from random error (Ballou, Sanders, & Wright, 2004).
Second, even when sample sizes are larger, states and districts should directly account for the aforementioned confidence intervals. One of the advantages of value-added models is that, unlike with observations, you can actually measure some of the error in practice. Accounting for it does not, of course, ensure that the estimates are valid—that the models are measuring unbiased causal effects—but it at least means you will be interpreting the information you have in the best possible manner. The majority of states and districts are ignoring this basic requirement.

Continually monitor results and evaluate the evaluations.

This final recommendation may sound like a platitude in the era of test-based accountability, but it is too important to omit. States and districts that implement new systems must thoroughly analyze the results every single year. They need to check whether value-added estimates (or evaluation scores in general) vary systematically by student, school, or teacher characteristics; how value-added scores match up with the other components (see Jacob & Lefgren, 2008); and how sensitive final ratings are to changes in the weighting and scoring of the components. States also need to monitor how stakeholders, most notably teachers and administrators, are responding to the new systems.
Another important detail is the accuracy of the large administrative data sets used to calculate value-added scores. These data sets must be continually checked for errors (for example, in the correct linking of students with teachers), and teachers must have an opportunity to review their class rosters every year to ensure they are being evaluated for the progress of students they actually teach.
Finally, each state should arrange for a thorough, long-term, independent research evaluation of new systems, starting right at the outset. There are few prospects more disturbing than the idea of making drastic, sweeping changes in how teachers are evaluated but never knowing how these changes have worked out.
All these exercises should be accompanied by a clear path to making changes based on the results. It is difficult to assess the degree to which states and districts are fulfilling this recommendation. No doubt all of them are performing some of these analyses and would do more if they had the capacity.

If We Do This, Let's Do It Right

Test-based teacher evaluations are probably the most controversial issue in U.S. education policy today. In the public debate, both sides have focused almost exclusively on whether to include value-added measures in new evaluation systems. Supporters of value-added scoring say it should dominate evaluations, whereas opponents say it has no legitimate role at all. It is as much of a mistake to use value-added estimates carelessly as it is to refuse to consider them at all.
Error is inevitable, no matter which measures you use and how you use them. But responsible policymakers will do what they can to mitigate imprecision while preserving the information the measures transmit. It is not surprising that many states and districts have neglected some of these steps. They were already facing budget cuts and strained capacity before having to design and implement new teacher evaluations in a short time frame. This was an extremely difficult task.
Luckily, in many places, there is still time. Let's use that time wisely.

EL Online


For another perspective on the use of value-added data, see the online-only article "Value-Added: The Emperor with No Clothes" by Stephen J. Caldas.

References

Baker, E., Barton, P., Darling-Hammond, L., Haertel, E., Ladd, H., Linn, R., et al. (2010). Problems with the use of student test scores to evaluate teachers (Briefing paper 278). Washington, DC: Economic Policy Institute.
Ballou, D., Sanders, W., & Wright, P. (2004). Controlling for student background in value-added assessment of teachers. Journal of Educational and Behavioral Statistics, 29(1), 37–65.
Chetty, R., Friedman, J., & Rockoff, J. (2011). The long-term impacts of teachers: Teacher value-added and student outcomes in adulthood (NBER Working Paper 17699). Washington, DC: National Bureau of Economic Research.
Corcoran, S. (2010). Can teachers be evaluated by their students' test scores? Should they be? The use of value-added measures of teacher effectiveness in policy and practice. New York: Annenberg Institute.
Goldhaber, D., & Hansen, M. (2008). Is it just a bad class? Assessing the stability of measured teacher performance (Working Paper 2008-5). Denver, CO: Center for Reinventing Public Education.
Gordon, R., Kane, T., & Staiger, D. (2006). Identifying effective teachers using performance on the job. Washington, DC: Brookings Institution.
Harris, D. (2011). Value-added measures in education: What every educator needs to know. Cambridge, MA: Harvard Education Press.
Harris, D., Sass, T., & Semykina, A. (2010). Value-added models and the measurement of teacher productivity (CALDER Working Paper 54). Washington, DC: Center for Analysis of Longitudinal Data in Education Research.
Jacob, B. A., & Lefgren, L. (2008). Can principals identify effective teachers? Evidence on subjective performance evaluation in education. Journal of Labor Economics, 25(1), 101–136.
Koedel, C., & Betts, J. (2011). Does student sorting invalidate value-added models of teacher effectiveness? An extended analysis of the Rothstein critique. Education Finance and Policy, 6(1), 18–42.
Koretz, D. (2002). Limitations in the use of achievement tests as measures of educators' productivity. Journal of Human Resources, 37(4), 752–777.
Lipscomb, S., Teh, B., Gill, B., Chiang, H., & Owens, A. (2010). Teacher and principal value-added: Research findings and implementation practices. Washington, DC: Mathematica Policy Research.
McCaffrey, D., Lockwood, J. R., Koretz, D., & Hamilton, L. (2004). Evaluating value-added models for teacher accountability. Santa Monica, CA: RAND Corporation.
McCaffrey, D., Sass, T., Lockwood, J. R., & Mihaly, K. (2009). The intertemporal stability of teacher effects. Education Finance and Policy, 4(4), 572–606.
Measures of Effective Teaching Project. (2012). Gathering feedback for teaching: Combining high-quality observation with student surveys and achievement gains (MET Project Research Paper). Seattle, WA: Bill and Melinda Gates Foundation.
Papay, J. (2011). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163–193.
Rockoff, J., & Speroni, C. (2010). Subjective and objective evaluations of teacher effectiveness. American Economic Review, 100(2), 261–266.
Rothstein, J. (2009). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Education Finance and Policy, 4(4), 537–571.
Schochet, P., & Chiang, H. (2010). Error rates in measuring teacher and school performance based on student test score gains (NCEE 2010-4004). Washington, DC: National Center for Education Evaluation and Regional Assistance, U.S. Department of Education.
Matthew Di Carlo is a senior fellow at the Albert Shanker Institute in Washington, DC.
Copyright © 2012 by ASCD

Use Caution with Value-Added Measures




November 2012 | Volume 70 | Number 3
Teacher Evaluation: What's Fair? What's Effective? Pages 80-81
Bryan Goodwin and Kirsten Miller

When the New York City Department of Education released its Teacher Data Reports in February 2012, Pascale Mauclair found herself in the spotlight—for all the wrong reasons. The New York Post dubbed Ms. Mauclair, a 6th grade teacher at highly rated P.S. 11 in Queens, the "city's worst teacher." There was just one problem. It wasn't true.

First, the data were suspect: Of the seven 6th grade teachers in the same school, three received zero percentile scores, an unlikely scenario for a school rated in the 94th percentile of the city's public schools. Next, although Ms. Mauclair taught both math and English language arts, only six of her students had taken the language arts assessment, a number below the allowable reporting sample of 20 students. Her value-added rating was therefore based solely on the results for the 11 students who took the mathematics exam (for which the minimum reporting sample is 10 students). Such a small sample is prone to distortions. Further, her class consisted of immigrant students who were still learning English and who entered her classroom at different times during the year; some students took the exam when they had been in her class for just a few months (Casey, 2012; Clawson, 2012).

Clearly, the numbers didn't tell the whole story. Yet Mauclair, who was regarded by other teachers and administrators in this high-performing school as an excellent teacher, was held up to public criticism by those unaware of the realities of the situation (Casey, 2012). In light of her experience and the similar experiences of other teachers, we should ask what the research says about the accuracy of value-added measures of teacher performance.

Researcher Misgivings

In many ways, the value-added teacher measurement model is still in its infancy, having emerged only in recent years as sophisticated data warehouses made it possible to measure the average growth of an entire class of students over the course of a school year. However, researchers have warned that what seems so simple and straightforward in theory is incredibly complicated in practice. Here are a few of the pitfalls.

Non-teacher effects may cloud the results. Meta-analytic research conducted by Marzano (2000) found that teachers account for only about 13 percent of the variance in student achievement. Student variables (including home environment, student motivation, and prior knowledge) account for 80 percent of the variance. Value-added models don't necessarily isolate teacher effects from these other influences (Braun, 2005).

Data may be inaccurate. In the aftermath of the Pascale Mauclair incident, multiple factual errors surfaced in New York's data. For example, one teacher had data for a year when she was on maternity leave; another teacher taught 4th grade for five years but had no data (Clawson, 2012). Moreover, small samples—for example, classes with only 10 students—can paint inaccurate pictures of teachers because they are subject to statistical fluctuations (Goe, Bell, & Little, 2008).

Student placement in classrooms is not random. For a variety of reasons, schools seldom place students randomly in classrooms. As a result, some teachers find themselves with accelerated learners, whereas others, like Ms. Mauclair, may find themselves with more challenging students. Existing models do not adequately control for this problem of nonrandom assignment (Rothstein, 2008).

Students' previous teachers can create a halo (or pitchfork) effect. Researchers have discerned that the benefits for students of being placed in the classrooms of highly effective teachers can persist for years. As a result, mediocre teachers may benefit from the afterglow of students' exposure to effective teachers. Conversely, researchers have found "little evidence that subsequent effective teachers can offset the effects of ineffective ones" (Sanders & Horn, 1986, p. 247). As a result, the value-added ratings for effective teachers may be diminished because of previous, ineffective teachers.

Teachers' year-to-year scores vary widely. Perhaps one of the most troubling aspects of value-added measures is that the ratings of individual teachers typically vary significantly from year to year (Baker et al., 2010). For example, in one study, 16 percent of teachers who were rated in the top quartile one year had moved to the bottom two quartiles by the next year, and 8 percent of teachers in the bottom quartile had risen to the top quartile a year later (Aaronson, Barrow, & Sander, 2003).

Still Better Than the Alternatives?

In general, the year-to-year correlation between value-added scores lies in the .30 to .40 range (Goldhaber & Hansen, 2010). Although this correlation is not large, researchers at the Brookings Institution note that it is almost identical to the correlation between SAT scores and college grade point average (.35); yet we continue to use SAT scores in making decisions about college admissions "because even though the prediction of success from SAT/ACT scores is modest, it is among the strongest available predictors" (Glazerman et al., 2010, p. 7).

Similarly, more traditional measures of teacher performance have not been tremendously accurate. For example, until recently, many teacher evaluation systems only provided binary ratings: satisfactory or unsatisfactory, with a full 99 percent of teachers receiving satisfactory (Weisberg, Sexton, Mulhern, & Keeling, 2009). Moreover, researchers have found weak correlations between principals' ratings of teacher performance and actual student achievement; in general, principals appear to be fairly accurate in identifying top and bottom performers, but they struggle to differentiate among teachers in the middle (Jacob & Lefgren, 2008).

When faced with imperfect predictors of college success, colleges have learned to use a variety of measures to make decisions about which students to admit. The challenges posed by value-added measurement would suggest that schools take a similar approach. School leaders should heed researchers' consistent warnings against publicly releasing individual teacher ratings or relying heavily on value-added measures to make high-stakes employment decisions. But value-added measures might reasonably be considered as one component of teacher evaluation—when taken with a healthy dose of caution and considered alongside other measures.

References

Aaronson, D., Barrow, L., & Sander, W. (2003). Teachers and student achievement in the Chicago public high schools. Chicago: Federal Reserve Bank of Chicago.
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., et al. (2010). Problems with the use of student test scores to evaluate teachers. Washington, DC: Economic Policy Institute.
Braun, H. I. (2005). Using student progress to evaluate teachers: A primer on value-added models. Princeton, NJ: Educational Testing Service.
Casey, L. (2012, February 28). The true story of Pascale Mauclair. Edwize. Retrieved from www.edwize.org/the-true-story-of-pascale-mauclair
Clawson, L. (2012, March 4). New York City's flawed data fuel right's war on teachers. Daily Kos. Retrieved from www.dailykos.com/story/2012/30/04/1069927/-New-York-City-s-flawed-data-fuels-the-right-s-war-on-teachers
Glazerman, S., Loeb, S., Goldhaber, D., Steiger, D., Raudenbush, S., Whitehurst, G. (2010). Evaluating teachers: The important role of value-added. New York: Brookings. Retrieved from www.brookings.edu/research/reports/2010/11/17-evaluating-teachers
Goldhaber, D., & Hansen, M. (2010). Assessing the potential of using value-added estimates of teacher job performance for making tenure decisions (Working paper 31). Washington, DC: National Center for Analysis of Longitudinal Data in Education Research.
Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. Washington, DC: National Comprehensive Center for Teacher Quality.
Jacob, B. A., & Lefgren, L. (2008). Principals as agents: Subjective performance measurement in education (Faculty research working papers series No. RWP05-040). Cambridge, MA: Harvard University John F. Kennedy School of Government.
Marzano, R. J. (2000). A new era of school reform: Going where the research takes us. Aurora, CO: McREL.
Rothstein, J. (2008). Student sorting and bias in value-added estimation: Selection on observables and unobservables. Paper presented at the National Conference on Value-Added Modeling, Madison, WI. Retrieved from www.wcer.wisc.edu/news/events/vam%20conference%20final%20papers/studentsorting&bias_jrothstein.pdf
Sanders, W. L., & Horn, S. P. (1998). Research findings from the Tennessee Value-Added Assessment System (TVAAS) database: Implications for educational evaluation and research. Journal of Personnel Evaluation in Education, 12(3), 247–256. Retrieved from www.sas.com/govedu/edu/ed_eval.pdf
Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: New Teacher Project.
Bryan Goodwin is vice president of communications, McREL, Denver, Colorado. He is the author of Simply Better: Doing What Matters Most to Change the Odds for Student Success (ASCD, 2011). Kirsten Miller is a lead consultant at McREL.
Copyright © 2012 by ASCD

Former Texas education commissioner Robert Scott sparked national revolt against high-stakes testing


Former Texas education commissioner Robert Scott sparked national revolt against high-stakes testing 


by JEFFREY WEISS
Staff Writer
Published: 28 November 2012 11:26 PM

Robert Scott
Robert Scott says now that he never intended to inspire a national revolt against high-stakes standardized testing in public schools. But when the then-commissioner of the Texas Education Agency spoke out earlier this year, that’s pretty much what happened.

In January, Scott responded to questions at a meeting of the State Board of Education. He called overemphasis on test results at the local level a “perversion” of the system. And he said that the state’s reliance on those test results for more and more accountability measures was the “heart of the vampire.”

He made similar comments a few days later — to enthusiastic applause — at the annual midwinter conference of the Texas Association of School Administrators, or TASA.

Scott’s pungent comments were an emperor’s new clothes moment for those opposed to test-based accountability.

They also triggered a backlash from defenders of the system who say Scott didn’t object as it was being designed and didn’t work hard enough to help it succeed.

But Scott’s perspective is being voiced at many levels. This week, U.S. Secretary of Education Arne Duncan was in Dallas. During an interview, his position on high-stakes testing closely shadowed Scott’s position.

Federal testing requirements in the No Child Left Behind law need a “reset,” Duncan said. More than 30 states — Texas not among them — have applied to his department for waivers to the law and none have retained a singular focus on one-day standardized testing.

“Test scores can be a part of what you’re looking at, but should not be the fixation,” Duncan said.
His objections

In Texas, Scott’s critique has become a rallying cry in preparation for next year’s legislative session. That’s where any significant changes to state testing requirements would need to be approved.
In a recent interview, Scott listed aspects to the Texas system in need of what he called a midcourse correction:

•The requirement that STAAR test results represent 15 percent of the final grade for high school core classes.
•The use of STAAR almost exclusively to determine ratings for schools and school districts
•The “four-by-four” high school graduation requirement of four years of math, English, science and social studies — each class tied to its own End Of Course/STAAR test — that leave little flexibility for class selection.
•The total absence of classes such as fine arts or career and technical education — none of which have STAAR tests — in accountability ratings.

These elements are all vital to the system that Scott supervised from 2007, when Gov. Rick Perry appointed him as commissioner, until he stepped down in July. That he was still in charge of the agency when he started to level his criticisms made them all the more influential.

First, his remarks became the core of a letter signed by more than a dozen North Texas school district superintendents. That letter became the framework of a resolution crafted by TASA that attacks high-stakes, one-day, one-test accountability. The resolution has now been approved by more than 85 percent of Texas local school boards.

Next, an organization called the National Center for Fair and Open Testing built on the Texas resolution and created a version it has taken around the country.

“He’s a real hero around here,” Robert Schaeffer, director of public education for the national test policy advocacy group, said about Scott.

And the ripples continue to spread. Just this week, the American Federation of Teachers announced a new PR campaign: “Learning is more than a test score.” The new message piggybacks onto a resolution that opposes what it called “the growing fixation on high-stakes testing” passed earlier this year at the AFT’s annual convention.

If Scott is a hero to one side, he’s something entirely different to the other. Sandy Kress is a former Dallas ISD board president, senior adviser on education to Rick Perry and George W. Bush, and a paid consultant for Pearson, the company that designs the standardized tests used in Texas and many other states.

But Kress was a high-stakes testing advocate long before he got the Pearson job. He’s a vigorous supporter of the current Texas system. Saying he’s speaking only for himself, he dismisses Scott’s comments to the state board and to TASA as nothing more than “retail demagoguing.”

If the system was misunderstood or being misused at the local level, it was Scott’s job to keep those things from happening, he said.

Not ‘some grand plan’

Scott, 43, has kept a relatively low public profile since leaving the TEA, though he’s made speeches to groups of educators in and out of Texas. He’s now working for an Austin law firm — he’s a lawyer — and as a consultant about education and other topics.

Scott seems bemused at being viewed as either a standard-bearer or a lightning rod. He says he hadn’t exactly plotted out his broadsides over accountability. The questions came up, and he answered them.
“It wasn’t some grand plan,” he said.

When he announced in May that he’d be resigning, there were murmurs that he’d been nudged for speaking out against the system. Scott says it was more the other way around.

“I realized last Christmas there wasn’t much more I could do at TEA,” he said. “That was probably more a symptom of me getting ready to go. Here’s what I see on my way out the door.”

Within the relatively closed world of education officials, grousing about testing has gotten louder by the year. What made Scott’s an observation that made such a difference?

“Because of his position and the state he comes from, he has a lot of credibility,” Schaeffer said.
Texas has been a national leader in pushing test-based accountability since Ross Perot chaired a Select Committee on Education in the 1980s. As governor, George W. Bush threw his full support behind it — and as president he took the idea to Washington as part of No Child Left Behind. Under Perry, the state only increased its emphasis on tests.

And Scott has been anything but a critic of the program since he started working at TEA in 1994.
“I’ve spent the last 20 years of my life developing this system,” he said last week. “I believe in it.”
In fact, he was in charge of the agency when the Legislature imposed some of the most controversial testing measures. Scott now says that some of that is overreach.

Critics like Kress want to know where his voice was as the policies were being crafted.
“The 15 percent rule was deliberated while he was commissioner,” Kress said. “I never heard a peep of objection to it from him.”

Scott says his ideas evolved.

STAAR is born

STAAR is part of an overhaul of the state’s education standards. First, the curriculum requirements, called the Texas Essential Knowledge and Skills (TEKS), were beefed up with the intention of pushing more students into college readiness. Then STAAR was supposed to represent better assessment of those skills than the old TAKS tests.

But as Scott and his agency released information about the changes, he discovered there wasn’t much interest in the new curriculum standards.

“The only way to appease people was to release a whole practice test,” he said.

Teaching to the test rather than the curriculum had become the goal in too many places, he said. Too many schools were filling their calendars with “benchmark” exams, testing whether students were able to succeed on STAAR-like tests.

Both Scott and Kress agree that’s not remotely what was intended.

Scott said he and his agency did what they could to redirect attention by teachers and school officials. Kress said that Scott himself should have been more a more visible advocate, barnstorming the state and speaking forcefully about the value in using the system properly.

Kress and Scott sharply diverge about the limits of testing.

For instance, people unfamiliar with the system might assume passing rates for STAAR or TAKS simply represent an objective measure of competency. Not so much, Scott said.

“I’m the guy who set the passing standards for five years,” he said.

Yes, they were supposed to indicate something about competency. But he also tried to set them low enough to not overly discourage students and schools, yet high enough to create an incentive to work harder. And he tried to raise the bar a bit each year.

None of that subjectivity was visible either in the scores sent home to parents or in the accountability standards for schools and districts that depended almost entirely on the passing rates.
And then there are the unavoidable flaws in the tests themselves, Scott said.

“The most fundamental question facing the state right now is ‘Is the test infallible?’” he said.
Scott’s answer: It’s not. And that’s a problem.

“When you use it for so many high-stake things, and you know it’s fallible,” he said, “you know there are risks.”
-----
WHAT HE SAID: Former TEA chief’s take on testing

Then-Education Commissioner Robert Scott, speaking to the State Board of Education in January:

•“You’ve reached a point now of having this one thing that the entire system is dependent upon. It is the heart of the vampire, so to speak. All you have to do is kill that, and you’ve killed a whole lot of things.”

•“I’ve been a proponent of standardized testing, for some things, and I want to continue to use it, for some things. But we have overemphasized it, and even if we haven’t overemphasized it specifically at the state level, the perception out there is that it is the end-all, be-all….”

• “The assessment and accountability regime has become not only a cottage industry but a military-industrial complex.”

Nov 29 Ethnoecology Blogs - Fall 2012: Gingko Biloba Moderator’s Note: It is my privilege this time of the year to present the work of my undergraduate students at the University of Washington. This year I am presenting a series of outstanding blogs by students enrolled in my course on Environmental Anthropology. The large lecture format course introduces students to the field of environmental anthropology, which includes the study of Ethnoecology – the knowledge of ecology developed by indigenous and other traditional place-based peoples. The course also examines contributions from the fields of critical political ecology, which focuses on the role of science in the politics of environmental law, policy, and social movements. Finally, we study aspects of environmental history, which focuses on the role of human societies – both small- and large-scale – in processes of ecological change and includes analysis of the history of ideas about the quality of the human-environment relationship. Like the class, these blogs seek to bridge all these approaches and more by providing entries that address local place-based knowledge and situate Ethnoecology within the context of politics and history. I am very proud of the work done by the students this year because they have demonstrated that the youth of the current ‘Millennial’ generation is as serious-minded and dedicated to creative and critical thought as any that preceded them. The students illustrate the value of collective work and the possibilities that unfold with collaborative group projects as part of a critical pedagogy that challenges the hyper-individualism of our mass society. It is refreshing to see these young minds create an intellectual community and contribute to the diffusion of the people’s knowledge of ecology. I also acknowledge the incredible contributions by my two graduate assistants, Claudia Serrato and Gabe Valle. They supervised the entire process of research, preparation, and editing of these blog entries. The results of their professional guidance and dedicated support of my students are superb. I am blessed to have such high quality graduate students in my midst. I am also grateful to Erik Jaccard, who serves as our English-writing instructor, and was masterful and skilled in preparing these entries for publication. The second entry in this series is about a tree, the Gingko Biloba, which is an organism that has survived on and adapted to changes in our planet for hundreds of millions of years. The students write about the “deep history” of the relationship between human and this tree, and about the traditional environmental knowledge (TEK) resulting from this ecological intimacy. Reading this entry, you will learn that Buddhist and Taoist monks venerate the Gingko for its long lifespan (approximately 2 to 4 thousand years), representing a strong, holy, and enduring life. The tree is a source of food and medicine and is also appreciated for spiritual and aesthetic values. The entry also explores the contemporary political ecology of the tree and the decline of biodiversity of the Gingko in part promoted by the nature of efforts to exploit it commercially in plantation monoculture tree farms and even conservation programs. The students conclude by noting how the “Biloba has affected human civilization for thousands of years, providing medicine and food, as well as symbolism in art, and will continue to do so as we learn more uses for it and come to understand its true value.” Ginkgo Biloba A TREE OF LIFE AND THE POLITICS OF ITS SURVIVAL


This is such beautiful, powerful, in fact, mind-blowing work conducted by Dr. Devon Peña's students.  This is a very well written and conceived, and very touching piece, as well: 
"It is estimated that there are over 300 pharmaceutical and clinical studies in Europe have researched or are researching the medicinal uses of Ginkgo leaf extract EGb 761. This extract is already being used, mostly in the form of herbal pills, to treat cardiovascular conditions, lung complications, and cognitive or memory disorders. It is used in these ways due to its anti-inflammatory properties, first discovered in ancient China. In the United States, the University of Maryland Medical Center has been conducting studies on the clinical effects Ginkgo can have on conditions such as depression, ADD and ADHD, chronic migraines, and vertigo as well. While the historical medicinal uses are vast, modern medical science is discovering more uses of this versatile plant every year.
Since the only native Ginkgo trees that survived the extinction were confined to what is now China, when people arrived in Southeast Asia, they grew to appreciate and cultivate the plant.  “[The] Ginkgo Biloba has been cultivated for more than 2000 years in China and for some 1000 years in Japan as a source of food, shade, and beauty” (Royerac).  

In Buddhist and Taoist monasteries, they kept trees alive because they were venerated as old creatures as the “the Ginkgo biloba tree has a life span of 2,000-4000 years” (Z. Pang).  Use became more widespread and written records show that it was used as a food source since at least 206 BCE and the Chinese soon came to honor and be proud of the tree. ‘

Yes, we need to save this tree because it is very valuable, life-extending, and beautiful and there is still much to discover about it's healing properties.  Moreover, our early ancestors recognized its sacred, medicinal, and practical properties.  Thanks so much for sharing, Devon.  I hope this piece gets the attention it deserves.      

-Angela

 
Trenton Dos Santos-Tam | Siva Hope | Sienna Landry | Alyssa Morant | Stephen Warner

Our objective is to learn about the native Ginkgo tree from an anthropological perspective and present our findings through the five lenses of classification, ethnobotany, agroecology, environmental history and political ecology with a multimedia and collaborative approach. We choose the Ginkgo because of its prehistoric roots spanning nearly 270 million years as well as a deep history relating to humans. We will see how the traditional environmental knowledge of indigenous people that came into contact with the plant can create a better understanding of our surroundings.

A deep history
Fossilized leaf of an ancient Gingko

In the Jurassic Period, the Ginkgoales thrived with over 20 species.  Existing all over the world, fossil records show that the tree was most diverse in North America, East Asia and Europe, and it was nonexistent in the equatorial regions. Some scientists think that the dinosaurs helped spread its seed and consequently helped it flourish during this time.  Parallel in the geological timeline, as dinosaurs saw their collapse by the Tertiary Period, the Ginkgo Biloba remained the only Ginkgo left.  There were likely other contributing factors to the downfall of the Biloba that occurred over the millions of years, like the great warming period that helped extinguish the dinosaurs, and subsequent ice ages, making it a living fossil. The Ginkgo Biloba is now the only living remnant of this family. The populations of Bilobas that survived were largely all “in the low coastal and interior mountains straddling the Yangtze River” (Royerac).

Since the only native Ginkgo trees that survived the extinction were confined to what is now China, when people arrived in Southeast Asia, they grew to appreciate and cultivate the plant.  “[The] Ginkgo Biloba has been cultivated for more than 2000 years in China and for some 1000 years in Japan as a source of food, shade, and beauty” (Royerac).  
In Buddhist and Taoist monasteries, they kept trees alive because they were venerated as old creatures as the “the Ginkgo biloba tree has a life span of 2,000-4000 years” (Z. Pang).  Use became more widespread and written records show that it was used as a food source since at least 206 BCE and the Chinese soon came to honor and be proud of the tree.  
Human cultivation brought the tree to Japan and Korea around 1192 CE because of Buddhist influences.  The Ginkgo nuts are first mentioned in Japanese textbooks in 1492 for its uses: medicine, food, its history. The Biloba was cultivated in the Southeast region of Asia exclusively until Joseph Kaempfer brought it from Japan to the Botanic Garden of Utrecht, Holland, in 1730, and then, in 1785, William Hamilton brought it to the last major continent in his estate in North America from England.

From the time that the Ginkgo reached the European continent, it was used as an exotic ornamental tree for gardens, yards and public areas alike.  As its presence in the Western world increased, so did its popularity among gardeners and landscapers.  
It is a resilient tree that can thrive in nearly any industrialized nation’s climate “In cultivation Ginkgo tolerates a wide variety of seasonal climates, ranging from Mediterranean to cold temperate” (Royerac).  Further allowing for its strong tenacity, “It has shown a resistance to insects, bacteria, viruses and fungi.”  The great perseverance of the tree can be attributed to these factors and the veneration given to it by humans has partially stemmed from its strength.
Photo credit: Dreamstime
The Ginkgo tree, both its leaves and its seeds, have been used by humans for medicine, food and art for thousands of years.  Historically, we have seen the resilience of the tree through such events as the ice age and even an atomic bomb at Hiroshima, making it a reliable source for use by people.  The first recorded use of the leaves is from China in 1436 externally to treat skin sores and again in 1505 internally to treat diarrhea.  Nearly all of the ethnobotanical uses of this plant originated from China or Japan, but today are common all throughout the world, including most European countries and America

Linnaeus initially described the tree in 1771 and the specific epithet biloba derived from the Latin bis, meaning “two”, and loba, meaning “lobed”. This is seen in the shape of the leaf which is split in the middle, creating two lobes. Botanist Richard Salisbury is recognized for two names of the tree: pterophyllus salisburienus and the earlier salisburia adiantifolia proposed by James Edward Smith which may have been intended to denote a characteristic resembling adiantum, the genus of Maidenhair ferns.

The Ginkgo Biloba is classified in its own taxonomy group because of its rare seed formation. According to Arthur Cronquist, “the whole seed, except the embryo itself, is formed before fertilization, which occurs after the seeds have fallen from the tree” (130). Therefore, the classification is as follows: Kingdom-Plantae, Division-Ginkgophyta, Class-Ginkoopsida, Order-Ginkgoales, Family-Gingoaceae, Genus-Ginkgo.

Another factor that separates the Ginkgo seeds from common tree seeds is the fact that they are not protected by an ovary wall and can morphologically be considered a gymnosperm. Because of this, the Ginkgo has been placed loosely in the divisions Spermatophyta and Pinophyta but no consensus has been reached. This separation has created a great controversy among taxonomists because “…both the morphologic distinctiveness of the Ginkgo phylad and its long and diversified fossil history contribute to the consensus, although the precise rank may still be debated.” (Cronquist 46)
The ethnecology and political ecology of a sacred tree
                         
By the late 1800s to early 1900s, it was a popular “street tree” on the east coast in urban areas. For more than 50 years, horticulturists in parks and public places, commercial landscapes, and street tree plantings have made use of Ginkgo.  In America it’s seen as an ornamental tree and “a number of selections have been released by horticulturists and foresters and a great number of cultivars developed [it] for ornamental purposes have been recorded.”  Consequently, it is considered a tree for collectors. Currently, the IUCN Red List of Endangered Plants has listed the tree as endangered.

It is estimated that there are over 300 pharmaceutical and clinical studies in Europe have researched or are researching the medicinal uses of Ginkgo leaf extract EGb 761. This extract is already being used, mostly in the form of herbal pills, to treat cardiovascular conditions, lung complications, and cognitive or memory disorders. It is used in these ways due to its anti-inflammatory properties, first discovered in ancient China. In the United States, the University of Maryland Medical Center has been conducting studies on the clinical effects Ginkgo can have on conditions such as depression, ADD and ADHD, chronic migraines, and vertigo as well. While the historical medicinal uses are vast, modern medical science is discovering more uses of this versatile plant every year.
Nutritional supplements are an example of the commercialization of the Gingko Biloba
 
The seed of the female Gingko tree, although less common than the male due to the odor it produces, can be used for food products, both ceremonially in indigenous cultures and for proclaimed health benefits predominantly in western society. In Japan, the Ginkgo’s fleshy seed is roasted and eaten at ceremonial banquets and weddings, or it is ground up as a seasoning for stew or soup. It can also be used in Ginkgo tea, which some people claim brings longevity and generally better health. A 1996 sales report shows that Ginkgo-based health food products had sales upward of 270 million dollars in the United States. Humans have recognized the Ginkgo as a valuable provider for thousands of years, and with the continual advancements of our understandings of the tree, humans’ relationship with this plant isn’t likely to cease.
                         
In addition to medicinal and food-based uses for the Ginkgo tree, the plant has also been used as an art motif and as a medium for art historically. In China and Japan especially, the Ginkgo leaf is an esteemed motif used for family crests, kimonos, jewelry and paintings or drawings.  
Source: Washington State Parks
The petrified gingko forest in Vantage, Washington displays the fossilized wood of the tree as a medium for ancient petroglyphs preserved by the visiting center. They also host an indoor exhibit that displays small and medium pieces of petrified Ginkgo wood that resemble images of people, animals or objects as well as historical and regional information about the trees. The continued artistic and cultural use of the Ginkgo tree speaks to how appreciated it is by humans historically and today.
                         
The ethnobotanical uses of the gingko tree, whether it’s as medicine, food or art, all contribute to the preservation and protection of the species as it is currently endangered. As they learn to acknowledge the interest and significance of the oldest living tree, people will work to preserve the rare species. 
Artwork depicting Gingko Biloba
Contemporary ethnobotany & commercialization
Humans today are contributing greatly to the survival and revival of the Ginkgo Biloba simply through appreciating its aesthetic and practical value. In the 1990s, farming of the Ginkgo Biloba began to develop in China through a joint venture program in which thousands of small scale farmers harvest the leaves. There are also Ginkgo Biloba plantations in the United States and France, created by pharmaceutical companies who require large amounts of the leaves for their health food products and medicine. Farming of the Ginkgo Biloba, both small and large scale, has provided one sustainable method by which humans are currently fostering the species. 
In addition to the farming of the tree, it is also being sustained through its increasing popularity in landscaping. Surrounding the Cleveland Indian’s baseball stadium are two hundred and sixty male trees and the project’s landscape architect, Darrell Bird, stated “I do not know what other tree we could have used…to get the effect we wanted.”
In 2009, Secretary-General Ban Ki-Moon planted a Ginkgo tree at a commemorative tree-planting ceremony to celebrate Seattle’s contributions to protecting not only the environment around them but to the global environment. Ban Ki-Moon also spoke at University of Washington at an Ecological issues convention.
The use of Ginkgo trees in landscaping can be noted on and surrounding the University of Washington campus as well. There are multiple trees thriving on the campus, lining University Way and scattered throughout University District and the Ravenna and Montlake neighborhoods. These trees are found in every city in the United States, and whether it is in a backyard, lining a city street or planted in masses, human use of the tree is undoubtedly aiding its recovery.  

The recovery of the tree has even been included in policy-making across country, including Seattle. The City of Seattle’s proposed tree regulations state: “Interim tree regulations implemented to limit tree loss outside of developmental process and to prevent the cleaning of trees prior to submission of a developmental proposal”. This regulation saved a 32” dbh (diameter at breast height) Ginkgo tree from being torn down while the area around it was under construction recently.
As Ginkgo trees are extremely slow growing, this tree could be hundreds of years old at that size. Saving this tree means that not only will its beauty be recognized for generations, but the resilience of the tree as well. Seattle has been working for decades to become a more eco-friendly city and these tree regulations are one successful step in the process.  

The Ginkgo tree is a broadleaf, deciduous, and dioecious tree. By being dioecious, this means that it reproduces by the pollination of the female tree by means of the male tree. In order for a new tree to grow, the male and female trees need to be close enough for pollination to occur. The female trees produce a flower and fruit but it can take up to 20 years for them to be seen. The ‘fruit’ that the female tree produces is about the size of a cherry tomato and has a soft, fuzzy skin. When the fruit is dropped it produces a foul odor. This odor is described as rotting garbage, spoiled milk, and the likes. Due to the malodorous stench, most people are partial to the male tree, which does not produce fruit or flowers. When planting the Ginkgo tree, most opt for the male. The female tree is becoming less and less common.
                         
Because the male tree is seen by many landscape professionals and arborists as much more favorable, there has been a great loss in biodiversity. This loss in biodiversity has led to the Ginkgo tree’s classification as an endangered species. Old, native, original species of the tree are especially rare. Prehistoric Ginkgo trees have been found in Washington.
The construction of a highway in Central Washington led a local geologist to discover an area with several different types of trees, including the Ginkgo. This area was preserved and named the Ginkgo Petrified Forest State Park. Although there are only a handful of Ginkgo trees in this park, it was so named because of the rarity of the tree.
                  
As we have seen, the Ginkgo tree has been used for medicine and decoration for thousands of years. With this long tradition comes the ability to grow the plant and ultimately cultivate it, known as agroecology. The Ginkgo is suited to moist, deep, and well-drained soils because the roots are more widespread, but it survives in a variety of soil pH levels, climates, with heat and cold. Planters usually fertilize the tree one to two times a year. The plant can also survive pest, fungus, viral, bacterial threats, ozone and sulfur pollution, fire, storms, ice-storms, and in one instance an atomic bomb. Typically, it grows in lowland forest valley areas below the 6,000 foot altitude mark.

Native growers would have to wait for the seasons to start growing the plant. This cold stratification, as it was called, meant going from cold to warm to let the seed know when to emerge. The germination would take place in sterile sand and they would have to fight off the mold that would attack the embryo before emerging.

The Japanese have used the Ginkgo in their bonsai gardens. This art form grows miniature aesthetically-pleasing plants in containers. The grower creates the plant to his ideals, more or less adapting the plant to suit a certain type of beauty. Shape, size, and proportion are considered and now there are many varieties suited to growth height, color, leaf shape, and weather conditions. Most of the Ginkgo Bonsai trees were the males since they do not produce the odorous fruit that the female does. The Japanese also used traditional environmental knowledge to take the chichi (nipples), which were the outer growths that eventually grew to the ground, and plot them upside down to make delicate leaves blossom from these little branches. Using those same techniques, they also grew the plant in some prefectures and several temples as a national monument.

A mechanical center-pivot sprinkler irrigates a monoculture field of Gingko Biloba seedlings
Although the female makes odorous fruit, the seeds are still valuable for the Ginkgo tree and economically for the growers. The plant, which fruits only after three to five years later, can create up to 36 pounds of seeds per tree per year. The seeds and leaves fetch a high price per kilogram nowadays. “In the local market, seeds are sold at US$507 per kg and dried leaves at US$1.50 to $2 per kg. Many institutions have researched how to generate even larger yields annually.” (Non-Wood News)

The Ginkgo tree has shown us the resiliency of certain life forms. While it was nearly extinct almost one million years ago and centralized only in China, it has been revived and now thrives across nearly every continent. Few plants have been known to survive from such a long time ago, even fewer display the longevity age-wise of an individual plant. Being such an adaptable and long-lasting plant, it has become aware to us, as a group, that Ginkgo tree is quite commonplace, including in the greater Seattle neighborhood.
We have developed an awareness of both the cultural and political aspects of the plant and how these intertwine with one another. By the same token, humans have greatly influenced the resiliency of the Ginkgo Biloba as well, yet the Ginkgo Biloba has affected human civilization for thousands of years, providing medicine and food, as well as symbolism in art, and will continue to do so as we learn more uses for it and come to understand its true value.
To learn more, use this link: Ginkgo Pages

In recent news:
For a PowerPoint File: Ginkgo PowerPoint Presentation

How Blogging Can Improve Student Writing


Blogs are indeed a higher stakes kind of writing but the most important aspect is their meaningfulness. And so yes, writing should improve as a result since writing for meaning is what makes thoughts and words powerful.
-Angela

How Blogging Can Improve Student Writing


Published Online: November 28, 2012

Command of the written word is a vital 21st-century skill, even if we are using keys, buttons, and tablets instead of pens and pencils. In fact, in our digital world, communication is now more instantaneous than ever.

How do we prepare our students to meet the challenge?

Blogging can offer opportunities for students to develop their communications skills through meaningful writing experiences. Such projects not only motivate students to write, but motivate them to write well. Furthermore, student-blogging projects can be designed to address the Common Core State Standards for writing. For example, see anchor standard six, which calls upon students to use technology to "produce and publish writing and to interact and collaborate with others." Score!

The Process

So how can you get started with student blogging? These steps helped my students have successful writing experiences:

[Continue reading here ]

Jeb Bush Urges Allies to Stand Firm on Ed Policy - State EdWatch - Education Week

Jeb Bush Urges Allies to Stand Firm on Ed Policy - State EdWatch - Education Week

On Tuesday, Bush spoke to a packed house at a downtown hotel in the nation's capital, promoting his favored policies in areas such as private school vouchers, test-based accountability, tougher forms of teacher evaluation, and school technology. The receptive audience included current and former officeholders, business and philanthropic officials, state schools chiefs, and others.

 Because of the nation's—mostly parents—backlash against not only test mania but also how these tests are used to privatize, corporatize, and marketize schools and all things education, Jeb Bush, family, and friends should indeed feel embattled....

-Angela

Teachers' Contract Includes Peer Review

Teachers' Contract Includes Peer Review

New Jersey seems to be on the right track here with respect to teacher evaluations. I like the professionalism upon which their new and emerging system is premised.

-Angela