Testing to Death

By Watson Scott SwailPresident & Senior Research Scholar, Educational Policy Institute

This summer, Education Week’s Scott Cech wrote about test scores in 12 states, based on a study authored by Bruce Fuller of UC-Berkeley and the Policy Analysis for California Education (PACE) at Stanford. The report finds that while academic progress appears to be increasing in mathematics, the growth rates are below those posted before and during the enactment of the No Child Left Behind Act (NCLB). Additionally, the study finds that in reading, there has been no closing of the gap by race/ethnic groups since 2003, although scores were closing before NCLB, and only Latino students have continued to make progress in mathematics.

A number of other studies, including one also reported in Ed Week the same week on students in Chicago Public Schools, show outcomes that are all over the place in terms of educational progress.

It seems to me that part of the problem is how we test students and how significant or “real” the numbers are. There is a danger in breaking out only a few pieces of data from a large dataset.

In 1993, AFT president Albert Shanker said that “we need less frequent but far better testing” (ETS, 1999, p. 4). Almost 15 years later, it is arguable that the opposite is true: we have more frequent and less effective testing.

Testing is clearly driving public education in the United States. As I’ve stated before, this isn’t just because of NCLB. Testing was alive and kicking well before NCLB. In 2001, before enactment of NCLB, I conducted a site visit of a school in Olympia, Washington to look at their use of technology in the classroom. The elementary classroom was using AlphaSmarts, a keyboard-based portable tool that allows students to type in classroom notes and assignments. A very cool little device when integrated prudently into the curriculum. However, during the site visit we noticed that the AlphaSmarts were literally packed away in a closet. I asked the teacher why this was the case, and she said that, for the past two months, the AlphaSmarts were packed up in order to focus on the “WASL.” The WASL, for those not in the “know,” is the acronym for the Washington Assessment of Student Learning, and is Washington’s high-stakes test for students (since the mid 1990s). Essentially, teachers were teaching the test.

This isn’t just a Washington State phenomenom. We have heard the same stories for the TAAS (Texas Assessment of Academic Skills) in Texas, the CAT (California Achievement Test) in California, and the SOLs (Standards of Learning) in Virginia. With the passage of NCLB in January 2002, state testing only increased. But is it better?

There is nothing wrong with testing. As an educator, I must say that testing for diagnostic and comparative purposes is essential to (a) understanding the learning deficits of students; (b) weighing a teacher’s ability as compared to other teachers, as well as to measure the pedagogical competency of a teacher; and (c) measure the progress of schools, districts, and states in particular academic areas. But we do each of these things poorly in most cases.

First, testing conducted today in support of state requirements and NCLB is rarely used for diagnostic purposes, which should be the primary reason for testing. In Virginia, we get a statement each year about our childrens’ progress on the SOLS, but I have no indication that any of this information is used to alter the teaching in the schools to focus on the deficits rather than the strengths.

Second, some states, like North Carolina, do use state testing to measure teachers and schools. This, of course, has not been without its critics, who complain that it is a misuse of student testing. Not necessarily so. It may be a harsh reality, but outcomes are outcomes.

And third, the NCLB effort was conducted, in part, to provide a measuring stick for schools, districts, and states with respect to the academic growth and performance of students. This can be done within states (although the abiltiy to do this is variable across states), but it cannot be done well across state lines because most states use different measures. This was done as a means of political expediency during the drafting of NCLB legislation: states wouldn’t sign on to NCLB (nor would the senators and members of congress) if the federal government was to legislate the test that would be used to measure student progress. After all, education is a state responsibility. However, this aspect of NCLB is the Achilles heel of the entire testing phenomena in the US. One of the primary reasons for having the tests can’t be delivered—comparison.

I am a firm believer that curriculum really doesn’t need to be dramatically–if any–different from state to state, province to province. For the most part, mathematics is mathematics, and chemistry is chemistry. In social studies, there are regional issues, but most US students should be learning about the same national history, and perhaps a whole lot more world history and politics. The study of English and literature is the same: I can’t be convinced that the needs in Oregon should be different than those in Maine.

I’ve harped on this before, but it is an important point. Each state legislates what is to be learned in public and private school. Over the past 15 years, we have started to focus on national “standards” in science and mathematics and other areas, but still, there is incredible latitude in what states can mandate for high school graduation, which, in turn, has a tremendous impact on what is mandated for college admissions.

Because of the variability, the need for an enormous plethora of “tests” is needed. Some states choose to use nationally-normed tests, while others, like Washington State, create their own, which are not normed outside of the state. Thus, with the exception to other tests that students may use, like the ITBS, or the SAT 9 or 10, which are special tests of ability, there is little to compare. If I’m the Superintendent of schools in Washington, I want to know how my students compare to the world. True, we have the NAEP, but why not use tests that provide a diagnostic service to students?

This isn’t just a K-12 issue, either. Earlier this morning, InsideHigherEd.com reported on the US Department of Education’s focus on developing learning outcome measures via a series of grants to associations. The article also notes the pledge by Miami-Dade College to embrace the institution’s new “learning outcomes” initiative.  If we think that coming up with well-regarded, empirically-sound, and universal tests to measure achievement in K-12 is tough, I’m utterly confused how it is expected to be done at the postsecondary level, which is almost completely unencumbered by any standards in any way, across any discipline or any topic, with the exception of the professional fields, such as engineering, accounting, medicine, and law, for example (which, of course, do use UNIVERSAL tests to measure proficiency, such as the LCAT and MCAT for law and medicine).

But back to K-12 to conclude, the President is now pushing an expansion of NCLB in the current reauthorization, which is unlikely to occur before the next president is sleeping at 1600 Penn. And to make this more difficult, the President yesterday announced that he is willing to veto any NCLB bill he doesn’t like, which is understandably meant to corral the Dems but could mire the legislation until 2009.

Is the President right or wrong? That’s arguable, for sure. But perhaps the late Al Shankar had it right: we are mired in a system of education that has far too much testing that simply isn’t good enough. We’d be better off with less and better testing that is more universal and less specialized for curricula that, for some reason, just has to be different across the country.

Have a nice weekend.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.