Saturday, October 14, 2006

I (heart) Testing

At the American Federation of Teachers (AFT) blog, there's been some question raising about the merits of testing, whether it is necessary, whether we need to test yearly and if data sampling should be used rather than actual data. The AFT even held a panel discussion to debate the issue.

This is one of those instances where you get the sense that critiquing the argument gives it the attention it doesn't deserve, but what the hell.

Not only do we not need less testing, we may in fact need more testing, and we certainly need better testing.

We need tests that produce disaggregated data that shifts away from using schools and districts as the essential unit of analysis, and instead turn toward analyzing classrooms in this way. Test data is the educator's report card, and we need to be held accountable to it, if only as a small attempt to rectify the imbalance between how poor teaching affects life outcomes of the teacher, versus how it affects life outcomes of the student. High-stakes test are fine -- life is high-stake, so let's prepare kids for being in environments where standing and delivering matter. They are mildly intrusive and difficult, but hell, the workload of Friday quizzes at my school at least equals, and may actually exceed the two hours of CSTs we administer in May.

I can't help but feel some of the outcry about all this is a desire to avoid a reckoning, avoid a public assessment of effectiveness and quality.

So yeah, as we fight against the reduction of tests, we must also advocate for better testing. In California, our tests are summative, but they are not prescriptive or diagnostic. CST results provide a general sense of where kids are in relation to the standards, but they do not go nearly far enough. What does it mean, in terms of instruction, when a kid scores Below Basic? What do they really need to know to make up the lost ground? Our tests do not tell us this. When kids score Far Below Basic, where are they, truly? Our tests do not tell us this and they do not allow students to demonstrate (at times multi-year) growth within that quintile. CST and like tests are effective summative assessments, they inform student grouping and placement, but once Day 1 begins, they have lost all instructional value.

California's API suffers in similar ways. It attempts to be a growth-based measure, but time-1 is calculated at the end of school year X, and time-2 is the end of school year Y. This must change, because it compares growth across different populations. This is especially damaging to schools with a small grade range (we used to be 7-8), where it is necessary to grow beyond the achievement levels of the previous year with large percentages of new students, and no way to build upon the successes of the past year because those kids had already graduated. API then, should measure from the first day of school to the last.

Assessment is part of the cycle of teaching and learning, and while there are probably extreme examples of over-testing and over-preparing, I can think of only a few valid reasons why we should advocate for less testing.

1) The tests lack all validity or reliability.
2) Success on the tests does not translate to mastery of the standards (e.g. High Point unit assessments) because the tests are either not aligned to standards, aligned to below grade level standards, are otherwise fundamentally flawed.
3) The material to be assessed is considered to be of relatively little value (i.e. the standards have been poorly chosen, written, or assembled).
4) Truly, truly assessment has become so frequent that it impedes instruction. And if you're gonna use this one, you better be really bringing it every minute of the (non-testing) day.


Blogger Mrs. B said...

For once I disagree with you (sort of). I think that testing gets in the way of instruction. It really irritates me that I have to take precious time from my units to prepare and administer tests. The kids feel no urgency about these tests, so it just becomes a chore for them (and me). Then the results are never delivered to the teachers in an effective manner, and we are left thinking, "That was a week I will never recover."

On the other hand, if testing is here to stay, then you are right. It has to be more meaningful, and the kids have to KNOW that it MATTERS. Otherwise it is just so many watermelons stacked up in the warehouse.

8:11 PM  
Anonymous Anonymous said...

Part of the point, at least as I took it, was that if testing was as high stakes for teachers, they'd find a way motivate their kids a wee bit more. If it mattered to a teacher's pay (or job), you can be sure s/he would make it matter to the kids.

8:41 PM  
Anonymous Anonymous said...

Or it would provide an incentive to teachers to get students transferred or expelled to make their class scores look better. To the extent that this correlates with better learning in the class without the troublesome kids there may be some point to it, but I'm sure we all feel a bad taste at the idea.

9:05 PM  
Anonymous Anonymous said...

For those teachers in designated program improvement schools testing certainly IS high stakes. Tests,with all the flaws mentioned in the post, matter so much that we focus really on nothing else. Schools in my area are facing reconstitution or charter conversion. Jobs ARE on the line. I'm not sure this was the point of the post, however. Test should be meaningful is that it? Who can argue with that?

9:16 PM  
Blogger KDeRosa said...

I find it hard to believe that any teacher who is interested in finding out if students are actually learning could have any objection to testing.

Testing is merely feedback from the students indicating whether the student has learned what the teacher has taught.

It provides a check for the students who have learned, allowing the teacher to move on, confident that that is being taught has been learned.

It also provides a check on the students who aren't learning who need a remedy can be applied, such as reteaching what was taught.

Often, this remedy step isn't applied and the feedback information isn't used. Thus, the testing can be seen as being of little use. But, is the testing or the response to the teaching which is at fault?

6:13 AM  
Anonymous Laura said...

I like being able to tell how far my students have advanced, to monitor and assess their progress. And I agree with your support of testing--to an extent.

We are asked in my district to administer a writing test "benchmark" every month. Now what do you suppose that benchmark is going to tell me that I couldn't have figured out from regular interaction with class projects and other forms of assessment not administered with the timed, specified constraints of a writing test?

In the meantime, that's a whole 90-minute class period every month I could have been going over useful strategies, techniques, approaches, and remediation for the struggling writers.

Last year we were supposed to turn those scores in, which, of course, varied according to the topic and the kid's commitment that day. The powers that be drew smiley faces when scores went up and question marks when they went down. And what did anyone gain from that?

6:54 PM  
Blogger TMAO said...

Laura, I hear you on the frustration around districts not knowing what to do with the data they've required, but...

You were gonna assign and evaluate those essays anyway, right? For your benefit and your own performance tracking?

10:56 PM  
Blogger Got Thoughts? said...

I am not familiar with what the California tests look like. Are they all multiple choice and essays?

In Washington State we have a statewide test called the WASL (Washington Assessment of Student Learning) that is aligned with our state standards called EALRs (Essential Academic Learning Requirements). The EALR functions as the main idea, with components that makes up a part of each EALR and then each grade level (K-10) has a GLE (Grade Level Expectation). For example, here is one writing EALR:
EALR 1: The student understands and uses a writing process.
Component 1.1. Prewrites to generate ideas and plan writing.
GLE: (kindergarten): 1.1.1. Uses pictures and talk for thinking about and planning writing.
The same EALR and component for a 10th grader has this GLE:
1.1.1. Analyzes and selects effective strategies for generating ideas and planning writing.

The WASL is this amazing test with a real love-hate element. I love that we are using it just to measure each individual student against themselves rather that comparing them to one another. I also love that the test tries to be a measure of thinking, rather than selecting the correct multiple choice answer. The horrible thing about it is the disservice it does to ELL students that cannot clearly express their thoughts and ideas through written means. The test ONLY given in English, and teachers and proctors are not allowed to even look at the test booklets. They can’t answer any questions unless it is to clarify the directions. The students are completely on their own.

These tests are rigorously graded by two separate scorers (often by teachers in the summers). If the two individual scores are not the same, a third scorer weights in. On our state education website,, you can see some of the decommissioned test questions along with annotated scoring guides.

3:49 PM  
Anonymous Anonymous said...

I wish we could come up with more authentic assessments. When kids do care they get stressed and stress in little ones is just not pretty and just not fair!

However, I agree with Mrs. B. "The kids feel no urgency about thesee tests, so it just becomes chore for them (and me)." It's frustrating for me too! Just why are they filling in all of these little bubbles anyway?

9:29 PM  
Blogger TMAO said...

But, but, but...

If kids aren't motivated, that's OUR fault. Ours.

I wish we could come up with more authentic assessments for writing. The stuff GT? writes about sounds pretty good to me. But in terms of reading and understanding -- an all around important skill -- I think the tests do okay.

Whether we're doing okay or not is another question entirely.

9:12 PM  
Anonymous Todd said...

I strongly disagree with you on your last point. It's not my job to motivate my students to do well on a test that doesn't matter to them. Students who see these tests as a waste of their time because they understand the test results don't affect them in any way are thinking more critically than those students who exert maximum effort on everything set in front of them. If a student shrugs off the CSTs because he has a big biology project due next Friday and needs to focus on that, that student has his priorities straight.

If students aren't motivated to do well on a standardized test, that may be "our" fault, as in the entire school system, but it's certainly not "our" fault, as in classroom teachers. The fact is that students can completely blow off STAR and CST tests. Nothing will happen to them, just like nothing will happen to them if they perform incredibly well.

Further, if my students and I received test data back before the end of the year, it might actually mean something. But when test data isn't released until August, I only get to see how my students-to-be did on those tests, not my students from the previous year. And those students from the previous year get a meaningless report showing their test data because the test doesn't impact them. No scholarships, no course placement based on those test results, nothing.

If we had test data quickly and in time (test data in May doesn't help at all: it's too late; test data in August doesn't help either: it's far too late) and if there was a school-wide push for explaining why these tests are important and if students had reason to do well on those tests, I'd agree with you and that's ultimately what I want. Sadly, none of those things are universally true right now. If they are true on your campus, consider yourself lucky and send me an application for your school.

12:11 AM  

Post a Comment

Links to this post:

Create a Link

<< Home