New Test Items

In a stack of papers called Testing.

  • Oct
  • 10
  • 2009

Every time you create a new test item, a new way to assess knowledge, what do you do? How do you step kids into that new item type, one they may not have seen before, in order to make sure you’re assessing exactly what you want to assess (content-area skill) and not something accidental (test-taking prowess)? I spent about two hours on Wednesday figuring out how to handle this situation. What would you have done?


You give biweekly vocabulary tests. Each week, your students get seven new words. You talk with them about the definitions, trying to put dictionary speak into plain English, telling (funny) stories about the words, and asking for situations in which the word would be used. Because you do all this, you decide to replace the vocabulary test format you’ve had for years with something new.

You figure that, since you verbally review words by giving examples, having a section on the latest vocabulary test where you provide a scenario (Stephan doesn’t care) and students fill in the appropriate vocab word from a word bank (apathetic) is a good idea. You go a bit overboard with it and make that a sixteen-item section and fill the word bank with twenty words to add a wrinkle to it. You realize that, on a seventy-eight point test, this new section is worth forty-eight points but still think it’s fair.

Additionally, you have a student teacher and the university supervisor stops in the day said student teacher is correcting this test with his class. Afterward, the university supervisor looks over the test and gives a remark about it, how it perhaps tests student ability to adapt to new test items more than it tests vocabulary skill. Your student teacher tells you this during your prep. When students bomb that portion of the test, not in droves but in some statistically significant fashion with several possible explanations, you nod. And that critique passed on to you from the university supervisor tickles the back of your brain all afternoon.

At first, denial. You’ve reviewed the words a lot of different ways and feel that these new items are not tricks at all. Next, anger. So what now? Any time a new test item is included, you make it extra credit? There’s nothing wrong with putting new and better test items in place of old and busted ones. The bargaining settles as you decide to cut the total points possible to the highest score in the class, thinking that will keep the test valid. Depression sinks you and you just stare at the test, wondering what you were thinking making that new section worth so many points. Acceptance happens. Bringing some of your close colleagues in, you soon expand your thinking and decide you’re going to do something about it.

Two hours later, you have it all worked out, copies in hand, plans laid, and explanation given to your student teacher. That test will be thrown out. It’s now a study tool. The vocab review the next day will take the form of those new test items. A brand new test will be given Monday. This new test will have five scenario-fill-in items and five definition-fill-in items, along with a ten-word bank so that process of elimination helps. This is a set of steps marching toward eventually giving a similar test that this whole mess started with.

Due to all this, you do not get that stack of twenty papers for today graded and all hope of passing those pieces of writing back by this Friday flee.


What do you do when you feel like a new test item is in order? Which test items do you think are fair to include from the beginning of the year? Is Matching such a universal test item that it doesn’t need scaffolding? How do you handle that idea that dawns on you as you’re making tomorrow’s test? What do you do if you’re making that test and finally realize that those test items you’ve been including don’t do what you want them to?


1. Josh says:

[10/11/2009 - 5:07 am]

It seems far more likely to me that your old tests were the ones that tested the skill you don’t want to test (how to memorize the definition of vocab words and match items on a test without actually understanding said words) and your new test actually is closer to testing the skill you want to test (whether a word has been actually learned well enough to understand how to use it). It also seems to me that standard matching tests with process of elimination do test test taking skill more than vocabulary (or whatever it is that you are testing). Its just a skill that the kids have had a long time to build up, so you don’t notice.

What I am trying this year is to try to design my tests and quizzes to test the harder, more authentic skill as often as possible and then count later tests and quizzes for more than earlier ones. That, along with comprehensive tests tells the kids that they have to learn the material and that they will be rewarded for doing so. So far it seems to be working, but it is a bit early to tell.

Bottom line, don’t assume that just because kids do badly on a test that there is something wrong with the test. It is just as likely that the new test is actually measuring the skill that you want and the unpleasant truth is that the kids haven’t actually learned that skill.

2. Todd says:

[10/11/2009 - 2:07 pm]

There is still a matching section on the new test, Josh. I know that’s a basic method, but I do feel that it tells you whether or not students know the definition. The definition possibilities are usually worded differently than the dictionary definitions we settle on, so I’m not worried too much about them knowing which item to match to based solely on rote memory. I will think about a matching section with more definitions than words. I’m not sold on matching as a great test item. I do think they serve a certain basic function, though.

What subject do you teach and how has your quest toward better tests manifested itself? What do your tests look like now as opposed to in previous years?

If kids do badly on a test that has a brand new test item on it, I’m thinking it’s the test that needs to be modified. There are lots of reasons for poor performance. If, after the review and discussion I gave about the new test item and the tests where more and more of these items will appear, students still perform badly, I’ll be more inclined to believe that they don’t have the skill I’m testing. Right now, though, the unpleasant truth learned here is that I didn’t scaffold this enough.

3. Josh says:

[10/16/2009 - 11:40 am]

I teach a lot of ESL students in chemistry and physics. Both subjects require a lot of understanding of English in order to understand really what is going on in anything more than a imitative, monkey see, monkey do, matter. So I end up teaching a lot of vocabulary.

I’m not at all sure that I’ve found a good way of doing this though. I liked your original test item and found it intriguing and potentially measuring vocabulary itself better than rote definitions usually do. (which I feel matching almost always does even if the definitions are not the book definitions).

Teaching a lot of ESL students has taught me that nearly all tests have reading comprehension as their most important skill. If a student does not have reading comprehension then they will do badly on the test, regardless of whether they understand the skill.

Probably, any new test item increases the chance that reading comprehension is the first skill being tested. Because the item is new a student may have a good chance of misunderstanding the problem if they don’t read it correctly.

I find short answer questions to be the best. Like define the word. and use the word in a sentence. But these are nightmare’s to grade in a timely fashion and I swing towards them and away from them as my appetite for lack of sleep ebbs and fades.

I meant the unfortunate comment in the light of testing in general. I’m not sure many tests actually measure anything close to what we pretend that they measure. But that is life, and education and we will work at improving it…