One piece of the classic item writing guidance is to “avoid trick items,” even as authors of that guidance admit that there’s no definition of trick items. Content review committees sometimes point to items that they do not like as being “trick items,” though they also cannot define the term.

I think I can explain it, and explain why the idea is superfluous.

Let’s begin by considering trick questions, outside of the context of assessment. Trick questions are those designed to trip us up. They somehow catch us in a mistake that we were not looking for. They rely on an inappropriate assumption or some other common foible. For example, they might rely on our assumption that “A or B?” requires us to pick just one answer. Or our ingrained sexist assumptions that surgeons are men. They often rely on a sort of sleight of hand, suggesting to us that they are testing us in one way, when they actually are fooling us in another.

Does this idea apply to assessment items? Is this a useful thing to look out for? I think not.

First, of course we want assessment items to offer opportunities for test takers to demonstrate their mistaken thinking and their misunderstandings. Our goal is figure what tests takers can do and do know, but also to figure out their limits. We want to know where they might benefit from additional instruction, or where a curriculum falls short. We might want to know whether there are holes in their knowledge that should prevent awarding of a professional license. Items designed to catch mistakes? Yes, that is a good thing.

Second, high quality test items should be designed to catch particular kinds of mistakes. That is, the mistakes with the targeted cognition. Items designed to measure a particular alignment reference or standard should create opportunities for test takers to show their proficiency with that targeted cognition, and to show any lack of proficiency with that targeted cognition. Other sorts of mistakes should not be captured by the item. There should not be any sleight of hand about the kinds of mistakes or misunderstanding that the item reveals. In this, items should not resemble trick questions.

Third, and on the other hand, selected response items should include the most common mistakes that test takers might make with the targeted cognition. That is, they should try to catch test takers who lack proficiency there. This is not unfair; this is the point. If an item reviewers see that an item would trip up many of their students because it features opportunities to make those common mistakes—instead of protecting them with guardrails that make those mistakes less common—the item is likely a better item. In this, items should resemble trick questions.

So, what is a trick item? Well, some poorly written items provide opportunities for other sorts of mistakes and/or misunderstandings to trip up test takers. That is construct irrelevant variance on the level of the alignment reference or standard. Those are already bad items, and we do not need the term “trick item” to recognize that. But items that intentionally set up test takers to fail with the item because of some common misunderstanding or assumption? Well, provided that it is a flaw in their understanding of the targeted cognition, that is a good item. Calling it a problematic “trick item” presumes that test takers should be protected from tests and test should not look for the shortcomings in their proficiencies. In this case, the term is counter-productive.

So, trick items? No, there’s no need to avoid them, or even to use the term.

Complex Variety: Assessment Development, Education and Occasional Other Topics

Latest & Greatest

Dr. Hoffman