Fisking the Haladyna Rules #29: Make distractors plausible

[Each day in October, I analyze one of the 31 item writing rules from Haladyna, Downing and Rodriquez (2002), the super-dominant list of item authoring guidelines.]

Writing the choices: Make all distractors plausible.

This might be the most important principle in all of multiple choice (MC) item development, and that makes it incredibly important to all of large scale standardized assessment because of the dominance of MC items on standardized test. But Haladyna et al. fail to explain what makes a distractor plausible in their 2002 article. Note that there is a different rule about basing distractors on common test taker mistakes (i.e., Rule 30), so it cannot be that.  

Their 2004 book provides a brief explanation, but still separates test taker errors from plausibility. They write that distractors should “look like a right answer to those who lack this knowledge” (p. 120). My regular co-author and I call that shallow plausibility. That is, those who lack the desired proficiency cannot easily and quickly dismiss it as incorrect. This idea of shallow plausibility undermines (or subsumes) most of Haladyna et al.’s advice on cluing (be it part of Rule 28 or any other rule) because it entirely shifts the issue into a different frame. Like their 2002 article, their 2004 book appears to equate “plausible” with “effective” and suggests that these are evaluated by judging how many test takers select them.

But is that a decent standard to judge items and effectiveness? If you care about validity—about content validity, construct validity, or validity evidence from test content—then it clearly not is not a decent standard.

Items aligned to easier assessment targets should be easier. Fewer test takers should select distractors for those items. Of course, that just begs the question of what “easier” means. Well, for assessment purposes, easier is not an intrinsic quality of the targeted cognition. Rather, it is about an interaction between the content, teaching and learning of the content and the item—all in test takers’ heads. When instruction improves (e.g., through better curriculum, better lesson plans or better pedagogy), measured content difficulty should drop. If some school, district or state does a better job of teaching some standard, the distractors don’t get less effective. Rather, more test takers are able to produce a successful response. Better teaching does not make items or distractors less effective simply because fewer test takers select an incorrect option.

This simply a dumb way to think about distractor effectiveness. Truly dumb. The question is not whether distractors are selected by many test takers, but rather whether these distractors (as opposed to other potential distractors) are the ones that will be fairly selected by the most test takers. But to understand what that means, you’ll have to read tomorrow’s post.

But, frankly, this idea that distractors should be judged in a sort of popularity contest is what leads to the kinds of deception and minutia that Haladyna et al. try to warn against in Rule 7 (Avoid trick items). If the best you can do when writing distractors is to try to deceive test takers, you are not trying to measure the targeted cognition at all. Dumb Rule 7 only exists because of this idea that items should be difficult, rather than that they should be fair.

[Haladyna et al.’s exercise started with a pair of 1989 articles, and continued in a 2004 book and a 2013 book. But the 2002 list is the easiest and cheapest to read (see the linked article, which is freely downloadable) and it is the only version that includes a well formatted one-page version of the rules. Therefore, it is the central version that I am taking apart, rule by rule, pointing out how horrendously bad this list is and how little it helps actual item development. If we are going to have good standardized tests, the items need to be better, and this list’s place as the dominant item writing advice only makes that far less likely to happen.

Haladyna Lists and Explanations

  • Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Routledge.

  • Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge.

  • Haladyna, T., Downing, S. and Rodriguez, M. (2002). A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment. Applied Measurement in Education. 15(3), 309-334

  • Haladyna, T.M. and Downing, S.M. (1989). Taxonomy of Multiple Choice Item-Writing Rules. Applied Measurement in Education, 2 (1), 37-50

  • Haladyna, T. M., & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied measurement in education, 2(1), 51-78.

  • Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied measurement in education, 15(3), 309-333.

]