The Most Insidious Obstacle to Well-Aligned Items

Items are the building blocks of assessments, and if they do not perform their designated function no purpose or inference from their test can be valid. That is, they must actually elicit evidence of the targeted cognition for the range of typical takers. 

The most insidious obstacle to the production and inclusion of well aligned items on operational tests has nothing to do with psychometrics. Truly, even educational measurement's original sin (i.e., the a priori assumption of unidimensionality)—my go-to villain for over 15 years—is not as insidious as the idealized student/test taker. It is more insidious because only the idealized student/test taker is so often consciously imagined by item contributors throughout the entire item development process, despite being even less frequently appropriate than unidimensionality.

The idealized student/test taker is a convenient fiction. It is an easy fiction. This paragon of testing virtue is an incredibly attractive fiction. But it is very much a fiction, and it is an enormous obstacle to developing valid items.

What is the Idealized Student/Test Taker (IS/TT)?

The idealized student/test taker learned all the lessons. They were diligent in class and are diligent on tests. They attempt every item exactly as their teachers would want them to, applying the correct technique and knowledge. Their thinking is linear, and clear. They understand the item and what it is trying to communicate. They recognize what they are being asked to do, and they do it as item writers and other developers would want them to.

They do not cheat. They do not do anything—not even unintentionally—to undermine the quality of the inferences we make from their performance. They do not look for shortcuts. They do not apply savvy test-taking skills. They do not look for alternative approaches when they realize that they lack some called for knowledge or skills. They do not have strategies for guessing. They succeed properly as directed, and accept their own shortcomings when faced with challenges beyond their ability. 

The idealized student/test taker remembers everything they learned, but is not advantaged by a particular pedagogical approach or examples they encountered during their studies. They are not tied to how their own teachers asked questions or gave instructions, being equally able to understand other wordings or framings. 

The idealized student/test taker is neither advantaged nor disadvantaged in any way. They do always work in good faith. They are beyond frustration or excitement. They are perfectly focused only on what the item gives them.

Therefore, the idealized student/test taker always provides a successful response when they possess proper proficiency with the targeted cognition and always fail to do so when they lack such proficiency. 

What is Wrong with the Idealized Student/Test Taker?

The IS/TT is a fiction. They not only are a fiction because no such student or test taker actually exists, but even more dangerously because they promote the idea that there is one sort of “typical” test taker that item writers, content development professionals and other item reviewers should focus on. They suggest that there is a preferred or primary test taker whose views and understandings are most important in our considerations. 

But the fact is that test takers vary enormously. They vary in an enormous number of dimensions. They have different experiences and proclivities. They have had the benefit of a multitude different instructional approaches and examples. And very few of them really want their test results to accurately reflect their proficiencies. Instead—and quite understandably—they generally want the highest score they can get. Some might be lazier. Some might be more devious. So, there are differences in motivation, but virtually across the board, they would prefer a higher score to a fairer score. 

If told what an item is trying to target, they generally would use some other approach if they felt it would given them a better chance to get the points. Real test takers do not see test developers or the items themselves as some collegial partner, but rather obstacles to be overcome. 

Moreover, there are many deep contradictions within the idea of the IS/TT. For example, we imagine that they can do every item as we would like them to while also imagining that when they lack the targeted skill, they do not employ strategies to work around its requirement. We also imagine that they lack any particular ethnic or cultural identity, and yet are fully comfortable with the cultural context of the item we are considering. It is not a coherent ideal, but rather a convenient collection of proclivities and behavior.

Why so Insidious?

I do not think that anyone would question anything in the previous section. It is all rather human and very obvious. 

And yet, it is so much easier to write, refine or review items with just the IS/TT in mind. It makes the process more direct and much less uncertain. It allows us to stay in our own perspectives, think about our own intentions. It is cleaner and simpler. 

It requires vastly more energy and work to consider the range of typical test takers. It is just hard to do the real work of radical empathy (i.e., applying cognitive empathy to a range of different types of test-taking personas) to envision how different sorts of test takers might respond to an item. Doing that requires us to go beyond our experiences and even beyond the experiences of people we know. Doing that well is rigorous and demanding professional practice that does not come naturally to many people. 

And so, everyone involved in developing items has a tendency to go back to the IS/TT. We all feel the urge. But items developed for the IS/TT will invite too many alternative paths by which real test takers can produce a successful response without using the targeted cognition. And they will trip up too many test takers who have appropriate proficiency with the targeted cognition but are subject to some other construct-irrelevant issue.

Because our tests are intended for a wide range of test takers, we must reject that urge. We must look deeper and think harder. 

The most important thing we can do when writing, refining or reviewing items is to put the idealized student/test taker aside and see the diversities of test takers who actually will attempt our items. This type of rigorous professional practice is necessary to produce valid items—items that elicit evidence of the targeted cognition for the range of typical test takers.