We have discussed the Angoff and Bookmark methods of standard setting, which are two commonly used methods, but there are many more. I would again refer the interested reader to Hambleton and Pitoniak’s chapter in Educational Measurement (4th ed.) for descriptions of other criterion-referenced methods.
Though criterion-referenced assessment is the typical standard-setting scenario, cut scores may also be determined for normative assessments. In these cases, the cut score is often not set to make an inference about the participant, but instead set to help make an operational decision.
A common example of a normative standard is when the pass rate is set based on information that is unrelated to participants’ performance. A company may decide to hire the ten highest-scoring candidates, not because the other candidates are not qualified, but because there are only ten open positions. Of course if the candidate pool is weak overall, even the ten highest performers may still turn out to be lousy employees.
We may also set normative standards based on risk tolerance. You may recall from our post about criterion validity that test developers may use a secondary measure that they expect to correlate with performance on the assessment. An employer may wish to set a cut score to minimize type I errors (false positives) because of the risk involved. For example, ability to fly a plane safely may correlate strongly with aviation test scores, but because of the risk involved if we let an unqualified person fly a plane, we may want to set the cut score high even though we will exclude some qualified pilots.
Normative Standard Setting with Secondary Criterion Measure
The opposite scenario may occur as well. If Type I errors have little risk, an employer may set the cut score low to make sure that all qualified candidates are identified. Unqualified candidates who happen to pass may be identified for additional training through subsequent assessments or workplace observation.
If we decided to use a normative approach to standard setting, we need to be sure that there is justification, and the cut score should not be used to classify individuals. A normative standard by its nature implies that not everyone will pass the assessment, regardless of their individual abilities, which is why it would be inappropriate for most cases in education or certification assessment.
Hambleton and Pitoniak also describe one final class of standard-setting methods called compromise methods. Compromise methods combine the judgment of the standard setters with information about the political realities of different pass rates. One example is the Hofstee Method, where stand setters define the highest acceptable cut score (1), the lowest acceptable cut score (2), highest acceptable fail rate (3), and the lowest acceptable fail rate (4). These are plotted against a curve of participants’ score data, and the intersection is used as a cut score.
Hofstee Method ExampleAdapted from Educational Measurement (Ed. Brennan, 2006)