Posted by Greg Pope
In my last post I offered some general information about assessment reliability. Below are some additional specific things to consider.
- What factors / test characteristics generally influence internal consistency reliability coefficient values?
A. Item difficulty: Items that are extremely hard or extremely easy affect discrimination and therefore reliability. If a large number of participants do not have time to finish the test this affects item difficulty
B. Item discrimination: Items that have higher discrimination values will contribute more to the measurement efficacy of the assessment (more discriminating questions = higher reliability). Part of this relates to sound question development, if questions are well crafted and non-ambiguously worded they are more likely to have acceptable discrimination
C. Construct being measured: If all questions are measuring the same construct (e.g., from the same topic) reliability will be increased
D. How many participants took the test: With very small numbers of participants the reliability coefficient will be less stable
E. Composition of people that took the test: If the sample of participants taking an assessment is not representative (e.g., no-one studied!) the reliability will be negatively impacted
F. How many questions are administered: Generally the more questions administered the higher the reliability (to a point, we can’t have a 10,000 question test!)
G. Environmental administration factors: Conditions in the testing area such as noise, lighting levels, etc. can cause distraction away from the measurement of what the participants know and can do
H. Person factors: Test anxiety, fatigue, and other human factors can reduce the accuracy of measurement of what people know and can do
For more on this subject see the Questionmark White Paper, “Defensible Assessments: What You Need to Know”