Reliability and validity are the keys to trust

How can you trust assessment results? The two keys are reliability and validity.

RELIABILITY EXPLAINED

An assessment is reliable if it measures the same thing consistently and reproducibly. If you were to deliver an assessment with high reliability to the same participant on two occasions, you would be very likely to reach the same conclusions about the participant’s knowledge or skills. A test with poor reliability might result in very different scores across the two instances. An unreliable assessment does not measure anything consistently and cannot be used for any trustable measure of competency. It is useful visually to think of a dartboard; in the diagram to the right, darts have landed all over the board—they are not reliably in any one place.In order for an assessment to be reliable, there needs to be a predictable authoring process, effective beta testing of items, trustworthy delivery to all the devices used to give the assessment, good-quality post-assessment reporting and effective analytics.

VALIDITY EXPLAINED

Being reliable is not good enough on its own. The darts in the dartboard in the figure to the right are in the same place, but not in the right place. A test can be reliable but not measure what it is meant to measure. For example, you could have a reliable assessment that tested for skill in word processing, but this would not be valid if used to test machine operators, as writing is not one of the key tasks in their jobs.

An assessment is valid if it measures what it is supposed to measure. So if you are measuring competence in a job role, a valid assessment must align with the knowledge, skills and abilities required to perform the tasks expected of a job role. In order to show that an assessment is valid, there must be some formal analysis of the tasks in a job role and the assessment must be structured to match those tasks. A common method of performing such analysis is a job task analysis, which surveys subject matter experts or people in the job role to identify the importance of different tasks.

ASSESSMENTS MUST BE RELIABLE AND VALID

Trustable assessments must be reliable AND valid.

The darts in the figure to the right are in the same place and at the right place.
When you are constructing an assessment for competence, you are looking for it to consistently measure the competence required for the job.

COMPARISON WITH BLOOD TESTS

It is helpful to consider what happens if you go to the doctor with an illness. The doctor goes through a process of discovery, analysis, diagnosis and prescription. As part of the discovery process, sometimes the doctor will order a blood test to identify if a particular condition is present, which can diagnose the illness or rule out a diagnosis.

It takes time and resources to do a blood test, but it can be an invaluable piece of information. A great deal of effort goes into making sure that blood tests are both reliable (consistent) and valid (measure what they are supposed to measure). For example, just like exam results, blood samples are labelled carefully, as shown in the picture, to ensure that patient identification is retained.

A blood test that was not reliable would be dangerous—a doctor might think that a disease is not present when it is. Furthermore, a reliable blood test used for the wrong purpose is not useful—for example, there is no point in having a test for blood glucose level if the doctor is trying to see if a heart attack is imminent.

The blood test results are a single piece of information that helps the doctor make the diagnosis in conjunction with other data from the doctor’s discovery process.

In exactly the same way, a test of competence is an important piece of information to determine if someone is competent in their job role.

Using the blood test metaphor, it is easy to understand the personnel and organizational risks that can result from making decisions based on untrustworthy results. If an organization assesses someone’s knowledge, skill or competence for health and safety or regulatory compliance purposes, you need to ensure the assessment is designed correctly and runs consistently, which means that they must be reliable and valid.

For assessments to be reliable and valid, it is necessary that you follow structured processes at each step from planning through authoring to delivery and reporting. These processes are explained in our new white paper “Assessment Results You can Trust” and I’ll be sharing some of the content in future articles in this blog.

For fuller information, you can download the white paper, Assessment results you can trust

RELIABILITY EXPLAINED

VALIDITY EXPLAINED

ASSESSMENTS MUST BE RELIABLE AND VALID

COMPARISON WITH BLOOD TESTS

Why human oversight in AI-based assessments matters for bias, trust, and accuracy

How to Measure Construct Validity

Understanding Convergent & Discriminant Validity

Get in touch

I’m looking for

RELIABILITY EXPLAINED

VALIDITY EXPLAINED

ASSESSMENTS MUST BE RELIABLE AND VALID

COMPARISON WITH BLOOD TESTS

Why human oversight in AI-based assessments matters for bias, trust, and accuracy

How to Measure Construct Validity

Understanding Convergent & Discriminant Validity

Get in touch