How to measure construct validity

When analyzing assessments, validity shows how well an assessment measures what it is designed to measure. There are three main types of validity: content validity, criterion validity, and construct validity (see our post on validity and reliability to learn more).

This guide outlines how to measure construct validity. It briefly explains this form of validity and its subtypes, then shows how the construct validity of an assessment can be quantified through basic statistical analysis.

What is construct validity?

Construct validity is used to determine how well an assessment measures the construct that it is intended to measure. For example, a math test would have high construct validity if it was good at showing which candidates possessed strong mathematical aptitude (and which did not).

It is important to understand this fundamental concept fully before attempting to measure it. For more information, take a look at our guide to construct validity.

The subtypes of construct validity

Construct validity is made up of two subtypes: convergent validity and discriminant validity. Both of these subtypes are used to establish how well a test evaluates what it is meant to evaluate, but there are key differences between them.

With convergent validity, assessment results are compared with the results of other tests that are designed to measure the same thing (e.g. a mathematical aptitude test vs other similar tests). A strong positive correlation between the results would indicate high convergent validity, suggesting that the tests are good at measuring what they are meant to measure.

With discriminant validity, assessment results are compared with the results of tests that are designed to measure different constructs (e.g. a mathematical aptitude test vs a linguistic aptitude test). There shouldn’t be much of a relationship between the results, as the tests measure different things – no (or very low) correlation indicates high discriminant validity.

See our blog on understanding convergent and discriminant validity for further explanation of these subtypes of construct validity.

Testing construct validity

To test the construct validity of an assessment, its convergent and discriminant validity must be calculated. Here, we are trying to show that:

  1. The results of the assessment have a strong positive correlation with those of other assessments that measure the same construct (i.e. it has high convergent validity).
  2. The results of the assessment do not correlate with those of assessments that measure different things (i.e. it has high discriminant validity).

Calculating convergent validity

To quantify convergent validity, the correlation coefficients between the results of the assessment and those of other similar assessments should be calculated. To achieve this, either use the CORREL function in a spreadsheet or follow these steps for a manual calculation:

  1. Calculate the covariance of both sets of assessment results using the formula below.

Cov (X,Y) = Σ [(each result from X – μ) * (each result from Y – ν)] / n – 1

 Where:

X refers to the first set of assessment results

Y refers the second set of assessment results

μ = the mean of results from assessment X

ν = the mean of results from assessment Y

n = the number of candidates who took the assessments

Σ = the sum of the bracketed values


2. Calculate the standard deviation of each set of assessment results using the formula below.

Standard deviation = σ = √ ∑ [(xᵢ − μ)²] / n

​Where:

xᵢ = each result from the assessment

μ = the mean result from the assessment

n = the number of candidates who took the assessment

∑ = the sum of the bracketed values​


3. Calculate the correlation coefficient of the two sets of assessment results using the formula below.

Correlation coefficient = p =  Cov(X,Y) / σₓ * σᵧ

​Where:

Cov(X,Y) = covariance from step 1

σₓ = standard deviation of results from assessment X

σᵧ = standard deviation of results from assessment Y


To build an accurate picture of convergent validity, repeat this process using the results of as many other similar assessments as possible. Each correlation coefficient should be a number between -1 and +1.

Finally, take the average of the correlation coefficients. A high average correlation coefficient (i.e. close to +1) indicates high convergent validity. The cut-off point for an acceptable convergent validity is generally considered to be +0.70.

Calculating discriminant validity

To quantify discriminant validity, calculate the correlation coefficients between the results of the assessment and those of other assessments designed to measure completely different constructs (e.g. the results from a mathematical aptitude test vs the results of linguistics tests).

Gather as many sets of other assessment results as possible, making sure they measure something completely independent and unrelated. It’s important to note that the other tests should also have been taken by the same number of candidates as the main assessment.

Use the process outlined above to calculate the correlation coefficients between the results of the main assessment and each of the other assessments. Each correlation coefficient should be a number between -1 and +1.

Finally, take the average of these correlation coefficients. A correlation coefficient close to 0 indicates high discriminant validity. If the number is close to -1 or +1, then the assessment has very low discriminant validity.

Combining convergent and discriminant validity

The steps outlined above should produce an average correlation coefficient for both convergent and discriminant validity. These numbers should be between -1 and +1.

To combine the two and quantify overall construct validity, subtract the discriminant coefficient from the convergent coefficient. If the discriminant coefficient is negative (e.g. -0.15), turn this into a positive number (e.g. 0.15) before subtracting it.

Construct validity = Convergent coefficient – discriminant coefficient

A number close to 1 indicates very high construct validity. Any number less than 0.5 indicates limited construct validity. Negative values showing that an assessment has very low construct validity – this would suggest that the assessment is not a good way to measure the intended construct at all.

Academic studies use more complex means of assessing convergent and discriminant validity (see, for example, the multitrait-multimethod matrix). However, for the purposes of measuring the construct validity of an assessment, simply subtracting the average discriminant coefficient from the average convergent coefficient should produce an accurate enough picture.

A worked example of calculating construct validity

An organization wants to work out the construct validity of its latest compliance regulations assessment. By calculating the assessment’s construct validity, the test makers will be able to show that it is actually a good measure of how well employees understand the relevant regulations.

The test makers follow these steps to quantify the assessment’s construct validity:

  1. First, they take the results of the latest compliance regulations assessment and put them into a spreadsheet with the results of 10 previous tests on compliance regulations.
  2. Using the CORREL function, the test makers work out the correlation coefficients between the results of the most recent test and those of the previous similar assessments.
  3. They then take the average of the 10 correlation coefficients, which in this case is 0.95 (indicating high convergent validity).
  4. Next, they gather the results of 10 unrelated mathematical aptitude assessments that the same employees have taken and put them into the spreadsheet.
  5. The test makers find the correlation coefficients between the results of the latest compliance regulation assessment and those of the 10 unrelated math tests using the CORREL function.
  6. Once more, they take the average of the 10 correlation coefficients, which turns out to be 0.01 (indicating high discriminant validity).
  7. Finally, they subtract the average discriminant coefficient (0.01) from the average convergent coefficient (0.95) to get a construct validity coefficient of 0.94.

The test makers have worked out that the construct validity coefficient of the new compliance regulations assessment is 0.94. This value is close to 1, so they can conclude that the new test is a good way to measure if employees understand the relevant compliance regulations.

When measuring the construct validity of similar compliance regulation assessments in the future, the organization will be able to use the results of this test to calculate convergent validity.

Summing up

This guide has explained how to measure construct validity. It has provided a basic statistical method for quantifying the construct validity of an assessment based on convergent and discriminant validity.

Questionmark provides an online assessment platform that enables organizations to carry out tests and analyze their construct validity based on the results. Book a demo today.

Further reading

Although this simplified approach to measuring construct validity is fine for employee assessments, there are more detailed and accurate methods available. You may want to take a look at the following resources to learn more: