Validity Background

Does your test really measure what you think it is measuring? If it does, then your test has a high validity. Testing instruments used in educational research attempt to measure constructs such as achievement, aptitude, attitudes, and motivation. You can only test these non-observable constructs indirectly; this fact opens up many threats to the validity of a test.

Content Validity

Your test can't ask all possible questions related to a particular subject. Instead, you have to construct a test out of a sample of possible questions. But does your sample represent the content domain? It takes an expert to answer this question. An expert gathers evidence by examining the content of the test as compared to the entire content domain. Note that the expert must carry out this procedure from the point of view of a specific use for the test results. For example, after training a technician how to handle unique and unexpected problems with a computer network, a test that focuses on naming the parts of a network would lack content validity. While the test might have wonderfully constructed questions, it would not measure the what we truly want to know...does our technician know how to handle unique and unexpected problems on the computer network?

Predictive Validity

How well did your GRE score predict your success in graduate school? It should have at least been in-the-ball-park. The developers of the GRE have put a great deal of effort into establishing a relationship between scores on the test and indicators of graduate school success (GPA for example). However, should someone use the GRE to predict leadership skills? Maybe. But until you establish a relationship between GRE scores and some accepted measure of leadership success you cannot do so with confidence.

Construct Validity

Suppose you wanted to test the stick-to-it-ness of your trainees as a function of their frustration tolerance level? While you might measure stick-to-it-ness in units of time, how could you measure frustration level? The construct frustration may be deduced from observable traits like blood pressure, pulse rate, sweat output, and tears. The threat to construct validity rests in the difficulty in confidently establishing a relationship between observable evidence, like test scores or physiological responses, and the construct we hope to imply.

Concurrent Validity

I have a new, fast, cheap, and easy way of testing IQ. Simply indicate your favorite color of the rainbow! The blue end of the rainbow indicates high IQ, while the red end indicates low IQ. If I could establish a relationship between each color of the rainbow and IQ as measured by an accepted test, in other words, establish my test's concurrent validity, my inexpensive new test would be a hit. Tests with high concurrent validity must offer the same assessment about the person being measured as tests that are already accepted as valid measures of some construct. If they do say the same thing, then use which ever test is cheaper and easier to administer.

On to validity activity.