Does your test really measure what you think it is measuring? If it does, then your test has a high validity. Testing instruments used in educational research attempt to measure constructs such as achievement, aptitude, attitudes, and motivation. You can only test these non-observable constructs indirectly; this fact opens up many threats to the validity of a test.
Content Validity
Your test can't ask all possible questions
related to a particular subject. Instead, you have to construct a
test out of a sample of possible questions. But does your sample
represent the content domain? It takes an expert to answer this
question. An expert gathers evidence by examining the content of the
test as compared to the entire content domain. Note that the expert
must carry out this procedure from the point of view of a specific
use for the test results. For example, after training a technician
how to handle unique and unexpected problems with a computer network,
a test that focuses on naming the parts of a network would lack
content validity. While the test might have wonderfully constructed
questions, it would not measure the what we truly want to know...does
our technician know how to handle unique and unexpected problems on
the computer network?
Predictive Validity
How well did your GRE score predict your
success in graduate school? It should have at least been
in-the-ball-park. The developers of the GRE have put a great deal of
effort into establishing a relationship between scores on the test
and indicators of graduate school success (GPA for example). However,
should someone use the GRE to predict leadership skills? Maybe. But
until you establish a relationship between GRE scores and some
accepted measure of leadership success you cannot do so with
confidence.
Construct Validity
Suppose you wanted to test the
stick-to-it-ness of your trainees as a function of their frustration
tolerance level? While you might measure stick-to-it-ness in units of
time, how could you measure frustration level? The construct
frustration may be deduced from observable traits like blood
pressure, pulse rate, sweat output, and tears. The threat to
construct validity rests in the difficulty in confidently
establishing a relationship between observable evidence, like test
scores or physiological responses, and the construct we hope to
imply.
Concurrent Validity
I have a new, fast, cheap, and easy way of
testing IQ. Simply indicate your favorite color of the rainbow! The
blue end of the rainbow indicates high IQ, while the red end
indicates low IQ. If I could establish a relationship between each
color of the rainbow and IQ as measured by an accepted test, in other
words, establish my test's concurrent validity, my inexpensive new
test would be a hit. Tests with high concurrent validity must offer
the same assessment about the person being measured as tests that are
already accepted as valid measures of some construct. If they do say
the same thing, then use which ever test is cheaper and easier to
administer.
On to validity activity.