Day 5

Table of Contents


 
 
 
 
 
 

Review

When conducting educational research, we often want to generalize knowledge from what we conclude from out sample to a larger (target) population. How valid that generalization can be done is based, in part, on how well our sample represents the population.

Target population:

To whom we want to generalize

Accessible population:

The group we can select from to conduct the research

Sample:

The group we actually select to represent the population

Sampling:

The process of selecting the sample. Common methods include: Random, stratified, clustered, systematic

Sampling bias:

Bias occurs when the sample doesn't represent the population. Common biases include convenience, volunteers, poor judgment sampling, and administrative convenience.


 
 
 
 
 
 

Educational knowledge

How do we know what works in a training or educational setting? Usually our knowledge comes from

Testing is the most exact, but the process presents many problems. The biggest problem is flawed tests, which lead to flawed results and conclusions. Flaws originate when tests are

To help with the administration, many tests are standardized. This means:

Standardized tests allow for comparison of scores across time or distance, as long as procedures and scoring are followed, and tests are valid and reliable.


 
 
 
 
 
 

Test Validity

Validity is the degree to which a test measures what it is suppose to measure for a specific group. There are four types of validity -- how well the test matches the content of what is trying to be measured, ability to predict attributes of a variable, similarity of results to another test, and attitudes of people.

Additional information on test validity

Practice your knowledge of test validity.


 
 
 
 
 
 

Test Reliability

If you take the GRE twice, will you obtain the same score?

Reliability tells us how well a test consistently measures its intent (Obtained scores are estimates of true scores; i.e., obtained = true).

High reliability gives us confidence that the scores an individual receives would be the same scores if given the test later or a different test. The strength of reliability is shown on a scale from .00 to 1.00 (coefficient of reliability).

A variety of methods can be used to measure reliability, including:

  • Test-retest:

usually 7-10 days between attempts.

  • Equivalent forms:

two tests to measure same material--often pre and post test

  • Split-half:

compares one half of the student's score to the other half

  • Rational Equivalence:

average of all split halves

  • Scorer/rater:

for interjudge and intrajudge subjective test scores

If reliability is low, obtained scores may not reflect true scores.


 
 
 
 
 
 

Choosing a test

Most educational testing involves examining one of the following:

Selecting a test is easier than creating a test. Two reference books to check are Mental Measurements Yearbooks (MMY) and Tests in Prints. The Eric Clearinghouse on Assessment and Evaluation can help you find tests and test reviews on-line.

Before selecting a test, ask yourself:

Creating your own test?

Activity: Finding standardized tests and reviews on-line


 
 
 
 
 
 

Common Mistakes

Many researchers make a variety of mistakes when choosing or creating a test instrument. These include choosing a test:

Other mistakes include:


 
 
 
 
 
 

Closure; Review and Assignments

Review questions: (To find the answers, click on the question mark icon)

Before next week:






For comments about this page, please contact Julie Gallant