|
|
Day 5Table of Contents |
When conducting educational research, we often want to generalize knowledge from what we conclude from out sample to a larger (target) population. How valid that generalization can be done is based, in part, on how well our sample represents the population.
|
Target population: |
To whom we want to generalize |
|
Accessible population: |
The group we can select from to conduct the research |
|
Sample: |
The group we actually select to represent the population |
|
Sampling: |
The process of selecting the sample. Common methods include: Random, stratified, clustered, systematic |
|
Sampling bias: |
Bias occurs when the sample doesn't represent the population. Common biases include convenience, volunteers, poor judgment sampling, and administrative convenience. |
How do we know what works in a training or educational setting? Usually our knowledge comes from
Testing is the most exact, but the process presents many problems. The biggest problem is flawed tests, which lead to flawed results and conclusions. Flaws originate when tests are
To help with the administration, many tests are standardized. This means:
Standardized tests allow for comparison of scores across time or distance, as long as procedures and scoring are followed, and tests are valid and reliable.
Validity is the degree to which a test measures what it is suppose to measure for a specific group. There are four types of validity -- how well the test matches the content of what is trying to be measured, ability to predict attributes of a variable, similarity of results to another test, and attitudes of people.
Additional information on test validity
Practice your knowledge of test validity.
If you take the GRE twice, will you obtain the same score?
Reliability tells us how well a test consistently measures its intent (Obtained scores are estimates of true scores; i.e., obtained = true).
High reliability gives us confidence that the scores an individual receives would be the same scores if given the test later or a different test. The strength of reliability is shown on a scale from .00 to 1.00 (coefficient of reliability).
A variety of methods can be used to measure reliability, including:
|
usually 7-10 days between attempts. |
|
two tests to measure same material--often pre and post test |
|
compares one half of the student's score to the other half |
|
average of all split halves |
|
for interjudge and intrajudge subjective test scores |
If reliability is low, obtained scores may not reflect true scores.
Most educational testing involves examining one of the following:
Selecting a test is easier than creating a test. Two reference books to check are Mental Measurements Yearbooks (MMY) and Tests in Prints. The Eric Clearinghouse on Assessment and Evaluation can help you find tests and test reviews on-line.
Before selecting a test, ask yourself:
Creating your own test?
Activity: Finding standardized tests and reviews on-line
Many researchers make a variety of mistakes when choosing or creating a test instrument. These include choosing a test:
Other mistakes include: