Implementation and Evaluation

Introduction Connect Apply Reflect Extend


You probably noticed that while we spent many modules agonizing over the details of the first two phases of the ADDIE process, Analysis and Design, we devoted only a single module to Development and now are zipping through Implementation and Evaluation together in but one lonely module!

It's not because time is running out and we're trying to get caught up! It's just that, as instructional designers, most of what we do is analysis and design. We may oversee development, but production specialists often carry out much of the work. In addition, the analysis and design phases are where most of the real decisions are made. Sure, it takes skill, particularly writing skill, and hard work to develop materials, but we're assuming you're capable of those if you got this far!

But why combine implementation and evaluation? Because, for our purposes, they go hand-in-hand. We're really interested in implementing our instructional programs and products in order to find out whether they work as expected. Finding out if they worked or not is our evaluation.

Chapter preview

  • The final two steps of the ADDIE process are Implementation and Evaluation.
  • For instructional designers, implementation is sometimes involved, but normally is apart from the realm of "instructional design."
  • There are two types of evaluation: Formative evaluation that goes on constantly within the ADDIE process to improve the process, and summative evaluation that goes on afterwards to judge the worth of the project, process, or product.
  • Donald Kirkpatrick established four types of evaluating training programs, each of which looks at a different level of the training's success.
  • Evaluation plans are often generated to identify the specific tasks, dates, and people responsible for the evaluation.


In this module

Four levels of training evaluation
Product and program evaluation tools
The evaluation plan


Let's deal with implementation first. Implementation just means delivering the instruction to the target audience. You conduct the course or workshop, send out the manuals, videotapes, and CD-ROMs, post the website, and otherwise get learners using instruction. There are project management issues involved, such as scheduling courses and duplicating and mailing products.  

Remember that solution system you created in your performance analysis?  This is the time to revisit and assure all components are addressed.  This is a critical component of the implementation effort, since all identified causes/drivers must be addressed if your training component is to make a difference. 

Educational Technology courses, including EDTEC 572 - Technology for Course Delivery and EDTEC 684 - Project Management, address implementation-related topics.


Evaluation is a very important phase for us as instructional designers. This is where we get feedback on our skill and effort. It teaches us lessons about how to do things better. It lets us know where we stand with respect to our instructional design decisions.

There are two types of evaluation: formative and summative. Formative evaluation takes place while there is still time to change the program or revise the project. It consists of some type of prototype evaluation, which we discussed in a previous module.

The instructional design team or their agents usually conduct formative evaluation "in-house." Summative evaluation may be conducted by a third party, an objective outsider. Formative evaluation involves small numbers of users, just enough to let us know what works and what doesn't. Summative evaluation usually relies on larger samples, in order to provide data to justify the expenditure of time and effort on the product. The result of formative evaluation is to make recommendations for making the product better. The outcome of summative evaluation is to document the effectiveness of the finished product.

Four levels of training evaluation

Educational technologists often use a four-level model of evaluation. Each successive level represents a more rigorous test of the instruction than the previous.

Level 1: Reactions

At Level 1, reactions, you want to know how well the learners liked your instruction. What are their attitudes towards the instructor, the materials, the environment in which the course or workshop was given, and so forth? You have no doubt filled out an attitudinal survey at the end of a course or workshop that attempted to measure Level 1. By gathering data from everyone in the program you can get a more realistic picture of reactions than you might get by just listening to a few vocal people.

In spite of the somewhat superficial nature of Level 1 evaluation, they can be very important. Sometimes learner's reactions are all sponsors want to hear about when they make decisions about whether or not to continue a program.

Level 2: Learning

Level 2, learning, is a somewhat more difficult test. People may have liked the instruction just fine, but not really learned anything! At Level 2 you want to know whether or not learners mastered the instructional objectives. You can measure this by giving them a test during or at the end of the instruction. These are the test items you wrote in Module 9, featuring both remember and apply types of knowledge. In a course or workshop this might be in the form of a paper-and-pencil or a performance test. In a multimedia instructional product, the test items might be delivered as a computer-based test. Unless the multimedia is part of an online system, you may need to make special provisions to get the data from the test back to you.

Level 3: Transfer

Transfer is the focus of Level 3. Even when learners master the instructional objectives, it's not always clear that they take their new skills and knowledge back to the workplace with them. To really find out whether the instruction "took," you must figure out some way to measure whether people are using what they learned on the job or otherwise in real life. Did the number of complaints taper off? Did their sales go up? Did their errors go down? Did their productivity increase? In business organizations, you can usually find some measurable indicators of individual performance improvement.

K-12 situations are a little different. We don't always ask students to actually use a lot of what they learn, except perhaps in other classes. Transfer across subject areas, such as math to science, or language arts to social studies, is notoriously difficult to achieve without special effort and collaboration on the part of groups of teachers. Jerry may know how to calculate averages in math class, but don't expect him or her to be able to calculate averages when doing a science experiment without directing attention to the process.

Level 4: Results

The most difficult level to evaluate, let alone accomplish with an instructional program, is Level 4, in which you attempt to find out whether you provided the appropriate intervention to accomplish your organizational goals. Did you get results where it counts? Are individual salesperson's higher sales resulting in a better bottom line? Are individual students' improvements in using hand-held calculators resulting in better SAT scores?  Level 4 measures return on investment (ROI).

You can use the results of each of these levels of evaluation in different ways. When you find problems at Level 1 (satisfaction), you know you need to pay attention to issues of motivation and perhaps how the materials are delivered or who is delivering them. Level 2 (learning) problems are usually analysis and design issues. Did you get the objectives right? Did you provide appropriate learning strategies? Did you provide enough of them?

Level 3 (transfer) weaknesses are more difficult to address. Sometimes more examples or more practice will improve transfer, particularly if you provide more variety of either or both. Sometimes Level 3 problems point to the need to consider alternative instructional strategies such as apprenticeships or on-the-job training. And don't forget the power of job aids to support performance on-the-job.  Level 4 (intervention) problems point you back to your initial performance analysis. Did you identify the correct performance drivers for this problem or opportunity? Did you--or someone--address all those you did identify?

We once conducted an innovative and highly successful (Levels 1, 2, & 3) faculty development workshop to help university instructors use innovative teaching methods and emerging technologies. It didn't make much of a dent on the university system as a whole, however, because many other performance factors were not taken into account. Faculty didn't have access to adequate tools when they went back to their own campus offices.

Level 4 probably wouldn't have looked so good, had it been assessed, because all the performance issues hadn't been addressed. To address Level 4 issues, you must address recommendations from your performance analysis in more than just the skills and knowledge driver. Level 4 doesn't get measured very often.

Product and program evaluation tools

Educational technologists evaluate both instructional products and programs at all four levels. Computer-based instructional products often support built-in assessment tools. For instance, many web sites count the number of "hits" on different pages to see what learners are looking at. More sophisticated tracking systems record the order in which learners interact with a program, the amount of time they spend on a given screen or with particular content, when and how long they access help, examples, remediation, enrichment materials, and so forth.

Of course, unobtrusive or other automatic data gathering strategies don't usually tell you the whole story. You may need to collect some additional information from learners using surveys, interviews, focus groups, and so forth.

The same is true of instructional programs like courses and workshops. The Level 1 participant survey by itself doesn't tell you much you can use to revise a program. You can get a better picture of a course by looking at Level 2 with individual performance tests, at Level 3 with on-the-job performance data, and at Level 4 with organizational data.

All these tools come together in an evaluation plan.

The evaluation plan

There are four steps to creating an effective evaluation plan for either a product or program: (1) identify the question you wish to answer or the aspect of the project you wish to evaluate; (2) choose or create an appropriate instrument with which to gather data to answer the questions; (3) establish a timeline for gathering data on each question, analyzing it, and writing the report; and (4) if appropriate, assign a person to take responsibility for each data-gathering, evaluation, and report-writing task.

Figure 13a shows a sample evaluation plan using a variety of tools to gather data on a variety of questions. Notice that the objectives listed in the first column are not necessarily the instructional objectives for the project. Rather, they are the questions or aspects of the product or program you wish to evaluate. Objectives 1-3, for instance, might actually correspond to instructional objectives, while objective 4 might be to find out what effect the new program has on students' GRE scores. This goes back to the question of organizational goals.




Person involved

Objectives 1-3

Pre/post test



Objective 4




Objective 5-6








Figure 13a. A sample evaluation plan.

The important point here is that you use a variety of tools, or instruments, to evaluate a product or program at different levels.

An evaluation report introduces the evaluation, states the objectives, describes the instruments and methods used, reports the data, and synthesizes the results in the form of conclusions or recommendations. The report may be for funding sources, parents, administrators, your own instructional design team, or for academicians. Write it in a style that is appropriate to the audience you intend to reach.


SDSU has installed an expensive multimedia, simulation-based training program for its football players. It focuses on field goals and three point conversions. Sue Designer has been hired to evaluate the training program. Read the following scenarios and decide which Level of Donald Kirkpatrick's training evaluation programs Sue used.


Level 1

Level 2

Level 3

Level 4

A few weeks after the football players view the training software, Sue visits the coaches to see if the players' performance on the field has improved.


Sue develops a survey to measure how the football players felt about the simulation-based training program. Did they like the sounds, colors, video, etc.


At the end of the football season, Sue compares last year's records to this year's records to see if there was an increase in the number of three point conversions and field goals.


In order to determine if the football players actually learned anything from the training program, you give them a short test on three point conversions and field goals.



This week, you will be working on your job aid assignment.  Your reflection task is simply to think about how evaluation might benefit your development efforts.  In what ways could you use formative and summative evaluation to inform the design and development of your job aid? Remember that pilot testing is a formative evaluation approach. There is no need to post a response, but note that the required pilot testing that you will accomplish is indeed evaluation.


Overview of this section

People in action

During the development of his workshop, Roberto conducted a series of formative evaluations with his supervisor and local managers to get feedback on the relevancy of scenarios, the design of the overheads, and the page format for the print-based materials. These rapid prototype evaluations complimented the earlier formative evaluations he had conducted in the analysis and design phases. Although Roberto was fairly confident that the formative evaluations kept him on track, the vice president tasked him with finding out if the materials he created were accomplishing the desired results.

Because the training workshop was not scheduled for a few weeks, and the performance evaluations not due until February, Roberto decided to establish an evaluation plan to gather data for Levels 1, 2, and 3 summative evaluations. His plan is shown in the following table.




Person involved

Level 1--Attitudes

Store instrument



Level 2--Objective

Pre/post test

12/1, 12/2


Level 3--Transfer

Performance evaluation forms


Personnel managers and Roberto

The first step to accomplish the evaluation was to generate an instrument to gather data on the managers' knowledge of using the performance evaluation rubric. Roberto had identified earlier that motivation, environment, and lack of incentives contributed to managers not completing the forms, but the evaluation rubrics were new to the managers and they needed training on their use. Roberto generated a criterion-based instrument to use at the beginning of the three-day conference the 50 managers would be attending in December. A text-based scenario that described three employees was generated, and during the first day of the workshop managers were asked to complete a performance evaluation on the employees' behaviors, knowledge, and skills.

During the second day of the conference, it was Roberto's turn to present. Using the instructional plan and job aids generated earlier, Roberto conducted his training in his allotted 90 minutes. At the conclusion of his presentation Roberto administered two evaluation components. First, he presented a new text-based scenario created to parallel the one he used for this pretest data. The managers read through the scenarios and completed the rubrics. The data Roberto gathered on this instrument would be used with the data gathered from the pretest to identify if the objective had indeed been learned by the managers.

Next he used the company's standard form to gather responses from the attendees on their feelings towards his presentation (a Level 1 evaluation). This instrument critiqued the style and method of presentation, and the feelings of the managers toward Roberto's instruction and instructional materials.

These two techniques would tell Roberto if the managers liked his materials, and whether or not they had learned how to do the performance evaluations, but there was an additional component that needed to be assessed. The managers were to complete the performance evaluation of their employees by February 1--but would they do any better than during the last evaluative period? To conduct the Level 3 evaluation Roberto decided to send a stamped, pre-addressed form to the personnel managers the week before the February 1 performance evaluation due date soliciting the names and dates when evaluations were submitted by the department managers. The vice president had provided Roberto with extant data from the previous period detailing who had completed the performance evaluations, and both Roberto and the vice president were anxious to see if the training had been transferred from the workshop to the work place.

Main points 

  • For instructional designers, most of their work comes during the analysis, design, and development phases; all of which are tied into evaluation.
  • There are two main types of evaluation--formative evaluation and summative evaluation.
  • Formative evaluation usually takes place in-house, and examines material while there is still time to alter it to make it better.
  • Summative evaluation is often done by an objective outsider and is used to document the effectiveness of the finished product.
  • Four levels of evaluations are often carried out. Each level provides more indepth information, but each level is also harder to conduct.
  • Level 1 evaluations examine the attitudes and reactions of how well learners liked the instructional material and the instructor.
  • Level 2 evaluations focus on the learning that takes place, and is based on the objectives.
  • Level 3 evaluations examine how well the content is being transferred from the instructional materials to the work place. Do changes really happen?
  • Level 4 evaluations try to identify if results actually occur--are the organization's goals actually being reached?
  • Each level of evaluation provides different insights into how well the intervention is working.
  • A variety of tools and techniques are often used to collect evaluation data, and are organized in the evaluation plan.
  • In creating an evaluation plan, identify the question(s) you wish to evaluate, choose the instrument(s) to collect the date, establish the time line for data collection, and assign people and resources to the different evaluative task(s).
  • The final report will be written for a specific audience, and should be tailored to their needs.

Next step

The next step is for you to see how well you can take the concepts we have shared with you in this course and apply them as you begin your new profession--or add to the skills and knowledge you use in your current profession. Be sure to stay involved, for educational technology is a dynamic field that is quickly evolving. Your involvement can be accentuated by reading current literature and joining a local organization that supports educational technology. Staying abreast of current developments and trends will help you focus on directions that the field of educational technology is headed.

For more information

Kirkpatrick, D. (1998). Evaluating training programs : The four levels. Berrett-Koehler Pub.

Introduction Connect Apply Reflect Extend

Page authors: Donn Ritchie & Bob Hoffman & James Marshall Last updated: Marshall,Spring, 2006

All rights reserved.