Program Evaluation

Home
About Program Evaluation
The Evaluation Process
Evaluation Tools
Case Studies
Program Evaluation Resources
Downloads
FAQ
Professional Dev. Standards
Standards and Rubrics
Site Map
Contact Program Evaluation

 

 

Up

Program Evaluation Planning and Design

A Guide for Teacher Centers

Seth Aldrich, Ph.Dd

This overview of program evaluation design is intended to assist Teacher Centers in planning and conducting evaluation of their professional development activities. The overview will contain references to materials available on the programevaluation.org website.   

Focusing the evaluation

Careful planning is essential to any program evaluation.  While evaluations often do not unfold exactly as planned, it is essential to identify the purpose of the program, key elements that help the program to be successful, what success would look/sound like, and what it is that key audiences need and want to know. Understanding these issues provides focus to the evaluation. Focusing the evaluation helps the evaluator identify the most crucial questions and how those questions can be realistically answered given the context of the program and resources available.  The Evaluation Planner can help evaluators to develop an evaluation plan.  There are many benefits to carefully planned evaluation:

  • Thoughtful questions yield useful results.
  • Assessments can be embedded in programs allowing for more depth of information.
  • Responsibilities can be delegated to participants, trainers, and other stakeholders.
  • Coordination of information collection procedures makes evaluation much more efficient.
  • Planning prevents duplicated efforts.  Existing information or data sources can be identified and used.  For example, a school district may be already collecting information relevant to program outcomes.

Purpose of the evaluation

The first step in evaluation is to have a clear understanding of why the evaluation is being conducted in the first place.  Is it to satisfy a grant requirement?  Provide information to continually improve Center functioning?  Decide what programs to continue and which ones to cut?  Certainly, many evaluations have multiple purposes and audiences, but these have to be clearly identified and prioritized.   Once the purpose of the evaluation is understood, the aspects of the program to be investigated and those who might see, use and or be affected by the evaluation results (audiences) become clearer.  

As the purpose of the evaluation becomes clear, evaluators list ‘objects’ or ‘processes’ to be evaluated.  For example an object to be evaluated might be inservices/courses provided by the Teacher Center, while a process might be how inservices/courses are selected and advertised.   Since it is unlikely that all activities and processes would be evaluated, it is important to prioritize, however at first brainstorming all potential objects and processes to be evaluated may be helpful.  Keeping the Teacher Center mission statement, purpose for the evaluation, stakes of outcomes and potential evaluation audience in mind will help to prioritize throughout the planning process.

Back to Top

Identifying audiences and stakeholders

Now that you have a list of Center programs and or activities that might be evaluated, consider who is affected by the program (stakeholders) and who might receive and or use information resulting from the evaluation (audiences).  Below is a list of potential stakeholders and audiences for a Teacher Center Evaluation. 

Stakeholders (Who is effected by the program)

Audience (sees and or uses evaluation information )

Students

Center Director/Policy Board

Parents

State Education Department

Teachers

Funders

Administrators

Administrators

Community members

Potential Advocates

 

Inservice/course Instructors

It is important to identify audiences and stakeholders early on because they will shape what questions are asked, the rigor needed to support results, and the way in which results will be communicated.

Level of impact

Professional development can have an impact on participants in a number of ways including: building awareness, increasing knowledge and skills of participants, and promoting changes that result in positive student outcomes.  Impact may be seen as a hierarchy beginning with the goal of increasing participant awareness and culminating at the top of the hierarchy with the objective of promoting positive student outcomes:

  • Increased participant knowledge/awareness of issues related to the training
  • Participant understanding of the relevance that the staff development can have for potential impact on their professional practice
  • Impact on participant's behavior, methods, materials used in the classroom
  • Positive impact on student/classroom outcomes

Training objectives should be clear from the onset of any substantial staff development effort. While it may be appropriate for some staff development efforts to raise awareness, increase knowledge or inspire educators, a comprehensive staff development plan will include trainings that result in long term positive impact for a wide range of educators and students. A program whose sole intent is on building participant awareness would yield weak outcome data if student impact were assessed. Likewise, an evaluation consisting only of participant satisfaction ratings would sell a program short if the program were successful at producing measurable student outcomes.  Understanding intended impact is important in prioritizing and designing an evaluation plan.  The Professional Development Outcomes Planner and Survey are designed to assist evaluators in determining intended impact and evaluating outcomes and a variety of levels.

Back to Top

Determining the ‘stakes’ or importance of programs and their outcomes

Several considerations are weighed when determining stakes of programs and their outcomes including:

  • Program cost - Programs that are expensive need to be proven effective and if not improved or abandoned. 
  • Importance of outcomes (e.g., implications of program failure) - Certain programs have serious implications for failure.  Participants of CPR courses are tested for proficiency because outcomes may mean the difference between life and death.  A program intended to inservice teachers in assessing statewide testing can also have serious implications if participants are poorly trained.
  • Perceived importance of program/outcomes by stakeholders and audiences – In some cases the reason a program is being evaluated has to do with a request by an audience (e.g., a funding source).  It is important to know the evaluation information these important audiences are looking for. 

When outcomes are very important, professional developers and evaluators need to make sure that the program is effective in achieving the intended results.  Therefore, high quality, defensible measures are selected.  For example, you may want to observe someone scoring ELA assessments rather than ask them if they know how to score accurately.

Formative versus Summative Evaluations

Whether the evaluation is being conducted in order to determine success or failure (summative evaluation), or to make improvements through adjustments based on ongoing feedback (formative evaluation), has a significant impact on the measures used and who receives the information.  Below are some examples of formative and summative questions that might be included in a Center Evaluation:

Summative

Formative

Should we continue a particular inservice/course based on attendance and satisfactory participant ratings?

Based on participant feedback, what might a presenter do to improve her inservice course? 

How many people are using the Resource Library?

What might increase teachers’ use of the Resource Library? 

Did participants in the inservice/course implement key objectives as taught? 

What were obstacles to teacher implementation that should be addressed to make to program more successful?

Did students make significant gains as a direct result of the program?

How could the program be improved to optimize student outcomes?

Back to Top

Understanding the evaluation ‘context’

Contextual factors influence how the evaluation is conducted and how it may be interpreted.  These factors must be weighed in the planning stage:

  • What time and resource constraints do we have to conduct the evaluation?
  • Are there hidden (or not so hidden) political agendas associated with the program?
  • Has the program had the opportunity to be effective?  (Don’t kill something before it has had the chance to show itself as effective.)
  • Will the evaluation results be challenged?  (If so the supporting data should be very strong and clearly communicated.)
  • What is the historical context of the program?
  • How could the context affect information collection?

Answering these questions help the evaluator to decide whether or not to evaluate a particular program.  He or she may also choose to use an external evaluator (to avoid a conflict of interest or accusation of bias).   In some contexts (e.g. a program that is still in development) the evaluator may choose to conduct a formative as opposed to a summative evaluation.

Prioritizing what to evaluate

Now that evaluation targets, audiences, stakeholders, and the context is understood, the evaluator can begin to prioritize.  These priorities may change as the evaluator takes a realistic look at resources to collect and analyze information.   

Generating questions

Meaningful questions are the heart of the evaluation.  Too few and the evaluation is not comprehensive enough.  Too many questions and the quality of the information may be compromised or resources stretched in a way that could hurt the program.

Qualities of good questions include:

  • Relevant to the purpose of the evaluation and program goals so that they are useful for important decisions;
  • Important to the identified audience(s);
  • Comprehensive enough to provide adequate information about what is being evaluated;
  • Constructed in ways that information is balanced and not bias;
  • Answerable with realistic means and at a reasonable cost.

Questions should be framed in observable, unambiguous terms.  It is also important to frame questions with a feasible assessment in mind.

Once major questions have been identified, sub-questions are generated that provide other important information with relatively low expenditure of resources.  It is important to remember that once a question is asked, the information has to be collated, analyzed accurately and communicated (avoid collecting data and then not using it).  For example, while direct observation and open-ended questions may yield important information, five hundred observations/responses will take some time to collect and analyze in a meaningful way.

Here are a few tips for asking questions in ways that resulting information can be organized, distilled and communicated:

  • Collect only information that you can use
  • Data is only useable if you can make sense of it and communicate it to others
  • Beware of opportunities to hear only what you want to hear
  • Remember that you may be the one that has to crunch a mountain of data; when possible, use multiple choice/selection responses.
  • Use tried and true measures when they can provide meaningful information.
  • When developing your own questions, run them by a few people first to make sure that they are clear and understood in the way you intend them to be. 

Back to Top

Existing data or sources of information

 

 

Identifying existing information or information sources is one way of using resources efficiently.  Some examples of this are:

  • Use of assessment data being collected by another organization For example, student achievement data collected by a school district that is related to the program being evaluated.   In this case it is important to use caution that the information being used is relevant and sensitive enough to be a fair assessment of program outcomes.
  • Archival data such as attendance, school suspensions, CSE referral rates, documented use of materials, website hit rates take little effort to collect and may be related to program outcomes.
  • Intensive evaluation efforts conducted during a previous year may not have to be replicated every year.  Instead, focus on another program.
  • Programs that have a very strong base in research may require less evaluation of efficacy than unproven programs.  However, it is important to have stringent criteria for what is ‘evidence based’.  Many programs purport to be research based, but do not have strong backing.  It may also be important to evaluate how the program is conducted locally.

Very organized evaluators may collaborate with organizations to share evaluation efforts.

Determine whether or not aspects of the evaluation are feasible and or appropriate

Considering available resources (e.g., money, time, personnel) and the context of the evaluation, is it possible to answer these questions meaningfully and in a way that the results will be used?  This is a reality check stage before beginning to seriously plot out the procedures, resources measures and calendar for the evaluation.  If the answer is yes, proceed in designing the evaluation.  If the answer is no, the evaluator may need to reconsider the question or in some cases opt not to conduct an evaluation at all.

Design the Evaluation

Once questions are generated, begin to identify measures for answering them.  As measures are identified or created, the questions may be altered to ‘fit’ the assessment.  Below are different evaluation processes.  Consider what type of design fits your needs.

Fixed versus emerging - Some evaluations are fixed.  That is all procedures and measures are identified up front and the evaluation goes according to its plan.  Other evaluations are emerging.  Emerging evaluations are more flexible.  As information is collected new questions may be identified and incorporated into the evaluation.  Many evaluations use a combination of both.  You need to have a plan, but it is good to address emerging information.

Formative versus summative – As discussed previously, some evaluation questions are designed to make a final conclusion (summative), while other evaluations obtain ongoing feedback in an effort to make program adjustments (formative).  The goal of summative evaluation is to prove or disprove programs while the goal of formative evaluation is to improve programs. 

Experimental versus natural inquiry – Evaluations involving experimental design collect information on those receiving a program and those not receiving a program (or the same person before during and or after a program) in order to prove that it makes a significant intended impact on participants.  Natural inquiry simply investigates what happens when a program occurs.

Focus on program outcomes versus the process involved in the program – You can evaluate the quality of the widgets (outcome), efficiency and cost effectiveness of the process in which they are made, or both. 

 Back to Top

Develop an Evaluation Calendar

Once the above planning and design information is collected, begin to organize it on a timeline.  Use a calendar to determine how the evaluation will unfold.  Don’t forget to include organizational activities such as any assessment training that might be needed, organizational meetings/contacts, data analysis and sharing procedures.  When possible, plan to contact audiences and get them involved.  This way, they are more likely to take ownership of the evaluation results and utilize them.  Identify ways that evaluation tasks can be delegated to appropriate parties (e.g., inservice/course instructors).  The evaluation calendar should be another reality check to make sure that the evaluation plan is a feasible one.

 Tools


Evaluation Planner

Evaluation Planner Spreadsheet (xls)

Evaluation Focus Spreadsheet