Program Evaluation

Home
About Program Evaluation
The Evaluation Process
Evaluation Tools
Case Studies
Program Evaluation Resources
Downloads
FAQ
Professional Dev. Standards
Standards and Rubrics
Site Map
Contact Program Evaluation

 

 

Up

Assessing Impact of Professional Development Activities on Teaching and Students

Seth Aldrich, Ph.D.

Introduction

Increasingly, staff developers are asked to assess and document outcomes of their trainings. Below are several examples of how the impact of staff development on classroom outcomes can be assessed. The first methods assess teacher implementation of staff development objectives. The last methods focus on student outcomes subsequent to implementation of staff development objectives. For each example, strengths and drawbacks are discussed.  

> TOPIC QUICK LINKS

Impact on Classrooms Impact on Students

Follow-Up Questionnaire

Analysis of existing evidence (records)

Self-Rating Checklist

Analysis of existing student work
Interview Techniques Pre-Post Test Scores
Direct Observation Group Comparison Test Scores
Analysis of Evidence Formative Evaluation

 

Impact on Classroom Instruction/Procedures
Much of the professional development provided by Teacher Centers is intended to make some impact on educators.  In order for student outcomes to be realized, changes must occur in classroom or school variables such as curriculum and instruction, assessment, materials used, management strategies, and or school procedures.  To understand how a program worked, we need to understand to what degree key program objectives were implemented.  This section provides some methods for assessing impact of professional development on educators and the instructional process.

1a. Follow-Up Questionnaires

Follow-up questionnaires come in a variety of sizes and flavors:

  • Mail in surveys sent to participants’ homes

  • E-mail surveys

  • Generic follow up surveys completed at later workshop/course sessions

  • Specific follow up surveys completed after participants have had a chance to implement course objectives

Resource Links
Follow-Up Questionnaire

Outcomes Planner

Outcomes Survey

The first example is a short, generic follow-up survey that could be mailed or e-mailed to participants or completed at a follow up session.  The generic survey illustrated below does not directly correspond to course objectives.  It is a very quick glimpse into participants’ reported follow through.  The evaluator has to rely on the respondent giving an honest response in order to get valid information concerning implementation.  Little information is gained about what factors prevented or enhanced implementation.

Back to Top

Example 1: Generic Questions

1) I have been able to implement major objectives taught at the workshop in a regular, sustained fashion.

Strongly Agree Disagree Agree Strongly Agree
1 2 3 4

2) I consider the changes in my teaching and or student outcomes as a result of implementing objectives of this professional development activity important and valuable.           

Strongly Agree Disagree Agree Strongly Agree
1 2 3 4

Advantages:

  • Generic, 'one size fits all' method makes this follow up assessment of impact very feasible.

  • Those in charge of collecting follow up information do not have to coordinate with those teaching the inservice course to define specific observable course outcome objectives.

  • Since it is so brief, respondents may be more likely to fill it out.

Disadvantages

  • The respondent is left with a vague question to answer. Which objectives? If they were able to implement anything taught, regardless of its significance, does that deserve a rating of 'somewhat'?

  • It is unknown whether or not the staff development resulted in high implementation ratings since the respondent may have already had the skills prior to the training.

  • The evaluator has to rely on the respondent giving an honest response in order to get valid information concerning implementation. 

1b. More Specific Follow-Up Questionnaires

Follow up questionnaires can be developed that incorporate items concerning specific inservice/course objectives.  Wording might include: “As a result of this training I am able to (insert specific objective)” so that implementation as a result of the specific inservice/course is assessed.   While this type of questionnaire may take more planning, the results are much more informative.   Follow-up surveys may also include open-ended questions to get formative information from participants in order to increase effective implementation.  Hint:  The more these measures are incorporated into the process of professional development, the more they may serve to enhance effective implementation.   

Resource Links

Follow-Up Survey (Excel)

Report Template

Directions for Use of Survey and Report Template

Outcomes Planner

Outcomes Survey

Back to Top

Example 2: Specific Follow-up questionnaire

 

Directions:  Please rate the degree to which you were able to implement the course objectives using the following key:

1 2 3 4 5
Not at all Inconsistently/Partly Completely/Consistently

 

As the result of this professional development activity I was able to:

Rating

A. Give directions to Curriculum Based Assessment according to script.   ______
B. Begin and end timing of reading according to directions. ______
C.  Score errors correctly.   ______
D. Graph results according to directions.   ______

3. Please attach any evidence of implementation or impact (e.g., procedural checklists, student data).

4. Please describe any impediments (e.g., lack of materials, support, resources, training) that need to be addressed for consistent, successful implementation to be achieved.

5. Please describe strategies that you used to make implementation easier and or more successful.

Advantages

  • Participants understand that they are expected to implement course objectives.

  • Key objectives are made clear to participants.

  • Gaps in knowledge or follow through can be identified and addressed.

  • This survey asks participants to provide evidence of implementation that may be used to document student outcomes.

Disadvantages

  • This method requires some planning so that the survey can be developed in accordance with course objectives.

  • The evaluator has to rely on the respondent giving an honest response in order to get valid information concerning implementation.

  • The open-ended questions will take some time to collate and analyze.

 


 

2. Self-Rating Checklist (With Specific, Prioritized Objectives)

Self-rating checklists can be developed so that inservice course participants evaluate their proficient implementation of course objectives.  This technique is particularly useful when trying to establish new instructional behaviors/habits, and or when new procedures have several steps that have to be remembered for successful implementation. 

Example:                                                                                      

Directions:  Please rate (by circling the number) how well you are able to complete the following course objectives once you get back to your classroom using the following scale:

1 2 3 4 5
Not at all Somewhat Completely

 

1. Turn the computer on. 1 2 3 4 5
2. Find Microsoft Word in the ‘Programs’ menu. 1 2 3 4 5
3. Open a new Microsoft Word file. 1 2 3 4 5
4. Use the following word processing skills:
  Underline 1 2 3 4 5
  Bold 1 2 3 4 5
  Copy Text 1 2 3 4 5
  Cut text 1 2 3 4 5
  Paste text  1 2 3 4 5
5. Print the file. 1 2 3 4 5
6. Save the file in a folder you created and named. 1 2 3 4 5

 

Back to Top

Advantages

  • The instructor can see what exactly the course participants have learned. This can be used to provide additional instruction in the next class if needed, and or to improve the course curriculum so that items frequently rated as unlearned can be taught more clearly.

  • The outline can prove to be instructional for participants when implementation involves multi-step tasks that are difficult to remember.

  • Course objectives are prioritized by the instructor so that high ratings of implementation are more meaningful.

  • Participants are more likely to follow through when procedures are provided and they are asked to conduct self-assessment.

Disadvantages

  • Those in charge of collecting follow up information may have to coordinate with those teaching the inservice course to define specific observable course outcome objectives.

  • Participants may have had many of the skills on the checklist prior to the training (Course instructors could give checklist as a needs assessment prior to the training.)

  • The task analysis may be somewhat cumbersome for instructors when course objectives are extremely complex (on the other hand, breaking it down may be helpful to the participants).

  • If the checklist is too long, some participants may not complete it.

  • The evaluator has to rely on the respondent giving an honest response in order to get valid information concerning implementation.


 

3. Interview Techniques

Interview techniques used to evaluate professional development range from impromptu conversations in which participants voice their satisfaction, concerns and impediments to implementation, to focus groups that entail intense data collection, analysis and reporting.  Interview techniques can be conversational (open format), a set of carefully prepared questions (structured format) or prepared questions with spontaneous follow up (semi-structured).  Interviews may be conducted with individuals or with groups.  The structure and formality of how interviews are conducted largely depend on the stakes of getting accurate, balanced feedback, evaluation resources and the evaluation audience.  While spontaneous feedback can provide useful insights, higher stakes evaluations require more formal data collection.  (For an illustration of this check out the Evaluation Funnel.) Resource Links
Focus Groups

Evaluation Funnel

Advantages

  • A major advantage of interview techniques is the dynamic process between the interviewer and person being interviewed.  Different points can be explored and clarified through follow up questions.

  • Important issues that would not have been included on a questionnaire may be revealed during interview.

  • Questionnaires items can be misinterpreted or completed hastily.  Interviews may provide a comfortable forum during which topics are clearly and thoroughly explored.

  • When interview is conducted by the instructor, clarified points can prove to be instructional for both parties.

Disadvantages

  • Interview quality may be largely dependent of the skills of the interviewer to elicit and record honest, clear, balanced information.  Group interviews may be dominated by one or two people, and therefore, important views may be unheard.

  • In some situations, those being interviewed may give socially desirable responses (e.g., indicating that they will implement training objectives when they have their doubts).

  • Interview information can be very time-consuming to collect, organize, analyze and report concisely.


Back to Top

4. Direct Observation of Teacher Implementation

Staff developers or others may observe whether teachers are implementing course objectives as intended subsequent to implementation. Observations can be structured or informal in nature. Structured observations will yield more reliable information. That is, different observers would report seeing similar things, and what is observed would be more consistent over different observations.  Direct observations can be conducted by inservice/course instructors, evaluators (e.g., Teacher Center staff) and colleagues serving as coaches.  Below is a report of a more structured observation form:  

Example

Spelling intervention: Cover Copy Compare

(The observer observes the intervention being conducted and checks the box corresponding with quality of implementation.)

 

Teacher                                                              Date of observation ________

                                                                       

  Rarely/never  Inconsistent Consistent

1. Teacher underlines misspelled words.

 

 

 

2. Teacher prompts/helps student to find the correct spelling for misspelled words.

 

 

 

3. Correct spelling is written next to the misspelled word.

 

 

 

4. Teacher prompts student to look at and remember the correct spelling.

 

 

 

5. Correctly spelled word is covered up with hand or card.

 

 

 

6. Correctly spelled word is covered up with hand or card.

 

 

 

7. Student ‘copies’ correct spelling from memory.

 

 

 

8. Correctly spelled word is uncovered and the student compares what he/she wrote with the correct spelling.

 

 

 

9. Repeat steps as needed.

 

 

 

 

Back to Top

Advantages

  • Direct observations provide direct access to what is actually happening and avoid pitfalls of questionnaire information such as inaccurate reporting.

  • Specific descriptors help observers to make more reliable observations.

  • Key objectives are prioritized for evaluation of important course objectives.

  • Observers who are familiar with program objectives can witness the program being executed and can identify where problems are occurring and provide needed consultation.

  • Observation can be conducted as part of a collegial coaching support.

Disadvantages

  • Direct observation can be time consuming for staff developers to conduct when there are several course participants involved. For many staff development situations direct observation is not feasible. One way of getting around this problem is to have course participants schedule a time to observe each other and report back to the instructor.

  • People being observed may feel threatened and may feel that the method is too obtrusive.

  • People being observed have been known to put on a show for observers or they could be having a bad day. Thus, the observer may not see what actually happens on a day to day basis.

  • Staff development that is intended to address classroom situations that happen only occasionally (i.e., physical restraint) do not lend themselves to direct observation. The observer will most likely not be present when implementation of the course objectives can be witnessed.


 

5. Analysis of evidence

Sometimes there is a rich trail of evidence such as school records, grades, website visits or attendance patterns that reflect implementation of professional development.  Sometimes implementation of professional development requires a sort of paper trail as in the first example.   Resource Link
Classroom Behavior Report Cards (CBRCs)

Example 1: Workshop on Classroom Behavior Report Cards

After a workshop on Classsroom Behavior Report Cards, participants were asked to bring in examples of CBRCs they had developed for students.  The instructor developed a rubric to rate aspects of the behavioral reports and participants were able to share strengths and weaknesses of their work.

Example 2: Using website ‘hits’

Teachers in a district are taught how to incorporate a Blackboard website into their Global History curriculum.  A counter on the website is able to document the number of students from each school in the district who have accessed the website, sections visited and the amount of time spent in each section.

Advantages

  • Analysis of records capitalizes on what already exists.  It may require little or no time to develop a measure and have participants complete it.

  • Sometimes analysis of records gets at what us really happening as a result of professional development as opposed to someone’s reporting of it.  

  • Sometimes evidence points to gaps in knowledge, use or follow through that can be addresses in later training sessions.

Disadvantages

  • In some cases records are simply a reflection of what is happening (e.g., changes in the number of referrals or suspensions).  Further information is needed to understand what is really going on.

  • If people have the opportunity to bring in only their best work, opportunities for additional instruction that may be beneficial are lost.  

Impact on Student Outcomes

Ultimately, the charge of professional development in education is student improvement.  In some cases professional development would not be expected to result in measurable student outcomes (e.g., professional development to increase awareness).  Increasingly, Teacher Centers are expected to have a comprehensive array of professional development activities, including those that would result in observable student improvement.  Assessing these outcomes is typically seen as a challenge for Teacher Centers.  The following strategies are presented to make assessment of student outcomes feasible and at the same time useful for improving programs.   Proper planning and integration of evaluation into professional development is very important in order to accurately and efficiently evaluate impact of professional development on students.

1. Analysis of existing evidence (e.g., records)

As reported in the section above, analysis of records varies in how closely it reflects true student outcomes.  Existing information should not be overlooked however as a rich data source.  Examples of existing information that may reflect professional development outcomes are:

  • Student grades

  • Disciplinary referrals/suspensions

  • Attendance patterns

  • Students visit patterns on websites

Back to Top

Example: Decline in student discipline referral rates for fighting

A school district identified fighting among its middle school students as a major problem.   All staff were trained in de-escalation techniques and mediation strategies.  In addition staff in the school were allocated to facilitate mediation meetings among students.  Like most schools, this middle school had kept close records on the number of fights on school grounds.  The graph below shows a weekly tally of fights before and during implementation of the program.

  2. Analysis of existing student work

A challenge for evaluators is finding student work that is easily organized, analyzed and communicated in a reliable fashion.  For example, while student journals or videotaped performances are rich in information, it is often difficult to capture these products in a succinct or quantitative manner.  Clearly, for some audiences and purposes (e.g., use of videotape at a Board of Education presentation to demonstrate student outcomes) a variety of media are relevant and effective demonstrations of outcome.

Example: Journals

An inservice course was conducted to improve writing skills of fourth grade students.  Teachers were asked to bring in sample journals that would reflect improvements in students’ written expression.  The instructor provided a rubric so that journals could be evaluated according to a relatively objective standard. 

The analysis of journals proved to be a helpful part of the course.  Participants commented that it improved their ability to analyze writing pieces in order to identify instructional needs.   While the presenter used the student work as a reflection of participant understanding, she concluded that the journal ratings would not serve adequately to qualify the course as a success.  First, teachers rated their own students’ writing pieces.  This created a potential bias.  Secondly, there was no way of knowing whether students would have made observed improvement without the course since no baseline was collected or comparison group assessed.  Finally, because of time constraints, teachers brought in only selected work.  There was no indication as to how representative writings submitted were to other students, how much time was taken to make journal entries, or how much assistance and prompting was given.  

Advantages

  • Student work is directly related to the curriculum and may be sensitive to changes in instruction.

  • Student work may foster productive conversation about instructional needs and strategies used.

Disadvantages

  • Student work may be difficult and time consuming to analyze in an objective, reliable manner.

  • Student work is difficult to quantify.

  • Results of student work may be difficult to communicate succinctly to audiences.

Back to Top

3. Pre-Post Test Scores

Many educators use group or individually administered achievement tests (i.e., statewide tests, individual norm referenced tests of achievement) to determine student gains in response to a particular program.  Differences in scores from the initial and later assessments are used to judge the success of  a program.  Standardized testing typically takes place once per year to investigate issues such as school or program accountability.  There are a myriad of factors that go into rises and drops in test scores, and it difficult to attribute changes in scores to one particular school variable, program or initiative.

Example: Pre-Post testing

The example below shows test scores before and after program implementation. 

Advantages

  • Standardized tests are prepared by experts in evaluation.  They typically have good measurement qualities (reliability, validity) and assess a broad range of skills.  It is the evaluator’s responsibility to be aware of test quality. 

  • Standardized tests are easy to score and report results.

  • Standardized test scores generally have high credibility among a variety of audiences.

Disadvantages

  • Standardized tests may be only vaguely related to professional development objectives. 

  • Standardized tests may not be sensitive to change over relatively short periods of time. In fact, using information such as changes in grade equivalents can be very misleading since just a one raw score point change can suggest a one half year grade change.

  • Because the lengthy period between pre and post testing, changes in scores may be due to factors unrelated to student improvement (e.g., other instructional factors, measurement error).

  • One reason for below average scores to rise and above average scores to fall is a statistical phenomenon call 'regression to the mean'.  Rises and drops in scores using a pre-post method may be easily misinterpreted.

4. Group Comparison Test Scores

Another way of gauging student improvement in response to a program is to compare a group receiving a program and those not receiving program.  The evaluator has to either assure that groups are ‘matched’ by important variables (e.g., same age, curriculum, intelligence, SES), or randomly assign members to groups.  This can be challenging in the real world.

Example: Group comparison of test scores  

Advantage

  • Experimental design helps to support results as being related to program.  It is a frequently used approach for high quality research and evaluation.

Disadvantages

  • Selecting groups that receive and do not receive a given program may be problematic. 

  • It is usually difficult if not impossible to say that different groups are equal in every respect except for participating with the program. 

  • Outcomes measures by commercially available tests often have a poor overlap with outcomes that are targeted with a given program. This lack of 'curriculum match' between the test and what is taught can result in the measures being insensitive to real change that has occurred.

  • Commercially available tests are time consuming and often provide limited information about student needs.

Back to Top


5. Formative Evaluation: Frequent, Ongoing Assessment of Student Skills Before and During Implementation

There are many advantages of collecting ongoing data that is closely tied to professional development objectives.  As for other assessment methods, the quality of collecting ongoing information ranges from very unstructured and or qualitative to structured and or quantitative (see Evaluation Funnel).  Some ongoing assessment examples include: informal teacher observation, Running Records, Mad Minutes (math), Curriculum Based Assessment (CBA) and Classroom Behavior Report Cards (CBRC).  The latter two will be illustrated here because they are feasible, have the measurement qualities for high stakes evaluation, and serve a number of purposes.
> Resource Links
CBA Manual

CBRC Manual

Curriculum Based Assessment (CBA)

Curriculum Based Assessment (CBA) is a method of systematically assessing students’ basic academic skills in reading, mathematics, spelling and written expression.   The instructor gives the student brief, timed samples, or “probes”, made up of academic material usually taken from the student’s curriculum.  

CBA in Reading may consist of letter/letter sound reading, word lists and or passage reading, depending on the student’s developmental level or instructional goals.  Students are asked to read from letter lists, word lists or reading passages called “probes” for one minute.  Students who are beyond an emergent level typically read three passages of text per grade level, and the median, or middle score is recorded.  Multiple passages within a single book level are prepared so that ongoing assessment can take place without practice effects.

CBA in Writing consists of a three-minute writing sample with a story starter.  There are many scoring options including counting the number of, and percent correctly spelled words in three minutes.  Qualitative scoring options are provided in this manual.

CBA Math uses two-minute calculation probes.   Probes, which may be comprised of a single skill or “mixed skills”, are selected to assess key skills from the student’s current or imminent instructional program.

Example: Reading improvement after peer tutoring program

The below graph illustrates a weekly timed (one minute) assessment of words correctly read in a second grade textbook before and during implementation of a program designed to improve reading fluency.

  Advantages

  • Procedures such as Curriculum Based Assessment (CBA) have multiple uses including identifying skills that need teaching and assessing the effectiveness of specific interventions. Curriculum Based Assessment has sound psychometric qualities and has been used for high stakes decisions (i.e., program evaluation and student eligibility for special education services).

  • Multiple assessments before the program/intervention begins provide clearer assessment of baseline performance, and therefore clearer assessment of outcomes due to the effects of the program.

  • Many of these measures take very little time to conduct compared to other assessments.

Disadvantages

  • Standardized procedures such as CBA require training for proper administration and interpretation.

  • While the time required to administer CBA is short, methods such as CBA require a certain amount of time to administer and score each week.   Longer term commitment is necessary.

  • CBA may not be familiar to some evaluation audiences. This may affect the credibility of the results.

Back to Top

Behavioral Monitoring - Classroom Behavior Report Cards

Some professional development is used to improve student behavior.  Classroom Behavior Report Cards involve ratings of specific student behaviors that are of prioritized concern on a daily basis. Ratings are typically in the form of a Likert-type scale, and or frequency of behaviors. They can be completed once or several times per day.  They can be part of a behavior plan, are used to facilitate regular communication with parents and parental involvement and have been used successfully for high stakes decisions such as medication evaluations.

Data from one or several students combined can demonstrate program effectiveness.  The ongoing nature of CBRC also allows for intervention/program adjustments until desired outcomes are achieved.

Example: Classroom Behavior Report Cards (CBRC)

Below is a Behavior Report Card rating three behaviors once per day.

Example of data graphed from a Classroom Behavior Report Card

The graph below illustrates how data from a Classroom Behavior Report Card can be graphed.  It shows that John’s ratings improved significantly after a behavior plan was put into place and that the gains were sustained when he began to rate his own behaviors (with teacher monitoring of course!).

  Advantages of using CBRC

  • CBRCs take very little time to complete and are therefore quite feasible to use.

  • They can assess student behavior every day, throughout the day or can be used to assess behavior during specific times of the day when most problematic.

  • CBRCs are able to assess less frequently occurring behaviors that may not be witnessed by an outside observer (e.g., serious behaviors that may occur once or twice per week).

  • CBRCs may be incorporated into a behavior contract, or a student self-monitoring intervention.

  • CBRCs allow for frequent, ongoing assessment of intervention outcomes.

  • CBRC can be generic or tailored to specific difficulties.

  Disadvantage

  • CBRCs depend on the respondent's memory of what happened and they can be subjective.  What is expected of a student to earn a rating of 8 may vary from teacher to teacher (rubrics can help to reduce this variation).

Back to Top