The EDifier

December 10, 2015

The future of using student achievement measures to evaluate teachers

With the president signing the new Every Student Succeeds Act (ESSA) into law today the days of the No Child Left Behind Act waivers (NCLB) as well as Race to the Top grants (RTTT) have officially come to an end. The elimination of these programs also ends the ‘incentive’ for states to evaluate teachers based, at least in part, on measures of student achievement. Keep in mind, however, over 40 states currently evaluate teachers partially on their students’ achievement.

Less than a decade ago this was far from the case. Prior to NCLB waivers and RTTT only a small number of districts included student achievement measures when they evaluated their teachers.  In fact, a number of states prohibited using these measures in evaluation as a matter of law.

More recently, NCLB waivers and RTTT grants provided ‘incentives’ to include student achievement measures as a significant portion of how teachers are evaluated. In response, the vast majority of states have made significant changes to their teacher evaluation systems. However, developing these new evaluation systems was no easy task. In fact, most states have just recently fully implemented such systems and some are still in the process of doing so.  So it is far too early to tell what impact these new evaluation systems have had on teachers and student learning.

Now that federal ‘incentives’ have been lifted, the question is will the states stay the course when it comes to evaluating teachers or will they shift gears? Of course, only time will tell but with the pushback against testing in a number of states I’m guessing at least some states will change their evaluation system, especially as it pertains to including student achievement measures.

Yet, even if states pull back on linking teachers’ ratings to student performance, these systems are likely to be significantly better than what states had in place prior to NCLB waivers and RTTT. As discussed in our Trends in Teacher Evaluation report, for decades most evaluation systems were little more than a bureaucratic exercise that failed to recognize either excellence or mediocrity in teaching. This is no longer the case. States have vastly improved their teacher evaluation systems in recent years and not just by including measures of student achievement. Nearly every state has vastly improved the way classroom observations are conducted. Now it’s the norm for teachers to be observed every year–in many cases, multiple times a year—and then provided immediate feedback to inform and improve their instruction. Moreover, nearly every state now evaluates teachers on multiple measures even when tests scores are not used. Such indicators include the quality of their lesson plans, feedback from their students, and the quality of their classroom assignments among others. These measures are then combined to provide a more accurate measure of a teacher’s true effectiveness as well as provide valuable information to help teachers improve their instruction.

States and districts have worked extremely hard over the past several years to design and implement these new teacher evaluation systems so it is unlikely they will be going to back to the old days when teachers were evaluated every couple of years and rarely provided useful information. While including objective measures of student achievement like test scores can be a valuable part of an effective teacher evaluation system, the new evaluation systems even without the student link are much more likely to accurately identify effective teachers as well as provide useful information to improve instruction. And that is good news for all teachers and students. – Jim Hull

March 18, 2015

Put teacher data in the hands of those who know how to use It

While every parent wants as much information as possible to do what is best for their child, it doesn’t mean that parents have the right to their child’s teacher’s evaluation data. That information should only be used by administrators to support the continuous improvement of their teachers and make more informed decisions on which teachers are best suited to teach which students. As I argue in Trends in Teacher Evaluations, this is the best way teacher evaluation systems can improve the effectiveness of all teachers. On the other hand, providing individual teacher evaluation data to parents, as one parent in Virginia is going to court over, will likely lead to a pitchfork mentality where parents will demand their child be placed in the highest rated teacher’s class and that low performing teachers be fired without any context on what the evaluation results actually mean.

Such rush to judgments on evaluating talent happens all too often by those only looking at the short-term gains. Sports provides the most vivid examples of this phenomena. One of the best examples is when the Boston Red Sox brought up Dustin Pedroia to play second base in 2007. But Pedroia’s numbers were downright awful the first month of the season and fans wanted him replaced. However, the manager kept playing him despite the bad numbers because his experience showed him that Padroia would someday become a very good player. And the manager was right; Padroia went on to win the Rookie of the Year award in 2007 and the American League’s MVP award the next. This illustrates how data is most effective in the hands of those who not only know how to use the data correctly but will use the data for the best possible outcomes in the long-term.

This isn’t to say that parents shouldn’t have any information about the quality of those teaching their children. They certainly should. The question is what information should be provided to parents. This is a question states and districts are still struggling with. Some states provide aggregate teacher effectiveness data by school while others notify parents that their child is being taught by a teacher rated as ineffective for multiple years in a row. There is no right answer to what information parents should have but it is clear just handing parents a teacher’s evaluation data would do more harm than good.

A far more effective strategy, would be for parents, teachers and policymakers to come together to find the best solution for all involved. Together they can come to an agreement on what not only is best for individual students in the short-term but what will allow for what is best for all students in the long-term.  – Jim Hull

November 18, 2014

High performing students ≠ Highly effective teacher

Sheri Lederman may, in fact, be an excellent teacher. But the fact that her fourth-grade class performed well above the New York state average on both the math and English tests is not evidence of her effectiveness. That’s because overall scores such as these have long been known to be more a measure of the quality of the students a teacher is assigned than the actual effectiveness of that teacher. As such, such scores should never be used to evaluate teachers.

Why would this be the case? Let me give you an example. Say Teacher A is assigned a remedial class where just 15 percent of students pass the state test, while Teacher B is assigned to a gifted class where 70 percent of the students pass the state test. However, 70 percent of Teacher A’s students made more than a year’s worth of growth, while just 15 percent of Teacher B’s students did so as well. Which teacher do you think is more effective?

Based on these test scores alone, Teacher A would be considered the more effective teacher since the amount of growth individual students make from one year to the next is more indicative of a teacher’s true effectiveness than scores for individual students at one point in time. So, the fact that Lederman’s students scored above the state average is basically meaningless when it comes to evaluating her effectiveness.

What needs to be known is how much growth her students made over the past year and how that compares to other teachers. But simply comparing the growth Lederman’s students made to the state average isn’t a very accurate measure of a teacher’s effectiveness either since a student’s prior achievement and background can influence how much growth they make in a given year. Such differences need to be accounted for before growth measures can be used to evaluate teachers.

In fact, New York state does account for such student differences when evaluating teachers based on student test scores—including Lederman. The statistical term for including such measures is called the Value-Added Model or VAM which simply determines if a teacher’s students made more, less, or similar academic gains had those same students been taught by an average teacher.

In 2014, the statistics show that Lederman’s students, for the most part, would have made greater academic gains if they had an average teacher. This was not the case for her previous class. So one must ask why the difference? How can a teacher’s performance change so much from year to year?

The answer is VAM’s are not perfect measures of a teacher’s effectiveness, so results may vary from year to year even if the teacher’s actual performance hadn’t changed. Keep in mind, while VAM’s get criticized for their inaccuracy, other measures including classroom observations are in many cases less accurate measures of a teacher’s effectiveness. However, research shows combining student growth measures along with other measures of teacher effectiveness– like classroom observations—provides a pretty accurate measure of a teacher’s true effectiveness.

Are teacher evaluations perfect? No, which is why decisions about individual teachers should be made based on multiple evaluations, not a single one. Even then, results should be used to inform personnel decisions not mandate specific actions. Those decisions should be left in the hands of school and district leaders who know what is best for their students. For Lederman, those who know her best consider her a good teacher despite her last evaluation. The question is: would their opinion change if she continues to receive poor evaluations? Time will only tell. – Jim Hull

September 18, 2014

PDK/Gallup poll Part 2 shows teachers matter

The folks at PDK and Gallup apparently had so much to report in this year’s annual poll of public attitudes toward public schools, they had to release it in two parts. Part 1, which we summarized here, addressed the Common Core state standards and perceptions about public schools more generally. Part 2, released this week, focuses primarily on public attitudes about the teaching profession. What they have to say should provide comfort to beleaguered teachers.

First, nearly two-thirds of the public expresses “trust and confidence in the men and women teaching children in the public schools.” The study’s authors note that this represents a decline from previous years. Nonetheless, it must be a refreshing show of support for teachers who have many reasons to feel beat up by the punditry.

The public also recognizes the key role teachers play in student learning and by large margins would welcome policies to bolster their preparation and training. A full 81% believe prospective teachers should be required to “pass board certification” similar to that for other professionals like doctors and lawyers on top of their college degree in order to be licensed to teach. Likewise, 60% thought that there should be higher entrance requirements into teacher prep programs at the front end. The majority also support the idea of requiring a longer period of supervised practice before teachers take charge of their own classrooms. A plurality of 44% thought such a bridge period should last one year with 27% saying new teachers need two (see chart).




Political affiliation had almost no effect on opinions about teacher preparation. Republicans, Democrats and Independents alike called for increasing rigor. Not so when asked about teacher evaluation, however, possibly in response to growing political controversy over new evaluation policies that are based in part on teachers’ impact on student learning. Nationally, 61% of the public opposes evaluations that include “how well a teacher’s students perform on standardized tests.” Yet only 50% of Republicans were opposed compared to 68% of Democrats. Interestingly, overall opposition to using evaluation in this way is much higher now than it was just two years ago when slightly less than half of the overall public thought it was a bad idea.

Similar party-affiliation gaps were evident when pollsters asked about the purposes of evaluation: 86% of Democrats said using evaluation to help “teachers improve” was “very important” compared to 71% of Republicans. In contrast, Republicans were much more favorable to linking evaluation to salaries or bonuses: 51% of Republicans said this was “very important” compared to only 41% of Democrats.

Other questions explored whether the public thinks their schools need to change “to meet today’s needs” and if so, how. More than half (58%) said that schools need to change compared to 47% who though so in 2006. The biggest needed change the public would like to see is a greater emphasis on career-technical education: 60% “strongly agreed” that “high school students should receive more education about possible career choices” while 32% said the same about placing “more emphasis” on college preparation for all.

This doesn’t necessarily mean that the public thinks career education is more important than college readiness. It could indicate that they think high schools are doing ok with college prep but they need to do more to get students ready for work. But either way, the message is clear that the public is looking toward high schools to make sure all graduates are able to thrive in the new workplace.

Along those lines, CPE’s Jim Hull has been analyzing work and other outcomes for the group of high school graduates who do not go to a two- or four-year college. His findings should produce some valuable insights into what career-readiness should look like. His first report will be released in the next two weeks so stay tuned. — Patte Barth

September 9, 2014

Myths About Teacher Evaluations

While teacher evaluations haven’t garnered as much media attention as the Common Core, in the education world it has been nearly as controversial. And just like the Common Core there are a number of myths about teacher evaluations that impede important discussions on how evaluations can best be used to improve student performance. Even this insightful EdWeek essay by a Philadelphia high school math teacher included some popular myths such as:

            Value-added systems provide precise percentile rankings of teachers

While value-added models certainly can provide percentile rankings of teachers this is typically not the case. The objective of most value-added measures is not to create rankings—which wouldn’t be very precise—but to determine if a teacher is more or less effective than an average teacher. Value-added measures cannot, should not, and typically are not used to rank teachers from best to worst in any teacher evaluation system.

The impact of a given teacher on student performance is too small to accurately quantify

Because there are a number of factors besides teachers that impact students’ test scores, this is exactly the reason why value-added measures should be used. It is the only quantifiable measure that even attempts to isolate the impact of the teacher from other factors that influence student achievement. As this video shows, teachers have a tremendous impact on the academic success of their students.

The differences between schools are too great to accurately quantify

It is true that large differences between schools have an impact on teacher effectiveness which is why high quality value-added models are designed to minimize the impact of such differences. A good value-added model will compare teachers within the same school or similar schools to control for the differences between schools. These controls are not perfect but they provide a more accurate assessment of how a teacher would perform in a typical school.

Teachers are blindly fired due to flawed data that doesn’t provide context

While the other three myths had some nuggets of truth, this one is totally untrue. As I found in my Trends in Teacher Evaluation report, no state relies solely on value-added (or any one measure of student achievement) for more than half of a teacher’s overall evaluation. Even in states where half of a teacher’s evaluation is based on measures of student achievement, most of these states require that multiple measures of student achievement be used, such as student learning objectives, formative assessments and teacher developed exams.

Furthermore, in just about every state evaluation system, the lowest-performing teachers are provided additional professional development, mentoring, or other assistance to help improve their performance. Only if the teacher fails to improve after multiple years of low performance do they become eligible to be fired. And in most states the district still has the final say on whether a teacher is fired or not. So while teacher evaluation systems are used to identify low-performing teachers, it is still up to district leaders in most states to determine what to do with that information. – Jim Hull

