Learn About: 21st Century | Charter Schools | Homework
Home / Edifier


The EDifier

October 31, 2011

Merit pay revisited- Is Denver’s pay for performance a model plan?

Although it remains a controversial issue, merit pay has long since evolved from the days when test scores were the single factor in determining whether a teacher would get paid for performance.  Nowadays a number of school districts across the country have developed multi-pronged plans aimed at equitably rewarding teachers for their accomplishments.  Nonetheless, the question still remains: Is there actually a way to fairly reward a professional who deals with the advancement of human capital?  No plan is perfect, but one district might have come close.

In 2009, The Center took a look at merit pay and made mention of Denver’s ProComp Pay for Performance plan.  Now, a three year study, conducted by Dan Goldhaber and Joe Walch of the Center for Education Data and Research, has come out.  The study was conducted between the fall of 2006 and spring of 2010 on Denver’s ProComp plan. Denver Public Schools (DPS) requires all teachers who were hired in 2006 or later to be a part of the ProComp plan and gives veteran teachers the choice whether to opt in or not.  ProComp offers teachers four opportunities to receive bonuses, which include:

  1. Knowledge and Skills: Teachers may earn pay for completing one professional development unit per year (and can bank extra PDU’s), getting advanced  degrees and licenses, and can even receive tuition and student loan reimbursement (50 to 65 percent received this pay)
  2. Comprehensive Professional Evaluation: Based on principal evaluations, which are every 1 to 3 years (5 to 14 percent received this pay)
  3. Market Incentives: Aimed at teachers who work in hard-to-serve schools and/0r in hard-to-staff subject areas, as reviewed by school demographics and market supply (35 to 65 percent received this pay)
  4. Student Growth: Teachers set up student growth objectives, based on what they expect students to learn, which are approved by the principal (example: I expect x number of students to  exceed expectations in Reading on the Colorado Student Assessment Program (CSAP).) (70 to 80 percent received this pay)

The study suggests that the ProComp plan made teachers feel more supported and in turn, allowed them to more consistently meet their goals (Robles 2011).  In fact, between 2006 and 2010, 15 percent of the non-ProComp teachers even switched over to join the plan after seeing the positive results ProComp had on their schools and colleagues.  Not only has ProComp made the teaching profession more attractive, Goldhaber and Walch conclude that:

  • There were significant learning gains across grades and subjects;
  • The benefits of tracking data and evaluating educators spread from ProComp teachers to the entire district;
  • There was an expectation that the program would cause a negative atmosphere between team members but the opposite actually occurred and role models were bred;
  • ProComp teachers’ students had larger than expected gains on the state assessment.

Skeptics argue that these rewards focus more on classroom instruction than student test achievement and that ProComp is inconsistent with the value-added approach. Goldhaber and Walch point out that, “whether this is good or bad is clearly a normative question” but that “overall, ProComp has had a positive effect.”  They also suggest that states might want to consider investing in similar programs, especially for their Race to the Top objectives. Yesenia Robles of the Denver Post notes that ProComp has helped propel infrastructure reforms to change recruitment practices and enhance methods of data gathering.  She goes on to point out that the difference between non-ProComp and ProComp teachers’ student growth objectives are comparable to the difference between a first and second year teacher’s.   Her article, DPS Teacher-Pay System Likely Boosting Student Achievement, Study Finds, also points out that Denver Public Schools has retained 160 more teachers per year since 2006 and that 80 percent of all DPS teachers currently participate in the program.   Robles notes that, “The ProComp system is already in the process of changing with the implementation of the district’s evaluation-and-support system, known as LEAP, now being tested in 94 percent of DPS schools.”  Right now it is still too early to tell if ProComp can survive these alterations. 

ProComp is an even-handed, well-formed pay for performance plan that other districts can use as a model and will hopefully emulate.  The research shows that ProComp was not only received well by DPS teachers but most significantly, student success consistently progressed. –M. Newport

(To see whether similar pay for performance plans have been successful, check out this ECS report.)






October 13, 2011

Using growth in NCLB’s reauthorization

On Wednesday, Senator Harkin released his bill to reauthorize the Elementary and Secondary Education Act (ESEA), better known as No Child Left Behind (NCLB). I haven’t read the bill yet, but I have read that Senator Harkin is proposing to drop the current Adequate Yearly Progress (AYP) requirement and instead evaluate schools based on “continuous improvement.” This would mean that all students are no longer expected to be proficient by 2014; instead, they are expected to make a certain amount of academic gains from year to year.

Sounds simple enough. Critics and proponents of NCLB alike have been pushing for the inclusion of a measurement of student growth since NCLB was enacted nearly a decade ago. However, back in 2002 less than a handful of states had the assessments and the data systems in place to measure how much academic gains individual student made from year to year. Now, thanks to NCLB, all states have the capacity to make such calculations. Almost all would agree that including such measures would greatly improve the fairness of any accountability system.  

Yet, incorporating student growth into a federal accountability system is not as straightforward as it seems. First of all, as my report Measuring Student Growth illustrates, there is no single method to measuring student growth. Choosing which method is best depends on the data available and how the data is going to be used. For example, a growth model, which identifies students who are not gaining as much as similar students, will look a lot different than a model that is used to identify students who gained enough in the past year to be on track to reach a certain benchmark such as being college and career ready when they graduate high school.

So before a growth model is used for accountability, policymakers need to state a clear purpose for what the growth data is to evaluate. For example, is the purpose to ensure schools are closing achievement gaps? Is the purpose to ensure all students are college or career ready by the end of high school? Or is it to identify schools where students are making fewer gains than students in schools with similar student populations?  For each of these questions, an adequate answer would require a different growth model.

Second of all, simply moving from a proficiency-based accountability system such as NCLB to a continuous improvement based system as proposed by Senator Harkin overlooks the fact that most state assessments are not designed to effectively measure student growth from grade to grade. Most states have developed their assessments to evaluate if a student is proficient or not proficient each year. As such, many state assessments are unable to reliably determine how much a student has learned from year to year, especially a student who scored at the very high or very low end of the test’s achievement scale. Yes, states can calculate a growth measure using the assessments they now have in place, but in many cases the result will not be as accurate as if the assessments were designed specifically to measure student growth.

These are just two major issues when it comes to including a growth measure for federal accountability. Yes, evaluating schools based on student growth is much fairer than how schools are currently evaluated under NCLB. But just simply including a growth measure does not automatically make it a better accountability system. Policymakers need to set a clear purpose for accountability systems and then incorporate a growth model that would best evaluates whether schools are meeting their goals. From what I know now about the Harkin bill, it is not clear whether the purpose is to ensure all students are college or career ready or if all students are making a year’s worth of gains. Without having a clear purpose for what to hold schools accountable for, adding a growth model will not be any fairer than NCLB. – Jim Hull






March 31, 2011

The need for better teacher evaluation systems

Filed under: Growth Models,Teacher evaluation,teachers — Tags: , — Jim Hull @ 2:00 pm

A friend of mine was complaining to me the other day about her son’s teacher. She spoke about how much further behind her son’s class was compared to the other classes in the same grade, and wondered what impact that would have on her son.

Unfortunately, she has no idea whether her son’s teacher is actually ineffective or if it was just her perception. More importantly, neither does the principal or the teacher.

That’s because in my friend’s district, as with most districts nationwide, teachers are not evaluated on how they impact their students’ achievement. As the report the Center released today, Building a Better Evaluation System, found, it is more than likely this teacher received a “satisfactory” rating on his or her most recent evaluation. That’s the rating the vast majority of our teachers receive. As a matter of fact, less than 1 percent of teachers nationwide are rated “unsatisfactory.”

I don’t mean to bash this or any other teacher. As with any profession, there are people that are very good at their jobs, and those who are not. It’s just that most teacher evaluations currently aren’t very good at identifying which are which.

Because current teacher evaluation systems are lacking, there has been a push by the Obama Administration and some national foundations to revamp the way teachers are evaluated by measuring teacher effectiveness on how much a teacher contributed to their students’ achievement over the course of the school year. Sounds simple enough. Most of us are evaluated in much the same way, so why shouldn’t teachers? Well, evaluating teachers isn’t so easy — but it can be done, as our new report points out.

To learn more about how to include measures of student achievement in evaluating teachers, check out the Center’s latest report, Building a Better Evaluation System. –Jim Hull






March 9, 2011

The need to fix teacher evaluation systems

Filed under: Teacher evaluation,teachers — Tags: , — Jim Hull @ 1:22 pm

Yesterday Becky wrote about how New Jersey’s governor is pushing to reform how the state’s teachers are evaluated. However, such a push is not just coming from political leaders. As a matter of fact, Randi Weingarten, president of the AFT, the country’s second-largest teachers’ union, has also pushed to improve how teachers are evaluated. She even highlighted the weakness of most current evaluation systems in a statement last week where she said:

“As important as evaluation is to assessing teacher performance, what passes for teacher evaluation in many districts frankly isn’t up to this important task. Way too often, teacher evaluations are superficial. They’re subjective. They miss a prime opportunity to improve teacher practice and, thereby, increase student learning. And that’s what it’s all about, isn’t it?”

I couldn’t agree more. Most current evaluation systems are worthless, to put it lightly. Of course agreeing that current evaluation systems are lacking is the easy part. What a quality and effective evaluation system should look like and what decisions should be based on the results is where the real tension lies.

That is why the Center has been putting together a new report on creating better teacher evaluation systems. The report cuts through the rhetoric about the strengths and weaknesses of certain tools to evaluate teachers, particularly value-added models, to give policymakers, teachers, and the public a true picture of what an effective teacher evaluation should include. By providing a clear explanation of the strengths and weaknesses of various ways to evaluate teachers, we hope to limit the debate down to how to use the tools currently available to make the best decisions possible about the actual effectiveness of teachers.

As you will see in the report that will be released in the coming weeks, improving how are teachers are evaluated can have a dramatic effect not only on our students and but on our overall economy as well. So stay tuned. – Jim Hull






January 7, 2011

What does baseball have to do with evaluating teachers?

Filed under: Growth Models,Pay for Performance,teachers — Tags: , , — Jim Hull @ 12:50 pm

“Look on the back of their baseball card.”

That’s the common reply from baseball managers and general managers around the major leagues when asked about what kind of production they expect from a player in an upcoming season.

It is a reference to the career stats of baseball players, which appear on the back of their baseball cards. Most often, at the end of the season a player’s stats will be more or less as their career stats.

This basic assumption is so accepted in Major League Baseball that teams offer tens of millions of dollars to those players with the strongest career stats, while players with the weakest stats tend to disappear after a few years.  The stakes don’t get much higher than that.

But in education, it’s still rare for similar high stakes decisions about school personnel (teachers, principals, and other administrators) to be made using data such as student test scores. Critics argue that evaluating educators primarily based on student test scores would be unfair. That even the most complex statistical models designed to isolate educators’ impact, such as value-added growth models, are not reliable enough to base high-stakes decisions such as salaries or tenure.

Now, these critics argue correctly that there is only a moderate correlation between a teacher’s past performance, as measured by value-added models, and their future effectiveness. For example, a Brookings report on value-added models stated that the correlation between test-based measures of teacher effectiveness between one school year and the next is between .30 and .40 (1.0 would mean a perfect correlation and 0.0 would be no correlation). This is a low to moderate correlation in statistical terms. Sounds like the critics have a strong point.

However, Brookings also points out when comparing that correlation to statistics used in other professions, such as Major League Baseball, the correlation doesn’t sound very weak.

Take batting average—one of the most widely used statistics to evaluate a player. The between-season correlation for batting averages for Major League Baseball players is just .36. Yet, a team will pay a player millions for hitting .300 and cut a player for hitting .250.  

Of course batting average is just one statistic teams use to evaluate a player and value-added scores are just one statistic that could be used to evaluate educators, but both can be key measures to determine how effective they will be in the future if used correctly.  

Do teams sometimes make the wrong decision by paying millions for the player who hit .300 last season while cutting the player who hit .250? Yes, it does happen. But overall, the teams are better off signing the players who had the strongest stats on back of his baseball card. Our schools could be more effective if they kept the educators with the strongest stats as well.– Jim Hull






Older Posts »
RSS Feed