Learn About: 21st Century | Charter Schools | Homework
Home / Edifier


The EDifier

April 4, 2013

Blame the adults not the tests

I have no problem with a debate about the proper role of standardized testing in our public schools. Standardized tests are not perfect but play a vital role in improving student achievement. However, exactly what role is not clear and is certainly open for debate.

While Eugene Robinson of the Washington Post admits that standardized tests are a vital tool in improving student achievement, he essentially blames them for the cheating crisis in Atlanta and elsewhere. Arguing, the reliance on standardized test scores to make decisions about bonuses drove otherwise honorable adults in Atlanta’s classrooms to cheat.

I whole heartily disagree with Mr. Robinson’s thesis. The tests didn’t make the adults cheat, the adults made that decision. Whether it was the teachers themselves or administrators that encouraged such actions, no one is to blame for the cheating than those adults who took part in the cheating directly or ordered that wrong answers be changed. Those are the ones to blame not the tests.

It should be noted Atlanta made significant gains on NAEP– where there is no evidence of cheating –during this time as well which shows that most Atlanta teachers and administrators responded to the pressures to raise student achievement by actually teaching and not cheating. Unfortunately, their hard work is being overshadowed by a number of teachers and administrators who decided to cheat instead of facing the reality they were not meeting the needs of their students.

It is true that Atlanta may have put too much emphasis on test scores alone. As I showed in Building a Better Evaluation System and CPE encourages with data-first.org, no one indicator can accurately evaluate the true effectiveness of a teacher, administrator or school. If Atlanta wanted to reward their most effective teachers and administrators with bonuses their effectiveness should have be measured using multiple measures such as observations, student learning objectives, or portfolios along with standardized test scores. Using multiple measures not only provides a more accurate measure of  true effectiveness but it would also lessen the likelihood those being evaluated can manipulate the results by cheating.

Another reason using multiple measures would help is while teachers should be evaluated on how much they improve their students’ achievement—as measures by their students’ standardized test scores—such measures do not provide any feedback on how they can improve their students’ performance in the future. As such, a teacher who received a low rating based solely on their students’ test scores may resort to cheating to save their job or get a bonus while a teacher who received a low rating but was provided feedback on how to improve their performance is more likely to modify their instructional techniques instead.

Using multiple measures to evaluate teachers and administrators lessens the chances of cheating but that does not mean that Atlanta’s reliance on test scores caused those involved to cheat. If only those teachers and administrators who cheated had more faith in their students and their abilities to teach, more Atlanta students would be better off today. – Jim Hull






School-wide data should not be used in teacher evaluations without collaboration

A baseball player doesn’t make as much if his team doesn’t do well, right? Should we pay a band’s drummer the big bucks if the band isn’t popular? Absolutely not. Even our representatives in Congress have their job stability linked to one another; if the party doesn’t do well, neither do individual members. For many people, there really is no “I” in team when it comes to pay and job security.

On first blush, tying individual performance to team performance makes perfect sense. After all, in organizations such as a baseball team, band, or political party, each member works together towards a common goal; therefore, whether or not the team achieved that goal should be considered.

Similarly, doesn’t it make sense to tie a teacher’s evaluation to the overall performance of the school? After all, teachers in one building should all be working towards a common goal—the learning of students. Why not encourage teamwork by tying teachers’ job security and pay to the performance of the school as a whole? In fact, that’s just what several states and big cities have done in their new teacher evaluation systems. Arizona, Colorado, Connecticut, Indiana, Maryland, Pennsylvania, South Carolina, Tennessee, West Virginia, Wisconsin, as well as the cities of Chicago and Pittsburgh, have made some portion of a teacher’s evaluation score tied to how well the school as a whole does on standardized tests or other indicators (such as graduation rates or promotion rates). While these states have good intentions in doing so, tying school-wide data to individual teachers’ evaluation is a bit like giving your child a cough drop for a fever. It’s the wrong medicine for the ailment.

Let me present another series of hypotheticals. Should a local eye doctor’s pay and job security be linked to the town’s ear and sinus health? Should one therapist be held accountable for the mental health of patients in his or her city who are not his or her own? Most of us would argue in these scenarios, the answer is no. What accounts for the difference?

The difference is that doctors and therapists are private practitioners. While their work contributes to a greater goal that many might work towards—the physical and mental health of a community—their job isn’t structured in a way that demands teamwork. For the most part, your eye doctor does not collaborate with, practice with, or learn with your ear doctor, foot doctor, or cardiologist. Likewise, while therapists may provide similar services with similar goals, for the most part, they do not collaborate together on a patient, practice with each other, or have mandated time together where they share practices to improve. These fields are by their nature more individualistic. We seem to demand consideration of team performance when the job itself is structured in such a way as to require collaboration. Baseball players practice together. Band members write music and perform together. Congressmen and women write bills together, discuss issues together, and compromise on laws together. Doctors and therapists in private practice do not have a job which requires collaboration, so they aren’t judged based on it.

It may surprise many Americans to find out that teaching mirrors more of the private practitioner structure than the team structure. When I first started teaching, I was certainly shocked by this as well. As a new teacher, I had envisioned meeting frequently with other teachers to share what was happening in our class or to discuss the best teaching strategies for a particular concept. The reality was quite different. Almost all of my day was taken up with teaching my classes, and in the short preparation period I had, I graded papers and planned my lessons alone. Even if I wanted to collaborate, it wouldn’t have been possible. Few teachers had off the same periods as I did, and those who did were busy planning and grading their own papers. I quickly realized that teaching was an exceptionally isolated job.

In his 1975 landmark study of the lives of American teachers, Dan Lortie deemed schools “egg crates,” where teachers worked close proximity to one another, but their work never touched, much like eggs in a crate.  Today, thirty-eight years later, not much has changed. A recent study found that teachers spend a whopping three percent of their day collaborating with colleagues. It’s not that teacher don’t want to collaborate; it’s that the structure of the school itself impedes it.  Work is done individually, and there’s no time carved into the school day for collaboration.

However, calls for evaluating teachers based on school-wide indicators of achievement seem to ignore the widely recognized truth that the teaching profession is structured as a field of private practitioners. States and districts plod on, arguing teachers lack incentives to work together, something remedied by inclusion of school-wide data in individual evaluations. This is a flawed premise supported by no research. What research does show is that teachers want time for collaboration with their colleagues and engage willingly and happily when given this time. The recent MetLife survey of American teachers found that collaboration contributes to teacher satisfaction.  These findings certainly don’t suggest that teachers are resistant to working with one another and are in need of incentives to do so. They do suggest that the school structure itself presents major impediments to a teamwork approach to learning in a school building, something that’s not remedied with a carrot and stick approach to evaluations, but which requires a structural revamp of a teacher’s day and work.

Furthermore, weighting school-wide data so heavily in teacher evaluations works to dilute the primary purpose of the Race to the Top brand of teacher evaluations—identifying the individual contribution of a teacher to student learning. In Tennessee, school-wide achievement data accounts for 35 percent of evaluations of teachers of non-tested subjects. But a report on the first year of the evaluation system suggested decreasing that percentage because teachers expressed that the data didn’t reflect their individual performance.

Before we jump to making teacher evaluations the cure all for everything wrong with public school education, we’d be wise to think deeply about what’s causing the problem we’re aiming to fix. That’s a more complicated question for some aspects of schooling. However, it’s not that complex when we ask ourselves why teachers aren’t collaborating towards greater student learning. Instead of firing teachers based on scores which aren’t their own, and for which they have no time within their school day to work towards improving, let’s think about breaking the egg crate to create a more effective school structure that’s less “I” and more “team.” Maybe then we can think about evaluating teachers based on school-wide scores. Until then, let’s stop giving a cough drop for a fever.   -Allison Gulamhussein

Filed under: CPE,Teacher evaluation,teachers — Tags: , — Allison @ 9:15 am





January 31, 2013

Beyond the pink slip: Teacher evaluation isn’t just for firings

It’s not ground-shattering to say that the conventional teacher evaluation system is broken.  This isn’t just an argument made by education reformers and parents, many teachers agree with this point as well.  They cite the haphazard, subjective nature of evaluations, which research suggests does little to improve instruction or lead to the removal of subpar teachers.

In response to calls for better systems to evaluate teachers, the Bill & Melinda Gates Foundation has funded the Measures of Effective Teaching project (MET).  This group has undertaken a three year study in seven school districts to analyze the effectiveness of certain measures of excellent teaching, specifically student standardized test scores, student surveys, and observations.  However, beyond the much debated question about whether these systems can really measure effective teaching, I’d like to ask a different, perhaps more important question—what should be the primary purpose of these systems?

National conversations about teacher evaluation have mainly focused on getting bad teachers out of the classroom, and there is no doubt that there are some teachers teaching today that shouldn’t be.  However, the notion that America’s schools are brimming with teachers who show movies and check emails all class period is, simply put, a myth.  This fact is certainly confirmed by the MET study.  The study trained observers and had those observers watch 7,491 videos of instruction by 1,333 teachers from six socio-economically and geographically diverse districts.  The observers were tested on their knowledge of the observation rubrics, and they were retested when observing to ensure the scores they were giving were calibrated.  After all those observations, the study found that “overall observed practice is overwhelmingly in the mid-range of performance as defined by the instrument.”  In other words, there were few outstanding teachers (as defined by the observation instrument) but also few really weak teachers (as defined by the observation instrument).

This finding should give us pause to really think about our goals in rolling out revamped teacher evaluation systems.  The writers of the MET study suggest that one can accurately use a combination of observations, student surveys, and standardized test data to identify exceptionally ineffective teachers, but then what?  Even if we fire the teachers who are exceptionally ineffective, that may only be a very small percentage of teachers in the classroom.  Will that alone drastically improve student achievement?

I think that looking at teacher evaluations as only a method to weed out the weak is exceptionally shortsighted and may represent a huge investment of money that gets America little bang for its buck.  Instead, we need to think about how teacher evaluations can be used to improve the teaching of teachers.  Bill Gates, in a recent op-ed piece written for CNN, acknowledges this fact, arguing that “the vast majority of teachers get zero feedback on how to improve” while they work “in isolation and have been asked to improve with little or no feedback.”  As a former public school teacher, my own experiences absolutely confirm this fact.

However, as districts rush to revamp their evaluation programs to align with the demands of Race to the Top and state policies, it’s questionable whether or not feedback to teachers is really a priority.  The Center for American Progress recently released a study exploring teacher perceptions of an urban district’s new teacher evaluation system.  The district rolled out the evaluation system in part to compete for Race to the Top funds.  At the beginning of the year, teachers set two student learning objective goals with their administrator.  Throughout the school year, teachers were observed by an administrator who ranked them on a scale of 1 (needs improvement) to 5 (exemplary).  Low scores and high scores had to be confirmed by an outside evaluator, a move to ensure fairness and objectivity on the part of the observer, and evidence of student achievement which aligned with the teacher’s initial goals was evaluated at the end of the year to see if students really grew academically.

This evaluation system did result in an increase in firings, but it didn’t result in much feedback to the teachers who weren’t fired about how to improve their practice.  After interviewing a large sample of teachers in the district, most said that the new system had no impact on their pedagogy.  Only half of the teachers said any of the feedback was helpful; some even said they got no feedback.  At the end of the 2010-2012 school year, the district had spent countless sums of money developing an evaluation system, hiring outside evaluators, and implementing the system, but fired only 34 teachers in a district employing 1600 teachers (2% of its teaching force).  What about the other 98% of teachers in the district?  According to those teachers, all of this time and money resulted in little to no change in their teaching.  That’s a problem.

In the public dialogue about how to improve America’s classrooms, there’s often a simplistic notion that firing a teacher and replacing him or her is a no-brainer solution to our educational dilemma.  While new evaluation systems should identify exceptionally ineffective teachers, that alone is not enough.  They have to provide feedback for improvement for teachers who aren’t fired.  Such a focus certainly brings into light new questions about who is giving the feedback, the nature of the feedback, the qualities of good instruction, and how to coach teachers toward good instruction; however, those are the questions we really need to be asking.

Filed under: CPE,Teacher evaluation,teachers — Tags: , , — Allison @ 2:57 pm





January 11, 2013

Putting it all together

As Naomi wrote yesterday the results from the final report from the Gates Foundation’s MET study are not groundbreaking. A number of researchers–including yours truly–policymakers and advocates have been saying for years that the most accurate way to evaluate teachers is by employing multiple measures of teacher performance that also includes measures of student achievement.

Even so, a number of highly respected education policy and research experts such as Jack Jennings and Linda Darling-Hammond have argued that measures of student performance such as value-added measures are too unreliable to accurately evaluate a teacher’s true performance. However, such critiques assume that value-added measures are the sole measure of a teacher’s performance which magnifies its limitations. These and other critics of value-added typically claim that other measures of teacher performance such as teacher observations are more accurate and should be used to evaluate teachers in lieu of value-added.

However, the claim that observations are a more accurate tool in evaluating teachers turns out not to be true at all. This is where the MET study really gets interesting. MET researchers specifically examined teacher observations and found that there are more limitations to only using observations to evaluate teachers than to only use value-added measures. Specifically, they found that a teacher’s observation score differed significantly depending on who did the observing and which lessons were observed. As Jennings and Darling-Hammond point out researchers have long known that value-added scores fluctuate significantly from year to year and even from assessment to assessment as well.

What gets lost in the rhetoric is the fact both tools can be made more accurate. For example, value-added scores are more accurate when they are averaged over multiple years—a point critics often leave out. On the other hand, observations scores are more accurate when teachers are observed multiple times by multiple people. This goes to show that no measure is perfect but there are ways to make them more accurate. Most importantly, the MET study found that when these and other measures were used together they were an accurate predictor of how they would perform in the future. So those students who currently have a teacher who previously obtained high value-added and observations scores are more likely to make greater achievement gains than similar students who currently have a teacher who earned lower scores.

It is important to point out that even when using multiple measures some very good teachers will be identified as ineffective and vice versa. However, an evaluation system that is based on a combination of value-added, observations and other measures is much better than using any one of these measures alone. By using multiple measures to more accurately evaluate teachers, administrators and policymakers can make personnel decisions based on how it will likely impact student achievement. This is a great improvement over the current system that simply evaluates teachers based on their years of experience and the highest degree they have earned which the MET Study found is the least accurate way to evaluate teachers. – Jim Hull

Filed under: Growth Models,Teacher evaluation,teachers — Tags: , , — Jim Hull @ 3:57 pm





January 10, 2013

Superfluous grades; StudentsFirst ranking considers performance last

January so far is looking like Michelle Rhee month. Last night the self-described education reformer was the hour-long focus of PBS’s Frontline series. The day before, her organization StudentsFirst released its report card on the state of education policy in which Rhee and her colleagues “flunked” most states. The headlines wrote themselves (see here and here).

But before we collectively freak out about our own states’ GPA, let’s take a critical look at what StudentsFirst is grading. First — and I can’t emphasize this enough — there are no points awarded for education performance. None. Zero. So if you’re concerned about that ‘F’, Vermont, relax.  You are still a high-achieving state.

What they did look for were state education policies that aligned with the StudentsFirst agenda. These include teacher and principal staffing decisions based on student achievement measures among others, and “empowering parents” through charter schools and vouchers.  Limiting their rankings to policies, however, leads to some strange juxtapositions.

In the following table, I list the top ten performing states in education as identified in KidsCount, the annual report card published by the Anne E. Casey Foundation.  The KidsCount education index includes pre-k participation, NAEP scores in reading and math and high school graduation rates. I then compare these to each state’s StudentsFirst grade:

Not much relationship here between achievement and StudentsFirst policy preferences. When looking only at the StudentsFirst grades in school choice, the relationship is even sketchier: 6 of the top ten states earned an “F” while the highest grade was a “D.”

Brookings Institute released a much less publicized report card before Christmas that graded urban districts on school “choice and competition,” but like StudentsFirst, placed little value on actual performance. Likewise, the Brookings’ rankings look a little wacky when compared to district performance. For example, New York City was ranked second with a letter grade of B+.  Yet its eighth-graders performed significantly below the overall national average on NAEP in math. Number three-ranked D.C. (a “B”) was 19th out of 21 urban districts on the same test. In contrast, middle-schoolers in urban Austin exceeded the national average of all students. Brookings gave that Texas school district an F.  (Comparable data was not available for number one-ranked New Orleans.)

We can probably go overboard drawing conclusions from these inconsistencies. To begin with, we can’t say for sure that the Brookings/StudentsFirst agendas work against achievement. But we can say one thing: there are many high-profile organizations that are promoting education reform policies that do not have a proven track record to support them.






Older Posts »
RSS Feed