Learn About: 21st Century | Charter Schools | Homework
Home / Edifier

The EDifier

March 7, 2013

John Stossel, funky charts and Simpson’s paradox

John Stossel was on Fox and Friends this morning to promote an upcoming show about public schools. Remember, this is the guy who gave us Stupid in America – his ABC documentary from a few years back about our allegedly failing schools. During his segment, he claimed that “America has tripled spending, but test scores haven’t improved.”  The culprits? Teachers unions, school boards and other unnamed bureaucrats. Viewers were then shown a graph that indeed featured a flat line representing test scores over 40 years (improvement 1 point) with a second line escalating to $149,000 over the same period. The source was given as NCES. This got my fact-checking synapses sparking.

While I could not find the exact graph they showed on TV, Stossel did post this rather snazzy display on his blog with the same data:

Go ahead and take a moment to admire the work of the Fox News graphics department. Ok, now let’s talk data. This chart shows scores for three subjects (math, reading and science) and dollar figures (the “cost of education”) from 1970 to 2010. While not noted, I’m assuming the data source is still NCES.

This may get a little wonky, but stay with me.  NCES reports trend data over four decades for only two tests:  the National Assessment of Educational Progress (NAEP) Long-Term Trends (LTT) and the SAT. NCES also has international test scores, but that data only goes back to the 1990s so that couldn’t be what Stossel used.  The SAT does not assess science, which leaves NAEP LTT as the only possibility. It’s not a perfect match. The last NAEP LTT administration was in 2008 although Stossel’s chart shows data to 2010. But I’m going to assume that he fudged a little on the timeframe because nothing else qualifies.

NAEP LTT is given to a representative sample of students age nine, 13 and 17. I’m also going to assume that his analysis is based on 17-year-olds because the data matches his in reading and comes closest in math (more on this later).  Between 1971 and 2008, LTT reading scores for 17-year-olds have been relatively flat, posting an increase of just 1 point (not 1% as shown on Stossel’s chart, but we’ll blame the designer for that common mistake).  Here’s what it looks like:

Now let’s have some fun. Let’s look at the same test scores disaggregated by race and ethnicity:

Note that every group improved more than the overall score did: White 17-year-olds by 2 points with their Black and Hispanic classmates gaining a whopping 25 and 17 points respectively. This gives me a chance to talk about Simpson’s paradox.  The paradox occurs when “a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data.”  In this case, the overall trend for 17-year-olds is flat while each group gained, some groups by a lot. The reason is that the distribution of racial/ethnic groups has changed significantly between 1975 and 2008. Here is the distribution of the NAEP samples for the two years:

The proportion of Black and Hispanic 17-year-olds is larger while the proportion of White students in 2008 is 25 percentage points lower than it was in 1975. Even though Black and Hispanic performance also increased by a lot, they were still lower-performing than their White peers in 2008. Thus, all groups gain, but when their performance is combined the overall trend is flat.

Clearly, no one would argue that an achievement gap, though improving, is acceptable and we can move on to other things. But it’s just as absurd to look at these gains and find evidence of failing schools, as Stossel does.  And the absurdity doesn’t end there. Stossel, in turns out, is a master cherry picker of data. Let’s look at the rest of NAEP Long Term Trends:

  • Reading, 13-year-olds, 1971-2008: Overall scores +12; Black students +23; Hispanic +24.
  • Reading, 9-year-olds, 1971-2008: Overall +5; Black +21, Hispanic +10.
  • Mathematics, 17-year-olds, 1978 (first year tested)-2008: Overall +6, Black +19, Hispanic +17.
  • Mathematics, 13-year-olds, 1978-2008: Overall +17, Black +32, Hispanic +17.
  • Mathematics, 9-year-olds,  1978-2008: Overall +24, Black +32, Hispanic +30.

Notice a pattern?  If one were to apply Stossel’s grossly oversimplified analysis of education cost to scores — and I’m not saying you should — but if you did, you would have to say our public schools are producing a return on our investment.   Then again, how he got those cost figures is another topic for another day.

Filed under: Achievement Gaps,Data,Demographics,Public education — Tags: , , , — Patte Barth @ 2:46 pm

February 26, 2013

The changing face of America and its schools

Our report on how demographic shifts are changing the cultural landscape of the United States and it’s education system, remains one of our most popular. So, I think you’ll enjoy this recent graphic representation of 2010 U.S. Census data courtesy of Education Week.

Filed under: CPE,Data — Tags: , , , — NDillon @ 3:21 pm

January 22, 2013

The flu season in graphics

We’re officially at the midway point of the flu season and while we won’t know for months what kind of havoc the influenza wreaked on the U.S., early CDC reports the flu has struck decisively in most parts of the country. What does that mean in real dollars and cents? Check out the below graphic to get an idea of the toll the flu can have on your bottomline.



Filed under: Data — Tags: , , — NDillon @ 2:43 pm

January 10, 2013

Gates Foundation report mirrors CPE’s findings

On Tuesday, the Gates Foundation released its third and final report on how (and if) teacher effectiveness can be quantitatively evaluated. Appropriately titled, Measures of Effective Teaching or MET, the findings were hardly earth-shattering but noteworthy nonetheless. Why?

The sheer size of the project— it spanned three years, cost $45 million, studied 3,000 teachers from eight districts across seven states and involved numerous universities and the Educational Testing Service— made it hard to ignore.

Despite all of the resources dumped into this effort, however, the findings were remarkably similar to what the Center for Public Education discovered in its 2011 report, “Building a Better Evaluation System.”

Among the most important takeaways from that report was the importance of using multiple measures to develop an accurate picture of whether and how much a teacher was contributing to student learning.

Surprise, surprise, the Gates Foundation discovered the same thing and determined that a combination of classroom observations, test scores, and student surveys taken as a whole was a solid indicator of teacher effectiveness.

Certainly, there are still some critics that disagree with the MET study’s whole premise— that data collection and disaggregation can be an effective means for determining effective (and ineffective) teachers. To them, too many outside factors, from a child’s socioeconomic background to the level of parental involvement, impact student growth and makes it impossible to truly ascertain individual teacher quality.

So-called value-added or growth models that attempt to isolate these external variables are not any more reliable, opponents say, because of the huge fluctuations that can occur from year to year.

While value-added models aren’t perfect, CPE’s report found they are a far better than current methods of measuring teacher effectiveness. With time and more data, CPE further noted, those wide swings diminish, providing greater clarity to educators about what is and isn’t working. But determining what’s effective and what’s not is near impossible to do without real data and metrics. This fact is yet another reason why the MET report has commanded and deserves attention— though CPE arrived at the same conclusion for about $45 million less.–Naomi Dillon


Superfluous grades; StudentsFirst ranking considers performance last

January so far is looking like Michelle Rhee month. Last night the self-described education reformer was the hour-long focus of PBS’s Frontline series. The day before, her organization StudentsFirst released its report card on the state of education policy in which Rhee and her colleagues “flunked” most states. The headlines wrote themselves (see here and here).

But before we collectively freak out about our own states’ GPA, let’s take a critical look at what StudentsFirst is grading. First — and I can’t emphasize this enough — there are no points awarded for education performance. None. Zero. So if you’re concerned about that ‘F’, Vermont, relax.  You are still a high-achieving state.

What they did look for were state education policies that aligned with the StudentsFirst agenda. These include teacher and principal staffing decisions based on student achievement measures among others, and “empowering parents” through charter schools and vouchers.  Limiting their rankings to policies, however, leads to some strange juxtapositions.

In the following table, I list the top ten performing states in education as identified in KidsCount, the annual report card published by the Anne E. Casey Foundation.  The KidsCount education index includes pre-k participation, NAEP scores in reading and math and high school graduation rates. I then compare these to each state’s StudentsFirst grade:

Not much relationship here between achievement and StudentsFirst policy preferences. When looking only at the StudentsFirst grades in school choice, the relationship is even sketchier: 6 of the top ten states earned an “F” while the highest grade was a “D.”

Brookings Institute released a much less publicized report card before Christmas that graded urban districts on school “choice and competition,” but like StudentsFirst, placed little value on actual performance. Likewise, the Brookings’ rankings look a little wacky when compared to district performance. For example, New York City was ranked second with a letter grade of B+.  Yet its eighth-graders performed significantly below the overall national average on NAEP in math. Number three-ranked D.C. (a “B”) was 19th out of 21 urban districts on the same test. In contrast, middle-schoolers in urban Austin exceeded the national average of all students. Brookings gave that Texas school district an F.  (Comparable data was not available for number one-ranked New Orleans.)

We can probably go overboard drawing conclusions from these inconsistencies. To begin with, we can’t say for sure that the Brookings/StudentsFirst agendas work against achievement. But we can say one thing: there are many high-profile organizations that are promoting education reform policies that do not have a proven track record to support them.

« Newer PostsOlder Posts »
RSS Feed