Archives for category: Data

Gene V. Glass is one of our nation’s most distinguished education researchers.

This post is an important analysis of the failure of the U.S. Department of Education’s Institute of Education Sciences, which was reorganized during the George W. Bush administration.

As a new administration moves into the US Department of Education, the opportunity arises to review and assess the Department’s past practices. A recent publication goes to the heart of how US DOE has been attempting to influence public education. Unfortunately, in an effort to justify millions of dollars spent on research and development, bureaucrats pushed a favorite instructional program that teachers flatly rejected.

The Gold Standard

There is a widespread belief that the best way to improve education is to get practitioners to adopt practices that “scientific” methods have proven to be effective. These increasingly sophisticated methods are required by top research journals and for federal government improvement initiatives such as Investing in Innovation (i3) Initiative to fund further research or dissemination efforts. The US DOE established the What Works Clearinghouse (WWC) to identify the scientific gold-standards and apply them to certify for practitioners which programs “work.” The Fed’s “gold standard” is the Randomized Comparative Trial (RCT). In addition, there have been periodic implementations of US DOE policies that require practitioners to use government funds only for practices that the US DOE has certified to be effective.

However, an important new article published by Education Policy Analysis Archives, concludes that these gold-standard methods misrepresent the actual effectiveness of interventions and thereby mislead practitioners by advocating or requiring their use. The article is entitled “The Failure of the U.S. Education Research Establishment to Identify Effective Practices: Beware Effective Practices Policies.”

The Fool’s Gold

Earlier published work by the author, Professor Stanley Pogrow of San Francisco State University, found that the most research validated program, Success for All, was not actually effective. Quite the contrary! Pogrow goes further and analyzes why these gold-standard methods can not be relied on to guide educators to more effective practice.

Researchers have told us that we need “randomized comparative trials” to reach “research-based conclusions.”

In fact, says Glass and the article he cites, this is not what happens. And the results of these trials turn out to be easily manipulated and falsified.

He writes:

Key problems with the Randomized Comparative Trial include (1) the RCT almost never tells you how the experimental students actually performed, (2) that the difference between groups that researchers use to consider a program to be effective is typically so small that it is “difficult to detect” in the real world, and (3) statistically manipulating the data to the point that the numbers that are being compared are mathematical abstractions that have no real world meaning—and then trying to make them intelligible with hypothetical extrapolations such the difference favoring the experimental students is the equivalent of increasing results from the 50th to the 58th percentile, or an additional month of learning. The problem is that we do not know if the experimental students actually scored at the 58th or 28th percentile. So in the end, we end up not knowing how students in the intervention actually performed, and any benefits that are found are highly exaggerated.

The sad part of the story is that we now have a new administration that is both ignorant of research and indifferent to it. DeVos has seen the failure of school choice in her own state, which has plummeted in the national rankings since 2003, and it has had no impact on her ideology. Ideology is not subject to testing or research. It is a deep-seated belief system that cannot be dislodged by evidence.

This morning, I posted an evaluation by Mathematica Policy Research, which concluded that the federal School Improvement Grants had no effect on test scores. A reader named Sara explains here why the SIG program failed, after spending $3.5 billion:

 

 

The SIG required certain interventions and did not give any autonomy or decision making power to the people who already worked in the school.

 

So for example in the school where I work, SIG required that an outside organization provide social emotional support to students- rather than supplementing the counseling and social work staff who are highly qualified and already know the students. Whenever new people come into a situation there is a long learning curve. Also people from an outside organization do not have a long term commitment to the school.

 
Another example, staff came in for the grant who merely measured and “coached” – what the school really needed was smaller class size, so for example another math teacher instead of a “coach.” Experienced teachers for the most part know what to do, they are just overwhelmed by the large number of students who have special issues – and they do not have support.

 
Hundreds of thousands of dollars were spent on technology – but the librarian and IT person were let go.

 
The presumption on the part of the administrators (not in the school) of the grant was that the problems in the school lay with the teachers – not with poverty, an insufficient number of qualified staff, and an unstable district.

Stephen Henderson is the editorial page editor of the Detroit Free Press. He is not anti-charter; his own children attend a Detroit charter school. He is opposed to lies and propaganda. He has written that the charter movement has done nothing to lift the children of Detroit, and that there are as many bad charter schools as public schools. He has written critically of DeVos’s successful efforts to torpedo accountability and oversight of charter schools.

 

When it comes to data and research, he says, DeVos is not to be trusted.

 

He writes:

 

A true advocate for children would look at the statistics for charter versus traditional public schools in Michigan and suggest taking a pause, to see what’s working, what’s not, and how we might alter the course.

 

Instead, DeVos and her family have spent millions advocating for the state’s cap on charter schools to be lifted, so more operators can open and, if they choose, profit from more charters.

 

Someone focused on outcomes for Detroit students might have looked at the data and suggested better oversight and accountability.

 

But just this year, DeVos and her family heavily pressured lawmakers to dump a bipartisan-supported oversight commission for all schools in the city, and then showered the GOP majority who complied with more than $1 million dollars in campaign contributions.

 

The Department of Education needs a secretary who values data and research, and respects the relationship between outcomes and policy imperatives.

 

Nothing in Betsy DeVos’ history of lobbying to shield the charter industry from greater accountability suggests she understands that.

 

If she’s confirmed, it will be a dark day for the value of data and truth in education policy.

 

 

Bruce Baker of Rutgers University is frustrated. He and colleagues have published study after study about the uses and misuses of standardized test scores to measure teachers and schools.The evidence is clear, he writes. Yet states remain devoted to failed, erroneous methods that pack any evidence!

 

“It blows my mind, however, that states and local school districts continue to use the most absurdly inappropriate measures to determine which schools stay open, or close, and as a result which school employees are targeted for dismissal/replacement or at the very least disruption and displacement. Policymakers continue to use measures, indicators, matrices, and other total bu!!$#!+ distortions of measures they don’t comprehend, to disproportionately disrupt the schools and lives of low income and minority children, and the disproportionately minority teachers who serve those children. THIS HAS TO STOP!”

 

 

Pasi Sahlberg is the great Finnish educator whose book Finnish Lessons gave us a vision of a nation that succeeds without high-stakes testing, without standardized testing, and without charter schools or vouchers. He wrote of highly educated teachers who have wide scope and autonomy in their classrooms and who collaborate with their colleagues to do what is best for their children. He wrote of a national school system that values the arts, physical activity, and play. And, lo and behold, the OECD calls it the best school system in the world!

 

So entranced was I but what I read about Finland that I visited there a few years ago and had Pasi as my guide. The schools and classes were everything he claimed and more.

 

Pasi, like many other education experts, is aghast at the GERM (Global Education Reform Movement) that has swept the world. The agency that has spread GERM far and wide is international testing, the great horse race that only a few can win. Since most are losers, the frenzy for more testing becomes even stronger.

 

Pasi suggests a different approach. Instead of Big Data, produced by mass standardized testing, why not search for small data? 

 

Here is Pasi’s thumbnail sketch of the contrast between Big Data and small data:

 

Big data is a commonly used term in daily discourse that often comes with a label that big data will transform the way we think, work, and live. For many of us, this is an optimistic promise, while for others it creates anxiety and concern regarding control and privacy. In general terms, big data means data of very large size to the extent that its manipulation and management present significant practical challenges.
The main difference between big and small data in education is, of course, the size of data and how these data are collected and used. Big data in education always requires dedicated devices for collecting massive amounts of noisy data, such as specific hardware and software to capture students’ facial expressions, movements in class, eye movements while on task, body postures, classroom talk, and interaction with others. Small data relies primarily on observations and recordings made by human beings. In education, these include students’ self-assessments, teachers’ participatory notes on learning process, external school surveys, and observations made of teaching and learning situations.

 

To watch and listen to Pasi, introduced by Howard Gardner, here is the lecture he gave at Wellesley College in October. 

Mercedes Schneider has assembled data on a scandal in D.C. In recent years, Michelle Rhee and Kaya Henderson claimed that it was the fastest improving district in the nation. The national media repeats their claim.

 

Retired D.C. teacher Erich Martel alerted me to what appeared to be the cooking of the books, and I connected him to Schneider.  She confirms that D.C. cooked NAEP data to overstate gains for the district.

 

She reviews the data and concludes:

 

“Rhee took the DCPS helm in June 2007; when she left in 2010, her deputy, Kaya Henderson, took over.

 

“According to NAEP, they both failed. So has the mayoral control of schools responsible for both Rhee and Henderson. And what is particularly striking is that these “reformers” would rather lie to the public about their success by concealing information than confront their failure and change their corporate-reform-fed course.

 

“I challenge DC Mayor Muriel Bowser to offer a public response to Martel’s NAEP story as publicized in this post, and I challenge DCPS to post the full spectrum of DC’s NAEP results, beginning with the 1998/2000 results; to make such posting easily accessible on the DCPS website, and to use accurate numbers.”

I discovered this wonderful article on Twitter. It was published in 2013, but it remains timely.

Albert Einstein’s great breakthrough came when he put known measures to one side. The notion that time and space were regular and linear was entrenched in science, and had led to an impasse which prevented it from making sense of the universe. By seeing that time and space might flex led to the Theory of Relativity, and led Einstein into a realisation that philosophical steps must be taken if breakthroughs were to be made. This philosophical context for his science led him to see that “not everything that counts can be counted, and not everything that can be counted counts”.

My wife and I recently took our children out of their local primary school to travel in the former Transkei for seven weeks. We thought that this would be a wonderful experience for them, experiencing life on the road in a completely different culture. Their school was programmed to see it differently. Its Ofsted rating could be adversely affected by the absence, and by the prospect of a six-year-old and four-year-old performing slightly less well in their assessments. The scientific culture of measurement risks so narrowing the concept of education that the system becomes unable to see any benefits (which cannot be directly measured) of such a trip.

Social or ‘impact’ investors, such as Panahpur with whom I work, try to achieve their purpose by blending the art of achieving their social goals with the science of managing their funds. They make financial investments for social, as well as financial returns.

The key challenge of doing this is understanding if, and how, the art of achieving social ‘returns’ can be measured in any scientifically robust way.

Successive governments have all but admitted defeat when it comes to the state’s ability to solve certain intractable social problems. There is a general recognition that charities, faith groups and other civil society organisations have an important role in reaching the parts that the statutory social services cannot reach. But to deliver their potential, they require access to capital. Contracting them to provide services has often led to a repeat of the problems of state provision, with a focus on inputs rather than outcomes. All this has led to the emerging world of social impact bonds and payment-by-results (PBR).

We invested in the first social impact bond at HMP Peterborough directly and have invested others subsequently indirectly. PPBR is a popular idea now, which might direct capital to those who can actually solve these problems. If charities and civil society organisations can do what the state cannot and help people to transform their lives, so the argument goes, then let’s pay them when they deliver.

But most charities lack the balance sheet strength to provide services at risk. Which means that PBR leaves private sector providers as the only realistic option. These private sector providers have fiduciary duty to extract all financial value they can from these contracts and return it to shareholders. So PBR contracts can become a proxy for privatisation. The extent of the privatisation of social services resulting from this and other trends is well documented in Social Enterprise UK’s report, ‘The shadow state’ and can be easily seen through the experience of the Work Programme.

All this is complex enough before one brings Einstein into it. But he would recognise that the intractable social problems that increasingly take the lions share of social service budgets can only be solved through the complex, time consuming and uncertain process of human transformation. Dysfunctional and deeply disadvantaged and distressed individuals need to turn their lives around, one by one. Graham Allen MP, through his work on early interventions, has demonstrated that the most cost-efficient interventions will occur during the first three years of an at-risk persons life.

The truth is that, in the context of gnarly social problems, building the link between inputs and outcomes is often impossible. What these social problems require are long term interventions, in the context of an uncertain rehabilitation journey and a chaotic client group. It needs persistence and, ultimately, love. Can one really measure the outcomes of particular interventions? Can a time-bound, tightly contracted and assessed financial confection achieve this deep change?

There is no question that the social impact bonds – for example, with offenders at HMP Peterborough or with children at risk of being taken into care in Essex – offer an exciting new opportunity to create significant positive social change by aligning the state, investors and the taxpayer.

But perhaps we need a deeper discussion, where we have the humility to accept that the relationship between inputs and outcomes of many things that society needs cannot be directly measured. And where we allow ourselves to make the philosophical leap that delivering and measuring social outcomes is not necessarily linear and regular.

Should we do this, we’d be led inexorably to a need to rediscover the notion of common values. Inevitably, in the context of the available evidence and budgets, we need to agree that some things should lead to taxpayer savings through better long term outcomes for the most distressed people in society because they are the right thing to do.

If we can do this, we might be able to direct capital with appropriate rigour to the best placed organisations to deliver long term change. If we cannot, we risk PBR just being the latest in a line of contracting methodologies that fail to address the root causes of our problems. Or, as Einstein might say, we may be doing what we can count but we may well not be doing what actually counts.

James Perry is chief executive at the social investment foundation, Panahpur. James is speaking at the Oxford Jam session If It Can’t Be Measured, It Doesn’t Exist on Thursday (today) at 4pm

Blogger Jersey Jazzman is an experienced teacher and graduate student at Rutgers, where he has learned how reformers play games with data. He is better than they are and can be counted on to expose their tricks.

In this post, he blows away the myth of the “success” of Boston charter schools.

The public schools and the charter schools in Boston do not enroll the same kinds of students, due to high attrition rates in the charters (called Commonwealth charter schools).

He writes:

“As I pointed out before, the Commonwealth charter schools are a tiny fraction of the total Boston high school population. What happens if the cap is lifted and they instead enroll 25 percent of Boston’s students? What about 50 percent?

“Let’s suppose we ignore the evidence above and concede a large part of the cohort shrinkage in charters is due to retention. Will the city be able to afford to have retention rates that high for so many students? In other words: what happens to the schools budget if even more students take five or six or more years to get through high school?

“In a way, it doesn’t really matter if the high schools get their modest performance increases through attrition or retention: neither is an especially innovative way to boost student achievement, and neither requires charter school expansion. If Boston wants to invest in drawing out the high school careers of its students, why not do that within the framework of the existing schools? Especially since we know redundant school systems can have adverse effects on public school finances?”

Conclusion: Jersey Jazzman opposes Amendment 2, which would lead to an unsustainable growth in charter schools, free to push out the students they don’t want.

Cathy O’Neil has written s new book called “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.” I haven’t read it yet, but I will.

In this article, she explains that VAM is a failure and a fraud. The VAM fanatics in the federal Department of Education and state officials could not admit they were wrong, could not admit that Bill Gates had suckered the nation’s education leaders into buying his goofy data-based evaluation mania, and could not abandon the stupidity they inflicted on the nation’s teachers and schools. So they say now that VAM will be one of many measures. But why include an invalid measure at all?

As she is out on book tour, people ask questions and the most common is that VAM is only one of multiple measures.

She writes:

“Here’s an example of an argument I’ve seen consistently when it comes to the defense of the teacher value-added model (VAM) scores, and sometimes the recidivism risk scores as well. Namely, that the teacher’s VAM scores were “one of many considerations” taken to establish an overall teacher’s score. The use of something that is unfair is less unfair, in other words, if you also use other things which balance it out and are fair.

“If you don’t know what a VAM is, or what my critique about it is, take a look at this post, or read my book. The very short version is that it’s little better than a random number generator.

“The obvious irony of the “one of many” argument is, besides the mathematical one I will make below, that the VAM was supposed to actually have a real effect on teachers assessments, and that effect was meant to be valuable and objective. So any argument about it which basically implies that it’s okay to use it because it has very little power seems odd and self-defeating.

“Sometimes it’s true that a single inconsistent or badly conceived ingredient in an overall score is diluted by the other stronger and fairer assessment constituents. But I’d argue that this is not the case for how teachers’ VAM scores work in their overall teacher evaluations.

“Here’s what I learned by researching and talking to people who build teacher scores. That most of the other things they use – primarily scores derived from categorical evaluations by principals, teachers, and outsider observers – have very little variance. Almost all teachers are considered “acceptable” or “excellent” by those measurements, so they all turn into the same number or numbers when scored. That’s not a lot to work with, if the bottom 60% of teachers have essentially the same score, and you’re trying to locate the worst 2% of teachers.

“The VAM was brought in precisely to introduce variance to the overall mix. You introduce numeric VAM scores so that there’s more “spread” between teachers, so you can rank them and you’ll be sure to get teachers at the bottom.

“But if those VAM scores are actually meaningless, or at least extremely noisy, then what you have is “spread” without accuracy. And it doesn’t help to mix in the other scores.”

This is a book I want to read. Bill Gates should read it too. Send it to him and John King too. Would they read it? Not likely.

We have had quite a lot of back and forth on this blog about Boston charter schools, in anticipation of the vote this November in Massachusetts about lifting the charter cap and adding another 12 charter schools every year forever. Pro-charter advocates argue that the Boston charters are not only outstanding in test scores but that their attrition rate is no different from that of the public schools, or possibly even less than the public schools.

Jersey Jazzman (aka Mark Weber) is a teacher and is studying for his doctorate at Rutgers, where he specializes in data analysis.

In this post, he demolishes the claim that Boston charters have a low attrition rate. As he shows, using state data,

In the last decade, Boston’s charter sector has had substantially greater cohort attrition than the Boston Public Schools. In fact, even though the data is noisy, you could make a pretty good case the difference in cohort attrition rates has grown over the last five years.

Is this proof that the independent charters are doing a bad job? I wouldn’t say so; I’m sure these schools are full of dedicated staff, working hard to serve their students. But there is little doubt that the public schools are doing a job that charters are not: they are educating the kids who don’t stay in the charters, or who arrive too late to feel like enrolling in them is a good choice.

This is a serious issue, and the voters of Massachusetts should be made aware of it before they cast their votes. We know that charter schools have had detrimental effects on the finances of their host school systems in other states. Massachusetts’ charter law has one of the more generous reimbursement policies for host schools, but these laws do little more than delay the inevitable: charter expansion, by definition, is inefficient because administrative functions are replicated. And that means less money in the classroom.

Is it really worth expanding charters and risking further injury to BPS when the charter sector appears, at least at the high school level, to rely so heavily on cohort attrition?