Fred Smith was the testing expert at the New York City Board of Education for many years. After he retired, he became a relentless truth-teller about the flaws of standardized testing and the clever means of distorting the stats to produce the desired results. He currently acts as an unpaid advisor to opt-out parents.

Smith sent this article from 2007 that shows how Mayor Bloomberg and Chancellor Klein played games with the data, in this case blaming “immigrant kids” for a drop in test scores.

Mayor Bloomberg and his schools chancellor, Joel Klein, have reaffirmed that old Mark Twain saying about the three kinds of lies: lies, damn lies and statistics.

Using a PowerPoint presentation filled with glitzy graphs and color charts, Klein reached a new low yesterday by attempting to blame a sharp drop in this year’s third-, fourth- and fifth-grade reading scores on thousands of immigrant pupils.

According to the chancellor, the drop in the lower grade scores was solely because of the federal government’s new requirement that all children classified as English-language learners, or ELLs, must take the regular state tests after being in the country just one year.

Because of that requirement, some 30,000 more ELLs took the state test this year than in 2006, Klein said, and their lower scores dragged down overall city results.

Fred Smith was outraged when he heard Klein’s explanation. Smith, you see, spent three decades analyzing tests for our city’s school system, so he knows a thing or two about how chancellors paint the prettiest picture for the public.

“They never told you that back in 2005, during the mayoral race, the school district quietly increased the number of exemptions for ELL kids and then claimed a record boost in scores,” Smith said.

In 2009, with Bloomberg’s fellow billionaire Meryl Tisch, in charge of the New York Board of Regents, test scores in the city went through the roof. After the mayoral primary election was safely past, the Regents commissioned a report by professors Daniel Koretz and Jennifer Jennings showing that the test questions had become familiar, leading to score inflation, and that the dramatic rise was not real.

Also, in an amusing turn of events, New York City won the Broad award in 2007 as the most improved urban district, right before the NAEP gains were released, showing that the city had made no gains on NAEP.

In 2010, Jennifer Medina of the New York Times wrote about the perils of over reliance on standardized tests and how it affected New York City in particular. 

She wrote:

When New York State made its standardized English and math tests tougher to pass this year, causing proficiency rates to plummet, it said it was relying on a new analysis showing that the tests had become too easy and that score inflation was rampant.

But evidence had been mounting for some time that the state’s tests, which have formed the basis of almost every school reform effort of the past decade, had serious flaws.

The fast rise and even faster fall of New York’s passing rates resulted from the effect of policies, decisions and missed red flags that stretched back more than 10 years and were laid out in correspondence and in interviews with city and state education officials, administrators and testing experts.

The process involved direct warnings from experts that went unheeded by the state, and a city administration that trumpeted gains in student performance despite its own reservations about how reliably the test gauged future student success.

It involved the state’s decision to create short, predictable exams and to release them publicly soon after they were given, making coaching easy and depriving test creators of a key tool: the ability to insert in each test questions for future exams. Next year, for the first time, the tests will not be released publicly.

It involved a national push for numbers-based accountability, begun under President George W. Bush and reinforced by President Obama. And it involved a mayor’s full embrace of testing as he sought to make his mark on the city, and then to get re-elected.

“They just kept upping the stakes with the scores, putting more pressure on the schools but not really looking at what it all means,” said Pedro Noguera, an education professor at New York University who has worked with the city’s Department of Education to help improve struggling schools.

New York has been a national model for how to carry out education reform, so its sudden decline in passing rates may be seen as a cautionary tale. The turnaround has also been a blow to Mayor Michael R. Bloomberg and his chancellor, Joel I. Klein, who despite warnings that a laserlike focus on raising scores could make them less and less reliable, lashed almost every aspect of its school system to them. Schools were graded on how much their scores rose and threatened with being closed if they did not. The scores dictated which students were promoted or left back, and which teachers and principals would receive bonuses.

Even now, the city believes that the way it uses the tests is valid. The mayor and the chancellor have forcefully defended their students’ performance, noting that even after the changes this year, student scores are still better than they were in 2002. They have argued that their students’ progress is more important than the change in the passing rate, and that years of gains cannot be washed away because of a decision in Albany to require more correct answers from every student this year.

The test scores were even used for a new purpose this year: to help determine which teachers should receive tenure.

“This mayor uses data and metrics to determine whether policies are failing or succeeding,” said Howard Wolfson, the deputy mayor for government affairs and communications. He also helped run Mr. Bloomberg’s re-election campaign in 2009, using the city’s historic rise in test scores to make the case for a third term. “We believe that testing is a key factor for determining the success of schools and teachers.”

“Under any standard you look at,” he added, “we have improved the schools.”

But given all the flaws of the test, said Prof. Howard T. Everson of the City University of New York’s Center for Advanced Study in Education, it is hard to tell what those rising scores really meant.

“Teachers began to know what was going to be on the tests,” said Professor Everson, who was a member of a state testing advisory panel and who warned the state in 2008 that it might have a problem with score inflation. “Then you have to wonder, and folks like me wonder, is that real learning or not?”

New Generation of Tests

The problems that plagued New York’s standardized tests can be traced to the origin of the exams.

In 1996, New York set about creating tests for fourth and eighth graders as a way to measure whether schools were doing their jobs. A precursor to the widespread testing brought about by Mr. Bush’s No Child Left Behind law, the tests replaced more basic exams that had been given in the same grades, which simply determined whether students needed remedial instruction. (The city had also given its own tests for many years.)

Teachers pushed back, saying they could gauge their students’ performance better than any mass-produced tests could. “There was a lot of resistance from throughout the education community to having the tests,” said Alan Ray, who was the chief spokesman for the State Education Department in the 1990s and in 2000, and retired this year after overseeing data for the office.

But education officials in New York, and many other states, were coming to the conclusion that some measurement system, no matter how limited, was necessary.

The officials sought advice from dozens of educators across New York to figure out what the tests should encompass, Mr. Ray said. Teachers and principals asked that the standards be specific, to make it clear what they were expected to teach at each grade level, and superintendents pleaded to keep the tests relatively short so that students would not spend days filling in bubbles. The state obliged both requests.

The decision to keep the tests narrow and short — the fifth-grade math test, for example, had 34 questions this year — would have a lasting impact, said Daniel Koretz, a professor at Harvard’s Graduate School of Education who specializes in assessment systems. The same types of questions would be trotted out every year, he said.

“In many cases you could not write an unpredictable question no matter how hard you tried,” Professor Koretz said. He oversaw the study of New York’s tests that led to the state’s conclusion that they had become too easy to pass.

The state also continued making tests public after they were administered. Coupled with the questions’ predictability, the public release of the tests, which started long before the nationwide accountability movement, provided teachers with ready-made practice exams….

A Mayor Chases Results

The state tests’ flaws would not become evident for years. But by 2001, the tests had a champion.

During his first campaign, Mr. Bloomberg said that education was his top priority. He pledged to take control of the city’s public schools, then under the supervision of the Board of Education, which had been ridiculed for budget troubles and stagnant academic performance.

Projecting the image of a bottom-line-oriented, pragmatic businessman, Mr. Bloomberg latched on to test scores as a clear way of seeing just how well students were doing.

“If four years from now reading scores and math scores aren’t significantly better,” Mr. Bloomberg said in a radio interview in 2001, “then I will look in the mirror and say that I have been a failure. I’ve never failed at anything yet, and I don’t plan to fail at that.”

After Mr. Bloomberg persuaded the Legislature to give him control of the schools, he appointed Mr. Klein, a former Justice Department lawyer and media executive, as his chancellor. Mr. Klein was seen as a technocrat who was eager and able to produce tangible results, the kind that could be measured.

Scores in the city and state were on their way up. In 2004, for example, the proportion of fourth graders in the city meeting math standards increased to 68 percent, up 16 percentage points since 2001. Only 42 percent of eighth graders met that mark, but that was still a significant improvement from just a few years earlier. By 2009, that rate would jump nearly 30 points.

“What is encouraging is that for two or three years in a row now, the tests have gone in the same direction — up,” the mayor said on a radio show in October 2004. “So there’s reason to believe we’re headed to the correct place.”

In 2003, Mr. Bloomberg ended the practice of “social promotion” in certain grades, requiring students performing at the lowest levels on the tests be held back unless they attended summer school and showed progress on a retest. That year, Mr. Klein released a list of 200 successful schools, the only places where teachers would not have to follow the citywide math and English curriculums. The list was primarily based on test scores.

More and more of the mayor’s educational initiatives were linked to the scores. They were used to help decide which schools should be closed and replaced with new, smaller schools. The new A-through-F grading system for schools was based primarily on how their students improved on the tests. Teachers and principals earned bonuses of up to $25,000 if their schools’ scores rose. Teachers’ annual evaluations and tenure decisions are partially dependent on test results.

Each new policy was met with denunciations from the teachers’ union or from education experts like Diane Ravitch. Ms. Ravitch, a supporter of standardized testing when she was an adviser to the Clinton and Bush administrations, became one of the biggest critics, arguing that schools were devoting too much time to the pursuit of high scores.

“If they are not learning social studies but their reading scores are going up, they are not getting an education,” Ms. Ravitch said in 2005, as the mayor coasted to re-election.

The mayor and chancellor dismissed these criticisms as the hidebound defenses of an old, failed system devoid of meaningful standards. But some questions were also being raised by people close to the administration.

In the Education Department headquarters on Chambers Street, some officials argued that the A-through-F system of grading schools should incorporate not only the English and math tests, but also the science and social studies exams given by the state. “We wanted to draw this as broadly as possible,” said a former school official who spoke on the condition of anonymity to avoid publicly disagreeing with Mr. Klein.

But after months of running models and tweaking formulas, Mr. Klein decided to stick with the two core subjects. After all, he often argued, if students could not master essential math and English skills, it would be impossible for them to grasp other concepts.

Dr. Noguera, the N.Y.U. education professor and adviser to the city, applauded Mr. Klein for creating a grading system that rewarded improvement from year to year so that schools in poor neighborhoods had the same chance of achieving a good grade as those in wealthier areas.

But it also was risky, Dr. Noguera said. “That got schools fixated on how to raise scores, not looking for more authentic learning,” he said.

Dr. Noguera expressed his views publicly and to some of Mr. Klein’s deputies, but never directly told the chancellor, he said.

Mr. Klein said in recent interviews that while the tests were imperfect, they were still the best measurements available for a school system that previously had no yardsticks. They also were not the only signs proving the city had been making progress, he said: On more difficult federal tests given to a sample of fourth and eighth graders, the city had steadily improved.

And the city’s main goal, he said, was not simply giving out laurels for students’ scoring 3s (“proficient”) and 4s (“advanced”) on the state tests.

Instead, its system of school grades and teacher incentives gave considerable weight to scores that showed improvement from year to year at all levels.

“Nobody else was doing this,” Mr. Klein said. “We never said it was good enough to get to passing and just stay there.”

In 2006, the state added tests for the third, fifth, sixth and seventh grades, in order to align with the requirements of No Child Left Behind. Scores jumped in 2007.

There were improvements at every grade level across the state and in New York City, where 65 percent of all students met state standards in math, an improvement of eight percentage points in one year.

“I’m happy, thrilled — ecstatic, I think, is a better word,” Mr. Bloomberg said at the time. “The hard work going on in our schools is really paying off.”

After Mr. Bloomberg’s first full term as mayor, the new scores seemed to ratify his claims of success. They also raised more alarms.

As a superintendent in the Brownsville section of Brooklyn, Kathleen Cashin had seen several schools improve throughout the early part of the decade. But when she saw the sudden jump, she said, she was shocked.

“I said to my intimate circle of staff, this cannot be possible,” Ms. Cashin recalled. “I knew how much effort and how much planning any little improvement would take, and not all of these schools had done any of it.”

But Ms. Cashin, who retired in February, held her tongue at the time. Asked why she did not take up her concerns with Mr. Klein or his deputies, she said, “I didn’t have their ear.”

A Proposal for a Fix

The following winter, Professor Koretz, of Harvard, and Professor Everson, of CUNY, who was a member of a state testing advisory group, sent a memo to state education officials.

“Research has shown that when educators are pressured to raise scores on conventional achievement tests, some improve instruction, while others turn to inappropriate methods of test preparation that inflate scores,” they wrote in the Feb. 5, 2008, memo. “In some cases, the inflation of scores has been extreme.”

The researchers proposed to devise a kind of audit. While tests tended to be similar from year to year, they would add to each exam some questions that did not resemble those from previous years. If a class performed well on the main section of the test but poorly on the added questions, that would be evidence that scores were inflated by test preparation. If a class performed well on both, the researchers wrote, that teacher might have methods worth emulating.

In addition, they wrote, such a system would give teachers “less incentive to engage in inappropriate test preparation and more incentive to undertake the much harder task of improving instruction.”

State education officials, the professors said, did not give them a hearing.

The 2008 results showed even more large gains — 74 percent of city students were deemed proficient in math, an increase of nine points in one year; and the city’s passing rate in reading was now 58 percent, up from 51 percent two years earlier. Statewide, the passing rates jumped to 81 percent in math and 69 percent in reading.

Professor Koretz and Professor Everson wrote another memo in September 2008, again proposing to create a way to make test results more reliable. But the idea went nowhere….

The city’s Department of Education constantly mines test score data for patterns to show where improvement is happening and where it is needed. In 2008, it noticed an incongruity: Eighth graders who scored at least a 3 on the state math exam had only a 50 percent chance of graduating from high school four years later with a Regents diploma, which requires a student to pass a certain number of tests in various subjects and is considered the minimum qualification for college readiness.

The city realized that the test results were not as reliable as the state was leading people to believe.

Mr. Klein and several of his deputies spoke by phone with Merryl H. Tisch, the vice chancellor of the Board of Regents, and Mr. Mills, trying to persuade them to create a statewide accountability system similar to the city’s, one that gave improvement at least as much weight as the score itself.

The state said it would consider moving to such a system, but would need more time.

Neither the city nor state publicly disclosed the concerns about the scores. By then, students across the state were preparing for the 2009 tests, filling in bubbles on mock answer sheets, using at least three years of previous state tests as guides.

The scores arrived in May, and with them, the bluntest warning yet.

Just before the results were released, a member of the Regents named Betty Rosa called Ms. Tisch, who had recently become chancellor.

Ms. Rosa, who had been a teacher, principal and superintendent in the Bronx for nearly three decades, said the unprecedented high scores simply seemed too good to be true. She suggested the unthinkable: the scores were so unbelievable, she said, that the state should not publicly release them.

“The question was really are we telling the public the truth,” Ms. Rosa said in a recent interview. Ms. Tisch, she said, relayed that she, too, found the scores suspicious, but that it would be impossible to withhold them. “It was like a train that was already in motion and no way to stop it,” Ms. Rosa said.

The English test scores showed 69 percent of city students passing. Mr. Bloomberg called the results “nothing short of amazing and exactly what this country needs.”

“We have improved the test scores in English,” he continued, “and we expect the same results in math in a couple of weeks, every single year for seven years.” Four weeks later, it was announced that 82 percent of city students had passed the math tests.

Because of the widespread improvement in the scores, 84 percent of all public schools received an A in the city’s grading system, something Mr. Klein said he later regretted. This year, the city limited the number of A’s to 25 percent of schools.

The 2009 numbers came out as the mayor was trying to accomplish two goals: to persuade the Legislature to give the mayor control of the schools for another seven years; and to convince city voters that he deserved a third term.

Mr. Bloomberg’s opponent, Comptroller William C. Thompson, had once been president of the Education Board.

“Mike Bloomberg changed that system,” said one of the mayor’s campaign advertisements. “Now, record graduation rates. Test scores up, violence down. So when you compare apples to apples, Thompson offers politics as usual. Mike Bloomberg offers progress.”

In his debates, Mr. Bloomberg hammered home the theme. “If anybody thinks that the schools were better when Bill ran them, they should vote for him,” he said in one face-off. “And if anybody thinks they’re better now, I’d be honored to have their vote.”

Indeed, according to exit polls, 57 percent of those who said education was their primary concern voted for Mr. Bloomberg, who won the election by a five-point margin.

Mr. Wolfson, the deputy mayor and 2009 campaign strategist, said the mayor had no regrets about focusing on the exams as a matter of policy, and during the election.

“What’s the converse?” he said. “The converse is that we don’t test and we have no way of judging success or failure. Either you believe in standards or tests, or you don’t — and life is not like that. There are tests all the time.”

Ms. Tisch, in releasing the 2009 test results, had not heeded Ms. Rosa’s radical request. But the very day she put out the English test results, she began openly acknowledging doubts about the scores, irking the mayor and chancellor, who privately seethed that she was seeking to undermine their success. “As a board, we will ask whether the test is getting harder or easier,” she said.

Although the Regents did not immediately opt to create an entirely new test, Ms. Tisch and David Steiner, the new education commissioner, asked Professor Koretz, who had been rebuffed in previous requests, to analyze the ones that were in use. His conclusion — and that of another researcher, Jennifer L. Jennings — was that the tests had become too easy, and hence the scores were inflated. That led the State Education Department to raise the number of correct answers required to pass each test.

The state intends to rewrite future tests to encompass a broader range of material, and will stop publicly releasing them.

“We came in here saying we have to stop lying to our kids,” Ms. Tisch said in a recent interview. “We have to be able to know what they do and do not know.”

Bloomberg was first elected to the mayoralty in 2001. There was a two-term limit. He ran again in 2005, for what should have been his second and last term, and won easily. In 2009, he used his vast resources to persuade the City Council to vote to give him and themselves a third term. And that he is how he qualified to run for a third term and used his education record as a reason to be re-elected.
Now, after all this investment in testing, test prep, interim assessments, etc. what were the results?
New York City has shown no gains in reading on NAEP from 2003-2019, in either fourth or eighth grades.
Make of it what you will.
If Bloomberg is the Democratic candidate against Trump, I will vote for him.
But please don’t believe the boasting about the New York City education miracle.
It never happened.
An update on some of the individuals mentioned in the New York Times’ 2010 article. Betty Rosa is now Chancellor of the State Board of Regents. Kathleen Cashin is a member of the Board of Regents. Meryl Tisch is now on the board of the State University of New York (which has the power to authorize new charter schools, including those of Eva Moskowitz’s Success Academy chain). David Steiner–now a professor at Johns Hopkins University– served for two years as State Commissioner, during which time he approved Mayor Bloomberg’s choice to succeed Joel Klein as NYC Chancellor, a retired magazine publisher named Cathie Black, who lasted three months. Steiner was also in charge of the State Education Department when it won a Race to the Top grant and committed the state to using student test scores to evaluate teachers, increasing the number of charter schools, and adopting the Common Core standards. These changes, in turn, created the parent-led Opt Out movement, in which parents refused to let their children take the state tests and grew to represent 20% of the eligible students. John King succeeded David Steiner and eventually replaced Arne Duncan in the last year of President Obama’s second term. When Joel Klein stepped down, he hired a Department of Education vendor named Wireless Generation and created a technology company called Amplify. Rupert Murdoch bought Amplify and invested a reputed $1 billion; newspaper stories predicted that Amplify would usher in a new age of hardware and software. However, the biggest sale of Amplify tablets and software was made to Guilford, North Carolina, purchased with Race to the Top funding; it turned into  a disaster when chargers melted and other problems emerged. Guilford canceled the contract. Murdoch, having lost about $500 million, put the company up for sale. Laurene Powell Jobs bought it, and Amplify is now part of her Emerson Collective, selling “personalized learning.” Klein works for an online healthcare company called OSCAR, co-founded by Joshua Kushner, brother of Jared Kushner.