“We have to go from what is essentially an industrial model of education, a manufacturing model, which is based on linearity and conformity and batching people. We have to move to a model that is based more on principles of agriculture. We have to recognize that human flourishing is not a mechanical process; it's an organic process. And you cannot predict the outcome of human development. All you can do, like a farmer, is create the conditions under which they will begin to flourish.”
― Sir Ken Robinson
Education matters to New Yorkers. The top keywords driving traffic to NYC.gov were “schools” and “Department of Education.” Parents want to know the quality of their child’s school, businesses want to ensure their future employees have the right skills, and citizens of all stripes have an interest in education as foundation for our democracy.
So how does New York City track education performance? What variables are considered important? How do those differ by area and by student, teacher, parent, and other community roles? What mechanisms are used to synthesize that data into performance metrics? This paper maps New York’s formal government pathways for measuring school performance and discusses opportunities for improvement.
NYC’s Department of Education performance dashboard includes metrics on average daily attendance or “ADA,” the percentage of students meeting / exceeding standards, average expenditure per student, school building quality, and students in schools that exceed capacity. Beyond these high level metrics, the Division of Performance and Accountability has several data monitoring tools that are synthesized into “Progress Reports” that grade each school A-F. Somewhat uniquely in public education, these reports integrate parent, teacher, and student satisfaction surveys into the ultimate grades. In addition, qualitative reviews through superintendent site visits, legal compliance evaluations, and New York State academic tests.
Limitations of standardized testing data
The City’s formula that judges elementary and middle schools rests primarily on reading and math test scores. These results are quite volatile and individual schools can jump from the very bottom to the very top like PS 25 did when it stopped teaching eighth grade in 2009.
Further, these results track coarse averages rather granular individual data and miss huge swathes of learning progress.  An elementary school field trip program that inspired students by visiting the Center for Urban Science and Progress’s Urban Observatory might lead students to learn computational and image processing skills that do not translate into grade level reading or mathematics standardized test questions. More broadly, many recent achievements in understanding humans – in areas as diverse as genetics and neuroscience – rest on looking beyond average population characteristics to investigating the measurement variation of individuals.
In addition, these data collection regimes collect designed sample data with certain statistical limitations. Standardized tests occur once a year and do not capture student learning progress in between such benchmarks. Further, as the American Statistical Association has noted, the testing data collection scheme and statistical methods used to aggregate student achievement through “value-added” models involve important limitations when used to draw performance inferences.  Such methods can be highly sensitive to model selection and only allow for correlations rather than causal interpretations. John Ewing, Executive Director of the American Mathematical Society, puts the issue directly:
“People recognize that tests are an imperfect measure of educational success, but when sophisticated mathematics is applied, they believe the imperfections go away by some mathematical magic. But this is not magic. What really happens is that the mathematics is used to disguise the problems and intimidate people into ignoring them—a modern, mathematical version of the Emperor’s New Clothes.”
So how might we use a broader systems approach to improve this data collection regime?
New Measurement Strategies
First, we should note that New York City is engaged in several promising programs that offer potential for better use of data. The iZone program has worked to pioneer blended learning, that is to say instruction strategies that use both digital and more standard physical classroom approaches, in select schools. As more instruction and student works utilizes digital tools, there’s increased opportunity for more comprehensive organically generated data to supplement the current regime of designed sample data. NYC is also the home of several ed tech startups, notably Knewton the leader in the adaptive learning space, positioning with ample human resources to make sense of the “digital exhaust” of learning tools.
New York's progress report cards could benefit, however, from more nuanced and robust approaches advocated by the Gate’s Foundation supported “Measures of Effective Teaching” project. This research group has used random control experimentation to effectively predict student achievement by integrating classroom visitation and student satisfaction data into their statistical “value added” models. They suggest filming classrooms as effective way to monitor teacher performance at scale. This study was limited to looking at student performance through standardized testing but nonetheless represents an improvement over current performance models that just look at standardized testing.
What about measuring alternative pathways
Looking at the broader system beyond standardized testing, it’s important to remember that education has always been how one generation passed on knowledge to the next. Students learn from their family, their friends, what’s on the TV, what games they play after school, what they do on their summer break, what their neighbors can teach them when they wander over to their garage, and whatever else young humans do beyond the school bell. Humans also learn throughout their lives.
How might we have a data collection process that reflects that reality? How might we measure the effectiveness of individual non-governmental pre-K organizations? Or how might we capture the impact of after school programs and other enrichment activities – such as our hypothetical CUSP Urban Observatory field trip or the Harlem Children’s Zone – into our performance evaluation? In a City like New York with a unique city college system, how might we integrate data across K-12, community college, 4 year CUNY programs and ultimately career success metrics?
From a systems perspective, we care about standardized testing because it serves as a proxy for our broader goals of education such as preparation for career success and involvement in democratic society.
Potential of new informatics strategies
To achieve that bigger goal, we might imagine linking the “digital exhaust” of students in blended learning environments (say their performance on Khan Academy progress quizzes) with long term skill acquisition (captured through say Linkedin data). This individualized longitudinal data would allow for a rich analytical canvas upon which we might map NYC’s educational ecosystem.
Another, more nonstandard approach, would be to learn from digital badges used by many developers and web companies to measure technology skill proficiency. Increasingly, however, these tools are used for broader areas of human learning – everything from cooking to cartography. Badges provide a practical way to measure skills that students learn outside of formal academic contexts. In addition, by treating badge-seekers the same irrespective of their age or academic standing, they allow students to achieve mastery at their own pace. Such approaches are new and have not been fully tested across a large urban education system but we would do well to listen to mathematician John Ewing’s advice on educational data:
“Shouldn’t we try to measure long-term student achievement, not merely short-term gains? Shouldn’t we focus on how well students are prepared to learn in the future, not merely what they learned in the past year? Shouldn’t we try to distinguish teachers who inspire their students, not merely the ones who are competent? When we accept value-added as an “imperfect” substitute for all these things because it is conveniently at hand, we are not raising our expectations of teachers, we are lowering them.”
For more information on how we might pioneer a human rather than industrial education ecosystem, please see this series of essays colleagues and I wrote a while back.
 Road Map for a Digital City: http://www.nyc.gov/html/media/media/PDF/90dayreport.pdf
 NYC performance dashboard accessed from: http://www.nyc.gov/html/ops/cpr/html/performance/performance.shtml
 NYC school satisfaction survey analysis accessed from: http://schools.nyc.gov/NR/rdonlyres/7D257715-FEB1-47D3-99C2-098CBB06A1FD/0/survey_2014_publicdeckforwebsite.pdf
 New York State test statistics: https://reportcards.nysed.gov/
 “Managing by the Numbers” from the Center for New York City Affairs.
 “The End of Averages” – talk at the Harvard School of Education retrieved from: http://www.gse.harvard.edu/news/14/09/8x8-hgse-faculty-share-their-bold-ideas-improve-education
 “The Science of Individuals,” Todd Rose, Parisa Rouhani, and Kurt W. Fischer accessed from: http://static.squarespace.com/static/511a348ce4b0f4197c0e2fd8/t/535130dbe4b0757a373bf0ab/1397829851621/RoseRouhaniFischer2013.pdf
 “ASA Statement on Using Value-Added Models for Educational Assessment” from the American Statistical Association accessed from: https://www.amstat.org/policy/pdfs/ASA_VAM_Statement.pdf
 “Mathematical Intimidation: Driven by Data,” John Ewing in the Annals of the American Mathematical Society accessed from: http://www.ams.org/notices/201105/rtx110500667p.pdf
 EdSurge, the leading ed tech news source, on the iZone program: https://www.edsurge.com/n/2014-10-01-with-steven-hodas-gone-what-happens-to-new-york-s-izone
 Measures of Effective Teaching Project sponsored by the Gates Foundation: http://www.metproject.org/faq.php
 See for instance the Mozilla Foundations OpenBadges.org effort
 “Managing by the Numbers” from the Center for New York City Affairs.