Aggregate Data Definition

Aggregate data refers to numerical or non-numerical information that is (1) collected from multiple sources and/or on multiple measures, variables, or individuals and (2) compiled into data summaries or summary reports, typically for the purposes of public reporting or statistical analysis—i.e., examining trends, making comparisons, or revealing information and insights that would not be observable when data elements are viewed in isolation. For example, information about whether individual students graduated from high school can be aggregated—that is, compiled and summarized—into a single graduation rate for a graduating class or school, and annual school graduation rates can then be aggregated into graduation rates for districts, states, and countries.

While most aggregate education data is numerical—e.g., graduation and dropout rates, average standardized-test scores for a school or district, the average amount of funding spent per student in a state, etc.—it’s both possible and common to aggregate non-numeric information. For example, educators, students, and parents in a school district may be surveyed on a topic, and the information and comments from those surveys could then be “aggregated” into a report that shows what the surveyed individuals generally think and feel about the issue. Information collected during polls, interviews, and focus groups can be aggregated in a similar fashion.

To further illustrate the concept of aggregate data and how it may be used in public education, consider a school with an enrollment of 500 students, which means the school maintains 500 student records, each of which contains a wide variety of information about the enrolled students—for example, first and last name, home address, date of birth, gender identification, race or ethnicity, date and period of enrollment, courses taken and completed, course-grades earned, test scores, etc. (the information collected and maintained on individual students is often called student-level data, among other terms). Once or twice a year, the school district may be required to submit student-enrollment reports to their state department of education. Each school in the district will then compile a report that documents the number of students currently enrolled in the school and in each grade level, which requires administrators to summarize data from all their individual student records to produce the enrollment reports. The district now has aggregate enrollment information about the students attending its schools. Over the next five years, the school district could use these annual reports to analyze increases or declines in district-wide enrollment, enrollment at each school, or enrollment at each grade level. The district could not, however, determine whether there have been increases or declines in the enrollment of white and non-white students based on the aggregate data it received from its schools. To produce a report showing distinct enrollment trends for different races and ethnicities, the district schools would then need to disaggregate the enrollment information by racial and ethnic subgroups.

Aggregated vs. Disaggregated Data

To aggregate data is to compile and summarize data; to disaggregate data is to break down aggregated data into component parts or smaller units of data. While this distinction between aggregated and disaggregated data may appear straightforward, there is a nuance worth discussing here: a lot of “disaggregated” data in education is actually data that has been technically aggregated, at some level, from records maintained on individual students. For example, graduation rates are widely considered to be “aggregate data,” while graduation rates reported for different subgroups of students—say, for students of different races and ethnicities—is typically considered to be “disaggregated data.” Yet to produce reports that disaggregate graduation rates by race and ethnicity, data on individual students actually has to be “aggregated” to produce summary graduation rates for different racial subgroups. Most likely, this distinction between aggregated and disaggregated data arose because, historically, only aggregated data on school-wide, district-wide, or statewide educational performance was readily or publicly available. When investigating or reporting on topics such as aggregate data or disaggregated data, it is important to determine precisely how the terms are being used in a particular context.

Reform

Before the early 2000s, most state education agencies and districts only collected aggregate data on students enrolled in public schools. Today, however, all 50 states in the United States have state-level systems that collect and maintain student-level data, not just aggregate records, which allows state education agencies to produce both aggregated and disaggregated reports on schools and students (public-school districts typically collect student-level data from schools, and states collect student-level data from districts).

While aggregate data such as high school graduation rates or average test scores can yield a variety of important insights, a significant number of school leaders, researchers, education reformers, and policy makers have advocated in recent years for the importance of disaggregating data to expose underlying trends and issues such achievement gaps, opportunity gaps, learning gaps, and other inequities in the public-education system. If, for example, the only graduation data available are annual rates for individual schools, this aggregate data may hide significant disparities in graduation rates for students from low-income households, students of color, students with disabilities, or students who are not proficient in the English language. It’s possible for a school’s aggregate graduation rate to appear strong overall—say, 90 percent—but when the data are disaggregated for different groups of students, the disaggregation may reveal, for example, that more than 50 percent of the African American and Hispanic students in the school fail to graduate.

Generally speaking, the main purpose of collecting and reporting aggregate data is to provide useful information about the performance of public schools and public-school students to those who are monitoring schools or working to improve them. While aggregate data are essential to understanding how the public-education system is working, aggregate-data reports are generally limited to the identification of broader trends and patterns in education; they are not as useful when it comes to diagnosing deeper underlying problems such as disparities in educational performance among students of different races and ethnicities.

Debate

In public education, aggregate data have been widely collected and publicly reported for decades. For the most part, the use of aggregate data has not been a controversial topic in public education, primarily because aggregate data present far fewer concerns about student safety and privacy than the collection, sharing, and use of data and personal information about specific students. That said, a variety of debates related to the use of aggregate data in education have emerged in recent years, typically in response to (1) the use of public reports, often called “school report cards,” intended to provide families and the public with summarized assessments of individual school performance, and (2) the use of average student test scores and other aggregate measures in the job-performance evaluations of educators.

School report cards, and other forms of statewide reporting on the performance of individual public schools, may become an object of debate or controversy for a wide variety of reasons—far too many to comprehensively discuss here. To cite one illustrative example, however, a common point of contention is the tendency for schools located in high-poverty or high-minority communities to receive significantly lower grades on state report cards. These schools tend to serve a higher-need student population with larger learning deficits, to be underfunded (compared to wealthier districts), to have less-experienced or less-skilled teachers, and to face an array of additional obstacles that contribute to lower performance—yet the aggregate data presented in state report cards may not provide this contextual information. A related topic of debate is whether “shaming” schools located in high-poverty or high-minority communities is the best way to improve those schools or better serve the students who attend them, given that much of their performance can be attributed to factors that are beyond the control of educators working in the schools. Those who advocate for the use of state report cards may argue that—regardless of the challenges schools face—parents, families, and the general public have a right to be informed about the performance of the public schools in their state and community, and that increasing transparency when it comes to school performance will lead to policies and reforms that will ultimately improve educational quality for students.

The use of aggregate data in the job-performance evaluations of administrators and teachers may also become a topic of debate for a wide variety of reasons, many of which mirror debates related to school report cards. For example, many educators and teachers unions argue that teachers should not have their job security or salaries based on student performance because many factors influencing academic achievement are beyond their control: for example, factors such as low parental education levels, unsupportive or dysfunctional home environments, or nutritional deficits and stress—not to mention starting a school year significantly behind academically—can all adversely affect educational achievement. Those who oppose the use of aggregate data in job-performance evaluations generally argue that using aggregate data to evaluate individual educators is often misrepresentative and unfair. For a related discussion, see value-added measures.

The Glossary of Education Reform by Great Schools Partnership is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Aggregate Data

Reform

Debate

Alphabetical Search