Oct. 2, 2017
FOR IMMEDIATE RELEASE
Report Offers Guidance to Federal Government on Creating a New Statistics Entity to Combine Data From Multiple Sources While Protecting Privacy
WASHINGTON -- A new report from the National Academies of Sciences, Engineering, and Medicine offers detailed recommendations to guide federal statistical agencies in creating a new entity that would enable them to combine data from multiple sources in order to provide more relevant, timely, and detailed statistics – for example, on the unemployment rate or the rate of violent crime. The report reviews options for structuring the new entity, identifies approaches for protecting individuals’ privacy while linking multiple sources of information, and identifies areas where staff training is needed.
The study committee’s previous report, released in January 2017, recommended the establishment of an entity to facilitate secure access to data from multiple sources. The new report builds on that recommendation, noting that a new entity to combine data may enable more detailed, timely updates to inform decision-makers on important economic, societal, and health indicators. It offers more detail on the process for developing such an entity and provides recommendations for implementation.
“A great deal of recent public discussion – especially that prompted by the recent report of the Commission on Evidence-Based Policymaking -- has focused on the value of combining data and of creating a new entity to do so,” said Robert M. Groves, provost of Georgetown University in Washington, D.C., and chair of the committee that wrote the report. “We hope our report will complement and inform the commission’s efforts and help initiate a more detailed discussion among stakeholders about the best path forward for the federal statistical system.”
Some federal agencies are already using multiple data sources to craft more useful datasets, the report notes. For example, the National Center for Health Statistics (NCHS) routinely links information gathered from the National Health Interview Survey with administrative records from the Centers for Medicare and Medicaid Services, which allows researchers to analyze the relationship between health and the uses and costs of medical care. While agencies are pursuing efforts to link multiple data sources individually and are currently implementing changes to their systems, a decentralized effort incurs large opportunity costs and limits potential benefits, the report says.
Emphasizing that privacy protection should be at the forefront of the new entity’s design, the report urges statistical agencies to train their technical staff in modern computer science technology – including secure multiparty computing, cryptography, privacy-preserving, and privacy-enhancing technologies -- so that they can better ensure security and enhance privacy protections. The report also identifies technological approaches that can minimize privacy risks; for example, secure multiparty computing could in some situations permit a statistical agency to compute a desired aggregate result without ever actually learning all the detailed data from the different data sources.
The report also recommends instituting an advisory committee on privacy to inform and advise the new entity on policies and current best practices. The entity could also serve as a valuable center for coordinating research across the federal statistical system and the academic community on the application and evaluation of privacy-preserving and privacy-enhancing techniques for federal statistics.
To do this, the entity needs strong legal authority to protect the confidentiality of data accessed through the entity and to ensure that the data are used only for statistical purposes. In addition, the entity’s legal foundation should foster independence from political and other undue external influence in providing access to, linking, and analyzing data, and in producing and disseminating statistical information. The new entity should also maximize the transparency of its statistical activities by posting a summary of the data sources accessed through the entity on a public website. The summary should include the purpose and public benefit of the study, the data sources used, a brief description of the methodology, and links to resulting statistical products.
The report discusses the advantages and disadvantages of various options for locating the new entity, such as in a federal statistical agency, a federally funded R&D center, or a university-based public-private research center. Regardless of where the entity is established, federal statistical agencies should create partnerships with academia and external research organizations to develop the new methods needed for design and analysis using multiple data sources, the report says.
The report also offers recommendations about governance of the proposed new entity, noting that the governance structure will need to obtain input from all of the statistical agencies and address their needs. The entity will also serve and have responsibilities to data providers and data users. Its director should report to a board of directors that includes representatives of the federal statistical agencies, experts on privacy, holders of data used by the entity, and users of statistical data.
Recognizing that much research is needed before many federal statistical programs can incorporate multiple data sources, the committee recommended that the transition be gradual, taking place in phases to accommodate changes in system architectures, data access, and staffing. The report suggests that the first phase take place over the course of five years, after which a comprehensive review would assess the demonstrated benefits to federal statistics.
The study was sponsored by the Laura and John Arnold Foundation with additional support from the National Academy of Sciences Kellogg Fund. The National Academies of Sciences, Engineering, and Medicine are private, nonprofit institutions that provide independent, objective analysis and advice to the nation to solve complex problems and inform public policy decisions related to science, technology, and medicine. The National Academies operate under an 1863 congressional charter to the National Academy of Sciences, signed by President Lincoln. For more information, visit http://national-academies.org.
Kacey Templin, Media Relations Officer
Andrew Robinson, Media Relations Assistant
Office of News and Public Information
202-334-2138; e-mail email@example.com