Contact Us | Current Operating Status
Office of News and Public Information
Back | Home
News from the National Academies

Date:  Jan. 15, 2015



New Report Says No Technological Replacement Exists for Bulk Data Collection;
Software Can Enhance Targeted Collection and Automate Control of Data Usage to Protect Privacy


WASHINGTON – No software-based technique can fully replace the bulk collection of signals intelligence, but methods can be developed to more effectively conduct targeted collection and to control the usage of collected data, says a new report from the National Research Council.  Automated systems for isolating collected data, restricting queries that can be made against those data, and auditing usage of the data can help to enforce privacy protections and allay some civil liberty concerns, the unclassified report says.


The study was a result of an activity called for in Presidential Policy Directive 28, issued by President Obama in January 2014, to evaluate U.S. signals intelligence practices.  The directive instructed the Office of the Director of National Intelligence to produce a report within one year "assessing the feasibility of creating software that would allow the intelligence community more easily to conduct targeted information acquisition rather than bulk collection."  ODNI asked the Research Council -- the operating arm of the National Academy of Sciences and National Academy of Engineering -- to conduct a study, which began in June 2014, to assist in preparing a response to the President.  Over the ensuing months, a committee of experts appointed by the Research Council produced the report.


“From a technological standpoint, curtailing bulk data collection means analysts will be deprived of some information,” said committee chairman Robert F. Sproull, former director of Oracle’s Sun Labs. “It does not necessarily mean that current bulk collection must continue.  A reduction in bulk collection can be partially mitigated by improving targeted collection, and technologies can improve oversight and transparency and help reduce the conflict between collection and privacy.”


The report defines “collection” as the process of extracting data from a source, filtering it according to some criteria, and storing the results.  If a significant portion of the collected data is not associated with current targets or subjects of interest in an investigation, it is considered bulk; otherwise, it is targeted.


A key value of bulk collection is its record of past signals intelligence that may be relevant to subsequent investigations, the report notes.  The committee was not asked to and did not consider whether the loss of effectiveness from reducing bulk collection would be too great, or whether the potential gain in privacy from adopting an alternative collection method is worth the potential loss of intelligence information.  It did observe that other sources of information -- for example, data held by third parties such as communications providers -- might provide a partial substitute for bulk collection in some circumstances.


Improving the relevance of collected information to future investigations could also be achieved with new approaches to targeting, the report says.  Rapidly updating filtering criteria to include new targets as they are discovered will help collect data that would otherwise be lost, and if done quickly enough and well enough, bulk information about past events may not be needed.  However, targeted collection cannot substitute for bulk collection if past events were unique or if the delay in collecting the new information is too long.


As an alternative to controlling the collection of data, automated controls on the use of collected data can help to protect the privacy of people who are not subjects of investigation, the committee found.  The report describes three key technical elements required to control and automate usage: isolating bulk data so that it can be accessed only in specific ways; restricting the types of queries that can be made against stored data; and auditing the queries that have been done.  The way these controls work can be made public without revealing sensitive data, so that outside inspectors can verify that the intelligence community has and abides by adequate procedures to protect privacy.


While some of the necessary technologies to enhance targeted collection or improve automated usage controls require further research and development, some of the techniques are already in use in the intelligence community or in private companies, some have been demonstrated in research laboratories, and many are feasible to deploy within the next five years, the report says, although it does not recommend adoption of any specific technology. Automating usage controls will be easier if the rules governing collection and use are technology-neutral and based on a consistent set of definitions.


Ultimately, the decision to deploy any given technology is a policy question that requires determining whether increased effectiveness and apparent transparency are worth the cost in equipment, labor, and potential interference with the intelligence mission.  Such discussions were beyond the scope of this report.


The study was sponsored by the Office of the Director for National Intelligence.  The National Academy of Sciences, National Academy of Engineering, Institute of Medicine, and National Research Council make up the National Academies.  They are private, independent nonprofit institutions that provide science, technology, and health policy advice under a congressional charter granted to National Academy of Sciences in 1863.  The National Research Council is the principal operating arm of the National Academy of Sciences and the National Academy of Engineering.  For more information, visit  A committee roster follows.



Lauren Rugani, Media Relations Officer

Chelsea Dickson, Media Associate

Office of News and Public Information

202-334-2138; e-mail

Twitter: @NAS_news and @NASciences

RSS feed:



#       #       #


Division on Engineering and Physical Sciences

Computer Science and Telecommunications Board


Committee on Responding to Section 5(d) of Presidential Policy Directive 28:

The Feasibility of Software to Provide Alternatives to Bulk Signals Intelligence Collection


Robert F. Sproull1 (chair)

Former Vice President and Director

Oracle Labs (retired)

Leeds, Mass.


Frederick R. Chang

Director, Darwin Deason Institute for Cyber Security, and

Bobby B. Lyle Centennial Distinguished Chair in Cyber Security

Department of Computer Science and Engineering

Lyle School of Engineering

Southern Methodist University



William H. DuMouchel

Chief Statistical Scientist

Oracle Health Sciences

Oracle Data Sciences

Tucson, Ariz.


Michael Kearns

Professor and National Center Chair

Department of Computer and Information Science

University of Pennsylvania



Butler W. Lampson1, 2

Technical Fellow

Microsoft Corp.

Cambridge, Mass.


Susan Landau

Professor of Cybersecurity Policy

Department of Social Science and Policy Studies

Worcester Polytechnic Institute

Worcester, Mass.


Michael E. Leiter

Executive Vice President for Business Development, Strategy, and Mergers and Acquisitions


Washington, D.C.

Elizabeth Rindskopf Parker


McGeorge School of Law

University of the Pacific

Napa, Calif.


Peter J. Weinberger

Software Engineer

Google Inc.

New York City





Alan Shaw

Study Director


Jon Eisenberg

Board Director




1 Member, National Academy of Engineering

2 Member, National Academy of Sciences