National Academy of Sciences
National Academy of Engineering
Institute of Medicine
National Research Council
Office of News and Public Information
National Academy of Engineering
Back | Home
News from the National Academies
Date: Sept. 3, 1998
Contacts: Dan Quinn, Media Relations Officer
Dumi Ndlovu, Media Relations Assistant
(202) 334-2138; e-mail <news@nas.edu>

EMBARGOED: NOT FOR PUBLIC RELEASE BEFORE 5 P.M. EDT THURSDAY, SEPT. 3

Caution Urged in Developing and Using Educational Tests

WASHINGTON -- As education officials rely more heavily on test results to guide school reform efforts, caution is needed to ensure that tests are properly designed and used in ways that will spur improvements in education and not harm students, according to three new, congressionally mandated reports from the National Research Council. The reports released today outline steps needed to refine the development and use of large-scale tests in education, including "high-stakes" tests used by schools for tracking, promotion, or graduation, and national tests proposed by the U.S. Department of Education to assess fourth-grade reading and eighth-grade mathematics.

High Stakes: Testing for Tracking, Promotion, and Graduation says that test results should not be the only basis for deciding which classes a student takes or what curriculum to teach, whether a student will advance to the next grade, or whether the student will be able to graduate. Other factors -- including grades and teacher recommendations -- also should be considered. Moreover, the report says, schools should eliminate "low-track" classes that typically do not provide challenging instruction and often are led by the least-experienced teachers.

Evaluation of the Voluntary National Tests: Phase 1 says that the National Assessment Governing Board (NAGB) is on the right track in developing questions for the Department of Education's voluntary national tests, but recommends that NAGB move quickly to reach decisions about how to score the tests, and in what form scores will be given to students, parents, and other users. NAGB also should move quickly to address the inclusion of students with disabilities and students who are English-language learners, both issues which have important implications for test design and accuracy.

Uncommon Measures: Equivalence and Linkage Among Educational Tests, a follow-up to an interim report by the Research Council released in June, concludes that one proposed alternative to national testing -- linking the results of existing commercial and state tests and providing comparable information about achievement of students taking different tests in different parts of the country -- is generally not feasible.

A summary of each report follows.

Educational Improvement: A Shared Responsibility

At the local level, schools have turned to large-scale standardized tests to help them place students in curriculum "tracks" -- in which students are assigned to specialized schools, programs, or classes. Tests also are used in decisions about whether a student will advance to a higher grade or be retained in the same grade, and about whether a student will be permitted to graduate from high school. When used appropriately, such high-stakes tests can help promote student learning and equal opportunity in the classroom by defining standards of student achievement and by helping school officials identify areas in which students need additional or different instruction. When used inappropriately, however, these tests can undermine the quality of education and lower opportunities for some students, especially if test results are misinterpreted or misused, or students are relegated to a low-quality educational experience as a result of their scores, according to High Stakes: Testing for Tracking, Promotion, and Graduation.

High standards cannot be established and maintained simply by imposing them on students, the report says. Students, parents, educators, public officials, school districts, and states share responsibility for improving the quality of education. If test results are going to be used to make high-stakes decisions about individual students, school districts seeking to improve student performance should first improve the content and methods of classroom instruction, and they should test students only for knowledge and skills that reflect closely what has been taught in the classroom. Schools also must guard against teaching that is narrowly tailored to performing well on a particular test, rather than focused on the broader set of skills and knowledge a test is intended to measure. School officials must ensure that what is taught extends beyond any particular test, and that students are not given help that undermines the integrity of the test as a reliable gauge of student learning.

The report also says that:

> Students who are placed in typical "low-track" classes are worse off than if they had been placed elsewhere. Such low-track classes should be eliminated. Neither test scores nor any other form of evaluation should be used to place students in these settings.

> Efforts must be made to ensure the participation of students who are not yet proficient in English and students with disabilities in high-stakes testing programs, while maintaining the comparability of their test scores with those of other students.

> Schools must ensure that tests used to determine eligibility for graduation are adequately focused on material that actually has been taught in the school. Students at risk of not graduating should be advised of their situation well in advance, and should be provided with appropriate instruction to cover the material on which they will be tested.

> Large-scale tests should not be used to make high-stakes decisions about students who are younger than 8 years or below third grade.

> All high-stakes testing programs should include a well-designed evaluation component, so policy-makers can monitor both their intended and unintended consequences, such as their effect on students' graduation rates or future employment prospects.

> The proposed voluntary national tests are being designed to help students and parents gauge academic progress against national standards, and they should not be used for decisions on student tracking, promotion, or graduation.

Progress in National Test Development

The Clinton administration's plan for national tests of fourth-grade reading and eighth-grade mathematics stipulates that the federal government would develop the tests for states and local schools, but not require them to be taken. The new tests are intended to tell individual students, parents, and teachers where students stand relative to high national standards, and in mathematics, how students compare to those in other countries. The proposed tests are not intended or designed for use in high-stakes decisions about tracking, promotion, or graduation of students.

Evaluation of the Voluntary National Tests: Phase I concludes that the National Assessment Governing Board (NAGB) has established reasonable specifications for the new tests and has made good progress in developing an adequate number of high-quality test questions. The board's plans for completing the development and evaluation of the questions are sound, and the plans to conduct pilot and field tests also appear to be adequate.

NAGB now needs to make important decisions about how the tests will be scored and how the results will be reported, and, based on these decisions, complete its development and evaluation plans. The board plans to use achievement levels developed for the National Assessment of Educational Progress (NAEP) in reporting results, so it must ensure that the questions included in the new tests are closely linked to the descriptions that have been established for these achievement levels. It is important for the test developers to try to achieve broad consensus on these tests and their uses, the report says.

NAGB has made little progress so far in developing procedures to ensure that students with disabilities or with limited English proficiency are included in comparable examinations and that their scores can be compared with those of other students. Plans for including and accommodating these students -- a major goal of the national testing program -- are still sketchy and do not break new ground. The report recommends that the board speed up its work to increase these students' participation and improve the ability of the tests to compare results among all U.S. students.

Test Diversity Limits Feasibility of Linkage

As an alternative to new national tests, Congress asked the National Research Council whether it is feasible to link the results of existing state and commercial tests and compare an individual student's achievement with national and international benchmarks and with that of students taking different tests in other school districts or states.

Uncommon Measures: Equivalence and Linkage Among Educational Tests says that tests administered currently at the state and local level are too diverse -- in terms of their content, format, difficulty, and intended uses -- to allow the results to be compared meaningfully to one another or to national or international standards. Linkages can be computed in some limited cases -- where the tests and their uses are very similar -- but it is generally not possible to link even small subsets of tests to make valid comparisons of student performance. Unless tests are closely aligned in content and format with the National Assessment of Educational Progress, attempts to link test results with NAEP and to report results in terms of the NAEP performance levels are likely to be unreliable and potentially misleading.

The studies were funded by the U.S. Department of Education. Rosters of the authoring committee members and investigators follow. The National Research Council is the principal operating arm of the National Academy of Sciences and the National Academy of Engineering. It is a private, non-profit organization that provides advice on science and technology under a congressional charter granted to the National Academy of Sciences.

Read the full text of High Stakes: Testing for Tracking, Promotion, and Graduation; Evaluation of Voluntary National Tests: Phase 1; and Uncommon Measures: Equivalence and Linkage Among Educational Testsfor free on the Web, as well as more than 1,800 other publications from the National Academies. Printed copies are available for purchase from the National Academy Press Web siteor at the mailing address in the letterhead; tel. (202) 334-3313 or 1-800-624-6242. Reporters may obtain a pre-publication copy from the Office of News and Public Information at the letterhead address (contacts listed above).


NATIONAL RESEARCH COUNCIL
Commission on Behavioral and Social Sciences and Education
Board on Testing and Assessment

Committee on Appropriate Test Use

Robert M. Hauser*(chair)
Vilas Research and Samuel A. Stouffer Professor of Sociology
Center for Demography
University of Wisconsin
Madison

Lizanne DeStefano
Associate Professor of Educational Psychology, and
Director, Bureau of Educational Research
College of Education
University of Illinois
Urbana-Champaign

Pasquale J. DeVito
Director
Office of Assessment
Rhode Island Department of Education
Providence

Richard P. Duran
Professor
Graduate School of Education
University of California
Santa Barbara

Jennifer L. Hochschild
Professor of Politics and Public Affairs
Woodrow Wilson School of Public and International Affairs
Princeton University
Princeton, N.J.

Stephen P. Klein
Senior Research Scientist
RAND Corp.
Santa Monica, Calif.

Sharon Lewis
Director of Research
Council of the Great City Schools
Washington, D.C.

Robert L. Linn (ex-officio)
Distinguished Professor
School of Education
University of Colorado
Boulder

Lorraine M. McDonnell
Professor of Political Science and Education
Department of Political Science
University of California
Santa Barbara

Samuel Messick
Distinguished Research Scientist
Educational Testing Service
Princeton, N.J.

Ulric Neisser*
Professor of Psychology
Department of Psychology
Cornell University
Ithaca, N.Y.

Andrew C. Porter
Director, Wisconsin Center for Education Research;
Co-Director, National Institute for Science Education; and
Professor of Educational Psychology
Department of Psychology
University of Wisconsin
Madison

Audrey L. Qualls
Associate Professor of Educational Measurement and Statistics
Iowa Testing Program
University of Iowa
Iowa City

Paul R. Sackett
Professor of Psychology
Department of Psychology
University of Minnesota
Minneapolis

Catherine E. Snow
Henry Lee Shattuck Professor of Education
Graduate School of Education
Harvard University
Cambridge, Mass.

William T. Trent
Associate Chancellor, and
Professor of Educational Policy Studies and Sociology
College of Education
University of Illinois
Urbana-Champaign


RESEARCH COUNCIL STAFF

Jay P. Heubert
Study Director

_________________________________________
(*)Member, National Academy of Sciences


NATIONAL RESEARCH COUNCIL
Commission on Behavioral and Social Sciences and Education
Board on Testing and Assessment

Project on Evaluation of the Voluntary National Tests: Phase 1 Report

Robert M. Hauser* (co-principal investigator)
Vilas Research and Samuel A. Stouffer Professor of Sociology
Center for Demography
University of Wisconsin
Madison

Lauress L. Wise (co-principal investigator)
President
Human Resources Research Organization
Alexandria, Va.

RESEARCH COUNCIL STAFF

Michael J. Feuer, Director, Board on Testing and Assessment


Committee on Equivalency and Linkage of Educational Tests

Paul W. Holland (chair)
Professor, Department of Statistics and Graduate School of
Education
University of California, Berkeley

Robert C. Calfee
Professor and Dean
School of Education
University of California, Riverside

John T. Guthrie
Professor, Department of Human Development
University of Maryland, College Park

Richard M. Jaeger
NationsBank Professor of Educational Research
Methodology, and
Director, Center for Educational Research and Evaluation
University of North Carolina, Greensboro

Patricia Ann Kenney
Research Associate
Learning Research and Development Center
University of Pittsburgh

Vonda L. Kiplinger
Assessment Specialist
Student Assessment Program
Colorado Department of Education
Denver

Daniel M. Koretz
Senior Social Scientist
RAND Institute on Education and Training
Washington, D.C., and
Professor of Educational Research, Measurement, and
Evaluation
Boston College

Frederick C. Mosteller*
Professor Emeritus of Statistics
Harvard University, and
Director, Technology Assessment Program
Harvard School of Public Health
Cambridge, Mass.

Peter J. Pashley
Director of Psychometrics
Law School Admission Council
Newtown, Pa.

Doris Redfield
Educational Consultant
Appalachia Educational Laboratory
Richmond, Va.

William F. Tate
Associate Professor of Mathematics Education
Department of Curriculum and Instruction
University of Wisconsin, Madison

David Thissen
Professor of Psychology, and
Director, Graduate Program in Quantitative Psychology
University of North Carolina, Chapel Hill

Ewart A.C. Thomas
Professor of Psychology
Department of Psychology
Stanford University
Stanford, Calif.

Lauress L. Wise
President
Human Resources Research Organization
Alexandria, Va.

RESEARCH COUNCIL STAFF

Michael J. Feuer, Study Director
_________________________________________
(*)Member, National Academy of Sciences