Reviewers of NIH grants cannot distinguish the good from the great, study suggests

May 15, 2018

The National Institutes of Health (NIH) invested more than $27B in biomedical research through competitive grants during its 2017 fiscal year. Those grants were awarded based on scores assigned by, and conversation between, expert peer reviewers.

This peer review process is designed to determine the best proposals to fund and is a bedrock feature of doling out dollars for scientific projects with careful deliberation.

But new findings by a team of University of Wisconsin–Madison researchers suggest that reviewers are unable to differentiate the great proposals from the merely good ones. Educational Psychology postdoctoral fellow Elizabeth Pier led the analyses of data collected by a multidisciplinary group including Molly Carnes, MD, MS (pictured), professor, Geriatrics and Gerontology. Dr. Carnes is also Jean Manchester Biddick Professor of Women's Health Research and Director of the Center for Women's Health Research at the UW. Additional team members included Cecilia Ford, emeritus professor of English and Sociology, colleagues in psychology and educational psychology at UW–Madison, and collaborators at West Chester University in Pennsylvania.

In a detailed simulation of the peer review process — the records of real reviews are not available for study — researchers at UW–Madison’s Center for Women’s Health Research and their collaborators discovered that there was no agreement between different reviewers scoring the same proposals.

"How can we improve the way that grants are reviewed so there is less subjectivity in the ultimate funding of science?" is the question at the heart of this work, says Dr. Carnes. "We need more research in this area and the NIH is investing money investigating this process."

"Collaboration can actually make agreement worse, not better, so one question that follows from that would be: ‘Would it be better for the reviewers not to meet?’" said Dr. Pier, who received her doctorate in educational psychology at UW–Madison while completing the work.

To address that question in the new study, the researchers focused on the reviewers’ initial critiques and identified the number and type of weaknesses and strengths assigned to each proposal, along with the score given.

"When we look at the strengths and weaknesses they assign to the applicants, what we found is that reviewers are internally very consistent," says Pier. "The thing that surprised us was that even though people are internally consistent, there’s really no consistency in how different people translate the number of weaknesses into a score."

The investigators emphasize that, with billions of dollars at stake, additional research is needed on this vital system of funding and any potential improvements to the process.

"It makes me proud to be a scientist, that we not only fund research from cells to society, but that we’re continually trying to improve the process by which we award these dollars," said Dr. Carnes.

Editor's note: A version of this article by Eric Hamilton was originally published by UW Communications on March 5, 2018.

Resources:

Pier EL, Brauer M, Filut A, Kaatz A, Raclaw J, Nathan MJ, Ford CE, Carnes M. 2018. Low agreement among reviewers evaluating the same NIH grant applications. Proc Natl Acad Sci USA 115(12):2952-2957.
"Reviewers of NIH grants cannot distinguish the good from the great, study suggests," UW-Madison, March 5, 2018.
"Why the Medical Research Grant System Could Be Costing Us Great Ideas," New York Times

Photo caption (top): In this file photo, Dr. Molly Carnes speaks during a 2015 Research Day event. Photo credit: Clint Thayer/Department of Medicine

Robyn Perrin, PhD

Geriatrics and Gerontology

Research

Women in Medicine

Education

Research

Divisions

About Us

Reviewers of NIH grants cannot distinguish the good from the great, study suggests