Science and Research Content

Study examines equity in blinded versus unblinded peer review -

A new large-scale field study has examined whether keeping author identities hidden during peer review makes the process more equitable. The work, led by Indiana University College of Arts and Sciences professor Tim Pleskac with collaborators Ellie Kyung (Babson College), Gretchen Chapman (Carnegie Mellon University), and Oleg Urminsky (University of Chicago), evaluated the effects of single-blind (reviewers know the author’s identity) versus double-blind (neither authors nor reviewers know each other’s identities) reviews.

The research focused on submissions to the 39th Annual Conference of the Society for Judgment and Decision Making, a global organization of more than 1,800 scholars. This marks the first systematic field study assessing fairness, reliability, and validity across review systems in a real-world review setting.

The findings were nuanced. Anonymous review appeared to reduce disparities for Asian first authors, yet women and early-career scholars fared slightly worse under this system. Differences in reliability and validity between anonymous and non-anonymous reviews were limited, and the most significant source of variation came from differences between reviewers themselves rather than from the review system.

Based on the results, the research team recommended anonymous review as the preferred approach, while noting the need for refinements to improve fairness and rigor. They also suggested viewing the inherent “noise” in peer review not as a flaw but as an opportunity, such as integrating an informed lottery system to select among the top-rated submissions. This approach could help broaden the range of ideas and perspectives represented in scientific discourse.

The study, titled “Blinded versus unblinded review: A field study on the equity of peer-review processes”, was published August 6, 2025, in Management Science. Following the findings, the Society for Judgment and Decision Making adopted anonymous review in its own processes.

The project was rooted in years of service on review committees within the Society for Judgment and Decision Making. Members had noticed potential links between anonymity and outcomes such as gender representation among award recipients, as well as concerns about whether institutional prestige influenced acceptance decisions. These experiences motivated the design of a structured, NSF-funded experiment to assess the issue empirically.

To compare the two review processes, 112 faculty reviewers assessed 530 conference submissions. Reviewers were randomly assigned to either anonymous or non-anonymous review, typically evaluating about 30 submissions each. To test reliability, 10% of submissions were assigned to three additional reviewers in each condition. Predictive validity was measured by later assessments of conference presentations by faculty and graduate students, as well as by tracking which submissions were eventually published.

The analysis found that Asian first authors received higher ratings under anonymous review, suggesting a reduction in racial or ethnic bias. Too few submissions came from Black, Hispanic, or Native American authors to draw meaningful conclusions, highlighting broader issues of diversity in the field.

Gender patterns differed. Male first authors received higher ratings than female first authors in both systems, with the gap slightly larger under anonymous review. This may reflect broader gender disparities in mathematically intensive subfields, or differences in how topics are valued. Some evidence suggested that non-anonymous reviewers may have consciously worked to offset these gaps.

The career stage also influenced outcomes. Submissions with senior coauthors were rated more highly in the non-anonymous process, while early-career first authors — such as doctoral students and research scientists — scored worse under anonymous review. This may indicate that reviewers give additional weight to senior faculty when their identities are visible, while also providing more opportunities to early-career researchers when status is known.

Both systems showed agreement on which submissions were weakest, but consensus about top-rated work was much lower. Panels of reviewers aligned on less than half of their leading choices, underscoring the variability within peer review. Neither gender, race, nor seniority predicted future success of the work, as measured by presentation ratings or later publication outcomes. Review scores had only limited predictive value for these markers of success.

Given the inherent variability, the researchers proposed embracing the idea that peer review contains elements of chance. One option would be to rank submissions broadly into higher- and lower-quality groups, then select randomly from the top group. This approach could ensure that strong but unconventional work is not overlooked and could help diversify the ideas represented at conferences.

Click here to read the original press release.

STORY TOOLS

  • |
  • |

sponsor links

For banner ads click here