The Art of Science Oversight

Is IRB variability really a bad thing?

In June 2016, the National Institutes of Health (NIH) announced a new policy requiring use of a single institutional review board (IRB) for NIH-funded, non-exempt multi-institutional research to take effect on September 25, 2017. While proponents see the potential for streamlining the research approval process, others point to the lack of data supporting such possible benefits. [1]

Central vs. Local: An ongoing debate

IRBs have played an important role in reviewing studies involving human research participants. To date, federal regulations have required research protocols conducted at multiple sites to be reviewed by an IRB at each site. As large, multi-institutional trials became more commonplace, variability in the IRB reviews of the same protocol at different institutions led to concern that IRB review, rather than reliably applying a single standard, was arbitrary and therefore possibly dangerous. [2]Empirical evidence supported this variability. A study, published in 2001, found that when reviewing the same proposals for human research, different institutional review boards (IRBs) often assigned different levels of risk and/or benefit to study participants. [3] A more recent review of the literature echoes this study’s findings despite the push towards centralization of research oversight. [4] Some have suggested that centralizing the research review process would reduce arbitrariness and accelerate approval by streamlining the process. However, despite the merits of these concerns and the anticipated benefits of single IRB review in multi-institutional trials, centralization cannot fully correct IRB variability, nor eliminate genuine moral disagreement, and may reduce flexibility necessary to account for regional and cultural differences.

Additionally, variability among IRB assessments can prove to have positive features. Variability may reflect flexibility between research sites attempting to account for relevant regional and cultural differences, allowing IRBs to tailor a research program and its protocols to the needs and protection of the local population. Variation between IRBs may also encourage a more comprehensive discourse on how best to protect human subjects.

Streamlining or Forced Uniformity?

While it might seem that IRB decentralization and subjectivity is inherently troublesome, it is not necessarily a negative feature of the research oversight process. If IRBs are all attempting, in good faith, to meet the same rigorous general standard in human subjects’ protections, then this variability may offer IRBs an opportunity to assess risk and benefit in a manner that better reflects their specific subject population. In one example, an IRB at a site primarily serving active duty military members may evaluate certain risks differently than that at a site primarily serving an elderly population. The subjective evaluations of risks and benefits, in this example, may be significantly different due to “considerations such as the extent to which [a harm] alters or affects lifestyle.” [5] Consider that the military population may evaluate the risk of the inability to perform strenuous exercise differently than the elderly population.

Furthermore, evaluating risk and benefit uniformly across all IRBs or through a central IRB would be a disservice to many populations given cultural differences that exist across the United States. For example, an IRB that works at a health center that serves primarily a Native American population may not evaluate risks in the same way as an IRB that serves an affluent suburban population. Doing so risks further problems like those which arose in the infamous Havasupai research. In short, reasonable people can disagree about the level of risks and benefits, even within the same research protocol; thus, research variability among IRBs is not automatically problematic because it allows for a pragmatic flexibility, provided that a good faith effort is made to protect subjects.

In addition to flexibility, variation offers an opportunity for constructive dialogue beyond individual IRB deliberations. There is no single way to understand risk or elucidate protection measures for research participants. Therefore, variability might, in at least some cases, be viewed as legitimate moral disagreement and not as an inherent problem representing lapses in research oversight protection or as an example of inefficacy, as suggested by some authors. [2] For example, there has been copious writing about what constitutes “more than minimal risk” or “clinical equipoise” in the literature on research ethics. These disagreements ostensibly do not stop inside of IRBs. Research oversight is an art, not a science. Is it reasonable to assume that different people, with different values, should automatically reach the same conclusions?

Behind Closed Doors? Opening up ethical deliberations

Accordingly, the problem is not IRB variability itself, which also occurs in opinions expressed within— as well as between—IRBs, but rather that the reasons for variability are not readily made public or discussed. If IRBs exist as small islands of moral deliberation, variation will naturally seem arbitrary from the outside because the morally salient reasons leading to such variation will be obscured from those not present for the deliberations. Variation between IRBs represents an opportunity for reasonable and caring people to disagree about important topics, discuss why they disagree, and to arrive at something resembling a public consensus.

Spaces already exist for IRB participants to promote these dialogues outside of IRB meetings, including in academic journals, consortia, and similar events. However, if such dialogues were expanded and made an essential part of the research oversight process, IRB variability might be shown to be a positive feature of the research oversight process. For example, cross-IRB review could be made routine, and decisions made by IRBs should be available for public comment. This dialogue could lead to ongoing exploration and elucidation of procedural and substantive components of IRB review.

A space for different but equally informed opinions on important topics is essential for any morally controversial topic, and research oversight is no different. Disagreement within IRBs is valuable to the process of deliberation and consideration of varying viewpoints and, by logical extension, variability between IRBs has the potential of being equally valuable.

Risks of Decentralization

Of course, it is not the case that all variability among IRBs is beneficial without some qualification. For example, it would be unfortunate if certain IRBs consistently underestimate risk and approve riskier protocols that might not otherwise be approved without significant changes. The threat—that inevitable human variation leads to incentivizing unethical behavior—is present. [6] At worst IRBs could purposely understate risk for financial or other reasons. IRBs must ensure that they are not straying from the principles outlined in the Belmont Report and the regulations of the Common Rule by systematically underestimating risk and unnecessarily placing human subjects at risk.

In addition to arising from a conflict of interest or a general spirit of “over-permissiveness,” problematic variations can result from simple misunderstanding or ignorance of the Common Rule regulations and the Belmont Report’s principles. This type of variation is not beneficial or pragmatic and may lead to more difficulty in conducting large, multi-site studies, cause patients to be under-protected, or, inversely could place undue and unnecessary burdens on researchers.

Though it is important to acknowledge that different research settings may necessitate differences in risk evaluations, it is essential to the research process that the overall framework established in research oversight regulations not be arbitrarily or poorly applied. Research oversight is meant to be flexible. However, other aspects of regulations are not flexible and should not be treated as such. Inappropriate flexibility borne from ignorance is not benign nor simply a subspecies of productive moral disagreement.

Balancing the Best of Both Approaches

Despite their best efforts at objectivity, IRB members may not always evaluate risks and benefits in exactly the same way. These diverse groups can validly differ on subjective evaluations of risks and benefits. In many ways, such differences are useful because they allow flexibility and a measure of responsiveness reflective of local needs. These differences may expand discussion and legitimate disagreement on morally difficult topics. However, it is important, in terms of human research subjects’ protection, to remember that the pressure to publish, the possibility of financial conflicts of interest, and other such influences can create a market for IRBs that systemically underestimate risks and overstate benefits. Such negative influence could lead to unwarranted approval of riskier studies.

The scientific, bioethical, and regulatory communities must ensure a threshold level of competence and integrity across all IRBs, even if complete uniformity is impossible and counterproductive. Accordingly, the research oversight community should work to improve communication between IRBs, develop a more open research oversight process, and provide relevant training and continuing education of IRB members. Given these conditions, the benefits of a decentralized IRB system would be more readily apparent.

Griffen Allen, MBE '17, can be reached at  BioethicsJournal (at)

[1] Klitzman, Robert, Ekaterina Pivovarova, and Charles W. Lidz. "Single IRBs in Multisite Trials Questions Posed by the New NIH Policy." JAMA 317, no. 20 (2017): 2061-2062.

[2] Abbott, Lura and Christine Grady, “A Systematic Review of the Empirical Literature Evaluating IRBs: What We Know and What We Still Need to Learn.” Journal of empirical research on human research ethics  6, no. 1 (2011) 3-19.

[3] Hull, Sara, Henry Silverman, and Jeremy Sugarman. “Variability among institutional review boards’ decisions within the context of a multicenter trial” Critical Care Medicine 29, no. 2 (2001) 235.

[4] Brehaut, Jamie C., Dean Fergusson, Tavis P. Hayes, Michael McDonald, Stuart G. Nicholls, Raphael Saginur, and Charles Weijer. “A Scoping Review of Empirical Research Relating to Quality and Effectiveness of Research Ethics Review” PLOS One 10, no. 7 (2015).

[5] Meslin, Eric “Protecting Human Subjects from Harm through Improved Risk Judgments” IRB: Ethics and Human Research 12, no. 1 (1990): 7–10.

[6] Trudo Lemmens and Benjamin Freedman, “Ethics Review for Sale? Conflicts of Interest and Commercial Research Review Boards” Milbank Quarterly 78, no. 4 (2000) 8.

[7] Lemmens, Trudo and Benjamin Freedman. “Ethics Review for Sale? Conflicts of Interest and Commercial Research Review Boards.” Milbank Quarterly 78, no. 4 (2000).

Journal Tags