NIH Center for Scientific Review: Using AI in Peer Review Is a Breach of Confidentiality

“As the scientific community continues to evolve, it is essential to leverage the latest technologies to improve and streamline the peer-review process. One such technology that shows great promise is artificial intelligence (AI). AI-based peer review has the potential to make the process more efficient, accurate, and impartial, ultimately leading to better quality research.”

I suspect many of you were not fooled into thinking that was me who wrote that statement. A well-known AI tool wrote those words after I prompted it to discuss using AI in the peer review process. More and more, we are hearing stories about how researchers may use these tools when reviewing others’ applications, and even writing their own applications.

Even if AI tools may have “great promise,” do we allow their use?

Reviewers are trusted and required to maintain confidentiality throughout the application review process. Thus, using AI to assist in peer review would involve a breach of confidentiality. In a recently released guide notice, we explain that NIH scientific peer reviewers are prohibited from using natural language processors, large language models, or other generative AI technologies for analyzing and formulating peer review critiques for grant applications and R&D contract proposals.

Reviewers have long been required to certify and sign an agreement that says they will not share applications, proposals, or meeting materials with anyone who has not been officially designated to participate in the peer review process. Yes, this also means websites, apps, or other AI platforms too. As part of standard pre-meeting certifications, all NIH peer reviewers and members of NIH advisory councils will be required to certify a modified Security, Confidentiality and Nondisclosure Agreement signifying that they fully understand and will comply with the confidential nature of the review process, including complete abstention from artificial intelligence tools in analyzing and critiquing NIH grant applications and contract proposals.

In other words, grant application and contract proposal materials and other privileged information cannot be shared or disseminated through any means or entity. Let’s explore this issue further with some hypothetical examples.

After agreeing to serve, Dr. Hal was assigned several grant applications to review. Hal has had a lot of experience writing grant applications before and knows how much effort and time they require. Even with that in mind, they were daunted when trying to give each of these applications their attention and appropriate review.

So, they turned to AI. They rationalized it would provide an unbiased assessment of the research proposed because it should be able to pull from numerous references and resources fairly quickly and distill the relevant information. And, to top it off, Hal even found a platform that was trained on publicly available biomedical research publications and NIH-funded grants.

Not seeing a problem, Hal fed the relevant information from the applications into the AI. Moments later, it gave an assessment, which Hal used as a starting point for their critique.

Here is another scenario:

Dr. Smith just finished reading, what seems to be, way too many grant applications. Exhausted they may be as an NIH peer reviewer, their job is not done until those critiques are also written. Tired, a bit hungry, and ready to just get home, they wonder if any of these new AI chat bots might be able to help. They rationalized it would just be used to create a first draft, and then they would go back to clean up the review critique before submitting.

Smith copied the abstract, specific aims, and research strategy sections of the applications. They uploaded to one of the AI systems that is publicly available, and widely used by many people for numerous different reasons.

A few minutes later, Ta-Da! There was some narrative that could be used for the first draft. Getting those initial critique drafts going saved hours of time.

To be clear, both of these situations are not allowed. Everybody involved with the NIH peer review process shares responsibility in maintaining and upholding the integrity of review. A breach of confidentiality, such as those described above, could lead to terminating a peer reviewer’s service, referring them for government-wide suspension or debarment, as well as possibly pursuing criminal or civil actions based on the severity. NIH Guide Notice NOT-OD-22-044, our Integrity and Confidentiality in NIH Peer Review page, and this NIH All About Grants podcast explain more.

Ensuring confidentiality means that scientists will feel comfortable sharing their candid, well-designed, and thorough research ideas with us. Generative AI tools need to be fed substantial, privileged, and detailed information to develop a peer reviewer critique on a specific grant application. Moreover, no guarantee exists explaining where AI tools send, save, view, or use grant application, contract proposal, or critique data at any time. Thus, using them absolutely violates our peer review confidentiality expectations.

NIH peer reviewers are selected and invited to review applications and proposals specifically for their individual expertise and opinion. The data that generative AI are trained on is limited to what exists, what has been widely published, and what opinions have been written for posterity. Biases are built into this data; the originality of thought that NIH values is lost and homogenized with this process and may even constitute plagiarism.

We take this issue seriously. Applicants are trusting us to protect their proprietary, sensitive, and confidential ideas from being given to others who do not have a need to know. In order to maintain this trust and keep the research process moving forward, reviewers are not allowed to share these applications with anybody or any entity.

Circling back to the beginning for a moment, we wanted to say a few words about using AI in writing one’s application. We do not know, or ask, who wrote an application. It could have come from the principal investigator, a postdoc, a professional grant writer, or involved the wider research team. If you use an AI tool to help write your application, you also do so at your own risk.

This is because when we receive a grant application, it is our understanding that it is the original idea proposed by the institution and their affiliated research team. Using AI tools may introduce several concerns related to research misconduct, like including plagiarized text from someone else’s work or fabricated citations. If we identify plagiarized, falsified, or fabricated information in a grant write-up, we will take appropriate actions to address the non-compliance. In my example above, we ran the AI-generated text through a well-known online tool which did not detect any plagiarism. Though we included it here for illustrative purposes, you should always be mindful about these concerns when putting together your application.

Link to article https://www.csr.nih.gov/reviewmatters/2023/06/23/using-ai-in-peer-review-is-a-breach-of-confidentiality/