AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Increasing clinical trial protocol complexity, amendments, and challenges around knowledge management create significant burden for trial teams. Structuring protocol content into standard formats has the potential to improve efficiency, support documentation quality, and strengthen compliance. We evaluate an Artificial Intelligence (AI) system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction. We compare the extraction accuracy of our clinical-trial-specific RAG process against that of publicly available (standalone) LLMs. We also assess the operational impact of AI-assistance on simulated extraction CRC workflows. Our RAG process was measured as more accurate (87.8%) than standalone LLMs with fine-tuned prompts (62.6%) against expert-supported reference annotations. In the simulated extraction workflows, AI-assisted tasks were completed 40% faster, rated as less cognitively demanding and strongly preferred by users. While expert oversight remains essential, this suggests that AI-assisted extraction can enable protocol intelligence at scale, motivating the integration of similar methodologies into real world clinical workflows to further validate its impact on feasibility, study start-up, and post-activation monitoring.

💡 Research Summary

Clinical trial protocols have become increasingly complex, with frequent amendments that impose a heavy burden on trial teams. To alleviate this, the authors developed an AI‑driven information extraction system that combines large language models (LLMs) with Retrieval‑Augmented Generation (RAG). The pipeline consists of three stages: (1) document preprocessing – PDF protocols are split into semantically meaningful chunks, embedded, and stored in a searchable vector database; (2) custom retrieval – domain‑specific queries retrieve the most relevant chunks for each data element, preserving traceability; and (3) structured generation – a generation LLM receives the retrieved chunks together with carefully crafted prompts and outputs JSON‑formatted data. A special two‑stage vision‑enhanced approach is used for extracting the Schedule of Events (SoE), which often appears as multi‑page, hierarchically merged tables that traditional PDF parsers struggle with.

The authors selected 23 publicly available protocols from ClinicalTrials.gov (9 oncology, 7 cardiovascular, 7 other therapeutic areas) based on drug‑intervention studies conducted in Canada or the United States. Human experts manually created a semi‑structured reference dataset covering six broad categories (general information, inclusion/exclusion criteria, adverse event definitions, interventions, site requirements, and SoE). To evaluate extraction quality, they employed an “LLM‑as‑a‑judge” framework: a separate LLM automatically scored the outputs, and low‑confidence cases were reviewed by humans, providing a scalable yet reliable ground‑truth approximation.

Performance comparison showed the RAG system achieving an overall accuracy of 87.8% versus 62.6% for standalone LLMs using finely tuned prompts. The advantage was most pronounced for information that is scattered across the document or embedded in complex tables. In a controlled experiment with 13 clinical research coordinators (CRCs), AI‑assisted extraction tasks were completed at least 40 % faster than manual extraction, and participants reported significantly lower cognitive load on the NASA‑TLX scale. Users also valued the system’s auditability (chunk references) and intuitive interface, with over 90 % expressing a preference for the AI‑assisted workflow.

Despite these gains, the authors stress that expert oversight remains essential. Ground‑truth creation is inherently subjective, and the current system cannot fully replace human judgment, especially for nuanced regulatory interpretations. Data privacy and compliance are addressed by running the RAG pipeline within a secure, closed environment that prevents any protocol content from being used to train external models.

The study concludes that AI‑assisted protocol extraction can substantially improve efficiency and data quality in clinical trial operations, paving the way for “protocol intelligence” at scale. Future work will involve real‑world deployment to measure impacts on feasibility assessments, study start‑up timelines, and ongoing monitoring, as well as extending the approach to additional data domains and multi‑institutional settings.

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

💡 Research Summary

Comments & Academic Discussion

Leave a Comment