Rapid Systematic Literature Review Prototype (2020)
Goal:
To create an automated process to help CDC’s Covid Mitigation team analyze and categorize legislation documents more rapidly. The current process is entirely manual, and resulted in a backlog of thousands of unprocessed documents.
Summary:
The R&D team created the Rapid Review web application to help accelerate the process of targeted Systematic Reviews. The Rapid Review system enables a legal analyst to quickly narrow down a collection of Titles and Abstracts from a literature review search to those that are most relevant to the given question. The final process was shown to eliminate 60-70% of human effort, while maintaining a high degree (80-90%) of accuracy compared to a fully manual process.
The Rapid Review algorithm was tested using python’s scikit-learn framework for Machine Learning. After testing several architectural options, the team refined an algorithm which replicated the dual-reviewer of traditional systematic reviews, leveraging an ensemble of different machine learning (ML) models to help identify the most difficult documents to classify.
Rather than building a system that classifies the documents directly, the Rapid Review system leverages the principles of active learning. In this type of process, a human makes all document classifications, with the ML model acting as a recommendation system that provides only those documents with a high probability of relevance to the research question.
The algorithm was bundled into a flask-based API and hosted in a Docker container. In addition, a custom front-end app (React/Node.js) was developed for a User Interface. The interface provided a simple portal for user authentication, document tagging, and project management/organization.