Text-mining for diverse review topics: preliminary analysis of a study comparing search strategies developed with and without text-mining tools

Presentation video:




Oral session: Searching and information retrieval (2)


Wednesday 23 October 2019 - 16:00 to 17:30


All authors in correct order:

Paynter R1, Erinoff E2, Featherstone R3, Lege-Matsuura J2, Voisin C4, Fiordalisi C1, Stoeger E1, Huppert J5, Adam G6
1 Agency for Healthcare Research and Quality, Effective Health Care Program, Scientific Resource Center, USA
2 ECRI Evidence-based Practice Center, USA
3 Cochrane Editorial & Methods Department, UK
4 RTI International-University of North Carolina Evidence-based Practice Center, USA
5 Agency for Healthcare Research and Quality, USA
6 Brown University Evidence-based Practice Center, USA
Presenting author and contact person

Presenting author:

Robin Paynter

Contact person:

Abstract text
Background: in an era of explosive growth in biomedical evidence, improving systematic review search processes is critical. Text-mining tools (TMT) are a potentially powerful resource to improve and streamline search strategy development.

Objectives: to compare the costs and benefits of searches with and without TMT. Specific questions include: 1) do TMT decrease the time spent developing strategies?;
2) do TMT identify groups of records that can be safely excluded?;
3) do TMT improve search performance?;
4) how does the performance of TMT for developing strategies compare for simple or complex review topics?

Methods: in this prospective study, we plan to include 10 systematic review projects, classified by topic as simple or complex. Each project's Information Specialist will use conventional methods to create the search strategy, and a paired Information Specialist will independently create a MEDLINE search strategy for the same review project using text-mining tools. All text-mining searches will be created using freely available TMT to ensure our research methods may be replicated across diverse settings and our findings may be relevant to all review producers. We will collect search results from both MEDLINE strategies, code and remove duplicates, and send the citations to the review team for screening. When the draft report is submitted, we will use its final list of included studies to calculate the sensitivity, specificity, precision, and Number Needed to Read (NNR) for both MEDLINE strategies. We will also track the time spent by Information Specialists to conduct each task in their search development process. We will analyze simple and complex topics separately to allow comparison.

Results: we will report the preliminary results from completed systematic reviews at the Colloquium.

Patient or healthcare consumer involvement: improvements to searching techniques positively affect the quality and efficiency of systematic review production, thereby providing better and more timely products on which consumers can base their healthcare decisions.