Background: millions of citations can be searched for similarities in their titles, abstracts and indexing. For example, some bibliographic databases provide suggestions on similar articles. The similarity between articles is measured using algorithms that take into account free text between the two articles, as well as their MeSH terms. Systematic review athors often review these suggested ‘similar articles’ to make sure that they did not miss any important studies for inclusion. Even so, to the best of our knowledge, it is not known how ‘similar’ studies included in Cochrane Reviews are to each other.
Objectives: to evaluate the similarity of citations included in Cochrane Reviews.
Methods: an inter-disciplinary team of methodologists and librarians identified all Cochrane Reviews published in 2018. We extracted all citations of included studies in these reviews and identified if they were also indexed in PubMed. For feasibility, we picked a random sample of 100 reviews that included at least five citations with PubMed IDs (PMIDs) for analysis. We developed a programme to interact with PubMed via the Entrez query and database system at the National Center for Biotechnology Information (NCBI). For each included citation, we submitted the PMID to PubMed and requested the PMIDs of all ‘related articles’. We tabulated and cross-referenced the retrieved PMIDs and mapped them into a network diagram for each Cochrane Review. We calculated the percentages of citations that were similar to each other. We used Endnote (X8), Microsoft Excel, and Excel add-ons (NodeXl and IrisXl) for citation management, data management and data analysis.
Results: from the 100 analyzed Cochrane Reviews, we identified 2610 included citations indexed in PubMed. Each review included 29.2 ± 30.3 citations (range: 6 to 217). For almost all reviews (98%), the included citations were related to at least half of the other included citations in the same review. In 59% of reviews, the included citations were related to more than 90% of the other included citations. As expected, the number of citations related to multiple included citations in the same review decreases proportionally as the number of relationships increases (Figure 1). We also reviewed the geometry of the network plots in order to attempt to identify reasons for disconnected networks, outliers and quality assurance that the included citations met the inclusion criteria (Figure 2). Reasons for the anomalies were sometimes clear (e.g. poor indexing of letters to the editor); others remain unclear as to the reason for not being captured.
Conclusions: overall, citations of studies in Cochrane Systematic Reviews are similar. Reviewing the network plots to identify anomalies may assist review authors in assuring quality searches and flagging studies that may not meet the inclusion criteria. We are further developing this strategy into a formal search model using ‘similar articles'.
Patient or healthcare consumer involvement: not involved