Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
There is an updated version of this publication, open Version 3.
conference paper

Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science

21/04/2023| By
Jodi Jodi Schneider,
+ 2
Malik Oyewale Malik Oyewale Salami
2311 Views
0 Comments
Disciplines
Keywords
Abstract

Previous research has posited a correlation between poor indexing and inadvertent post-retraction citation. However, to date, there has been limited systematic study of retraction indexing quality: we are aware of one database-wide comparison of PubMed and Web of Science, and multiple smaller studies highlighting indexing problems for items with the same reason for retraction or same field of study. To assess the agreement between multidisciplinary retraction indexes, we create a union list of 49,924 publications with DOIs from the retraction indices of at least one of Crossref, Retraction Watch, Scopus, and Web of Science. Only 1593 (3%) are deemed retracted by the intersection of all four sources. For 14,743 publications (almost 30%), there is disagreement: at least one source deems them retracted while another lacks retraction indexing. Of the items deemed retracted by at least one source, retraction indexing was lacking for 32% covered in Scopus, 7% covered in Crossref, and 4% covered in Web of Science. We manually examined 201 items from the union list and found that 115/201 (57.21%) DOIs were retracted publications while 59 (29.35%) were retraction notices. In future work we plan to use a validated version of this union list to assess the retraction indexing of subject-specific sources.

Show Less
Preview automatically generated form the publication file.

Assessing the agreement in retraction indexing across 4 multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science

Jodi Schneider*, Jou Lee **, Heng Zheng**, Malik Salami **

*jodi@illinois.edu & jschneider@pobox.com

0000-0002-5098-5667

School of Information Sciences, University of Illinois at Urbana-Champaign, USA

**joulee2@illinois.edu; zhenghz@illinois.edu; malikos2@illinois.edu

0000-0001-8927-0370; 0000-0001-5866-7746; 0000-0002-2329-5660

School of Information Sciences, University of Illinois at Urbana-Champaign, USA

Previous research has posited a correlation between poor indexing and inadvertent post-retraction citation. However, to date, there has been limited systematic study of retraction indexing quality: we are aware of one database-wide comparison of PubMed and Web of Science, and multiple smaller studies highlighting indexing problems for items with the same reason for retraction or same field of study. To assess the agreement between multidisciplinary retraction indexes, we create a union list of 49,924 publications with DOIs from the retraction indices of at least one of Crossref, Retraction Watch, Scopus, and Web of Science. Only 1593 (3%) are deemed retracted by the intersection of all four sources. For 14,743 publications (almost 30%), there is disagreement: at least one source deems them retracted while another lacks retraction indexing. Of the items deemed retracted by at least one source, retraction indexing was lacking for 32% covered in Scopus, 7% covered in Crossref, and 4% covered in Web of Science. We manually examined 201 items from the union list and found that 115/201 (57.21%) DOIs were retracted publications while 59 (29.35%) were retraction notices. In future work we plan to use a validated version of this union list to assess the retraction indexing of subject-specific sources.

1. Introduction

Retraction has been widely studied in scientometric research, often relying on databases such as PubMed and Web of Science to determine which publications are retracted. Only 5.4% of post-retraction citations in PubMed Central acknowledged that the paper they were citing was retracted (Hsiao & Schneider, 2021), and a case study posited a correlation between poor indexing and inadvertent post-retraction citation (Schneider et al., 2020).

Many retracted papers are not marked as retracted on publisher and aggregator sites (Badreldin et al., 2020; Decullier & Maisonneuve, 2018). Retraction status is inconsistently displayed across a wide range of sources, including publisher sites (Dal-Ré & Ayuso, 2020; Suelzer et al., 2021), search engines (Genot & Olsson, 2021), scholarly databases (Mine, 2019; Proescholdt & Schneider, 2020; Schneider et al., 2020; Suelzer et al., 2021), and other websites (Frampton et al., 2021; Mine, 2019).

Retraction indexing may also be lacking in some cases. For example, (Proescholdt & Schneider, 2020) found thousands of examples of apparently retracted papers that were not indexed as such, whose titles starting with "RETRACTED:" or a cognate phrase. Early retractions might also pose challenges: many were issued in non-citable ways such as “tip-in” notices (Snodgrass & Pfeifer, 1992), which did not meet PubMed indexing standards (Kotzin & Schuyler, 1989) and would be missed by retraction indexing. Other studies discovered indexing issues in both document titles and the linking of retracted publications and retraction notices (Schmidt, 2018; Suelzer et al., 2021).

However, to date, there has been limited systematic study of retraction indexing quality: we are aware of one database-wide comparison of PubMed and Web of Science (Schmidt, 2018), and multiple smaller studies highlighting database indexing problems for items with the same reason for retraction (e.g., Malički et al., 2019) or same field of study (e.g., Bakker & Riegelman, 2018; Dal-Ré & Ayuso, 2020; among many others). An analysis of PubMed's duplicate publication index in 2013 found 48% (12/25) of retracted publications (identified by publisher notices) did not show retraction status correctly for duplicate publications, and these problems persisted after authors contacted PubMed and editors during a 5-year follow-up period (Malički et al., 2019). 38% of mental health articles and 4% of genetics articles marked as retracted in Retraction Watch were not indexed as retracted in PubMed (Bakker & Riegelman, 2018; Dal-Ré & Ayuso, 2020). An analysis of 144 retracted articles in metal health found that only 7% (10/144) of retracted items were marked as such across a variety of publisher sites and database records (i.e., EBSCO databases, MEDLINE and PsycINFO via Ovid, PubMed, Scopus, Web of Science), and of those, the majority only indicated the retraction in one place (Bakker & Riegelman, 2018).

While it is known that retraction indexes are incomplete, there has been no systematic assessment of the extent to which retraction metadata agrees in multidisciplinary databases. This study fills that gap.

2. Goals and Research Questions

We construct a union list of all DOIs indexed as retracted publications in at least one of four multidisciplinary sources: Crossref, Retraction Watch, Scopus, and Web of Science. We check the extent to which each source agrees with the union list, restricting to each source’s coverage.

Our specific research questions are:

  1. How many DOIs are indexed as retracted publications in each of Crossref, Retraction Watch, Scopus, and Web of Science? Overall, how many DOIs are indexed as retracted publications in at least one source?

  2. How much agreement does each source have with the union list, restricting to its coverage?

  3. Does the level of agreement in DOIs indexed as retracted publications vary by field, publication year, or retraction year?

  4. For a sample of DOIs with less than 100% agreement in retraction indexing, does the publisher's website indicate that they are retracted publications?

3. Methods and Data

3.1. Methods and Data for RQ1: How many DOIs are indexed as retracted publications in each of Crossref, Retraction Watch, Scopus, and Web of Science? Overall, how many DOIs are indexed as retracted publications in at least one source?

To address RQ1, we create a list of DOIs that are indexed as retracted publications in one or more of our sources. To do this, we extract metadata about retracted publications as shown in Table 1.

After retrieving DOIs indexed as retracted publications, we deduplicate metadata within each data source, removing duplicate items with the same DOI. For ease of matching, we also remove items without DOI. Then we combine metadata across the four sources. Each DOI is annotated with a list of the sources that indexed it as a retracted publication, which we call rp_indexed_in.

We do not seek to retrieve publications indexed as errata or correction because according to the Committee on Publishing Ethics (2019), retractions should be distinguished from other types of correction or comment.

Table 1. Retracted publications identified from multidisciplinary sources.

Source Search Query Query Results Retrieved1 Search Date Top Categories
(as categorized by source)
Crossref Update_type=(
'retraction',
'Retraction',
'retracion',
'retration',
'partial_retraction',
'withdrawal','removal')
14,745 2023-04-05

General Medicine (1738);

Pharmacology (medical)(1315);

Multidisciplinary (883);

General Computer Science (426);

General Environmental Science (385); Biochemistry (385)

Retraction Watch All results 39,301 2023-03-27

((BLS) Biology - Cancer;(BLS) Biology - Cellular;(BLS) Genetics;(838)

(B/T) Computer Science;(B/T) Technology; (719)

(B/T) Computer Science; (674)

Scopus2 DOCTYPE("tb") 21,515 2023-04-05

Computer Science (6,911)

Engineering (5,887)

Medicine (3,908)

Biochemistry, Genetics and Molecular Biology (2,935)

Business, Management and Accounting (2,884)

Physics and Astronomy (2,078)

Web of Science (WOS) all collections DT="Retracted Publication" 16,434 2023-04-05

Biochemistry Molecular Biology (7,920)

Genetics Heredity (5,796)

Cell Biology (5,495)

Pharmacology Pharmacy (5,010)

Oncology (4,225)

Immunology (2,810)

3.2. Methods and Data for RQ2: How much agreement does each source have with the union list, restricting to its coverage?

An item might not be found in a given source on a given search date, because either: the item was not covered by the source; or, the item was covered but is not indexed as a retracted publication in that source. For a given DOI, we poll each source that it is not "rp_indexed_in" (using results from RQ1), to see whether the DOI is "covered_in" the source. We use APIs for Crossref, Scopus, and Web of Science; for Retraction Watch, there is nothing to check because our database dump only covers retracted publications.

In calculating agreement, we will consider a source to agree if it indexes as retracted a publication that is deemed retracted by any one of our sources (including just itself).

Considering the coverage, we quantify the extent of the agreement in retraction indexing for each source:

\(RetractionIndexingAgreement\_ SOURCE\) \(=\) \(\frac{Number\ of\ DOIs\ rp\_ indexed\_ in(SOURCE)}{Number\ of\ DOIs\ covered\_ in(SOURCE)}\)

3.3. Methods and Data for RQ3: Does the level of agreement in DOIs indexed as retracted publications vary by field, publication year, or retraction year?

Analogous to the RetractionIndexingAgreement_SOURCE above, we also quantify the extent of the agreement in retraction indexing for each source:

\(RetractionIndexingAgreement\_ DOI\) \(=\) \(\frac{Number\ of\ sources\ the\ DOI\ is\ rp\_ indexed\_ in}{Number\ of\ sources\ the\ DOI\ is\ covered\_ in}\)

We then analyze the RetractionIndexingAgreement_DOI across field, publication year, and retraction year.

We (JL, JS) categorize DOIs based on the conference or journal in which they appear. We use Scopus's conference and journal categorization when available for titles on the Scopus source list as of January 20233: publications are one or more of Health Science, Life Science, Physical Science, Social Science, or General. For venue titles not in Scopus, we extract an initial set of topic words by using Yet Another Keyword Extractor4 on the Scopus source list. Then in an iterative process, we review uncategorized conference and journal titles, and manually curate additional keywords5 in English and close cognates (e.g., Kardiologie). Titles in other languages or using terminology with multiple potential meanings are left uncategorized.

3.4. Methods and Data for RQ4: For a sample of DOIs with less than 100% agreement in retraction indexing, does the publisher's website indicate that they are retracted publications?

We (HZ, JS) examine a sample of about 200 DOIs from our union list that are covered in multiple sources that disagree on their retraction indexing (e.g., RetractionIndexingAgreement_DOI < 100%), to check: Does the publisher's website indicate that they are retracted publications?

To select the sample, we first group DOIs using the pair (RetractionIndexingAgreement_DOI score as calculated from RQ2, field as determined by RQ3) and then select items from each group. We keep other aspects as diverse as feasible, particularly the journal or conference title. We overselect DOIs with certain features: retraction year earlier than the publication year (especially more than 1 year earlier), having a PubMed ID (since PubMed retraction status is public domain data freely available for reuse), or no retraction year in our data.6

4. Results

4.1. Results for RQ1: How many DOIs are indexed as retracted publications in each of Crossref, Retraction Watch, Scopus, and Web of Science? Overall, how many DOIs are indexed as retracted publications in at least one source?

Our union list has 49,924 unique DOIs that are indexed as a retracted publication by one or more of Crossref, Retraction Watch, Scopus, and Web of Science. As shown in Table 2, these were consolidated and merged from the 91,995 records retrieved.

Table 2. After deduplication and checking for DOIs, we get a merged list of 49,924 unique records with DOI.

Source Query results retrieved Records with DOI Records without DOI removed Duplicate records removed
Crossref 14,745 14,742 0 3
Retraction Watch 39,301 33,423 5828 50
Scopus 21,515 21,094 49 372
Web of Science 16,434 15,247 1126 61
Total 91,995 84,506 7003 486
Total (Unique) 49,924

Figures 1 and 2 show the overlap between sources. Among the 49,924 unique DOIs, only 1593 (3%) were found in all four sources, with a total of 24471 (49%) purportedly retracted publications found in only one source: 9937 (20%) in Crossref, 8443 (17%) in Retraction Watch, 5056 (10%) in Scopus and 1035 (2%) in Web of Science.

Figure 1: DOIs were retrieved as retracted publications in 1, 2, 3, or 4 different sources.7

Figure 2: The overlap between sources, limited to the DOIs indexed as retracted within a named source. The total retrieved (to the right of the source name) were either retrieved as retracted publications only from that source (top left number in each box), or shared with 1, 2, 3 other sources. Pairwise overlaps are given in the table to the right.

4.2. Results for RQ2: How much agreement does each source have with the union list, restricting to its coverage?

The RetractionIndexingAgreement_SOURCE indicates the percentage of covered items, shared with the union dataset, that are indexed as retracted. Agreement is 100% for Retraction Watch,

which only provided retracted publications; 95.67% for Web of Science; 92.85% for Crossref; and 62.29% in Scopus. Coverage differs for each database, and Figure 3 compares the number of DOIs from our union list that are indexed as retracted in a source (blue) with those covered but not indexed as retracted (orange) in that source. Coverage was checked April 9, 2023 with the Crossref API, Scopus API8, and Web of Science API.9

Figure 3: Number of records that are covered but not indexed as retracted; and indexed as retracted in each source.

Figure 4: The proportion of our 49,924 DOIs that are: not covered; covered but not indexed as retracted; and indexed as retracted in each of Crossref, Retraction Watch, Scopus, and Web of Science.

4.3. Results for RQ3: Does the level of agreement in DOIs indexed as retracted publications vary by field, publication year, or retraction year?

While publication years range from 1940 to 2023 (Figure 6), interestingly, the first disagreement in for DOIs in our union list is in publication year 2016: about 570 DOIs were covered but not indexed as retracted in some source. The highest disagreement of over 2000 DOIs was recorded in 2019.

Figure 6: Publication year distribution for our 49,924 DOIs.

The publication year distribution varies by RetractionIndexingAgreement_DOI, and as shown in Figure 7, agreement of 50% and 66% is found from 2016 forward. By contrast, 25% agreement is found only in publications from 2022; 33% agreement is found only in publications from 2021 to 2023; and 75% agreement is found mostly in publications from 2022 with some from 2021.

Figure 7: Publication year distribution for each RetractionIndexingAgreement_DOI score.

The retraction year distribution (Figure 8) is roughly similar to the publication year distribution.

We have the retraction year for 43,584 DOIs (87%). All DOIs from Retraction Watch include a retraction year. Currently we lack retraction year for 6340 items, those we found only in Scopus (4869, 9.75%), only in WoS (1035, 2.07%), only in Crossref (1, 0%), both Scopus and WoS (245, 0.49%), only in Crossref and Scopus (154, 0.31%), and Crossref, Scopus, Web of Science (36, 0.07%).

Figure 8: Retraction year distribution for each RetractionIndexingAgreement_DOI score.

Limited to the 43,584 (87%) DOIs with retraction year in our records.

Figure 9 shows the prevalence of Life Science, and to a lesser extent Physical Science, and Health Science DOIs.

Figure 9: Field categorization of the 49,924 DOIs.

4.4. Results for RQ4: For a sample of DOIs with less than 100% agreement in retraction indexing, does the publisher's website indicate that they are retracted publications?

We confirmed 114/201 (56.72%) DOIs were retracted publications (including withdrawn or removed) as shown in Table 3. The most common indexing error was retraction notices 59/201 (29.35%).10 We group in “Retraction-related publications” expressions of concern, temporary removals, and retracted and republished articles; removed or purportedly retracted publications whose retraction notice we could not immediately locate; and a few retraction-related publications, such as publications whose duplicates had been removed/retracted.

Table 3. Categorization of 201 DOIs we manually checked.

Number of DOIs Percentage Description
115 57. 12% Retracted publication (including withdrawn or removed articles)11
59 29.35% Retraction notice
14 6.97% Non-retracted publication that has a correction
11 5.47% Retraction-related publications
2 1.00% No sign of retraction

5. Discussion and Conclusions

We created a union list of DOIs indexed as retracted in one or more of Crossref, Retraction Watch, Scopus, and Web of Science. Among the 49,924 unique DOIs, only 1593 (3%) were found in all four sources, with a total of 24,471 (49%) purportedly retracted publications found in only one source. Agreement with the union list, taking coverage into consideration, is 100% for Retraction Watch, which only provided retracted publications; 95.67% for Web of Science; 92.85% for Crossref; and 62.29% in Scopus. The retraction year and publication year distribution are roughly similar, with disagreements starting in 2016 and most disagreements in publications from 2021 forward with retraction year of 2022 or later.

5.1 Limitations

We only examined a very small number of articles (201) manually. Some DOIs indexed as retracted publications were not, in fact, retracted, withdrawn, or removed; many were retraction notices.

We removed 7003 records that had no DOI. We estimate we have lost information about 8-12% of our records (Range is 5928-1126-49=4753 to 7003/[7003+49924]) that have no DOI. Among our sources, Retraction Watch had 5928 records without DOIs; Scopus 49 records without DOIs; and Web of Science 1126 records without DOIs as shown in Table 2.

In calculating agreement metrics, we have a choice in how to handle the DOIs that were uniquely contributed by each source. We have defined our agreement metric to focus on the absence of DOIs contributed by any source (including the source under examination). A stronger metric would consider the presence of unique items a disagreement.

5.2 Discussion and Future Work

Disagreement in retraction indexes seems largely to be due to two types of errors: retracted publications with DOIs missing retraction indexing in a source that covers them; and misindexing of DOIs, especially retraction notices and corrigenda.

In the future we would like to better understand how the metadata flows between sources. Multiple types of problems in the metadata flow seem likely. For example, in examining the data we also find discrepancies between publisher websites and metadata; for example, Figure 10 shows that the retraction year is 2022 from the publisher website, but 2019 from the Crossref metadata.

Figure 10: Discrepancies in data for DOI:10.1016/j.yexmp.2018.12.005 as of April 15, 2023.

Left, publisher page from ScienceDirect https://doi.org/10.1016/j.yexmp.2018.12.005

Right, data from Crossref http://api.crossref.org/works/10.1016/j.yexmp.2018.12.005

Sharing hand-validated metadata as well as metadata quality procedures could be helpful in the future. Currently, only existing public domain data sources such as Crossref and PubMed can be readily shared. License agreements are another mechanism for sharing; for instance, Clarivate, the parent organization of Web of Science, licenses Retraction Watch data for EndNote and presumably could use it for Web of Science as well. More disagreement was found in items retracted in 2022 and 2023, suggesting that data sharing might be helping, but might need more frequent updating. Our results suggest significant room for improvement in retraction indexing quality in these multidisciplinary sources. Fully automatic processes will not be sufficient for creating a comprehensive union list from our current sources, in their current state of data quality.

Open science practices

Code is available at: https://github.com/infoqualitylab/retraction-indexing-agreement and archived in Zenodo as http://doi.org/10.5281/zenodo.7851298

We have shared data from the Crossref API at a temporary sharing link and will ultimately register a DOI for this data: https://databank.illinois.edu/datasets/IDB-9099305?code=AlD6KWmLk4ekAyq1Dj445-RCYJ6spZPt6TySNZwF1BM

We have shared the keywords used to manually identify fields at a temporary sharing link and will ultimately register a DOI for this data: https://databank.illinois.edu/datasets/IDB-8847584?code=Shd4NY0xgh7YWpfIMtAooESBKBcEwkV1LZPmPtXSyzc

Data for this study is licensed by each source. Only the Crossref API allows us the right to share the data we've collected. For Retraction Watch Data, we used data available from The Center for Scientific Integrity, the parent nonprofit organization of Retraction Watch, subject to a standard data use agreement. Retracted publications listed in Scopus and Web of Science data can be retrieved from the user interface as shown in Table 1, by database subscribers. Note that checking coverage in Scopus requires specific permission since the Academic Use Case of Scopus API is limited to a single subject area.

Acknowledgments

For feedback, we thank members of NISO Communication of Retractions, Removals, and Expressions of Concern aggregator/end user subgroup. This work would not have been possible without data from the sources used. Thank you to Crossref for providing a public REST API with data that may be used for any purpose. Thank you to Web of Science for API data access. Thank you to the Elsevier ICSR Labs and Scopus API teams for facilitating data access. We particularly acknowledge The Center for Scientific Integrity for free provision of Retraction Watch data for scientometric and data quality research. We appreciate feedback we received from Tilla Edmunds, Yuanxi Fu, Tzu-Kun Hsiao, Rachael Lammey, and Ivan Oranksy.

Author contributions

CRediT:

Conceptualization: Jodi Schneider

Data Curation: Heng Zheng and Jodi Schneider

Funding acquisition: Jodi Schneider

Investigation: Jou Lee, Malik Salami, Jodi Schneider, and Heng Zheng

Methodology: Malik Salami and Jodi Schneider

Project administration: Jodi Schneider

Resources: Jodi Schneider

Software: Jou Lee and Tzu-Kun Hsiao

Supervision: Jodi Schneider

Validation: Malik Salami and Heng Zheng

Visualization: Jou Lee

Writing – original draft: Jodi Schneider, Malik Salami, and Heng Zheng

Writing – review & editing: Jodi Schneider

Competing interests

JJ, HZ, and MS declare no competing interests.

JS declares non-financial associations with Crossref; COPE; International Association of Scientific, Technical and Medical Publishers; the National Information Standards Organization; and the Center for Scientific Integrity (parent organization of Retraction Watch). The National Information Standards Organization is a subawardee on her Alfred P. Sloan Foundation grant G-2022-19409.

Funding information

This project was funded by Alfred P. Sloan Foundation G-2022-19409 Reducing the Inadvertent Spread of Retracted Science II: Research and Development towards the Communication of Retractions, Removals, and Expressions of Concern.

References

Bakker, C., & Riegelman, A. (2018). Retracted publications in mental health literature: Discovery across bibliographic platforms. Journal of Librarianship and Scholarly Communication, 6(1), eP2199. https://doi.org/10.7710/2162-3309.2199

COPE Council. (2019). Retraction guidelines. https://doi.org/10.24318/cope.2019.1.4

Dal-Ré, R., & Ayuso, C. (2020). For how long and with what relevance do genetics articles retracted due to research misconduct remain active in the scientific literature. Accountability in Research, 28(5), 1–17. https://doi.org/10.1080/08989621.2020.1835479

Decullier, E., & Maisonneuve, H. (2018). Correcting the literature: Improvement trends seen in contents of retraction notices. BMC Research Notes, 11(1), 490. https://doi.org/10.1186/s13104-018-3576-2

Frampton, G., Woods, L., & Scott, D. A. (2021). Inconsistent and incomplete retraction of published research: A cross-sectional study on Covid-19 retractions and recommendations to mitigate risks for research, policy and practice. PLoS ONE, 16(10), e0258935. https://doi.org/10.1371/journal.pone.0258935

Genot, E. J., & Olsson, E. J. (2021). The dissemination of scientific fake news: On the ranking of retracted articles in Google. In The Epistemology of Fake News. Oxford University Press. https://doi.org/10.1093/oso/9780198863977.003.0011

Hsiao, T.-K., & Schneider, J. (2021). Continued use of retracted papers: Temporal trends in citations and (lack of) awareness of retractions shown in citation contexts in biomedicine. Quantitative Science Studies, 2(4), 1144–1169. https://doi.org/10.1162/qss_a_00155

Kotzin, S., & Schuyler, P. L. (1989). NLM’s practices for handling errata and retractions. Bulletin of the Medical Library Association, 77(4), 337–342.

Malički, M., Utrobičić, A., & Marušić, A. (2019). Correcting duplicate publications: Follow up study of MEDLINE tagged duplications. Biochemia Medica, 29(1), 010201. https://doi.org/10.11613/BM.2019.010201

Mine, S. (2019). Toward responsible scholarly communication and innovation: A survey of the prevalence of retracted articles on scholarly communication platforms. Proceedings of the Association for Information Science and Technology, 56, 738–739. https://doi.org/10.1002/pra2.155

Proescholdt, R., & Schneider, J. (2020, October 22). Retracted papers with inconsistent document type indexing in PubMed, Scopus, and Web of Science [poster]. METRICS 2020 workshop at ASIS&T 2020. https://hdl.handle.net/2142/110134

Schmidt, M. (2018). An analysis of the validity of retraction annotation in PubMed and the Web of Science. Journal of the Association for Information Science and Technology, 69(2), 318–328. https://doi.org/10.1002/asi.23913

Schneider, J., Ye, D., Hill, A. M., & Whitehorn, A. S. (2020). Continued post-retraction citation of a fraudulent clinical trial report, 11 years after it was retracted for falsifying data. Scientometrics, 125(3), 2877–2913. https://doi.org/10.1007/s11192-020-03631-1

Snodgrass, G. L., & Pfeifer, M. P. (1992). The characteristics of medical retraction notices. Bulletin of the Medical Library Association, 80(4), 328–334.

Suelzer, E. M., Deal, J., Hanus, K., Ruggeri, B. E., & Witkowski, E. (2021). Challenges in identifying the retracted status of an article. JAMA Network Open, 4(6), e2115648. https://doi.org/10.1001/jamanetworkopen.2021.15648


  1. As retrieved from each data source, before deduplication and before checking for DOIs↩︎

  2. This data was downloaded from Scopus API on April 5, 2023 via http://www.scopus.com↩︎

  3. https://www.elsevier.com/?a=91122 ; this contained 58.22% of journals and conferences (4644/7977) associated with the DOIs.↩︎

  4. https://pypi.org/project/yake/↩︎

  5. We have shared the keywords used to manually identify fields at a temporary sharing link and will ultimately register a DOI for this data: https://databank.illinois.edu/datasets/IDB-8847584?code=Shd4NY0xgh7YWpfIMtAooESBKBcEwkV1LZPmPtXSyzc↩︎

  6. We did not retrieve retraction year from Scopus or Web of Science since it was not available in the bulk download options from the user interface.↩︎

  7. ​​Figure 1 was created using http://bioinformatics.psb.ugent.be/webtools/Venn/↩︎

  8. via http://api.elsevier.com and http://www.scopus.com↩︎

  9. No separate search is needed in Retraction Watch since it only covers items it deems retracted.↩︎

  10. We counted as retracted publications 12/201 (5.97%) DOIs that are shared by both the retracted article and its retraction notice.↩︎

  11. Fully distinguishing these categories is difficult because publishers may leave in place the full-text of article as described as withdrawn, or take down the full-text of article they describe as retracted. Of the 201 DOIs we checked, 87/201 (43.28%) were retracted articles, 24/201 (11.94%) were withdrawn articles, and 4/201 (1.99%) were removed articles in our judgement.↩︎

Figures (11)

Publication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication Image
Submitted by21 Apr 2023
Download Publication

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

ReviewerDecisionType
User Avatar
Hidden Identity
Minor Revision
Peer Review
User Avatar
Hidden Identity
Minor Revision
Peer Review
User Avatar
Hidden Identity
Accepted
Peer Review