This study investigates the ontological features of retraction notices. Through text mining and analysis of 12,940 retraction notices from web of science and original websites, we found that most of the retraction notices contain basic elements such as who requested the retraction and reasons for retraction, but there is less mention of process information such as retraction policy, who conducted the investigation, and whether the author was contacted and responded to. The author of retraction notices reflect the current opacity in identifying the authorship of retraction notices. Sentiment analysis results demonstrate that the retraction notice satisfies the language requirements of objectivity and neutrality. This study attempts to show the overall status of the current practice of the retraction system, to inform readers about the learnable aspects and limitations of the message that retraction notices convey, and provide some ideas for regulating the retraction system and maintaining research integrity.
Shuying Chen1, Hui-Zhen Fu1*
Department of Information Resources Management, School of Public Affairs, Zhejiang University, China
Department of Information Resources Management, School of Public Affairs, Zhejiang University, China
This paper investigates the ontological features of retraction notices. Through text mining and analysis of 12,940 retraction notices from web of science and original websites, we found that most of the retraction notices contain basic elements such as who requested the retraction and reasons for retraction, but there is less mention of process information such as retraction policy, who conducted the investigation, and whether the author was contacted and responded to. The author of retraction notices reflect the current opacity in identifying the authorship of retraction notices. Sentiment analysis results demonstrate that the retraction notice satisfies the language requirements of objectivity and neutrality. This study attempts to show the overall status of the current practice of the retraction system, to inform readers about the learnable aspects and limitations of the message that retraction notices convey, and provide some ideas for regulating the retraction system and maintaining research integrity.
Retracting problematic scientific publications has become a pressing concern within the scientific community, prompting researchers to view retraction as an effective and visible measure for self-correcting errors, maintaining scientific norms, and promoting academic progress (Katavić, 2014; Hesselmann et al., 2017).A retraction notice, which is a written notice identifying the retraction of a scientific publication, is a critical component of the retraction system. The content of the retraction notice not only corrects the retracted article but also reflects the current state of the retraction system and provides insight into scientific research and publication processes.
The scientific community and the public rely on retraction notices to learn about the problems of a retracted article, the reasons for its retraction, who is responsible for the retraction, and the process the paper went through from discovery to retraction. Therefore, retraction notices must include as much pertinent information as possible, be factual, and reflect the details of the problem's discovery, investigation process, results, and follow-up measures. Incomplete or vague retraction notices are not effective in informing the scientific community and the public about the problem.
Despite bibliometric studies analyzing the characteristics and reasons for paper retractions, there have been fewer studies focusing on retraction notices. By analyzing the textual content of retraction notices, we can better understand the practice of article retraction in journals, assess the transparency of scientific research and publication processes, reducing the anomie phenomenon and maintaining scientific norms.
Elia et al. (2014) conducted a study based on a survey conducted in 2011 by the State Medical Association of Rheinland-Pfalz, Germany. The survey identified 88 articles published in 18 journals that lacked formal ethical approval and warranted retraction, which involved 79 retraction notices. These notices were compared with all criteria for adequate retraction notices in Committee on Publication Ethics (COPE)'s guidelines. The results showed that the format of retraction notices was consistent within each journal, but differed between journals. Only 15 retraction notices were found to meet all predefined criteria for adequate retraction notices according to COPE.
A study on retractions in the journal Science between 1983 and 2017 found that 60% of the retractions involved all authors of the retracted paper signing the published retraction (Wray & Andersen, 2019). Wager and Williams (2011) analyzed retraction notices of 249 articles from 2000 to 2017 and found that 63% of retractions were issued by authors, while 21% were issued by editors.Vuong (2020) analyzed a dataset of 2046 retraction notices downloaded from WOS and found that 53% of the notices did not include information about who initiated the retraction. Additionally, the information included in these notices was often sparse and vague, making it difficult for readers to understand specific details.
The disclosure of retraction reasons in retraction notices varies in different studies. Tripathi et al. (2018) and Wager and Williams (2011) both found that most retraction notices did not provide clear reasons for retraction based on their investigation of 249 retraction notices in the Scopus database from 2000 to 2017 and Medline retractions, respectively. However, another study found that 91% of retraction notices mentioned information about retraction reasons, problems with the retracted paper, and related evidence (Vuong, 2020). This discrepancy may be due to the different data sources used by the researchers in collecting retraction notices.
Deculllier and Maisonneuve (2018) found that out of 244 retraction notices published in 2008, except for 9 notices that could not be retrieved, only 21 retracted papers (9%) did not provide clear reasons for retraction. In 233 retracted papers, the original paper or its location was mentioned (95%).
The data is divided into three parts. The first part consists of retraction notice basic data, primarily sourced from the Web of Science Core Collection, covering the period from 1900 to 2022. Searches were conducted using retrieval formulas such as TI="retraction notice" and TI="retracted article" to retrieve retraction notices and retracted articles, respectively. The full record data for both retraction notices and retracted articles were downloaded. The retracted notices and retraction articles were matched based on their titles, authors, and other information. Data cleaning and processing were carried out by manual inspection, matching, and removal of duplicates, resulting in preliminary basic data etc. For retraction notices and retracted articles that could not be matched, additional information was obtained through manual searches.
The second part of the data comprises detailed information on retraction notices. By clicking on the links in the Web of Science, we were able to access the original websites where the retraction notices were published. We recorded full text of the retraction notices, PDF download links, and watermarks.
The third part is other relevant data. We collected guidelines or recommendations on the content and specifications of retraction notices from COPE, ICMJE and retraction watch websites, collected data on the reasons for retraction from retraction watch websites through crawlers, and downloaded the Journal Citation Reports for 2021.
After selecting retraction notices with relatively complete textual information, we obtained a total of 12940 retraction notices and corresponding basic and supplementary information for retracted articles.
Our research focuses on text mining and analysis of the content of retraction notices, exploring the content elements and text structure of retraction notices through statistical analysis and sentiment analysis. We also explores the gap between retraction practice and norms by comparing with international norms such as COPE guidelines.
4.1 Content elements of a retraction notice
We quantified the observed and identified retraction notice information by recording each potential element in two dimensions. Firstly, we determined whether the retraction notice text contained the respective element or not, in other words, whether the element was mentioned in the retraction notice or not, to explore the overall content and textual structure of the retraction notice. Secondly, for some elements, we conducted a more in-depth analysis, such as categorization and proportion, typical expressions, and so on.
Table 1 displays whether elements are mentioned in retraction notices or not. The table reveals that a large majority of retraction notices provide both the main information of who requested the retraction（93.1%）and an explanation of the reason for retraction（97.1%）. Additionally, 32% of retraction notices conform to the guidelines established by COPE of contacting with the authors before retracting the article. 27.4% of retraction notices indicate the policy upon which the retraction is based, and 25.4% specify the investigating party responsible for examining the article.
Table 1. Elements of the retraction notice
|Element Category||Retraction Notice|
|Who requested the retraction||12050||93.1%|
|Who conducted the investigation||3284||25.4%|
|Contact the author and author response||4138||32.0%|
Based on the definition of third-party surveys in the reason category of the Retraction Watch, combined with the understanding and analysis of the notice text, the investigator were classified into five categories: authors, journals, publishers, ORI, and the third party. Out of the 3284 retraction notices, 2916（88.8%） mentioned single investigator while 368(11.2%) mentioned two or more investigator . Notices from either journals or authors as investigators exceeded 30%.
Table 2. Investigator mentioned in the retraction notices
|Investigator mentioned in the retraction notices||
|Numbers of who investigated the retraction||Only one investigator||2916（88.8%）|
|Two or more investigators||368（11.2%）|
|Who investigated the retraction||Author||809(22.2%)|
|The third party||1098(30.1%)|
Figure1. Investigator mentioned in the retraction notices
The retraction policy is the basis for journals to retract papers. It refers to whether the declaration mentions which standards and policies the retraction of the paper is based on or whether the standards and policies have been followed. After reading retraction notices, it was found that there are mainly two situations in which the retraction policy is mentioned. Firstly, it refers to the COPE guidelines on retraction or the policy of ICMJE. Secondly, it mentions the journal's own retraction policy. Of the 3541 retraction notices, 2112 (16.3%) mentioned the journal's retraction policy, 1425 retraction notices (11.01%) mentioned the COPE guidelines on retraction, and 22 retraction notices (0.17%) mentioned the ICMJE retraction policy.
Table 3. Mention of retraction policy in the retraction notice
|Category||Mention the journal's own retraction policy||Mention of following COPE's guidelines||Mention of following ICMJE's policy|
4.2 The author of retraction notices
According to COPE, a published research article can be retracted by the author(s), journal editor, publisher, or academic organization alone or in collaboration with other parties. Additionally, the retraction entity may also involve lawyers, research integrity offices, the author's institution, or other entities. It is important to note that the retraction of an article should be considered carefully and only done in valid circumstances.
After analyzing the information about the authors of retraction notices, it was found that 9861 of the retraction nnotices (76%) had the same author as the retracted article. The first author and corresponding author of a paper bear the primary responsibility for its publication. The first author usually undertakes the majority of the research, writing, and editing of most of the paper's content, with the greatest contribution. The corresponding author has a higher authority among co-authors and is the direct contact person for the paper, responsible for ensuring its reliability. Through statistical analysis, it was found that in the remaining 3079 retraction notices, 2145 (70%) had the same first author as the retracted article's first author.
4.3 The sentiment of retraction notices
Retraction notices serve as a critical component of scientific publishing by providing transparency and accountability to the research process. However, the language used in these notices can have a significant impact on readers and the scientific community. It is crucial that retraction notices are written objectively and accurately to ensure that readers fully understand the reason for the retraction and the severity of the issue.
According to the COPE guidelines, retraction notices should be clear, specific, and provide a detailed explanation of the reason for the retraction. The language used should be factual, concise, and free from emotional or subjective opinions. This is particularly important because the emotional impact of retracted research can have serious consequences for both the scientific community and the public at large.Similarly, the ICMJE guidelines emphasize the importance of transparency in scientific publishing and urge authors to ensure that their work is accurate and free from misconduct. The guidelines also emphasize the importance of prompt correction and retraction of any inaccuracies or errors.
As a specialized genre of scientific writing, retraction notices should adhere to a specific set of standards to maintain their integrity and value in the scientific record. By providing clear, objective, and accurate language, retraction notices can serve as a critical tool for maintaining the transparency and credibility of the scientific publishing process.
The original text of the withdrawal notice was pre-processed and subjected to sentiment analysis to obtain two scores: polarity and subjectivity. Polarity represents the degree of emotional intensity of the sentence from negative to positive, ranging from -1.0 to 1.0. Text with a negative emotional score is between -1.0 and 0, neutral text has a score of 0.0, and text with a positive emotional score is between 0.0 and 1.0. The average polarity score was 0.04, indicating that the emotional tendency of the retraction notice text was very weak and close to neutral.
Subjectivity is an indicator of the number of personal feelings, opinions, or beliefs contained in the text. Subjectivity is a value between 0.0 and 1.0, where 0.0 is very objective, while 1.0 is very subjective. Subjectivity quantifies the number of personal opinions and information contained in the text. A higher subjectivity score means that the text contains personal opinions rather than information. The average subjectivity score was 0.36, indicating that subjective factors were present to some extent in the retraction notice text, but overall, it was still relatively objective.
Figure2. The polarity and subjectivity of the text of a retraction notice
The study found that retraction notices differ in content and composition, and also in the amount of information they contain. Most retraction notices disclose the reason for the retraction and indicate who requested the retraction. This helps readers who were interested in the article or researchers who cited it understand why the paper was retracted and who first detected the problem. However, information on procedural matters, such as whether the journal contacted the author and received a response, and who conducted the investigation, is still limited in most cases, and only some notices provide the basis for the retraction.
A complete retraction notice should provide more comprehensive and accurate information in accordance with the requirements of COPE and ICMJE. Obviously, the mechanism and format requirements for publishing withdrawal notices are still in an unregulated stage, and there is still a long way to go.
In terms of authorship, most retraction notices have the same author as retracted article. It is also most common when the authors of the retraction notice and the retracted paper are not the same, or at least the same first author. Another situation of concern is that group authors are important types of authors of retraction notices. Many journals publish retraction notices with the journal or editorial office as the author.
Retraction notices should provide sufficient information for readers to know who is retracting and why the findings are considered unreliable, while also clearly distinguishing between misconduct and honest errors. However, retraction notices often need to strike a balance between providing adequate information and avoiding defamation or slander. (Moylan, & Kowalczuk, 2016). The results of sentiment analysis show that the language of withdrawal notices follows the principles of objectivity, truthfulness, and clarity. The sentiment inclination of the notice is very weak, close to neutral, and overall quite objective.
This study is still in progress, and analysis of other elements will be the next step. We will further improve the content of retraction notices and analyze the relationship between the content structure and language expression of retraction notices and the academic impact of retracted articles.
This study reveals significant variation in the elements included in retraction notices. While over 90% of notices provide information about the subject of retraction and reasons for retraction, less than half disclose the investigator of the paper, policy rationale, and contact information with the author. Furthermore, the authorship of retraction notices is often unclear, with 76% of notices being identical to the authors of the retracted papers or being equated with those who performed the retraction, thus obscuring the provenance of the notices. Despite this, the language used in retraction notices adheres mostly to principles of objectivity and accuracy.To improve transparency and consistency in the scientific publishing process, the scientific community must prioritize standardizing retraction practices and promote greater integrity and credibility in scientific research.
The authors have no competing interests.
Open science practices
The data for this study are available from the web of science core collection.
Deculllier, E., & Maisonneuve, H. (2018). Correcting the literature: Improvement trends seen in contents of retraction notices. BMC research notes, 11(1), 1-3.
Elia, N., Wager, E., & Tramèr, M. R. (2014). Fate of articles that warranted retraction due to ethical concerns: a descriptive cross-sectional study. PLoS One, 9(1), e85846.
Hesselmann, F., Graf, V., Schmidt, M., & Reinhart, M. (2017). The visibility of scientific misconduct: A review of the literature on retracted journal articles. Current sociology, 65(6), 814-845.
Katavić, V. (2014). Retractions of scientific publications: responsibility and accountability. Biochemia medica, 24(2), 217-222.
Moylan, E. C., & Kowalczuk, M. K. (2016). Why articles are retracted: a retrospective cross-sectional study of retraction notices at BioMed Central. BMJ open, 6(11), e012047.
TRIPATHI M, DWIVEDI G, SONKAR S K, et al. Analysing Retraction Notices of Scholarly Journals: A Study[J]. Desidoc Journal of Library & Information Technology, 2018,38(5):305-311.
Vuong, Q. H. (2020). The limitations of retraction notices and the heroic acts of authors who correct the scholarly record: An analysis of retractions of papers published from 1975 to 2019. Learned Publishing, 33(2), 119-130.
Wager, E., & Williams, P. (2011). Why and how do journals retract articles? An analysis of Medline retractions 1988–2008. Journal of medical ethics, 37(9), 567-570.
Wray, K. B., & Andersen, L. E. (2018). Retractions in science. Scientometrics, 117(3), 2009-2019.
No comments published yet.