Can tweets predict article retractions?

This study explores the potential of using tweets to predict article retractions, by analyzing the Twitter mention data of retracted articles as the treatment group and unretracted articles that were matched as a control group. The results show that tweets could predict article retractions with an accuracy of 57%-60% by machine learning models. Sentiment analysis is not effective in predicting article retractions. The study sheds light on a novel method of detecting scientific misconduct in the early stage.


Introduction
Scientific misconduct and questionable research practices have become more prevalent in recent years, undermining the credibility of scientific research. Traditional methods for identifying problematic articles have focused primarily on text-based plagiarism (Eysenbach, 2000;Wager, 2011) and image manipulation (Koppers et al., 2017;Parker et al., 2022;Pflugfelder, 2022), but they are limited to detect more sophisticated forms of misconduct, such as data falsification, and authorship issues. To address this challenge, alternative sources of information can be explored, such as reader comments on social media. Twitter has emerged as a source of discussion on scientific articles, accounting for over 80% of comments across all platforms (Peng et al., 2022). Despite some research studying Twitter mentions of retracted articles (Bornmann & Haunschild, 2018;Haunschild & Bornmann, 2021), there is a lack of large-scale studies examining the potential of Twitter as a tool for detecting misconduct in scientific research. In this study, we aim to fill this gap by addressing the following research questions: 1. Are there sentimental differences between tweets about retracted and unretracted articles? 2. Can machine learning models help to predict article retractions through tweets?

Data
We collected a total of 9,364 retracted articles from the Web of Science (WoS) and Retraction Watch databases, which were published between 2012 and 2021. Out of these, we identified 3,628 articles (38.7%) that had been mentioned at least once on Twitter, as the treatment group for our analysis.

Methods 2.2.1. Coarsened Exact Matching (CEM)
To establish a control group, we employed the coarsened exact matching (CEM) technique to match 3,505 unretracted articles to 3,505 retracted articles based on the same issue of the same journal, with similar number of authors and tweets. We then assigned the retraction time lag of the retracted articles to their matched unretracted articles, thereby screening out pre-retraction tweets to predict whether the articles would be retracted. As a result, we obtained 15,383 tweets related to the retracted articles and 11,031 tweets related to the unretracted articles (hereinafter retracted tweets and unretracted tweets).

Similarity calculation
We calculated the Levenshtein distance between the tweet texts and the article title, to exclude the tweets simply echoing the title of the mentioned articles. Figure 1 shows the distribution of the Levenshtein distance for both retracted tweets and unretracted tweets. We excluded tweets with a similarity distance greater than 90. Finally, we obtained 10,932 retracted tweets and 6,962 unretracted tweets for further analysis.

Sentiment analysis
Differences in sentiment between retracted tweets and unretracted tweets may potentially help to predict article retractions. To explore this, the study used the Textblob package in Python to calculate the sentiment polarity score (ranging from -1 to 1) and the sentiment subjectivity score (ranging from 0 to 1) of the tweets.

Machine learning prediction
Machine learning is a method that enable computers to learn from data and make predictions or decisions without explicit programming. This study aims to explore whether machine learning models can help to predict article retractions through tweets, and if possible, which model has the highest prediction accuracy. Specifically, we employed four classical machine learning models in our study, including Naive Bayes (NB), Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR).

Sentiment analysis
The study calculated sentiment polarity and subjectivity scores of retracted and unretracted tweets (Figure 2). However, we found that there were no significant differences in sentiment scores between the two groups. The research suggests that it is challenging to use sentiment scores of tweets to predict article retractions.  Table 1 lists the accuracy of each machine learning model in predicting article retractions. The accuracy results of the various machine learning models are similar, ranging between 57% and 60%, with Logistic Regression being the best. Overall, the results indicate that machine learning models can predict article retractions to some extent.

Conclusions
Although we found that tweets could predict article retractions to some extent, the prediction accuracy still needs to be improved. Due to the complexity of users on social media and the entertainment nature of social media sharing behavior, people need to hold critical perspectives to scientific articles shared on social media, as they may have potential issues that need to be carefully considered.