Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
There is an updated version of this publication, open Version 2.
conference paper

Do original tweets and retweets differ in indicating research impact across various subject areas in multidisciplinary papers published in PLoS?

21/04/2023| By
Ashraf Ashraf Maleki,
Kim Kim Holmberg
757 Views
1 Comments
Disciplines
Keywords
Abstract

Earlier altmetrics studies have often focused on investigating whether the number of tweets mentioning scientific articles could be used as an indicator of scientific impact or attention, with results showing weak to moderate correlations with citation counts and some disciplinary differences. But all tweets may not be equal, as original tweets and retweets may reflect different levels of engagement and, with that, impact. In this research, the relationship between original tweets and retweets and Scopus citations was analyzed for a total of 330,022 PLoS publications and compared over time and across 22 subject fields. The findings showed that the correlations were strongest between citations and original tweets, and the relationship was stronger in Social Science and Humanities subject fields than in Natural Science, Engineering and Medicine. The results showed that tweets and retweets are very different, and thus they should be considered two different metrics and analyzed separately.

Preview automatically generated form the publication file.

Do original tweets and retweets differ in indicating research impact across various subject areas in multidisciplinary papers published in PLoS?

Ashraf Maleki* and Kim Holmberg**

*ashraf.maleki@utu.fi

https://orcid.org/0000-0002-8223-4833

Department of Social Research, University of Turku, Finland

** kim.j.holmberg@utu.fi

https://orcid.org/0000-0002-4185-9298

Department of Social Research, University of Turku, Finland

Twitter is a popular platform to discuss and share scientific articles. Earlier altmetrics studies have often focused on investigating whether the number of tweets mentioning scientific articles could be used as an indicator of scientific impact or attention, with results showing weak to moderate correlations with citation counts and some disciplinary differences. But all tweets may not be equal, as original tweets and retweets may reflect different levels of engagement and, with that, impact. This research analyzed whether the correlation between citations and original tweets differs from that between citations and retweets and whether there is any disciplinary difference between the two. For this purpose, the relationship between original tweets and retweets and Scopus citations was analyzed for a total of 330,022 PLoS publications and compared over time and across subject fields. The findings showed that the correlations were strongest between citations and original tweets, and the relationship was stronger in Social Science and Humanities subject fields than in Natural Science, Engineering and Medicine. The results showed that tweets and retweets are very different, and thus they should be considered two different metrics and analyzed separately.

1. Introduction

Twitter is a popular social media platform where users (often called tweeters) can publish and share content to their network of followers. Through retweeting tweeters can easily disseminate content that someone else has originally published. While creating an original tweet can take a bit of effort, retweeting can easily be done just by clicking or tapping on a button, thus it seems fair to say that retweeting doesn’t require as much effort as tweeting does. Because of that we can also argue that retweeting signals less engagement than creating and publishing an original tweet does. In altmetrics, i.e., the measuring of engagement or attention that scientific outputs have received online, Twitter is one of the main data sources, as there is significant activity around scientific articles on the platform (Costas et al., 2015; Haustein et al., 2015). Often in altmetrics research tweets and retweets are counted as one measure, without making any distinction between them. We argue that because the two acts are fundamentally different, indicating different levels of engagement and possibly attention or impact, combining them in statistical analyses may lead to false results. The goal of this research is to investigate whether this is true, and whether original tweets and retweets should be analyzed separately in altmetric research.

2. Background

Much of early altmetrics research focused on examining whether altmetrics could be an alternative to traditional citation-based measures of impact. The research focused on testing for correlations between tweets and citation counts, providing some mixed results with large scale studies (e.g., Barthel et al., 2015; Costas et al., 2014, 2015) showing lower correlations between tweets and citations than studies with more focused, journal or discipline specific samples (e.g., Eysenbach, 2011; Shuai et al., 2012). Earlier research has also discovered disciplinary differences in how scientific articles get tweeted, as scientific articles from social sciences and biomedical and health sciences tend to attract more attention on Twitter than articles from mathematics and computer science, and natural sciences and engineering (Costas et al., 2015; Haustein et al., 2015). Other characteristics too, such as the length of the article (Haustein et al., 2015), OA status (Holmberg et al., 2020), and research funding (Didegah et al., 2018), may have an influence on the attention scientific articles receive on Twitter. A more recent study investigated how different types of user engagement behaviors on Twitter, i.e., liking, retweeting, quoting, and replying, were used in connection to scholarly content (Fang, Costas, & Wouters, 2022). The results showed that while likes (44%) and retweets (36%) were frequently used, quotes (9%) and replies (7%) were less frequent. While earlier research has already shown disciplinary differences in the uptake of scientific articles on Twitter (e.g., Haustein, Costas, & Lariviére, 2015), and how researchers use Twitter (Holmberg & Thelwall, 2014), the results by Fang, Costas, and Wouters (2022) showed that there are disciplinary differences also in the ways with which users engage with scientific content on Twitter. But do the disciplinary differences extend to both tweeting and retweeting? Or are the possible differences evened out if tweets and retweets are treated as same? This research investigates possible disciplinary differences between tweeting and retweeting, as well as if there are any differences in how citation counts correlate with the number of tweets and retweets.

3. Method

3.1. Data

A total of 330,022 PLoS publications published between 2003-2023 were extracted from Scopus in April 2023. The extracted publications were published in nine PLoS journals and eight proceedings, with majority of the papers (94%) being journal articles. Altmetric.com was used to extract separate datasets of 1) all tweets and 2) original tweets, which were then used to count the number of retweets for each paper.

3.2. Subject fields

As all PLoS papers are only classified as multidisciplinary in Scopus, we used the classification used by altmetric.com (Australian and New Zealand Standard Research Classification 2020 (ANZSRC)1) to assign subject fields to each article. For the analysis we used:

(1) first subject field of each paper (no duplicates); and

(2) all publications in a subject field (duplicates included between subject fields).

Table 1 shows the number of publications when counting only first/primary subject field and all publications within a subject field. The first 11 fields in Table 1 are from Natural Science, Engineering and Medical and Health Sciences (STEM) and the second 11 fields are from Social Science and Humanities (SS&H). Of all the publications about 19% have not been assigned to a field; these mostly were Erratum and non-tweeted.

Table 1. Number of PLoS papers according to first/primary subject field assigned and total number of publications in a subject field (including duplicates).

Fields of Research (FoR) First/Primary field (no duplicates) All publications in field (incl. duplicates)
Mathematical Sciences 7,787 8,312
Physical Sciences 1,506 4,319
Chemical Sciences 2,855 8,584
Earth Sciences 2,142 5,407
Environmental Sciences 8,631 15,129
Biological Sciences 92,008 129,424
Agricultural and Veterinary Sciences 1,637 19,121
Information and Computing Sciences 6,984 17,317
Engineering 2,555 9,597
Technology 851 1,071
Medical and Health Sciences 123,711 172,521
Built Environment and Design 27 748
Education 818 1,779
Economics 1,984 4,337
Commerce, Management, Tourism and Services 498 2,830
Studies in Human Society 1,854 10,976
Psychology and Cognitive Sciences 10,496 22,991
Law and Legal Studies 123 1,292
Studies in Creative Arts and Writing 84 874
Language, Communication and Culture 529 1,999
History and Archaeology 792 2,182
Philosophy and Religious Studies 100 624
No subject assigned 62,050 62,050
Total 330,022

3.3. Analysis

To analyze the possible relationship between citations and all tweets, original tweets and retweets, comparisons across fields and over time were conducted. For this purpose, proportion non-zero and Geometric mean of citations, tweets and retweets were calculated and normalized for comparisons between subject fields and with the world average (here, all PLoS publications). The data was first prepared (Thelwall, 2017) and then the calculations were conducted with Webometric Analyst (lexiurl.wlv.ac.uk).

  1. Normalized Proportion non-zero was used as an estimate for publications with non-zero Scopus citations, tweets and retweets, with a 95% confidence interval.

  2. World normalised proportion non-zero of metrics (EMNPC) were used for comparisons. EMNCP values for fields are compared for any variation from the world average (=1).

  3. Geometric mean was calculated based on the logarithm of raw metric counts + 1 or ln(1+raw data), as proposed by Thelwall (2017) and all calculation were in 95% confidence interval.

  4. World normalised mean metrics (MNLCS) were calculated on log-transformed data of ln(1+raw data) and calculated in 95% confidence interval. MNLCS values also need to be compared with value one which represents the world average.

4. Findings

4.1. Normalized Proportion Cited

Figure 1 shows that the total publication frequency of PLoS had significantly increased from 87 in 2003 to just below 35,000 in 2013, after which the level drops and remains at around 20,000 annually. The proportion non-zero citations show a cumulative increase over time, the number of publications mentioned in tweets rose from about 20% in 2010 (about the time when altmetric.com started to collect tweets) to 76% in 2016 and then a fall to about 65% in 2022. The proportion non-zero retweets shows a delayed rise since 2013, rising to 40% by 2018, levelling off after that, while proportion tweeted has slightly dropped in the same period.

Figure 1: Frequency of total publications, publication cited, tweeted and retweeted and normalized proportion non-zero in the metrics

Presenting the results from the normalized proportion non-zero of metrics, Figure 2 shows that on average 92% of publications in STEM fields had been cited, while only 82% in SSH fields had received citations. On average, 75% of articles in STEM had been tweeted, compared to 85% in SSH, while only 35% of STEM articles and 50% of SSH articles had been retweeted.

Figure 2: Normalized proportion cited, tweeted and retweeted across fields.

4.2. World normalised proportion non-zero for metrics or EMNPC

Figure 3 shows that after world normalization of proportions non-zero, both tweets and retweets appear significantly above world average in SSH fields for STEM the results are mixed both below and above the world average. Mathematical Science, Earth Science, Environmental Science and Information and Computing Sciences all show EMNCP >1 for tweets and > 1.5 for retweets, while all the other STEM fields remain below the world average. The results also showed that the diversion from the world average for retweets is at higher magnitude than for tweets across all fields; for above world average counts, the proportion non-zero retweets was significantly higher than for tweets, and for below world average counts, the proportion non-zero retweets was significantly lower than for tweets. This may suggest a greater discrepancy across fields in terms of retweeting behaviour.

Figure 3: World normalized proportion cited, tweeted and retweeted across fields.

4.3. Geometric mean Citations vs. Original tweets and Retweets

Figure 4 illustrates the changes in geometric mean metrics over time, showing that the geometric mean for citations peaked at about 49 in 2008 before gradually dropping over years. The trend is, however, almost reversed for the metrics from Twitter, showing a slow drop between 2003 and 2009 (<1) before rising to about 3 for total tweets in 2018 (about 2 for original tweets and 1.25 for retweets), soon after which they too start to fall. The results also show, that the ratio of retweets to original tweets has been over 1 since 2017, suggesting that in the past six years, a majority of tweets mentioning scientific articles have in fact been retweets rather than original tweets. The average geometric mean citations across STEM fields is 14, while about 9 across SS&H fields. In contrast, the average geometric mean all tweets, original tweets and retweets across STEM fields (3, 2 and 1, respectively) is approximately half the SS&H fields (6, 4, and 2).

Figure 4: Geometric mean Scopus citations, all tweets, original tweets and retweets across years.

4.4. World normalised mean metrics or MNLCS

The mean of world normalized ln(1+ raw metric values) metrics from Twitter mentions indicate subject bias (Figure 6). In STEM fields, such as Chemical Science, the results are below the world average for tweets and retweets, while slightly above it for citations, but the case is very different for the SSH fields. A majority of SSH fields perform below world average in citations, but significantly above the world average in tweets and original tweets (up to 1.5 times the world average) and with 3.5 times the world average in retweets (e.g., History and archaeology, and Creative arts and writing).

Figure 6: Mean of world normalized ln(1+raw citation, tweet, or retweet count) across fields.

4.5. Spearman’s Correlation between Citations and different types of tweets

The correlation coefficients between citations and all tweet metrics showed stronger correlations when the zeros, i.e., articles with no citations or tweets, were included in the calculation (Figure 7). The correlations were weak but significant across the line. The strength of the relationship between citations and tweets has, however, first increased over time and then from 2019 started to fall. Furthermore, the correlation coefficients with all tweets were slightly stronger than for original tweets from 2014 (r = .282 > .280, respectively) through 2018 (r = .330 > .329), and since 2019 (r = .343 < .346) until 2022 the relationship between citations and original tweets appears to be slightly stronger than for all tweets in both zero-included and non-zero datasets.

Figure 7: Spearman’s correlation coefficients between Scopus citations and all tweets, original tweets and retweets over time.

Figure 8 illustrates a heatmap of the correlation coefficients between Scopus citations and the three metrics of all tweets, original tweets, and retweets across subject fields for zero (Z) and non-zero datasets (-) and first-assigned subject (F) and all publications in a field (A). The findings suggest that the median correlation coefficient of Scopus citations across fields is highest with original tweets (median r = .310), while remaining weak but significant with retweets (median r = .087) when the first assigned subject fields were used. Using all publications in a subject fields led to even weaker correlations than with first-assigned subjects. However, for the first-assigned SSH subject fields the median correlation coefficients between citations and original tweets were at medium level (median r = .409), in contrast to the weak correlation in STEM subject fields (median r = .162).

Including zero metric counts in the datasets resulted in stronger correlation coefficients between citation and all the other metrics in SSH fields (median r with original tweets = 0.409 in zero-included dataset, 0.329 in non-zero dataset), but weaker in STEM subject fields (median r = 0.175 in non-zero dataset, r = 0.149 zero-included dataset). It would appear that tweets are moderately likely to align with traditional research impact in Social Science and Humanities, but they indicate only a weak relationship and a limited usage in STEM subject fields.

Figure 8: Spearman’s correlation coefficients between Scopus citations and all tweets (T), original tweets (O) and retweets (R) across fields. The empty cells indicate no statistical significance (p > .05).

FZ: Raw metric counts with zeros (Z) in first (F) assigned subject field; F: Non-zero raw metric counts in first assigned subject field; AZ: Raw metric counts with zeros in all (A) assigned publications to field; A: Non-zero raw metric counts in all assigned publications to field.

5. Discussion

Current study compared citations, original tweets, and retweets, as measures of impact assessment. The results were in line with some of the findings in earlier research (e.g., Costas et al., 2015; Haustein et al., 2015). The results showed clear disciplinary differences in how scientific articles had been mentioned and shared on Twitter, but it was also discovered that scientific articles in Social Science and Humanities receive up to 2 to 3 times as much retweets as the world average, compared to Natural Science and Engineering which were below the world average. The results also showed how the correlations between citations and original tweets were clearly stronger than between citations and retweets, and how the correlations overall were stronger for SSH subject fields, than STEM subject fields. The results clearly point at the differences between original tweets and retweets, confirming that the two do reflect different types of actions and therefore, should be treated separately, at least when it comes to altmetrics research.

Funding information

This research was funded by the Academy of Finland (funding decision: 332961).

References

Barthel, S., Tönnies, S., Köhncke, B., Siehndel, P., & Balke, W.T. (2015). What does twitter measure? Influence of diverse user groups in altmetrics. In Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries (pp. 119–128). Knoxville, Tennessee, USA: ACM Press. http://doi.org/10.1145/2756406.2756913.

Costas, R., Zahedi, Z., & Wouters, P. (2014). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the Association for Information Science and Technology, 66(10), 2003–2019. https://doi.org/10.1002/asi.23309

Costas, R., Zahedi, Z., & Wouters, P. (2015). The thematic orientation of publications mentioned on social media: Large-scale disciplinary comparison of social media metrics with citations. Aslib Journal of Information Management, 67, 260– 288. http://doi.org/10.1108/AJIM-12-2014-0173.

Didegah, F., Bowman, T.D., & Holmberg, K. (2018). On the differences between citations and altmetrics: An investigation of factors driving altmetrics vs. citations for Finnish articles. Journal of the Association for Information Science and Technology, vol. 69, no. 6, pp. 832-843. DOI: 10.1002/asi.23934.

Eysenbach, G. (2011). Can tweets predict citations? Metrics of social impact based on twitter and correlation with traditional metrics of scientific impact. Journal of Medical Internet Research, 13, e123– e123. http://doi.org/10.2196/jmir.2012.

Fang, Z., Costas, R. & Wouters, P. (2022). User engagement with scholarly tweets of scientific papers: a large-scale and cross-disciplinary analysis. Scientometrics 127, 4523–4546. https://doi.org/10.1007/s11192-022-04468-6

Haustein S, Costas R, Larivière V (2015). Characterizing Social Media Metrics of Scholarly Papers: The Effect of Document Properties and Collaboration Patterns. PLOS ONE 10(5): e0127830. https://doi.org/10.1371/journal.pone.0127830

Holmberg, K. & Thelwall, M. (2014). Disciplinary differences in Twitter scholarly communication. Scientometrics, vol. 101, no. 2, pp. 1027-1042. DOI:10.1007/s11192-014-1229-3.

Holmberg, K., Hedman, J., Bowman, T.D., Didegah, F., & Laakso, M. (2020). Do articles in open access journals have more frequent altmetrics activity than articles in subscription-based journals? An investigation of the research output of Finnish universities. Scientometrics, vol. 122, pp. 645-659. DOI : 10.1007/s11192-019-03301-x.

Thelwall, M. (2017). Web indicators for research evaluation: A practical guide. San Rafael, CA: Morgan & Claypool.


  1. https://www.abs.gov.au/AUSSTATS/abs@.nsf/mf/1297.0↩︎

Figures (7)

Publication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication Image
Submitted by21 Apr 2023
User Avatar
Ashraf Maleki
University of Turku
Download Publication

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

ReviewerDecisionType
User Avatar
Hidden Identity
Accepted
Peer Review
User Avatar
Hidden Identity
Accepted
Peer Review