Exploring Twitter for Scientific and Public Engagement with Scholarly Articles

: The “ science-society connect ” deals with “ transferring the benefits of scientific work to meet existing and emerging societal needs ”. In other words, it talks about taking the science to the society so that the society can benefit from scientific research and scientific temper can be inculcated in citizens. The “ science-science connect ” refers to “ sharing of ideas and resources within the knowledge ecosystem ”. Though traditionally, science communication happened using journals and conferences but with the penetration of social media, it has grown beyond these boundaries. Therefore, the use of social media for diffusion of scholarly communication into the society has drawn interest across the world. This work presents an exploratory analysis of effectiveness of Twitter as a medium for diffusion of scholarly communication beyond science-science networks.


Introduction
Science is considered to be one of the greatest collective endeavours that creates new knowledge.Not only science helps solve the needs of society and improve the quality of our lives, but it also improves our understanding of society.The science-society connect is therefore a very important aspect in the modern world, as also indicated in UNESCO's Science for Society section1 which states that "Science must respond to societal needs and global challenges".This calls for increased interaction and resource sharing between scientific institutions and knowledge workers to bridge the three kinds of gaps: "science-society", "science-science" and "society-science".The interactions between scientists and society can facilitate two-way flow of facts, knowledge and ideas, which in turn can benefit both science and society.
Traditionally, scientists have been using journals and magazines to report about various scientific discoveries.These journals and magazines are usually limited in their clientele and circulation.Owing to these issues, science journalists have been trying to convey important scientific discoveries to the common public as easy to understand news articles.Though such communication practices continue to operate, the emergence of new social media platforms has brought in new avenues and opportunities for quicker and wider dissemination of science (Büchi, 2017;Nielsen, 2012;Veletsianos, 2016).Different kinds of social media platforms are now being used by researchers, scientists, institutions and science reporters for dissemination of scientific research and advancements.It is in this context that studies are trying to explore the 'inreach' and 'outreach' potential of social media, and also the 'uptake' of social media by scientists in different countries.The importance of social media in building research networks and in dissemination is now being underscored in different global surveys of scientists.Brossard & Scheufele (2013) through their study have shown that 60% of the U.S. public seeking information about specific scientific issues lists the Internet as their primary source of information.Based on this, they suggested that there is urgency for scientists to pay attention to communicating science in the new online world.Another survey-based study by Pew Research Centre in 2015 involved a survey of 3,748 U.S.-based members of the American Association for the Advancement of Science (AAAS) and found that 47% of them use social media to follow new discoveries and discuss science (Pew Research Center, 2015).Lee & VanDyke (2015) pointed out that science organizations continue to use social media largely for one-way communication and that the social media's potential for dialogue and engagement with public is underutilized.Collins, Shiffman & Rock (2016) have shown that scientists perceive numerous potential advantages of using social media in the workplace, but its usage has yet to be widely adopted.An editorial in Nature Cell Biology (Nat Cell Bio Editorial, 2018) noted that scientists are increasingly embracing social media in their professional lives and emphasized that social media engagement can positively influence their day-to-day work and scientific communication.
Among all kind of social media platforms, Twitter has been a very popular platform for researchers to disseminate information about their research.Therefore, the Twitter mentions were explored from different dimensions, ranging from article impact to tweet life span (Priem & Costello, 2010;Weller et al., 2013;Kyung et al., 2017;Haustein 2019).Due to popularity of Twitter in engaging non-scientists with scientific research, researchers explored Twitter data to measure the information diffusion.Tsou et al. (2015) sampled a set of 2000 unique tweeters, from a pool of tweets between March 2012 and March 2013 linked to articles published in Nature, PLoS One, PNAS and Science.Based on the biographical survey this study concluded the male dominance (67%) in the users.Medical Sciences was outlined as the most fascinating domain of research to attract users.Ke et al. (2017) curated a lexicon of 322 scientist titles using occupational words from Wikipedia.Based on this curated lexicon, this work identified the scientist on Twitter.Mohammadi et al. (2018) investigated 1912 twitter users who tweeted scholarly articles in terms of occupation, academic discipline, age, gender, etc.They concluded similar male dominance among the users whereas Social Science seems to be more connected as per their study.Alperin et al. (2019) examined the diffusion of 11 open access biology articles on Twitter by analyzing the follower network of users who tweeted them.The focus of the study was to investigate the existence of a general audience in the communication network of scholarly data.They concluded that most of the sharing/ diffusion of articles happens within a closely coupled group of tweeters that could be academicians.Joubert & Costas (2020) matched the author names against tweeters to differentiate between scholars and non-scholars.Lemke et al. (2021) manually inspected profile pages of tweeters of tweets related to nuclear repository research and explored their sharing patterns.Toupin et al. (2022) constructed a semi-manual codebook to detect the public engagement in Climate Change publication dissemination.
Motivated by the early evidence of use of Twitter platform for dissemination of science to nonscientists, this study attempts to measure the engagement level of non-scientists with scholarly information disseminated through Twitter.Additionally, the study also tries to explore whether there exists an attraction bias towards contemporary topics.These objectives have been explored using natural language processing (NLP) and machine learning (ML) based techniques.Characterizing and tagging the user profile have been done in automated system based on the attributes extracted from the respective profile.Furthermore, the EM clusteringbased approach is implemented to visualize the professional behaviour of engaged users.

Research Questions
Based on the previous studies explored on the non-scientist engagement phenomenon, this study attempts to answer the following research questions: i.
Do Twitter can be a medium for scholarly article diffusion beyond scientists?ii.
Do the user engagement tend towards contemporary topics?

Data & Methodology
A set of four scholarly articles are selected from three different journals: "Nature", "The Lancet", and "Science".The articles have been chosen based on two criteria: one of them was that they have a higher number of tweets associated with them.Second, two among these four articles are representing the contemporary topics discussed in social media in 2019.One topic is on the behaviour pattern of young protestor and other topic is on humanitarian activities of Hongkong.Rest two articles are representing science understanding among which one is on fine writing methodology of research article whereas another on Artificial Intelligence.The data description has been presented in Table 1.  1.
The tweetIDs for all the tweets in the years 2019, 2020, 2021, 2022, and 2023 (till 2 nd April) were obtained through the use of the "Open results in API" interface of the Altmetric Explorer.
Twitter lookup was performed using Twitter API to capture the corresponding tweet texts using the tweetIDs.In this process some tweets were missed out due to privacy restriction or deletion of the tweet.The collected tweets were then processed to extract the tweet date, Tweeter's user_name, screen_name, userid.Later, the screen_name information was utilized to extract the lists metadata of the Tweeter of each tweet.Any Twitter user can create a Twitter list which is a group of carefully selected Twitter accounts that includes a name and brief description that characterizes its users.The pictorial representation of collecting the data has been depicted in Figure 1.2013) with some additional processing.In the first step, lists with only English metadata were filtered out.These texts were then again refined for emojis.The metadata was then broken down using CamelCase, digits, and punctuation marks like "/", "&", and ",".The resulting words were taken as candidate attributes.In the next phase, metadata was passed through stemming, pos tagging and the Nouns, Adjectives, unigrams and bigrams were chosen as candidates.

Results
The list metadata of Twitter users were analyzed to classify users into Scientists and Public.
Various statistics were obtained computationally to gauge the public involvement scholarly information flow.

Scientists and General Public participation in scholarly communication
Measuring the societal engagement of scientific research requires a multi-faceted approach that takes into account both the engagement from the scientific community and the general public.
One way to quantify this is to measure the public participation in scholarly communication that represents the level of understanding and awareness of scientific research and its potential applications.Figure 3 presents the distribution of scientists and public in the dissemination network of the above mentioned four articles respectively.The public engagements seem to be quite high in Twitter, where almost for every paper the number of tweets from public are almost twice that of scientists.For the two papers concerning society directly (Paper 2 & Paper 3), there is a bit more participation from public.Though the overall diffusion seems irrespective to the nature of the paper so far, but the relevant papers are found to be drawing more attention from general public.Overall, the results suggest that scientists introduce the subject matter to the world.Public actively participate in the diffusion of such communication over Twitter.The content analysis reveals that Public understands the theme and substantially acknowledge the discoveries.This pattern is consistent across all paper types whether it is a contemporary topic or purely science topic.One more possibility that emerges from the results is that public follow Scientists i.e., most of the time retweet rather than making an original tweet.Thus, our results support the idea that Twitter can be used for wider dissemination of scholarly work and has the reach beyond scientific community.

Conclusion
This study explores the use of Twitter platform as a medium for wider dissemination of scholarly information.The public engagement in science communication has been examined by analysing the list meta of Twitter users and analysing the tweet contents by tweeters.The Twitter users are classified using NLP techniques and EM clustering into Scientists and Public.
The information flow has been investigated thoroughly between the two classes.It has been found that the Scientists serves as the source of the knowledge through disseminating the research work and the public mostly engages with the tweets of Scientists.The public can be believed to comprehend the themes.It has been argued earlier that Twitter is a popular platform for scientists to communicate with the public, as its users are typically more educated than the general population, and its user base is diverse in terms of demographics and geography (Greenwood et al., 2016;Mislove et al., 2011).Therefore, the possibility of use of Twitter for diffusion of scientific information to non-scientists is interesting and can be very impactful for science communication.Effective communication between scientists and the public can help to ensure that scientific research is conducted in an open and transparent manner, and that the benefits of this research are shared with the broader community.

Figure 1
Figure 1 Attribute Extraction Procedure

Figure 3 :
Figure 3: Tweeter Distribution Over the Papers

Figure 4 :
Figure 4: Tweet interaction between the Scientists and Public for Paper1

FigureFigure 6
Figure 5: Tweet interaction between the Scientists and Public for Paper2

Figure 6 :
Figure 6: Tweet interaction between the Scientists and Public for Paper3

Figure
Figure 7: Tweet interaction between the Scientists and Public for Paper4

Table 1 : Data statistics
The selected research articles provide a scenario to assess the reachability and public interest on diverse topics.So that, the diffusion bias if any can be explored whether towards on-going topics or discussed events.Web of Science (WoS) and Altmetric.comwere used to verify the articles based on their DOI.Altmetric data for these articles have been downloaded from Altmetric.com on 2 nd April, 2023 using DOI lookup.Each of the article received thousands of tweets till 2023."Novelist Cormac McCarthy's tips on how to write a great science paper" (paper 1), mentioned in 7,141 tweets."Concerns of young protesters are justified" (paper 2) received the highest number of tweets among others as of 9,521 which is one.The politically important paper "International humanitarian norms are violated in Hong Kong" (paper 3) received 6,628 tweets.However, the purely science domain article "The global landscape of AI ethics guidelines" (paper 4) was tweeted 1,011 times.The updated tweet counts and retrieved list statistics corresponding to each paper are presented in Table