The goal of open science is to improve the quality of publications and to overcome the shortcomings of the classic peer review process. Post-Publication Peer Review (PPPR) has been proposed as an alternative. It is of particular interest to study a non-anonymous PPPR platform to dive into the dynamics relative to the position of the commentators in the scientific community. This research-in-progress describes for the first time in detail the publications targeted by PPPR comments on PubMed Commons (PMC) and the commenters in order to better identify the underlying issues. From the original PMC corpus, we extracted a sample of 657 authors who wrote 4514 comments. To run a bibliometric analysis, this sample was matched with Scopus® database in order to inform the status of the commenters and of the publications. Preliminary results show that the distribution of comments over time reveals some events of intense debate. Most of the comments are rather short. The number of comments by authors follow a Pareto distribution. Commenters are scientists with a high reputation but there is no correlation between their critical activity and any bibliometrics indicators. Finally, we identified only a small fraction of retracted publications. Our results seem to reveal the heterogeneity of the profiles, reflecting a divergent interest in PPPR probably related to the researchers’ positions in the scientific field, and the respect of the Mertonian norms of the scientific ethos. Further research is currently underway to investigate these characteristics in more detail.
Show LessAnalysis of the Pubmed Commons
Post-Publication Peer Review Plateform.
Philippe Gorry*, Léo Mignot** and Antoine Sabouraud**
* philippe.gorry@u-bordeaux.fr
ORCID # 0000-0002-5497-8069
Bordeaux School of Economics CNRS 6060, University of Bordeaux, France
**leo.mignot@u-bordeaux.fr; **antoine.sabouraud@u-bordeaux.fr
Centre Emile Durkheim CNRS 5116, University of Bordeaux, France
Peer review, introduced by Nature in the 1950s, involves single- or double-blind criticism of research publication. The evaluation is carried out by pre-selected experts, anonymous, confidential, time-consuming, subject to bias, costly and open to fraud (Chambers et al, 2014). The early 2010s were marked by the launch of numerous open peer review projects opposing the norm represented by Science and Nature (Tennant et al, 2017). Platforms promoting open science are based on collaboration to ensure transparency of reviews and facilitate communication (Aleksic et al, 2015) and the Post Publication Peer Review (PPPR) is central to this new form of scientific evaluation. While classic peer review happens before the publication, PPPR takes place after the release of the manuscript and opens the critique to a larger number of reviewers. Comments are made public and visible, whether on dedicated online platforms, social media or (more traditionally) through letters to the editor. While its relevance was noted by Gibson (2007), debates between advocates of this practice (Dubois & Guaspare, 2019) and those who condemn it (Blatt, 2015) are constant and dedicated sites are multiplying. Among them, F1000 which appeared in 2002, proposing a journal operating on the basis of PPPR. It allows scientists to publish quickly, at a lower cost and to have more feedback on their work (Hunter, 2012). Kirkham and Moher (2018) showed that nearly 80% of articles published on F1000Research passed PPPR. This experience is perceived positively by both authors and reviewers. While the majority of researchers are dissatisfied with the current peer review system, the anonymity of reviewers seems paradoxically to be preserved (Ross-Hellauer et al, 2017). Finally, questions remain about the quality of PPPR, especially since Bohannon's sting operation (2013), which submitted a study with prohibitive. Solutions such as open evaluation (Kriegeskorte, 2012) deserve further study. They have only been studied a little despite a few conclusive publications (Vaught et al, 2017; Lane et al, 2018; Dubois & Guaspare, 2019). Studying extensively the content of these comments on platforms that have removed anonymity is thus of major interest.
PubMed Commons (PMC) was a PubMed project that served as an interface between publications and their reviewers via named comments. This platform was launched in October 2013 by the National Institutes of Health and the National Center for Biotechnology Information. It centralized thousands of comments until it was shut down in February 2018. PMC gave authors with at least one of their own publications indexed in PubMed the ability to comment on any other articles in Pubmed but they could not be anonymous. Paul Lane's assessment of PMC's impact one year after its launch showed that less than 0.05% of the approximately 900,000 publications added to PubMed in 2014 were commented on. The trend continued over time (Lane et al, 2018) and PMC was discontinued in March, 2018 due to lack of interest compared to competing platforms such as PubPeer which allows anonymous critics. PMC has therefore seen intense debate about the importance of anonymity in the PPPR process (Dolgin, 2018).
If PMC platform has been the subject of comments in the academic literature, it has not yet been the subject of any scientometrics analysis although closed since 2018. This work is a first attempt to describe the publications targeted by PPPR comments on PMC and the commenters. The objective is to explore the characteristics of publications, the profile of the authors, the nature of the comments, and measure possible correlations between these 3 dimensions.
PMC archives were retrieved from the NCBI FTP site. It includes the following fields: “Comment ID”, “PubMed ID” (PMID), “Date”, Commentor “First and Last name”, and “Comment content”. The corpus contains 7614 comments about 6012 publications and published by 1905 authors between June 2013 and February 2018. This represents 3.996 comments on average per author, with a median equal to 1 and a standard deviation of 12.728. At minima, 1159 authors wrote one comment, and at maxima, one author wrote 247 comments (0.052 %). The number of comments by authors follows a Pareto distribution (data not shown). In order to carry out a bibliometric analysis, it was necessary to match the PMC data with that of a bibliographic database (Scopus®) in order to inform the status of the commenters (affiliation, country, research domain, publications number, citations number, H-index) and to inform the annotated publications (authors, title, year of publication, journal, number of citations). The matching step for the publications was easy with the PMID number with some exceptions requiring manual cleanup. The process was too complex for the commenters, because of numerous errors (inversion of name and surname, problems with nobiliary particle, first name entered as an initial, and homonymy) which required a manual matching to disambiguate the names. 58.6% of PMC authors have more than one homonym in Scopus® with some authors (n=55) having over 100 homonyms. In a first approach, we proceeded to a sampling of our population of comments by random selection. As the comments do not follow a normal distribution, we did it by strata of comments (n=10). After modelling the sample size in each strata in order to obtain a confidence interval (CI) of 95% and a standard error (SE) less than 5%, we calculated a minimum sample size of 432 authors. After disambiguation and random sampling, we constructed a sample of 657 authors who have written 4514 comments (59.28%) with a CI of 95% and a SE of 3%.
The comments are on about 6012 publications and they concern different types of documents, mainly articles (77.88%). We also note the presence of a group of 74 retired “Article” and 19 "Erratum" as well as a group of 375 "Letter", "Note", and "Editorial" (7.11%) that are characteristic of exchange of views within the scientific community (data not shown). These documents were published in 1670 different journals and the distribution of number of publications per journal follows a power-law distribution (data not shown). The journal with the most publications commented on PMC is PLosONE with 198 articles. In contrast, there is a group of 932 different journals represented by a single article commented on PMC. In the top 10 journals with the highest number of comments, apart from PLosOne (3.76%), we find high impact medical or scientific journals, such as the New England Journal of Medicine (2.39%), the Journal of the American Medical Association (1.78%), Nature (2.11%) or the Proceedings of the National Academy of Sciences (2,05%). On average, publications in the PMC corpus received 165.44 citations with a median at 22 citations, a standard deviation of 237.20 with minimum value of zero citation for 218 publications and a maximum of 186.038 citations. We then built a keyword map of the publications based on the co-occurrence of abstract and authors keywords in order to identify the main research topics covered by the PMC publications using VosViewer (data not shown). After analysis, they were 7 clusters: the main topic is on “humans”, the next on “animals”, then on “middle aged”, “adolescent”, “cohort studies“, “young adult” and at last on “neoplasms”. Finally, we examined the source of research funding published and reported in PMC. Unfortunately, less than half (45.02%) of the publications reported this information. We found 1309 different research funding institutions, the main one being the National Institute of Health. We also noted the presence of 48 pharmaceutical companies thanked for their financial support on 327 publications on PMC (5.49% of total publications).
Table 1. Descriptive statistics of the commenter’s bibliometric indicators
To further explore the
characteristics of the commenters, we ran a bivariate correlation
analysis between the number of comments and the different bibliometrics
indicators (Table 2).
Table 2. Correlation matrix of the author bibliometric indicators
The comments constitute a text of 7613 paragraphs with 104 529 lines for a total of 1 038 669 words of text. The size of comments follows a Pareto distribution (data not shown). The majority of comments are less than 50 words in length and a minority of comments exceed 500 words. 42.6% of comments include an html link to a bibliographic reference or a blog. 11.54% refer strictly to a publication referenced in Pubmed, and only 3.85% mention both. In total, 7614 comments were posted on PMC with an annual average of 1725.25. The calculation of the coefficient of variation and the trend excludes any significant, weekly, monthly or quarterly variation (data not shown). However, the daily distribution of comments reveals intense debate on certain dates (Figure 1). While the average number of comments per day is about 4.41 (median at 4) with a standard deviation of about 5.55, there is a maximum peak of 113 comments on December 10, 2014, and another at 104 comments on August 23, 2016. In total, there are 70 days over the observed period with a number of comments above 3 SD.
Figure 1: Number of comments over the years.
Among the corpus of 6012 PMC publications, we could identify 74 retracted publications. These papers were published between 2001 and 2018, mainly before the opening of PMC platform (Table 3). They were retracted from the year of opening of PMC (2013), in majority before the closing of the platforms in 2018, but for some not until today in 2022. The average retraction period was 7 years with the case of immediate (0 years) or late (17 years) retraction. On average, the retracted papers cumulated 78.58 citations with a median at 34.5 citations, a standard deviation at 123.31 with minimum value of one citation and a maximum of 763 citations.
Table 3. Descriptive statistics of the retracted publications
For a third, they were
published by top journals, in particular “Cancer Research” (16.22%),
“The Lancet” (4.05%) and “Proceedings of the National Academy of
Sciences of the United States of America” (4.05%) (data not shown). The
majority of comments on these papers come from two authors: Morten
Pedersen Oksvold (134 comments in total) and Ivan Oransky who is one the
administrators of Retraction Watch (87 comments). Basically, out of 80
comments, many are perfectly identical comments, 17 are identical
comments mentioning a series of retracted papers by Fazlul Sarkar, 6
others alert on copycat papers, 4 twin comments indicate the withdrawal
of an article and 4 others the withdrawal of another article.
This article is a work-in-progress, and presents a preliminary bibliometrics exploration of the PMC PPPR platform for the first time. Our analysis highlighted the characteristics of the commented publications, of the commenters and those of the comments that were exchanged. The commenters are mainly English-speaking male, publishing in the field of biomedicine, with a high level of scientific capital. There is no correlation between the bibliometric characteristics of commenters and the total number of comments. A small number of commenters concentrate a large number of comments on PMC : some of them are high-level scientists known for their open science activism, others are surprisingly non-academic watchdogs. Regarding the comments, two main characteristics stand out : they are often very short and a minority seem to have given rise to extensive discussions at certain times. These comments affected publications in leading journals, and only a small portion were retracted.
Some explanations for the lack of correlation between bibliometric indicators and involvement in the PPPR process can already be raised. Possible explanation could be related to : (1) the presence of outliers reinforcing the heterogeneity of profiles (2) divergent interest for this PPPR process linked to the position of researchers in the scientific field, and the respect for the Mertonian norms of the scientific ethos (Merton, 1973).
Several issues require to be developed, both qualitatively and quantitatively, before strong statements can be made. First, data collection is still underway to gather all PMC commenters and document their profile with different variables for multivariate analysis. Second, further analysis would be necessary to identify clusters of commenters with similar profiles and interests by using clustering algorithms and by conducting semi-structured interviews to explore the divergent PPPR interests. Third, it would be interesting to study the content of the comments in more detail. Given the volume of text involved, it seems more relevant to conduct an analysis based on the use of lexicometrics and sentiment analysis to identify the themes covered, the arguments mobilized and to compare them according to commenters’ profiles and types of publication. Another interesting line of research would be to analyze the evolution of the life cycle of publications according to whether or not they are subject to negative comments.
Acknowledgments
We would like to thank Pierre Mirambet for raw data extraction help and Pascal Ragouet for his ongoing intellectual support.
Author contributions (https://credit.niso.org/)
PG supervise, scrub data, run statistics and write the original draft. LM scrub data and write the original draft. AS scrub data and write the original draft.
Competing interests
Authors have no competing interests.
Funding information
This work is supported by ANR with a research grant (project Skeptiscience, grant# ANR-20-CE26-0008) to Michel Dubois (Paris-Sorbonne University).
References
Blatt, M.R. (2015). Vigilante Science. Plant Physiology, 169, 907-909.
Bohannon, J. (2013). Who does peer review? Science, 342, 60–65.
Chambers, C.D., et al. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1, 4-17.
Dolgin, E. (2018). PubMed Commons closes its doors to comments. Nature. DOI:10.1038/d41586-018-01591-4.
Dubois, M., & Guaspare, (2019). «Is someone out to get me? » : la biologie moléculaire à l’épreuve du PostPublication Peer Review, Zilsel, 6, 164-192.
Gibson, T.A. (2007). Post-publication review could aid skills and quality. Nature. 448:408.
Hunter, J. (2012). Post-publication peer review: opening up scientific conversation. Front. Comput. Neurosci. 6:63.
Kirkham, J. & Moher, D. (2018). Who and why do researchers opt to publish in post-publication peer review platforms? - findings from a review and survey of F1000 Research. F1000Research, 7, 920.
Lane, P. et al (2018). Use of PubMed Commons – still not so common? Current Medical Research and Opinion, 34, 27-27.
Merton, R.K. (1973). The sociology of science: Theoritical and empirical investigations. Chicago: Univeristy of Chicago Press.
Ross-Hellauer, T. et al. (2017). Survey on open peer review: Attitudes and experience amongst editors, authors and reviewers. PLoS ONE, 12, e0189311.
Tennant, J.P., et al. (2017). A multi-disciplinary perspective on emergent and future innovations in peer review. F1000, 6: 1151.
Vaught, M.D. (2017). A Cross-sectional Study of Commenters and Commenting in PubMed, 2014 to 2016: Who’s Who in PubMed Commons [on line] https://peerreviewcongress.org/peer-review-congress-2022-program/
No comments published yet.
GORRY, P., MIGNOT, L. & SABOURAUD, A. (2023). Analysis of the Pubmed Commons Post-Publication Peer Review Plateform. [preprint]. 27th International Conference on Science, Technology and Innovation Indicators (STI 2023). https://doi.org/10.55835/6442f02464eb99f94fe5a307