Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
You are watching the latest version of this publication, Version 1.
conference paper

Analysis of the Pubmed Commons Post-Publication Peer Review Plateform.

21/04/2023| By
Philippe Philippe GORRY,
+ 1
Antoine Antoine SABOURAUD
512 Views
0 Comments
Disciplines
Keywords
Abstract

The goal of open science is to improve the quality of publications and to overcome the shortcomings of the classic peer review process. Post-Publication Peer Review (PPPR) has been proposed as an alternative. It is of particular interest to study a non-anonymous PPPR platform to dive into the dynamics relative to the position of the commentators in the scientific community. This research-in-progress describes for the first time in detail the publications targeted by PPPR comments on PubMed Commons (PMC) and the commenters in order to better identify the underlying issues. From the original PMC corpus, we extracted a sample of 657 authors who wrote 4514 comments. To run a bibliometric analysis, this sample was matched with Scopus® database in order to inform the status of the commenters and of the publications. Preliminary results show that the distribution of comments over time reveals some events of intense debate. Most of the comments are rather short. The number of comments by authors follow a Pareto distribution. Commenters are scientists with a high reputation but there is no correlation between their critical activity and any bibliometrics indicators. Finally, we identified only a small fraction of retracted publications. Our results seem to reveal the heterogeneity of the profiles, reflecting a divergent interest in PPPR probably related to the researchers’ positions in the scientific field, and the respect of the Mertonian norms of the scientific ethos. Further research is currently underway to investigate these characteristics in more detail.

Show Less
Preview automatically generated form the publication file.

Analysis of the Pubmed Commons

Post-Publication Peer Review Plateform.

Philippe Gorry*, Léo Mignot** and Antoine Sabouraud**

* philippe.gorry@u-bordeaux.fr

ORCID # 0000-0002-5497-8069

Bordeaux School of Economics CNRS 6060, University of Bordeaux, France

**leo.mignot@u-bordeaux.fr; **antoine.sabouraud@u-bordeaux.fr

Centre Emile Durkheim CNRS 5116, University of Bordeaux, France

1. Intoduction

Post-publication peer review

Peer review, introduced by Nature in the 1950s, involves single- or double-blind criticism of research publication. The evaluation is carried out by pre-selected experts, anonymous, confidential, time-consuming, subject to bias, costly and open to fraud (Chambers et al, 2014). The early 2010s were marked by the launch of numerous open peer review projects opposing the norm represented by Science and Nature (Tennant et al, 2017). Platforms promoting open science are based on collaboration to ensure transparency of reviews and facilitate communication (Aleksic et al, 2015) and the Post Publication Peer Review (PPPR) is central to this new form of scientific evaluation. While classic peer review happens before the publication, PPPR takes place after the release of the manuscript and opens the critique to a larger number of reviewers. Comments are made public and visible, whether on dedicated online platforms, social media or (more traditionally) through letters to the editor. While its relevance was noted by Gibson (2007), debates between advocates of this practice (Dubois & Guaspare, 2019) and those who condemn it (Blatt, 2015) are constant and dedicated sites are multiplying. Among them, F1000 which appeared in 2002, proposing a journal operating on the basis of PPPR. It allows scientists to publish quickly, at a lower cost and to have more feedback on their work (Hunter, 2012). Kirkham and Moher (2018) showed that nearly 80% of articles published on F1000Research passed PPPR. This experience is perceived positively by both authors and reviewers. While the majority of researchers are dissatisfied with the current peer review system, the anonymity of reviewers seems paradoxically to be preserved (Ross-Hellauer et al, 2017). Finally, questions remain about the quality of PPPR, especially since Bohannon's sting operation (2013), which submitted a study with prohibitive. Solutions such as open evaluation (Kriegeskorte, 2012) deserve further study. They have only been studied a little despite a few conclusive publications (Vaught et al, 2017; Lane et al, 2018; Dubois & Guaspare, 2019). Studying extensively the content of these comments on platforms that have removed anonymity is thus of major interest.

1.2. Pubmed Commons

PubMed Commons (PMC) was a PubMed project that served as an interface between publications and their reviewers via named comments. This platform was launched in October 2013 by the National Institutes of Health and the National Center for Biotechnology Information. It centralized thousands of comments until it was shut down in February 2018. PMC gave authors with at least one of their own publications indexed in PubMed the ability to comment on any other articles in Pubmed but they could not be anonymous. Paul Lane's assessment of PMC's impact one year after its launch showed that less than 0.05% of the approximately 900,000 publications added to PubMed in 2014 were commented on. The trend continued over time (Lane et al, 2018) and PMC was discontinued in March, 2018 due to lack of interest compared to competing platforms such as PubPeer which allows anonymous critics. PMC has therefore seen intense debate about the importance of anonymity in the PPPR process (Dolgin, 2018).

1.3. Objectives

If PMC platform has been the subject of comments in the academic literature, it has not yet been the subject of any scientometrics analysis although closed since 2018. This work is a first attempt to describe the publications targeted by PPPR comments on PMC and the commenters. The objective is to explore the characteristics of publications, the profile of the authors, the nature of the comments, and measure possible correlations between these 3 dimensions.

2. Methods

PMC archives were retrieved from the NCBI FTP site. It includes the following fields: “Comment ID”, “PubMed ID” (PMID), “Date”, Commentor “First and Last name”, and “Comment content”. The corpus contains 7614 comments about 6012 publications and published by 1905 authors between June 2013 and February 2018. This represents 3.996 comments on average per author, with a median equal to 1 and a standard deviation of 12.728. At minima, 1159 authors wrote one comment, and at maxima, one author wrote 247 comments (0.052 %). The number of comments by authors follows a Pareto distribution (data not shown). In order to carry out a bibliometric analysis, it was necessary to match the PMC data with that of a bibliographic database (Scopus®) in order to inform the status of the commenters (affiliation, country, research domain, publications number, citations number, H-index) and to inform the annotated publications (authors, title, year of publication, journal, number of citations). The matching step for the publications was easy with the PMID number with some exceptions requiring manual cleanup. The process was too complex for the commenters, because of numerous errors (inversion of name and surname, problems with nobiliary particle, first name entered as an initial, and homonymy) which required a manual matching to disambiguate the names. 58.6% of PMC authors have more than one homonym in Scopus® with some authors (n=55) having over 100 homonyms. In a first approach, we proceeded to a sampling of our population of comments by random selection. As the comments do not follow a normal distribution, we did it by strata of comments (n=10). After modelling the sample size in each strata in order to obtain a confidence interval (CI) of 95% and a standard error (SE) less than 5%, we calculated a minimum sample size of 432 authors. After disambiguation and random sampling, we constructed a sample of 657 authors who have written 4514 comments (59.28%) with a CI of 95% and a SE of 3%.

3. Preliminary results

3.1. Publications

The comments are on about 6012 publications and they concern different types of documents, mainly articles (77.88%). We also note the presence of a group of 74 retired “Article” and 19 "Erratum" as well as a group of 375 "Letter", "Note", and "Editorial" (7.11%) that are characteristic of exchange of views within the scientific community (data not shown). These documents were published in 1670 different journals and the distribution of number of publications per journal follows a power-law distribution (data not shown). The journal with the most publications commented on PMC is PLosONE with 198 articles. In contrast, there is a group of 932 different journals represented by a single article commented on PMC. In the top 10 journals with the highest number of comments, apart from PLosOne (3.76%), we find high impact medical or scientific journals, such as the New England Journal of Medicine (2.39%), the Journal of the American Medical Association (1.78%), Nature (2.11%) or the Proceedings of the National Academy of Sciences (2,05%). On average, publications in the PMC corpus received 165.44 citations with a median at 22 citations, a standard deviation of 237.20 with minimum value of zero citation for 218 publications and a maximum of 186.038 citations. We then built a keyword map of the publications based on the co-occurrence of abstract and authors keywords in order to identify the main research topics covered by the PMC publications using VosViewer (data not shown). After analysis, they were 7 clusters: the main topic is on “humans”, the next on “animals”, then on “middle aged”, “adolescent”, “cohort studies“, “young adult” and at last on “neoplasms”. Finally, we examined the source of research funding published and reported in PMC. Unfortunately, less than half (45.02%) of the publications reported this information. We found 1309 different research funding institutions, the main one being the National Institute of Health. We also noted the presence of 48 pharmaceutical companies thanked for their financial support on 327 publications on PMC (5.49% of total publications).

3.2. Commenters

Among the original PMC corpus, we noticed the presence of 19 journal clubs, of which we could unambiguously identify only ten. They mainly belonged to American universities (n=7) and English, Dutch and Swiss universities. Moreover, we failed to identify 12 commenters in the Scopus® database. Altogether, it represents 182 comments (0.0239%): they were subsequently excluded from the analysis. The following descriptive statistics of commenters on PMC is based on a sample of 657 authors drawn at random and by strata of 10 comments per author. The commenters in the sample are coming from 490 institutions in 319 different cities from 55 different countries. The top 2 countries, the United States (US) and United Kingdom, account for the majority of the comments (40% and 10% respectively). We find, more or less, in the same order, the main publishing countries (data not shown). The first city of origin of the commentators is London, followed by cities of the US east coast, including Canadian cities, and Paris notably (data not shown). At the institutional level, most of the institutions (82.85%) account for only one comment on PMC. We also note the presence of authors affiliated with consulting firms, scientific publishers and pharmaceutical companies as well as some individual authors without academic affiliation. We characterized the authors' research field according to the subject area annotated in the Scopus® (data not shown). Since PMC is linked to PubMed® database, it is not surprising to find that the two main research fields are medicine (32.72%) and biochemistry (49.47%). We also find peripheral fields of research, such as agriculture, engineering, environmental or material sciences, mathematics, astronomy and social sciences, for generally less than 1% of the total comments. We then ran bibliometrics statistics for the commenters to measure their scientific interest, their productivity and the impact of their research. As the reader can see in Table 1, the commenters are established scientists with H-index around 37, well connected (hundreds of co-authors), publishing on average more than 100 publications, giving rise to thousand citations. But, when measured by median, extreme values and standard deviation, the commenters appear to be very heterogeneous. Since an author could be referenced in Scopus® database by more than one research area, we calculated their number as a proxy of subject diversity which varies between one and 29 according to the subject area annotated in the database (Table 1: Subject diversity). In addition, we measured the level of the commentator’s expertise on the basis of the “Topic Field-Weighted Citation Impact” indicator proposed by Scopus® (Table 1: Top Topics).

Table 1. Descriptive statistics of the commenter’s bibliometric indicators

To further explore the characteristics of the commenters, we ran a bivariate correlation analysis between the number of comments and the different bibliometrics indicators (Table 2).

Table 2. Correlation matrix of the author bibliometric indicators

The Pearson correlation coefficient displays in the table measures the strength and direction of the relationship between two variables. Notably, there is no significant correlation between the number of the comments and any bibliometrics indicators, whereas there are strong and significant correlations between the different bibliometrics variables. Even though there is no correlation, it is interesting to make a scatter plot with the number of comments versus the number of published papers per author (data not shown). If we take the time to look at the data in detail, we can note that there is a small group of authors who do not publish much but who comment a lot on PMC. We can further notice that the author of the largest number of comments (n=247) has only published 8 scientific documents, and this author has no academic affiliation.

3.3. Comments

The comments constitute a text of 7613 paragraphs with 104 529 lines for a total of 1 038 669 words of text. The size of comments follows a Pareto distribution (data not shown). The majority of comments are less than 50 words in length and a minority of comments exceed 500 words. 42.6% of comments include an html link to a bibliographic reference or a blog. 11.54% refer strictly to a publication referenced in Pubmed, and only 3.85% mention both. In total, 7614 comments were posted on PMC with an annual average of 1725.25. The calculation of the coefficient of variation and the trend excludes any significant, weekly, monthly or quarterly variation (data not shown). However, the daily distribution of comments reveals intense debate on certain dates (Figure 1). While the average number of comments per day is about 4.41 (median at 4) with a standard deviation of about 5.55, there is a maximum peak of 113 comments on December 10, 2014, and another at 104 comments on August 23, 2016. In total, there are 70 days over the observed period with a number of comments above 3 SD.

Figure 1: Number of comments over the years.

3.4. Subset of comments on retracted publications

Among the corpus of 6012 PMC publications, we could identify 74 retracted publications. These papers were published between 2001 and 2018, mainly before the opening of PMC platform (Table 3). They were retracted from the year of opening of PMC (2013), in majority before the closing of the platforms in 2018, but for some not until today in 2022. The average retraction period was 7 years with the case of immediate (0 years) or late (17 years) retraction. On average, the retracted papers cumulated 78.58 citations with a median at 34.5 citations, a standard deviation at 123.31 with minimum value of one citation and a maximum of 763 citations.

Table 3. Descriptive statistics of the retracted publications

For a third, they were published by top journals, in particular “Cancer Research” (16.22%), “The Lancet” (4.05%) and “Proceedings of the National Academy of Sciences of the United States of America” (4.05%) (data not shown). The majority of comments on these papers come from two authors: Morten Pedersen Oksvold (134 comments in total) and Ivan Oransky who is one the administrators of Retraction Watch (87 comments). Basically, out of 80 comments, many are perfectly identical comments, 17 are identical comments mentioning a series of retracted papers by Fazlul Sarkar, 6 others alert on copycat papers, 4 twin comments indicate the withdrawal of an article and 4 others the withdrawal of another article.

4. First discussion and further research

This article is a work-in-progress, and presents a preliminary bibliometrics exploration of the PMC PPPR platform for the first time. Our analysis highlighted the characteristics of the commented publications, of the commenters and those of the comments that were exchanged. The commenters are mainly English-speaking male, publishing in the field of biomedicine, with a high level of scientific capital. There is no correlation between the bibliometric characteristics of commenters and the total number of comments. A small number of commenters concentrate a large number of comments on PMC : some of them are high-level scientists known for their open science activism, others are surprisingly non-academic watchdogs. Regarding the comments, two main characteristics stand out : they are often very short and a minority seem to have given rise to extensive discussions at certain times. These comments affected publications in leading journals, and only a small portion were retracted.

Some explanations for the lack of correlation between bibliometric indicators and involvement in the PPPR process can already be raised. Possible explanation could be related to : (1) the presence of outliers reinforcing the heterogeneity of profiles (2) divergent interest for this PPPR process linked to the position of researchers in the scientific field, and the respect for the Mertonian norms of the scientific ethos (Merton, 1973).

Several issues require to be developed, both qualitatively and quantitatively, before strong statements can be made. First, data collection is still underway to gather all PMC commenters and document their profile with different variables for multivariate analysis. Second, further analysis would be necessary to identify clusters of commenters with similar profiles and interests by using clustering algorithms and by conducting semi-structured interviews to explore the divergent PPPR interests. Third, it would be interesting to study the content of the comments in more detail. Given the volume of text involved, it seems more relevant to conduct an analysis based on the use of lexicometrics and sentiment analysis to identify the themes covered, the arguments mobilized and to compare them according to commenters’ profiles and types of publication. Another interesting line of research would be to analyze the evolution of the life cycle of publications according to whether or not they are subject to negative comments.

Acknowledgments

We would like to thank Pierre Mirambet for raw data extraction help and Pascal Ragouet for his ongoing intellectual support.

Author contributions (https://credit.niso.org/)

PG supervise, scrub data, run statistics and write the original draft. LM scrub data and write the original draft. AS scrub data and write the original draft.

Competing interests

Authors have no competing interests.

Funding information

This work is supported by ANR with a research grant (project Skeptiscience, grant# ANR-20-CE26-0008) to Michel Dubois (Paris-Sorbonne University).

References

Blatt, M.R. (2015). Vigilante Science. Plant Physiology, 169, 907-909.

Bohannon, J. (2013). Who does peer review? Science, 342, 60–65.

Chambers, C.D., et al. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1, 4-17.

Dolgin, E. (2018). PubMed Commons closes its doors to comments. Nature. DOI:10.1038/d41586-018-01591-4.

Dubois, M., & Guaspare, (2019). «Is someone out to get me? » : la biologie moléculaire à l’épreuve du PostPublication Peer Review, Zilsel, 6, 164-192.

Gibson, T.A. (2007). Post-publication review could aid skills and quality. Nature. 448:408.

Hunter, J. (2012). Post-publication peer review: opening up scientific conversation. Front. Comput. Neurosci. 6:63.

Kirkham, J. & Moher, D. (2018). Who and why do researchers opt to publish in post-publication peer review platforms? - findings from a review and survey of F1000 Research. F1000Research, 7, 920.

Lane, P. et al (2018). Use of PubMed Commons – still not so common? Current Medical Research and Opinion, 34, 27-27.

Merton, R.K. (1973). The sociology of science: Theoritical and empirical investigations. Chicago: Univeristy of Chicago Press.

Ross-Hellauer, T. et al. (2017). Survey on open peer review: Attitudes and experience amongst editors, authors and reviewers. PLoS ONE, 12, e0189311.

Tennant, J.P., et al. (2017). A multi-disciplinary perspective on emergent and future innovations in peer review. F1000, 6: 1151.

Vaught, M.D. (2017). A Cross-sectional Study of Commenters and Commenting in PubMed, 2014 to 2016: Who’s Who in PubMed Commons [on line] https://peerreviewcongress.org/peer-review-congress-2022-program/

Figures (4)

Publication ImagePublication ImagePublication ImagePublication Image
Submitted by21 Apr 2023
User Avatar
Philippe GORRY
Bordeaux School of Economics
Download Publication
ReviewerDecisionType
User Avatar
Hidden Identity
Minor Revision
Peer Review
User Avatar
Hidden Identity
Minor Revision
Peer Review