Editorial gatekeeping up and down the journal hierarchy

: How do journal editors affect what papers and which authors get published? The research presented in the proposed presentation builds on a novel dataset that provides coverage of all journal editors across the full ecosystem of social science journals. This enables the authors to investigate the variation in how editors affect the publication process across the journal hierarchy


Purpose and main contributions
How do journal editors affect what papers and which authors get published? We have collected a unique dataset on journal editors that captures the full ecosystem of social science journals over the past decade. By leveraging this dataset alongside bibliometric data from the Web of Science and document embeddings from Semantic Scholar, we are able to assemble a state-of-the-art multiplex representation of the social sciences and address the core issue of how the role and influence of editors varies across types of journals (e.g. generalist vs. specialist) and across the journal hierarchy. As such, the research described here stands to make significant substantive, methodological, and even theoretical contributions to the literature on editorial influence in the science system.

Background
Top scientists have a major role in affecting the evolution of scientific research (Azoulay et al., 2019;Chu & Evans, 2021) and receive sizeable rewards for their contributions (Allison et al., 1982;Merton, 1968;Xie, 2014;Zuckerman 1977). Opportunities to become a top scientist are limited, however, and what's more they are distributed with clear disparities across any number of social cleavages. One of the primary avenues for joining this elite set of scientists is publishing in top journals. Indeed, many bibliometric studies treat this as the definition of being a top scientist (e.g. Heckman & Moktan, 2020).
How do journal editors-who as scientists are of course also competing for membership in the elite-affect what papers and which authors get published? There are three main concerns here: • It is possible that editors may use their position to further their own careers and to disadvantage their rivals. • It may be that editors give preferential treatment to people they know, to the disadvantage of people they do not know. • Editors may also use their decision-making powers to affect the overall distribution of research in a field by moderating the visibility and impact of specific topics.
Any of these situations may result of intentional actions on the part of editors, but they could equally arise from their unconscious biases.
There is evidence to support each of the enumerated ideas, but the effect sizes are small. Editors have been shown to be more likely to support papers that are closer in topic to their own research areas (Krieger et al., 2021). Editors and reviewers are also more likely to support the publication of papers that are written by academics who are nearby in the collaboration network (Dondio et al., 2019;Ductor & Visser, 2022;Teplitskiy et al., 2018), who have won notable awards (Huber et al., 2022), and who are members of elite professional networks (Crane, 1967;Laband & Piette, 1994).

Research questions
Despite the research discussed in the preceding section, very little work has been done to conceptualize the circumstances where we might expect editorial gatekeeping to be more common, or the circumstances in which we might expect it to be more impactful. This is all the more unfortunate given that there are clear journal selection biases in the existing literature. There are, for instance, a growing number of studies set in non-elite multidisciplinary journals (Dondio et al., 2019;Teplitskiy et al., 2018), along with a fair amount of studies on elite generalist journals, though almost entirely in economics (e.g. Colussi, 2018;Ductor & Visser, 2022;Laband & Piette, 1994). There are comparatively few studies of editorial gatekeeping at specialist journals, let alone studies that consider specialist journals at various levels in the journal hierarchy.
This is problematic for our understanding of editorial gatekeeping, as the opportunity for an editor to gatekeep is limited by the number of co-editors that they share editorial control with, and by the number of papers that fall within their area of expertise. Not only do generalist journals have larger editorial teams, but they also publish the widest range of research. The state of our knowledge on editorial gatekeeping may only reflect what happens in those situations where editorial gatekeeping is the least likely. 1 Just as important as the variation between generalist and specialist journals, but perhaps less problematic, is that editorial gatekeeping almost certainly varies from the top of the journal hierarchy to the bottom. Here again this has to do with the number of people that share editorial control, as top journals are more likely to have large editorial teams. But additionally, one would expect that gatekeeping is more common in situations where the stakes are higher. Why gatekeep a resource that is not so valuable in the first place? While there is evidence of editorial gatekeeping at different points in the journal hierarchy, there has been little to no research that investigated the issue systematically.
With this in mind, we pursue two fundamental clarifying questions regarding editorial gatekeeping: 1. How does gatekeeping vary between generalist and specialist journals in science? 2. How does gatekeeping vary from the top of the journal hierarchy to the bottom, among generalist journals on the one hand, and among specialist journals on the other?

Data
Our project will provide the most substantial evidence to date relating to an editor's effect on the publication process. We have collected the most comprehensive longitudinal dataset of journal editors in the social sciences, with roughly 3000 editors at around 1000 journals. This allows us to make examine the level of editorial influence in elite journals, specialist journals, and the broader mass of journals in each social science field. The breadth of the data mean that we can also provide some of the first looks at editorial gatekeeping outside economics in the social sciences.
The best comparison point for our data is a recent piece by Liu et al. (2023), which collected editor names from the Elsevier API to cover more than 1000 journals across 15 disciplines, and over multiple decades. While they do not directly assess editorial gatekeeping, the sheer scale of their data is useful in explaining the virtues of our own dataset. As they point out in their paper, these journals publish a staggering 20% of the research across all of science. But this is also the main weakness of the data. They are only able to capture editor data from Elsevier journals, which, if we invert the same number from above, does not cover 80% of scientific research. This includes, notably, many top journals.
What we have done instead is to take a more labor-intensive route to improve our coverage of our fields of interest. In our case this means the social sciences. We scrape 15 years of (English) Wikipedia pages for every journal in the social science category and record editor names. This data covers nearly all English-language social science journals, as well as a number of non-English social science journals. To ensure that the resulting data was accurate we used a team of research assistants to find CV information for each editor and extract the start and end dates for their term at their journal.
Data collection is nearly complete. Our research assistants were able to identify the start and end dates for roughly 80% of the editors in the starting dataset. Note that this accounting includes "failures" for long-defunct journals that nonetheless have Wikipedia pages. The research assistants were assigned journal-editor pairs at random, with 25% of their rows set to overlap amongst them. If we consider the rows that were assigned to more than one research assistant, roughly 94% of the estimated start and end years fall within +/-1 year of each other, suggesting that the intercoder reliability for our approach is high.
To make full use of this data we are leveraging the Web of Science with state-of-the-art author disambiguation to assemble multiple measures of the distance between authors and editors.

Analytical choices
Using a series of field-specific relational event models (de Nooy, 2011;Quintane et al., 2014;Schecter & Quintane, 2021) we will report on the likelihood that someone gets published in a given journal, conditional on their distance from the current editor(s). The advantage of using this approach over standard linear models, hierarchical or otherwise, is that relational event models better account for the edgewise interdependencies in the publication process.
At the heart of these models will be a multiplex network of the social scientific system. Like existing studies, we allow researchers to be linked to papers (which further defines coauthorship), papers to be linked to journals, and editors themselves to be linked to journals, while for the sake of simplicity we represent distances to a given journal in terms of the distance from researchers to the journal's editor(s). To better account for the relational structure in science-which is not fully captured by the collaboration network alone-we further incorporate the university affiliation network to measure distance.
We also introduce a few methodological novelties to maximize the usefulness of our data. In the first place, we aim to use a truly multiplex measure of distance in science, allowing researchers to be linked to a given editor across any combination of collaborative and affiliation ties. We also make a key methodological improvement on existing studies by controlling for the semantic similarity between authors' research and editors' research. This is important given that editors tend to specialize in topics that are part of the core topics at their journals, meaning that close collaborative ties may be mistakenly counted toward gatekeeping even in cases when they simply represent a close fit between the authors' work and the journal in question. We tackle this issue by linking papers to their SPECTRE document embeddings from Semantic Scholar and taking the distance between papers and the typical paper in an editor's corpus, or the typical paper in a journal's corpus.
Finally, we also allow the entire network to vary over time. Some studies have done this in the past with collaborative distance, though we allow every feature to vary over time.
Researchers are represented by their most recent paper(s), and all measures of are taken using time-dependent measures.

Discussion
The research described above is ongoing. We are very close to being finished with data collection and cleaning, and network distances are in the process of being computed. We will begin fitting models in the next 2-3 weeks. As such, we cannot report on any results at this point in time.
It is also worth noting that there are two major limitations to our research plan. The most significant limitation is that we do not have access to reviewers, nor can we pair them to the papers they evaluated. Editors of course can and do wield power over the publication process, but it is typically less direct than that of reviewers. Another challenge is that we cannot disentangle self-selection by authors themselves from editorial decision-making. Researchers at least in part-though this almost certainly varies dramatically across (sub)fields-make their choice of which journals to submit to on the basis of the identity of the current editor. This means that while it is entirely reasonable to speak of the effect that editors have on a person's likelihood of publishing in a given journal, it is insufficient to conclusively establish that it owes to editorial gatekeeping, specifically.
Still, we are optimistic about our paper's potential. On the methodological side, the data we have collected provides an ecosystem-level view on editors and academic publishing, allowing us to tackle a unique set of research questions. We are also aiming to maximize the potential of this data by incorporating a careful and cutting-edge approach to constructing the underlying network and to modeling it. Our research will further make a number of substantive contributions. It will (1) provide a rare look at publishing practices in social sciences other than economics; (2) highlight variation in how editors affect publishing across several social science fields; and (3) document variation in terms of how editors affect publication across the hierarchy of journals within a field.

Open science practices
The data used in this project are only partially open at this time. Wikipedia data are freely available through their API, and our curated dataset will be made available at the time of publication. The document embeddings we use from Semantic Scholar are freely available for download through their API. We further use an augmented bibliometric Web of Science database from CWTS at Leiden University, which is available on a subscription basis. Analysis was done using open source packages on R. Replication code will be made available at the time of publication.

Author contributions
Andrew C. Herman contributed to the conceptualization, the methodology, the writing, performed the formal analysis, and created visualizations.