Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
There is an updated version of this publication, open Version 2.
conference paper

Normalization of rare citation events in the context of uptake of research in the non-scientific literature

20/04/2023| By
Henrique Henrique Pinheiro,
+ 1
David David Campbell
1263 Views
0 Comments
Disciplines
Keywords
Abstract

The citation uptake of research papers in the non-scientific literature is often sparse. It is thus frequently reported as a proportion of cited papers instead of as an average of the papers’ citation counts. Citation-based indicators are commonly normalized by dividing a paper’s citation count (or binary score; 0 = not cited, 1 = cited) by the world average (or proportion) in the corresponding year, field and document type. Such ratio-based method can generate outliers when dealing with the binary scores. At low aggregation levels, these outliers can produce unreliable results. Here, a ratio-based method is compared to one in which the world’s proportion is subtracted from the papers’ scores using a set of universities as units of analysis. This difference-based method has two main advantages: interpretation of results is more transparent/straightforward, and outliers are less problematic, leading to narrower confidence intervals.

Preview automatically generated form the publication file.

Normalization of rare citation events in the context of uptake of research in the non-scientific literature

Henrique Pinheiro*, Etienne Vignola-Gagné** and David Campbell***

* h.pinheiro@elsevier.com

0000-0002-2175-7518

Science-Metrix and Analytical and Data Services, Elsevier, Montréal, Canada and Amsterdam, the Netherlands

** e.vignola-gagne@elsevier.com

0000-0002-4948-4363

Science-Metrix and Analytical and Data Services, Elsevier, Montréal, Canada and Amsterdam, the Netherlands

*** d.campbell@elsevier.com

0000-0003-3806-3237

Science-Metrix and Analytical and Data Services, Elsevier, Montréal, Canada and Amsterdam, the Netherlands

Abstract

The citation uptake of research papers in the non-scientific literature is often sparse. It is thus frequently reported as a proportion of cited papers instead of as an average of the papers’ citation counts. Citation-based indicators are commonly normalized by dividing a paper’s citation count (or binary score; 0 = not cited, 1 = cited) by the world average (or proportion) in the corresponding year, field and document type. Such ratio-based method can generate outliers when dealing with the binary scores. At low aggregation levels, these outliers can produce unreliable results. Here, a ratio-based method is compared to one in which the world’s proportion is subtracted from the papers’ scores using a set of universities as units of analysis. This difference-based method has two main advantages: interpretation of results is more transparent/straightforward, and outliers are less problematic, leading to narrower confidence intervals.

1. Introduction

Citations of research publications within the peer-reviewed literature are widely used as markers of an entity’s (e.g., a country, institution, researcher) scientific influence/impact. To enable proper comparisons across entities, the common practice is to divide each paper’s citation counts by the world average in the corresponding year, field and document type prior to averaging the normalized scores of an entity’s papers. This ratio-based method is commonly referred to as field normalization and aims to control for confounding factors that can influence the extent to which an entity’s publications get cited beyond the paper’s own performance (Waltman & van Eck, 2018).

With the advent of several alternative sources tracing the uptake of research publications beyond academia, a new range of citation-based indicators emerged. These indicators commonly quantify uptake as the proportion of cited papers, instead of as an average of their citation counts, to cope with the scarcity of several types of altmetric citations. Two such indicators making use of ratio-based normalization include the Equalized Mean-based Normalized Proportion Cited (EMNPC) and the Mean-based Normalized Proportion Cited (MNPC) (Thelwall, 2017). The EMNPC applies equal weights to all normalization strata, regardless of how its papers are distributed across them. As noted by Thelwall, issues associated with this strategy would require the exclusion of small groups, with no clear guideline for such a procedure. Therefore, the EMNPC is not considered further in this paper. The MNPC weights each paper equally in the same manner as the well-known Mean-Normalized Citation Score (MNCS).

After several rounds of experimentation with MNPC in an applied evaluation context, the authors concluded that ratio-based normalization of binary citation counts can generate outliers due to the rarity of citation events for some altmetrics. For example, if an entity has papers in 2 normalization strata (A and B) with different world’s proportions (0.5% for stratum A and 10% for stratum B), the ratio-normalized score of cited papers will be 200 (1/0.005) in stratum A and 10 in stratum B. In such an example, even 1 cited paper in stratum A could drastically influence the score of this entity.

More recently, new ratio-based alternatives to MNPC have been proposed, such as the Mantel-Haenszel Row Risk Ratio (MHRR) (Smolinsky, Klingenberg, & Marx, 2022), an improved version of the Mantel-Haenszel quotient (MHq) (Bornmann & Haunschild, 2018). Whereas MNPC weights each stratum in proportion to its appearance in the output of a given entity, the MHRR/MHq gives more weight to strata in which the citation event is more common, which effectively reduces the impact of outliers. MHRR effectively converts to MNPC by applying the weights of the latter to the former. MHRR’s weighting scheme adds further complexity for interpretation by decision makers. Plus, due to their respective normalization procedures, neither MNPC nor MHRR can be directly connected to an entity’s raw proportion of cited papers calculated over the pooled normalization strata.

In this paper, a difference-based approach is introduced that effectively deals with the problem of outliers while enabling an intuitive interpretation of the data that directly connects an entity’s actual (not normalized) proportion of cited papers and the normalized score. This is achieved by subtracting the world proportion in the corresponding stratum of a paper from its binary citation score. The paper discusses the relative strengths of this approach by comparing it with MNPC, relying, as an example, on the uptake in the policy-relevant literature (UPRL) data for a selected group of British and Indian academic institutions. The authors intend to subsequently add MHRR to the comparison.

2. Methods

Scopus was used to extract the peer-reviewed publications (articles, conference papers, reviews) of a set of entities between 2016 and 2020. Overton was matched to Scopus using DOIs to uncover which of the retrieved papers were cited in the policy-relevant literature. The average of the binary UPRL variable for an entity e’s papers gives its (not normalized, or raw) share of cited papers in the policy literature (\(p_{e}\)).

For this analysis, all British (187) and Indian (581) universities with more than 30 papers and at least one paper cited in Overton were selected to explore the reliability of the proposed approach across institutions of differing sizes. The Overton coverage of British and Indian policy documents differs drastically (Szomszor & Adie, 2022) resulting in a diverse set of institutions in terms of their share of papers cited.1

Two methods were applied to normalize the binary UPRL data by year, subfield (using Science‑Metrix journal-based classification whereby multidisciplinary journals were reclassified at paper level) and document type. The first (ratio-based) method leads to MNPCR (MNPC as in Thelwall, 2017, the subcript R is added in reference to the ratio-based method). First, the paper-level ratio of paper i (ri) is calculated as follows:

\[r_{i} = \left\{ \begin{array}{r} 0\ if\ c_{i} = 0 \\ \frac{1}{p_{f_{i}}^{w}}if\ c_{i} > 0,\ where\ paper\ i\ is\ from\ year,\ subfield \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ and\ document\ combination\ f \\ \end{array} \right.\ \]

Where,

  • ci is the number of UPRL (or any other altmetric) citations of paper i

  • \(p_{f_{i}}^{w}\) is the proportion of world’s papers cited in the same year, subfield and document combination f as paper i.

It follows that MNPCR for a given entity e equals:

\[{MNPC}_{R}^{e} = \frac{\sum_{i = 1}^{n}r_{i}}{n}\]

Where,

  • n is the number of papers of that entity.

The second (difference-based) is introduced and defined as follows. First, the paper-level difference of paper i (di) is calculated as follows:

\[d_{i} = \left\{ \begin{array}{r} 0 - p_{f_{i}}^{w}\ if\ c_{i} = 0 \\ 1 - p_{f_{i}}^{w}\ if\ c_{i} > 0,\ where\ paper\ i\ is\ from\ year,\ subfield \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ and\ document\ combination\ f \\ \end{array} \right.\ \]

It follows that MNPCD (expressed in percentage points) for a given entity equals:

\[{MNPC}_{D}^{e} = \frac{\sum_{i = 1}^{n}d_{i}}{n}\]

The world’s share of papers cited is weighted to reflect the distribution of entity e’s papers across normalization strata (also called synthetic world levels here) as follows:

\[p_{w}^{e} = \frac{\sum_{i = 1}^{n}p_{f_{i}}^{w}}{n}\]

Note that \({MNPC}_{D}^{e}\) corresponds to the difference between the (not normalized, or raw) share of cited papers of entity e (\(p_{e}\)) and \(p_{w}^{e}\). This property of the difference-based method makes for a simple and intuitive presentation/interpretation of results that directly connects an entity’s raw share with the normalized difference to the world level. The above indicators were computed using full counting.

3. Results and Discussion

3.1 Distribution of paper-level scores using ratio-based and difference-based normalization

Table 1 and Table 2 respectively present the distribution of paper-level scores normalized using the ratio-based and difference-based method for all papers in Scopus. For a small portion of papers (~2%), the ratio-based method led to scores that are much higher than the world level of 1 (> 10). For entities with few papers, just a few high scores could drastically change their average score. This is not the case using the difference-based method in which scores are bounded between -1 and +1. With the ratio-based method, all non-cited papers receive the same score (0), regardless of their corresponding world-level share (\(p_{f}^{w}\)). With the difference-based method, the scores of non-cited papers are lower if they belong to normalization strata with higher world-level shares (\(p_{f}^{w}\)).

3.2 Interpretation of results using MNPCR versus MNPCD

Table 3 presents, for each of the top 10 institutions by number of papers (among those selected), the raw \((p_{e})\) and normalized (using the ratio- (MNPCR) and difference-based (MNPCD) methods) shares of papers cited in policy-relevant documents. It also presents the world’s share of cited papers weighted to reflect the distribution of each entity’s papers across normalization strata (\(p_{w}^{e}\)) and the ratio of \(p_{e}\) to \(p_{w}^{e}\) to help comparisons with MNPCR.

This table illustrates a key strength of difference-based normalization over the ratio‑based method as pertains to the interpretation of results. Based on MNPCR, University College London scores 145% above the world average. Using the corresponding score with the difference-based method (MNPCD = 6.7 percentage points (pp) above world level), one can provide more information to support interpretation, because the measured difference allows the direct juxtaposition of the entity’s raw share (12.2%) with its synthetic equivalent at world level (5.56% = 12.21% - 6.65% [data was rounded in the table]). This is relevant as normalized scores would carry different significance depending on the baseline, which is the share at world level; being 145% above world level would lead to different conclusions if the reference is 1%, 5%, or 20%, for example. It is also then possible to express the finding as a ratio (2.20 = 12.2/5.6) which in this case gives a smaller difference relative to world level compared to MNPCR (121% above compared to 145%). Note that MNPCR does not permit such juxtaposition between the raw shares and their world equivalent.

3.3 Alignment between MNPCR and MNPCD and stability of scores with and without outliers

Table 4 presents the share of institutions with convergent MNPCR and MNPCD, as defined by the score of one method not exceeding the other by more than 20%, by bin of institution size. The results converged for 86% of the institutions in the bin with at least 10,000 papers. For bins with less than 5,000 papers, the results converged in less than 60% of the cases, suggesting that the normalization method may affect findings in bibliometric assessments of smaller institutions.

Note: Both methods were compared based on columns MNCPR and \({\mathbf{p}_{\mathbf{e}}\mathbf{/p}}_{\mathbf{w}}^{\mathbf{e}}\) as displayed in Table 3. \({\mathbf{p}_{\mathbf{e}}\mathbf{/p}}_{\mathbf{w}}^{\mathbf{e}}\) was used as it provides a direct way to express the MNCPD as a ratio to the world level and, therefore, a viable approach to compare MNCPD and MNCPR.

Share of convergent results + Share of divergent results = 100%.

The following two columns on divergent cases show that scores normalized using the ratio-based method considerably exceeded those based on the difference-based method more commonly than vice versa. This uneven distribution in the direction of divergent scores is linked to outliers generated by the ratio-based method, which are only located on the right tail of the distribution (Table 1). For smaller institutions, due to their limited number of papers, outliers will not always materialize. In such cases, the ratio-normalized scores will be more influenced by non-cited papers compared to difference-normalized scores. This happens because in the ratio-based method, a score of zero is assigned to all non-cited papers, the minimum possible score in this method, while the difference-based method allows for more nuanced scores of non-cited papers. As the number of papers increases, outliers are more likely to materialize, skewing the scores of these institutions. The last column shows the share of cases in which an observed change of signal (i.e. one of the indicators is above the world level while the other falls below it) was considered relevant according to the margins displayed in the table. Relevant changes in signal are more common in smaller institutions. For institutions with more than 1,000 papers, these discrepancies occur for less than 7% of institutions.

Table 5 presents 10 institutions with divergent MNPCR and MNPCD characterized by a relevant change in signal. They were randomly selected to include two institutions from each of the 4 bins from 30 to 4,999 papers, and one from each of the remaining bins. With one exception, MNPCR was above world level while MNPCD was below it, reflecting the distribution observed for the 72 institutions with relevant change in signal (in 58 of them, the MNCPR was above world level). Table 5 also presents scores after exclusion of outliers, which show that the effect of excluding outliers was more pronounced on MNPCR than MNPCD. In only two cases (3 and 6) did MNPCR remain above world level after removing the outliers.

The potential effect of outliers is well illustrated by RK University. In that case, the exclusion of 1 cited paper (out of 4) moved its MNPCR from 2.14 to 0.87. This paper is from a normalization stratum (i.e. a conference paper in Networking & Telecommunications from 2017) whose world’s share of cited papers (\(p_{f}^{w}\)) is 0.35%. The ratio-normalized score of this paper (r) is thus 277 (1/0.0035). With all its papers included, this institution has a 1.8% share of cited papers (4 out of 217) below its synthetic equivalent at world level (\(p_{w}^{e}\), not displayed is 2.4%). This shows that MNPC should be below world level, which is only achieved, when keeping outliers, with the difference-based normalization (MNPCD).

Note: To remove outliers, all papers from normalization strata containing the highest ratio-normalized scores (ri) of each institution were excluded. For each institution having 10,000+ papers, the strata containing any of its top 5 papers based on ri were excluded; for institutions with 1,000 to 9,999 papers, the strata containing any of its top 3 papers were removed. For smaller institutions, a single normalization stratum was excluded based on an institution’s top ri score. This rule was applied to test the sensitivity of the two methods to outliers and is not intended as a general method for processing outliers.

3.4 Reliability of MNPCR versus MNPCD at different scales

Simulations were also used to assess the sample size needed to accurately estimate an entity’s true position, relative to the world, using MNPCR versus MNPCD. Four large institutions with similar MNPCR and MNPCD scores were selected (two below world level and two above). Each entity’s position in relation to the world (i.e. above or below) is assumed to provide the true parameter to be estimated across different sample sizes. For each sample size, MNPCR and MNPCD were computed for 2,000 random samples (with replacement). Table 6 reports the share of scores obtained in each of these random samples agreeing with the corresponding population parameters as relates to position relative to the world.

Table 6 shows that the difference-based normalization produces findings that are more likely to align with the true population parameter for most of sample sizes. The exceptions concern those institutions with true population parameters below the world level for sample sizes ranging from 30 to 100, where MNPCR has a higher degree of agreement with the population parameters than MNPCD. Note that the convergence of MNPCR decreases when sample sizes increase from 30 to 300, after which convergence increases with sample size. While this may seem counterintuitive, it is likely explained by the fact that, in smaller samples, the extreme outliers in the distribution of ratio-based paper-level scores (ri), may, because they are rare, appear less frequently. For intermediate-sized samples, outliers become more likely to be selected, producing more scores that are above the world level, even though the population parameter is below the world level. Finally, above a certain sample size, the volume of papers dilutes the effect of outliers, and the observed scores start to converge to the true population parameter.

4. Conclusion

4.1 Highlights

  • MNPCD discriminates non-cited papers across normalization strata with differing world shares of cited papers whereas MNPCR cannot discriminate them.

  • MNPCD, compared to MNPCR, permits juxtaposing an entity’s raw share with its world equivalent in support a of a more nuanced interpretation of observed differences.

  • Cases of divergence between MNPCD and MNPCR were shown to be due to outliers in the ratio-based paper level scores (ri) used to compute MNPCR. Accordingly, the difference-based method appears more reliable, especially for smaller institutions.

4.2 Limitations of the study

Our conclusions are relevant for institutions of any size, but pending further validation we believe continued caution is warranted in the use of any normalization method when dealing with smaller institutions, especially when a relevant fraction of their papers are from subfields where shares at world level are low. Computing confidence intervals for difference-normalizations could help mitigate this limitation.

Regional coverage biases are documented in many altmetrics sources and should be taken into account within any comparative analysis.

4.3 Future directions

The authors intend to incorporate MHRR in a subsequent version of this paper for a more inclusive assessment of the performance of the proposed difference-based method.

References

Bornmann, L., & Haunschild, R. (2018). Normalization of zero-inflated data: An empirical analysis of a new indicator family and its use with altmetrics data. Journal of Informetrics, 12(3), 998–1011. https://doi.org/10.1016/J.JOI.2018.01.010

Smolinsky, L., Klingenberg, B., & Marx, B. D. (2022). Interpretation and inference for altmetric indicators arising from sparse data statistics. Journal of Informetrics, 16(1), 101250. https://doi.org/10.1016/J.JOI.2022.101250

Szomszor, M., & Adie, E. (2022). Overton -- A bibliometric database of policy document citations. ArXiv. https://doi.org/10.48550/arxiv.2201.07643

Thelwall, M. (2017). Three practical field normalised alternative indicator formulae for research evaluation. Journal of Informetrics, 11(1), 128–151. https://doi.org/10.1016/J.JOI.2016.12.002

Waltman, L., & van Eck, N. J. (2018). Field normalization of scientometric indicators. In W. Glänzel, H. F. Moed, S. U., & M. Thelwall (Eds.), Springer Handbook of Science and Technology Indicators (pp. 281–300). Springer. Retrieved from http://arxiv.org/abs/1801.09985

Open science practices

The Science-Metrix team at Elsevier uses proprietary implementations of the Scopus and Overton databases in its work. Access to Scopus data for research purposes can be requested by the academic community at ICSR Lab: https://www.elsevier.com/icsr/icsrlab.

Acknowledgments

The authors thank Fei Shu for input on an earlier draft of this manuscript.

Author contributions

Henrique Pinheiro: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Writing—original draft, Writing—review & editing. Etienne Vignola-Gagné: Conceptualization, Investigation, Methodology, Writing—original draft, Writing—review & editing. David Campbell: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Supervision, Writing—original draft, Writing—review & editing.

Competing interests

The authors are employees of Elsevier B.V.

Funding information

Not applicable.


  1. Although not the study’s focus, it also helped to highlight that coverage issues should be considered in selecting an appropriate reference for normalization. Using the difference-based approach (see below) for UPRL, the average rank, among selected entities, is 177 for British universities and 551 for Indian universities. In the case of India and UPRL, normalization against the national level might make more sense with comparisons to other countries relying on within-country ranks.↩︎

Figures (6)

Publication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication Image
Submitted by20 Apr 2023
Download Publication

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

ReviewerDecisionType
User Avatar
Hidden Identity
Accepted
Peer Review
User Avatar
Hidden Identity
Accepted
Peer Review