Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
There is an updated version of this publication, open Version 2.
conference paper

Why shouldn't university rankings be used to evaluate research and researchers?

20/03/2023| By
Dmitry Dmitry Kochetkov

We address the question of why global university rankings should not be used for research evaluation. To answer this question, we analyze four groups of literature (academic vs non-academic literature, English-language vs Russian-language literature). The analysis shows that most researchers agree that rankings should not be used to evaluate research. However, they are still used for these purposes directly or indirectly, although recent developments give us hope for a change in the situation in the near future.

Preview automatically generated form the publication file.

Why Shouldn't University Rankings Be Used to Evaluate Research and Researchers?

Dmitry Kochetkov*,**

Centre for Science and Technology Studies, Leiden University, Netherlands


Department of Scientific & Information Development and Library Support,
Russian Presidential Academy of National Economy
and Public Administration (RANEPA), Russia

We address the question of why global university rankings should not be used for research evaluation. To answer this question, we analyze four groups of literature (academic vs non-academic literature, English-language vs Russian-language literature). The analysis shows that most researchers agree that rankings should not be used to evaluate research. However, they are still used for these purposes directly or indirectly, although recent developments give us hope for a change in the situation in the near future.

1. Introduction

Rankings have been used for quite a long time in various fields of human activity, in this regard, university rankings are no exception. However, global university rankings are a relatively new phenomenon, dating back to the Academic Ranking of World Universities (ARWU) in 2003. Rankings may have been originally conceived as a marketing and benchmarking tool, but very quickly penetrated the realm of policies and research evaluation. The appearance of the first global ranking was due to the implementation of excellence initiatives in China (Projects 211 and 985). In turn, ARWU stimulated the emergence of excellence initiatives in Europe (Exzellenzinitiative in Germany, Initiatives d'excellence in France, etc.). Thus, global university rankings have become de facto a tool for research evaluation and funding allocation. The idea of "one-button" evaluation looked too attractive to governments and university administrators.

Since their emergence, global university rankings, or rather their use in policy initiatives, have been the subject of heated debate. Huang (2012) provided an overview of the debate around the QS World University Rankings (QS). Iesbik Valmorbida, Rolim Ensslin, Ensslin, & Ripoll-Feliu (2016) identified 20 university rankings and major areas of criticism. These two reviews provide a good starting point for our research. At the same time, we do not just supplement these reviews with more recent literature, but we try to make our study more focused. The goal of our study is to answer the question "Why shouldn't university rankings be used to evaluate research?" based on a systematic literature review. It may seem strange that we mention the evaluation of individual researchers in the title of this article, even though university rankings evaluate institutions as a whole. However, university rankings have a direct impact on the assessment procedures within organizations, including the evaluation of researchers and faculty. We analyse relevant literature in English, identified using Web of Science, as well as Russian-language literature. The choice of Russia is not accidental in the light of the objectives of the study. In 2013–2020, the Russian government implemented an excellence initiative called Project 5top100. The very name of the initiative stands for “five Russian universities should get into the top 100 of global university rankings”; thus, rankings played an exceptional role in this project. The review is supplemented by an analysis of grey (non-academic) literature.

2. Method

To search for relevant literature in English (with very few Spanish and Portuguese documents) we used Web of Science, search query "ranking* NEAR/2 university AND ("research evaluation" OR "research assessment" OR "research performance" " OR "research quality" OR "excellence initiative*")". The query returned 161 results (2005-2022). In the next step, we manually filtered out documents that did not match the research question. The “core” comprised 32 articles. In addition, to ensure that no relevant papers were omitted, we performed a forward snowballing search, which identified four additional papers.

To search for Russian-language literature, we used similar methods, but in general we analysed a wider body of literature on university rankings in Russian. The review was supplemented by an analysis of non-academic (grey) literature in both English and Russian. Due to the lack of a selective mechanism for grey literature, the sample was formed on the basis of expert opinion.

3. Results1

Different approaches to global university rankings and their use in research evaluation are generalized in Fig. 1.

Figure 1: Map of views on global university rankings
and their use in research evaluation

3.1. Major areas of criticism

1. First, a significant part of the academic literature points to technical problems and bugs in the methodologies used by rankings. This is the most extensive part of the criticism, so we have divided it into several narrower areas.

Technical problems associated with bibliometric data sources. Van Raan (2005) pointed out errors related to the identification of cited/citing publications as well as affiliations (see also Billaut, Bouyssou, & Vincke, 2010; Dimzov, Matošić, & Urem, 2021; Ioannidis et al., 2007). Pandiella-Dominique, Moreno-Lorente, García-Zorita, & Sanz-Casado (2018) also drew attention to the problem of incorrect data retrieval. Huang et al. (2020) based on a comparative analysis of the three largest databases, revealed discrepancies in bibliometric data, which can significantly affect the positions of universities in the ranking. Krauskopf (2021) based on the analysis of the methodology of ARWU subject rankings, revealed the problem of uneven distribution of Web of Science categories between different ARWU subjects (54 categories are absent altogether). It is noteworthy that most of the criticism of bibliometric data sources related to ARWU and, accordingly, Web of Science.

Problematic application of bibliometric indicators in a number of areas (technical, social sciences and, in particular, humanities). Different disciplines are characterized by different levels of citations (Ioannidis et al., 2007; van Raan, 2005).

Methodological bugs. Ioannidis et al. (2007) identified key challenges for ranking methodologies, including the need to take into account the size of the institution, measurement of average versus measurement of extremes, timeframe of measurements, and allocation of credit for excellence. Billaut et al. (2010) singled out several methodological issues of the ARWU ranking:

  • There is a long-time lag between doing research and being awarded the Nobel or Fields Prizes. In addition, these awards represent only a small part of the spectrum of scientific fields.

  • Highly cited researchers tend to be quite mature and have changed several universities over the course of their careers. Again, there is a time lag, due to which the relationship between the indicator and research performance of the institution being assessed is not obvious.

  • Weighting coefficients for authoring articles in Nature and Science are illogical.

  • The number of articles tells nothing about the quality of the research, just because a significant proportion of the articles have not been cited by anyone.

  • “The number of Full Time Equivalent (FTE) academic staff” indicator is not clearly defined, so the Productivity criterion is questionable.

Huang (2012) drew attention to the issues associated with QS reputation questionnaires. The ranking methodology suggests that questionnaires can serve as an indicator of university performance, but in fact they are only an indicator of reputation. The return rate and lack of control over experience and qualifications of the respondents casts doubt on the representativeness of the sample. Krauskopf (2021) highlighted arbitrariness of top journals identification (ARWU) among other methodological flaws.

Pandiella-Dominique et al. (2018) drew attention to the critical issue of reproducibility. INORMS Research Evaluation Working Group (2022); Waltman, Wouters, & Eck (2020) also note the problem of transparency and reproducibility among other ranking issues.

Arbitrariness of the assignment of weights in the composite index calculation. This is a potential source of distortion in the calculation and interpretation of rankings using such index. (Bellantuono et al., 2022; Fauzi, Tan, Mukhtar, & Awalludin, 2020).

2. Methodological flaws contribute to the emergence of biases of various origins. One of the most widespread biases is linguistic bias, which arises due to the fact that bibliometric databases takes into account mainly publications in English (Bellantuono et al., 2022; Billaut et al., 2010; van Raan, 2005; van Raan, van Leeuwen, & Visser, 2011). In turn, linguistic distortions lead to the emergence of territorial bias (Fauzi et al., 2020; Safón, 2013; van Raan, 2005). Huang (2012) found a correlation between the number of reputation questionnaires returned and the number of universities from that country in the QS institutional rankings. Finally, there is a reputational bias that researchers mainly associate with a group of leading American universities (Safón, 2013; Safón & Docampo, 2020). All these distortions together lead to the Matthew effect in rankings. This effect can be explained not only by experts voting for well-known universities (this effect is clearly present in the QS and THE questionnaires), but also by the fact that universities from the leading group can better attract financial and human resources compared to the rest.

3. Rankings evaluate only certain areas of university performance with a clear emphasis on research. None of the university rankings evaluate research and education equally, the focus is always on research (Iesbik Valmorbida et al., 2016; Ioannidis et al., 2007). In addition to education, other aspects of university activities such as university management and professional development do not receive proper attention. None of the global university rankings is suitable for evaluating the university as a whole (Vernon, Andrew Balas, & Momani, 2018). Ranking compliers consider what is easy to count, but not what not what should be taken into account (Abramo, 2017). Loukkola, Peterbauer, & Gover (2020) argued that a single “indicator may indeed reflect one aspect of an institution’s performance, but it should not be generalized to reflect the institution’s performance in relation to other aspects, or the entire institution altogether” (Loukkola et al., 2020, с. 24).

4. A number of researchers believe that global university rankings pose a threat to the national identity of higher education. Based on the example of the institutional reconfiguration of Asian universities as part of excellence initiatives, Li (2016) showed that rankings have become a tool for spreading the Western archetype of higher education. Global university rankings have played a key role in the Taiwanese government's excellence initiatives (Shreeve, 2020). The author noted the ambiguity of the rankings’ impact on the national system of higher education. On the one hand, global rankings pose a threat of academic colonialism; on the other hand, the national system itself cannot exist in isolation, rankings can play the role of an integration tool. Gao & Zheng (2020) showed that excellence initiatives and new managerialism have a number of negative consequences for the development of social sciences and humanities (SSH) in Chinese universities.

In the Russian-language literature, some researchers also consider global university rankings as a tool for promotion of the Western model in higher education (Lazar, 2019; Pyatenko, 2019). There is also a more aggressive “isolationist” discourse, which is gaining strength in the current situation (see, for example, Bolsherotov, 2013; Eskindarov, 2022).

5. Many of the problems associated with global university rankings are not related to the methodology itself, but to the misinterpretation of results and the non-responsible use of rankings in general. However, this aspect is more often analysed in the non-academic literature (“More Than Our Rank,” n.d.; Waltman et al., 2020).

3.2. Positive and neutral outlook on rankings

There are not so many studies of university rankings that take a positive or neutral perspective. In a number of studies, the authors highlight the objective nature of rankings and the possibility of their use for cross-country comparisons (e.g., Docampo, 2011). Very often, these studies are reduced to an attempt to reveal the underlying factors of rankings that determine the position of a university (Docampo & Cram, 2014; Klumpp, 2019; Pakkan, Sudhakar, Tripathi, & Rao, 2021). It is argued that rankings can be used as a benchmarking tool (Tuesta, Bolaños-Pizarro, Neves, Fernández, & Axel-Berg, 2020).

In the final report on the Rankings in Institutional Strategies and Processes (RISP) project, Hazelkorn, Loukkola, & Zhang (2014) indicate the possible areas for the use of rankings by universities:

  • Information source

  • Benchmarking tool

  • Evidence in the process of decision-making

  • Marketing promotion tool

In the Russian-language literature, rankings are most often viewed through the prism of competitiveness and integration into the global educational space (e.g., (Leonova, Malanicheva, & Malanicheva, 2017; Puzatykh, 2019). This interpretation of the rankings is mainly due to the fact that in Russia in 2013-2020, the Project 5top100 excellence initiative was implemented. The very name of the project contains the goal of five Russian universities to get into the top 100 of the global university rankings, so the rankings played a key role in this initiative. Accordingly, a significant part of the academic literature analyses university rankings in this context (Arefiev, 2014, 2015; Guzikova & Plotnikova, 2014; Kushneva, Rudskaya, & Fersman, 2014). The results of Project 5top100 were summed up by the Accounts Chamber of the Russian Federation (“Bulletin of the Accounts Chamber Vol. 2 (279),” 2021). The authors acknowledged that the main goal of the project was not achieved, while all participants experienced a significant increase in the number of publications and promotion in rankings (mainly in subject ones).

4. Discussion and Conclusions

We have explored four groups of literature (academic vs grey literature, Russian-language vs English-language literature). It is interesting to note that individual perspectives have points of intersection, but there are no theses that would be the subject of discussion and analysis for all groups. At the same time, English-language academic literature intersects in certain aspects with English-language non-academic and Russian-language academic literature, and Russian-language academic literature additionally has points of intersection with Russian-language grey literature.

Few studies consider global university rankings as a neutral or positive phenomenon. In these studies, as a rule, the authors mention objectivity and the possibilities of using ranking data for quantitative analysis and cross-country comparisons. However, our literature review showed that under the guise of apparent transparency and objectivity, there are many shortcomings that negatively affect all the stakeholders in the higher education ecosystem. Rankings can be used as a marketing and benchmarking tool, but only if the limitations of the methodology are recognized.

Thus, most researchers support the view that global university rankings should not be used for research evaluation, but they are still used, nonetheless. On the other hand, constant criticism from the academic community led to the development of rankers’ “discursive resilience: the ability to engage with critics in a productive way in order to navigate a potentially hostile environment” (Hamann & Ringel, 2023).. In this regard, we would like to mention two events in 2022 which may radically change the future situation.

Firstly, the Agreement on Reforming Research Assessment was developed (CoARA, 2022), which, as of March 12, 2023, has been signed by 487 universities. The document clearly states that the use of rankings should be avoided when evaluating research (and researchers). At the same time, the drafters of the agreement admit the possibility of using rankings for the purposes of comparative analysis, but in this case, the limitations of the methodology should be recognized.

Secondly, at the end of 2022, some prestigious American law and medical schools announced their intention to quit the U.S. News & World Report (Hamann & Ringel, 2023). This is not the end of the "ranking power," but more than a clear signal from the academic community.

Open science practices

Conducting this study made me think about the Helsinki Initiative on Multilingualism in Scholarly Communication (Helsinki: Federation of Finnish Learned Societies, Committee for Public Information, Finnish Association for Scholarly Publishing, & Universities Norway & European Network for Research Evaluation in the Social Sciences and the Humanities, 2019), which is rarely mentioned in this context. In the process of preparing the review, I came across the fact that the scientific quality of the articles of Russian authors in English is significantly higher than in Russian. In my opinion, this is a direct consequence of the phenomena analysed in this paper. The race for ranking promotion and quantitative indicators leads to the degradation of the national academic discourse, especially in the field of SSH. Therefore, in parallel with the English version, I will post a preprint in Russian and in the future, I will do the same with all the studies that I publish.


The author is grateful to Ludo Waltman, whose comments significantly improved this paper. This study received no external funding.

Competing interests

The author is affiliated with the Centre for Science and Technology Studies (CWTS) of Leiden University, which is the producer of the Leiden Ranking. The author is also a former employee of the Ministry of Science and Higher Education of the Russian Federation that implemented the Project 5top100 excellence initiative. No proprietary information received by the author during the period of public service in the Ministry was used in this paper.


Abramo, G. (2017). Bibliometric evaluation of research performance: Where do we stand? Voprosy Obrazovaniya / Educational Studies Moscow, 2017(1), 112–127.

Arefiev, A. L. (2014). Global university rankings as a new phenomenon in Russian higher education. Sotsiologicheskaya Nauk i Sotsialʹnaya Praktika [Sociological Sciences and Social Practice], 7(3), 5–24.

Arefiev, A. L. (2015). On the participation of Russian universities in international rankings. Rossiya Reformiruyushchayasya [Russia in Reform], 13, 213–231.

Bellantuono, L., Monaco, A., Amoroso, N., Aquaro, V., Bardoscia, M., Loiotile, A. D., … Bellotti, R. (2022). Territorial bias in university rankings: a complex network approach. Scientific Reports, 12(1), 1–16.

Billaut, J.-C., Bouyssou, D., & Vincke, P. (2010). Should you believe in the Shanghai ranking? Scientometrics, 84(1), 237–263.

Bolsherotov, A. L. (2013). World university rankings: catch up and overtake. Is it necessary? Part 1. World University Rankings. Zhilishchnoe Stroitel’stvo [Housing Construction], (4), 17–23.

Bulletin of the Accounts Chamber Vol. 2 (279). (2021). Retrieved January 21, 2023, from website:

CoARA. (2022). Agreement on reforming research assessment (p. 23). p. 23. Retrieved from

Dimzov, S., Matošić, M., & Urem, I. (2021). University rankings and institutional affiliations: Role of academic librarians. Journal of Academic Librarianship, 47(5).

Docampo, D. (2011). On using the Shanghai ranking to assess the research performance of university systems. Scientometrics, 86(1), 77–92.

Docampo, D., & Cram, L. (2014). On the internal dynamics of the Shanghai ranking. Scientometrics, 98(2), 1347–1366.

Eskindarov, M. A. (2022). Russia needs to develop its own internal university ranking. Rektor Vuza [University Rector], (4), 42–47.

Fauzi, M. A., Tan, C. N., Mukhtar, M., & Awalludin, N. (2020). University rankings : A review of methodological flaws implications of university rankings. Issues in Educational Research, 30(1), 79–96. Retrieved from

Gao, X., & Zheng, Y. (2020). ‘Heavy mountains’ for Chinese humanities and social science academics in the quest for world-class universities. Compare, 50(4), 554–572.

Guzikova, L. A., & Plotnikova, E. V. (2014). Positions and prospects of the participants of the 5-100-2020 project in international university rankings. Voprosy Metodiki Prepodavaniya v Vuze [Teaching Methodology in Higher Education], 17(3), 15–27.

Hamann, J., & Ringel, L. (2023). University rankings and their critics – a symbiotic relationship? Retrieved from LSE Impact Blog website:

Hazelkorn, E., Loukkola, T., & Zhang, T. (2014). Rankings in Institutional Strategies and Processes: Impact or Illusion? Retrieved from

Helsinki: Federation of Finnish Learned Societies, Committee for Public Information, Finnish Association for Scholarly Publishing, & Universities Norway & European Network for Research Evaluation in the Social Sciences and the Humanities. (2019). Helsinki Initiative on Multilingualism in Scholarly Communication.

Huang, C. K., Neylon, C., Brookes-Kenworthy, C., Hosking, R., Montgomery, L., Wilson, K., & Ozaygen, A. (2020). Comparison of bibliographic data sources: Implications for the robustness of university rankings. Quantitative Science Studies, 1(2), 445–478.

Huang, M. H. (2012). Opening the black box of QS world university rankings. Research Evaluation, 21(1), 71–78.

Iesbik Valmorbida, S. M., Rolim Ensslin, S. P., Ensslin, L., & Ripoll-Feliu, V. M. (2016). Rankings universitários mundiais. Que dizem os estudos internacionais? REICE. Revista Iberoamericana Sobre Calidad, Eficacia y Cambio En Educación, 14.2(2016), 5–29.

INORMS Research Evaluation Working Group. (2022). Fair and responsible university assessment : Application to the global university rankings and beyond. Retrieved from

Ioannidis, J. P. A., Patsopoulos, N. A., Kavvoura, F. K., Tatsioni, A., Evangelou, E., Kouri, I., … Liberopoulos, G. (2007). International ranking systems for universities and institutions: A critical appraisal. BMC Medicine, 5, 1–9.

Klumpp, M. (2019). Sisyphus revisited: Efficiency developments in european universities 2011-2016 according to ranking and budget data. Review of Higher Education, 43(1), 169–219.

Krauskopf, E. (2021). The Shanghai global ranking of academic subjects: Room for improvement. Profesional de La Informacion, 30(4), 1–13.

Kushneva, O. A., Rudskaya, I. A., & Fersman, N. G. (2014). World University Rankings and the program “5-100-2020” of the Ministry of Education and Science of the Russian Federation as a way to increase the competitiveness of Russian universities. Obshchestvo. Sreda. Razvitiye [Society. Environment. Development], 31(2), 17–26.

Lazar, M. G. (2019). Consequences of the fascination with quantitative performance indicators in science and higher education. Uchenyye Zapiski Rossiyskogo Gosudarstvennogo Gidrometeorologicheskogo Universiteta [Proceedings of the Russian State Hydrometeorological University], (54), 134–144.

Leonova, T. N., Malanicheva, N. V., & Malanicheva, A. S. (2017). International rankings as a tool for assessing the competitiveness of the university. Vestnik Universiteta [Bulletin of the University], (10), 125–130.

Li, J. (2016). The global ranking regime and the reconfiguration of higher education: Comparative case studies on research assessment exercises in China, Hong Kong, and Japan. Higher Education Policy, 29(4), 473–493.

Loukkola, T., Peterbauer, H., & Gover, A. (2020). Exploring Higher Education Indicators.

More Than Our Rank. (n.d.). Retrieved January 5, 2023, from INFORMS website:

Pakkan, S., Sudhakar, C., Tripathi, S., & Rao, M. (2021). Quest for ranking excellence: impact study of research metrics. DESIDOC Journal of Library and Information Technology, 41(1), 61–69.

Pandiella-Dominique, A., Moreno-Lorente, L., García-Zorita, C., & Sanz-Casado, E. (2018). Model for estimating Academic Ranking of World Universities (Shanghai Ranking) scores. Revista Espanola de Documentacion Cientifica, 41(2), 1–14.

Puzatykh, A. N. (2019). Participation in world university rankings as a determining factor influencing the educational policy of countries and the development of universities. Psikhologiya Obrazovaniya v Polikul’turnom Prostranstve [Educational Psychology in Polycultural Space], 48(4), 105–113.

Pyatenko, S. V. (2019). Independent living rankings and educational outcomes. Originalʹnyye Issledovaniya [Original Research], 9(4), 18–28.

Safón, V. (2013). What do global university rankings really measure? The search for the X factor and the X entity. Scientometrics, 97(2), 223–244.

Safón, V., & Docampo, D. (2020). Analyzing the impact of reputational bias on global university rankings based on objective research performance data: the case of the Shanghai Ranking (ARWU). Scientometrics, 125(3), 2199–2227.

Shreeve, R. L. (2020). Globalisation or westernisation? The influence of global university rankings in the context of the Republic of China (Taiwan). Compare, 50(6), 922–927.

Tuesta, E. F., Bolaños-Pizarro, M., Neves, D. P., Fernández, G., & Axel-Berg, J. (2020). Complex networks for benchmarking in global universities rankings. Scientometrics, 125(1), 405–425.

van Raan, A. F. J. (2005). Fatal attraction: Conceptual and methodological problems in the ranking of universities by bibliometric methods. Scientometrics, 62(1), 133–143.

van Raan, A. F. J., van Leeuwen, T. N., & Visser, M. S. (2011). Severe language effect in university rankings: particularly Germany and France are wronged in citation-based rankings. Scientometrics, 88(2), 495–498.

Vernon, M. M., Andrew Balas, E., & Momani, S. (2018). Are university rankings useful to improve research? A systematic review. PLoS ONE, 13(3), 1–15.

Waltman, L., Wouters, P., & Eck, N. J. van. (2020). Ten principles for the responsible use of university rankings. Retrieved from CWTS Blog website:

  1. This section presents only the key results, the full literature review will be published in the two forthcoming articles: “University Rankings in the Context of Research Evaluation and Policy Initiatives: A State-of-Art Multilingual Review” and “A Review of Russian-language Academic Literature on University Rankings.”↩︎

Figures (1)

Publication Image
Submitted by20 Mar 2023
User Avatar
Dmitry Kochetkov
Leiden University
Download Publication

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

User Avatar
Hidden Identity
Minor Revision
Peer Review
User Avatar
Hidden Identity
Peer Review
User Avatar
Hidden Identity
Minor Revision
Peer Review