Platform logo
Explore Communities
27th International Conference on Science, Technology and Innovation Indicators (STI 2023) logo
27th International Conference on Science, Technology and Innovation Indicators (STI 2023)Community hosting publication
There is an updated version of this publication, open Version 2.
conference paper

Measuring the science-technology-innovation linkage and its evolution based on citation and text features of FDA-approved drugs-patents-papers

21/04/2023| By
Dongyu Dongyu Zang,
+ 1
shuang shuang chen
651 Views
1 Comments
Disciplines
Keywords
Abstract

Science, technology and innovation are important issues related to national development and world progress. However, there is still a lack of research on the linkage and evolution of science-technology-innovation. We used the data of FDA-approved drugs (1949-2022) in the Orange Book, back-traced the relevant patents and their non-patent references, and established drug-patent-paper citation pairs and citation networks. We used citation characteristics to analyze the network structure, strength, speed and their evolution among science-technology-innovation. In addition, we used text features to analyze the topic linkage, topic evolution path and evolution mode among science-technology-innovation. It is suggested that the linkage between science-technology-innovation increased and became closer from 1949-2022. The strength of the linkage increased, but the speed of the linkage tended to slow down. we found that the cross-integration and fragmentation of topics promote the linkage between science-technology-innovation knowledge more closely.

Preview automatically generated form the publication file.

Measuring the science-technology-innovation linkage and its evolution based on citation and text features of FDA-approved drugs-patents-papers

Dongyu Zang *, Chunli Liu ** and Shuang Chen*

*zdy3006@163.com;

ORCID: https://orcid.org/0000-0002-0849-0468;

School of Health Management, China Medical University, China

** liuchunliliangxu@163.com

ORCID: https://orcid.org/0000-0002-3671-6487

Library, China Medical University, China

* shuangchen1212@163.com

ORCID: https://orcid.org/0009-0000-6587-3388

School of Health Management, China Medical University, China

Corresponding author: Chunli Liu, liuchunliliangxu@163.com

Abstracts: Science, technology and innovation are important issues related to national development and world progress. However, there is still a lack of research on the linkage and evolution of science-technology-innovation. We used the data of FDA-approved drugs (1949-2022) in the Orange Book, back-traced the relevant patents and their non-patent references, and established drug-patent-paper citation pairs and citation networks. We used citation characteristics to analyze the network structure, strength, speed and their evolution among science-technology-innovation. In addition, we used text features to analyze the topic linkage, topic evolution path and evolution mode among science-technology-innovation. It is suggested that the linkage between science-technology-innovation increased and became closer from 1949-2022. The strength of the linkage increased, but the speed of the linkage tended to slow down. we found that the cross-integration and fragmentation of topics promote the linkage between science-technology-innovation knowledge more closely.

Keywords: “Topic Evolution”, “Science Linkage”, “Technology Linkage”, “Science Cycle Time”, “Technology Cycle Time”

1. Introduction

Knowledge transfer or knowledge flow that occurs in the scientific and technological innovation system is one of the important topics of great interest to experts in scientometrics and biomedicine. Some scholars have studied science-technology linkage roughly from three perspectives. The first one is the linkage analysis based on the citation relationship and citation network. For example, researchers have observed the science-technology knowledge flows between patents and papers through non-patent references (Sun X & Ding K, 2018; Sung, Wang, et al., 2015). Recently, a new study has measured the knowledge network structure by the coupling of knowledge elements between S&T (Ba Z.C. & Liang Z.T., 2021), etc. In addition, some scholars have focused on the strength and speed of the science-technology linkage (Wang, J.J. & Ye, F.Y., 2021). The strength of the science-technology interaction has been measured by some scientometrics indicators, such like science linkage, technology linkage (Narin, F. & Olivastro, D., 1988). The citation delay was used to quantify the speed of the scientific and technology knowledge flow (Narin, F., 1994; Verbeek, A., Debackere, K., et al., 2002). The second one is the topic evolution analysis based on text similarities (Han, W., Han, X., et al.,2022; Chen, L.X., 2017). Measuring the text similarity between patent-papers can help us to find S&T related topics and their evolution paths (Wang, X.F., Zhang, S., et al., 2021; Yang, X., Feng, L. Z., Yuan, J.P., 2023). The third one is the linkage analysis based on relationship mapping. For instance, inventor-authors mapping (Boyack,K.W. & Klavans, R, 2008), patent-journal classification or subject mapping (Narin, F., Hamilton, K.S., et al., 1997), patent-non-patent references’ country mapping (Gazni, A., 2020), etc.

In addition, the patent-drug linkage is also a necessary process for the transformation of scientific knowledge into drug innovation. Some scholars have analysed the characteristics of key patents of drugs approved by FDA. Victor et al. collected the key patents of 78 small-molecule drugs approved by the FDA from 2019 to 2020, and analysed the date of patent filing, the content characteristics of the patents, and whether similar patents were obtained in other countries (Van de Wiele, V.L., Torrance, A.W., et al., 2022). Ke, Q. investigated the “basicness” of the references cited by patents issued by the USPTO and revealed that patents in life science are likely to cite basic research. While key patents that were related to FDA approved drugs cited a significant part of clinical research papers (Ke, Q., 2020). Du and Li et al. have accessed the drugs approved by US FDA from 2006 to 2015, calculated the time delay for patents-drugs and and the technology linkage (the average number of key patents for drugs approved by the FDA) (Du, J., Li, P.X., et al., 2019). However, most of the research on drug-patent linkage revolves around the characteristics of patents and the non-patent references. As far as we know, only Du evaluated the time delay and technology linkage of the key patents for FDA-approved drugs (Du, J., Li, P.X., et al., 2019).

Based on previous studies, we adopted a backward tracking model which was first proposed by Du (Du, J., Li, P.X., et al., 2019), and constructed a dataset consisting of FDA-approved drugs-patents-papers backward linkages. The two objectives of our study were as follows.

(1) To analyze the linkage and evolution of science-technology -innovation from three aspects: citation network, linkage speed and linkage strength.

(2) To identify important topics of innovation-technology-science, sort out the topic evolution paths and the pattern of topic evolution.

2. Sample and Methodology

2.1. Sample selection

Firstly, we downloaded the drugs (1981-2022) approved by the US Food and Drug Administration (FDA) from the website of Orange Book (February 2022 version) and the drug-related patents (also known as key patents for drugs). Secondly, we searched patents in Lens.org and PatCite by patent number and non-patent references. Then we constructed a dataset of Drug-patent-paper citation pairs including 963 drug, 3121 patents and 32011 papers. Thirdly, we further acquired or mined the text information of drugs (indications, usage and description), patents (titles and abstracts) and papers (titles and abstracts) for topic linkage analysis. Finally, we combined the text information of the three textual corpora into a total corpus after labelling the source of corpus (S: Scientific paper, T: Technology patent, I: Innovation drug).

2.2 Methodology

2.2.1 Citation linkage analysis

Based on the citation relationship between drug and patent, patent and non-patent references, we established "drug-patent-paper citation pair", and used Gephi software to construct biomedical citation network. The citation network included 32011 academic paper nodes, 3121 patent nodes and 963 drug nodes. The citation network was constructed according to drug approval year, patent publication year, paper publication year and drug-patent-paper citation pair, and visualized by Gephi. Science Linkage (SL) (Narin, F. & Olivastro, D., 1988) and Technology Linkage (TL) (Narin, F. & Olivastro, D., 1998) are used to measure the linkage strength between technology-science and innovation-technology respectively. Science Cycle Time (SCT) (Verbeek, A. & Debackere, K., et al., 2002) for patents and Technology Cycle Time (TCT) (Du, J., Li, P.X., et al., 2019) for drugs are utilized to quantify the linkage speed from science-technology and technology-innovation respectively.

2.2.2 Topic linkage analysis

We divided the text corpus into seven time slices. Bertopic model was used to identify and extract topics. The cosine similarity of two topics in different time slices was used to measure the linkage between two topics in different time segments and reveal the evolution of topics over time. When calculating the topic similarity between topic pairs, the first 30 topic representation words established by c-TF-IDF were taken. The value of the cosine similarity is between 0 and 1, and the closer the value is to 1, the more similar the two topics are and the higher the correlation between the topics is. It was found that the similarity values among topics ranged from 0.493 to 0.953. We selected a threshold of 0.860 to screen out highly related topic pairs and finally obtained 66 pairs of topic linkage between time slices.

A total of 112 topics were extracted from seven time slices. We selected the top five topics in each time slice as the key topic. In addition, we explored the evolutionary paths of 35 key topics and 66 highly related topic pairs respectively. Finally, we used Sankey diagrams to visually analyze the evolutionary patterns of 112 topics between time slices, with corpora source identification letters attached next to each topic. Topic evolution pattern analysis can not only analyze the laws of different types of topic splitting and fusion, but also reveal the changes in the contributions of scientific knowledge, technology and innovation to the topic during the evolution of each topic.

3. Results

3.1 The citation network structure of drug-patent-paper linkage

Figure 1 depicts the citation network structure of drug-patent-papers and its evolution over time in six time slices from 1949 to 2022. In the citation network diagram, pink nodes represent papers, blue nodes represent patents, while orange nodes represent drugs. Correspondingly, the blue connecting line indicates the patent citation to the paper, and the orange connecting line indicates the drug citation to the patent.

Before 1990, there were only a few papers. From 1991 to 2000, the patent-paper citation relationship (blue lines) began to appear. From 2001 to 2005, the citations of patents to papers increased gradually, and the citations of drugs also began to appear. After 2006, especially from 2011 to 2015, the patent-paper and drug-patent citation relationships showed explosive growth, and all kinds of citation relationships and networks reached the maximum in 2022.

Figure 1: Citation Network Structure and its evolution

3.2 The strength of drug-patent-paper linkage

Figure 2 plots the science linkage and technology linkage, indicates the linkage strength between patent-paper and drug-patent, respectively. Before 2002, the curve of the science linkage and technology linkage was not stable. After 2003, the strength of science linkage was almost significantly higher than that of technology linkage After 2016, the gap between science linkage and technology linkage is getting bigger. Each patent cited an average of 34 non-patent references and each drug cited an average of 14 patents. It can be seen that the linkage strength between patents and papers is on average 2.4 times greater than the linkage between drugs and patents.

Figure 2: Science Linkage and Technology Linkage

3.3 The speed of drug-patent-paper linkage

Figure 3: Science Cycle Time and Technology Cycle Time

Figure 3 has shown the science life cycle and the technology life cycle. Overall, the average technology life span is 3.1 years, while the average science life span is 12.6 years. We found that each year the scientific life cycle was significantly longer than the technology life cycle. That is, it takes much longer per year for papers to be translated into patents than for technical knowledge to be transferred to new drugs. We also found that the science life cycle and the technology life cycle both increased over time. The growth rate of technology life cycle is relatively stable. The technology life cycle reached two peaks in 2008 and 2012 respectively, and the technology life cycle rose steadily in other times. The scientific life cycle showed a relatively fluctuating upward trend before 2006 and after 2020, but showed a stable upward trend between 2006 and 2020. In general, the technology life cycle is more stable than the science life cycle.

3.4 The topic evolution path

Table 1 indicates topic evolution path of the 35 key topics in the same category. We divided 35 key topics into seven categories according to their content, and we give each category a more appropriate name (For example, category “Oncology Disease Drug Research”). The topic number in the evolution path refers to the ordinal number of key topics among the 112 topics. For each topic category, we sorted the key topics by the time slice number to facilitate the identification of the topic evolution path. In addition, we marked the corpus type (S/T/I) next to each key topic to facilitate observation of the source of the topic corpus, and then analyze the contributions and development rules of papers, patents, and drug knowledge during the topic evolution.

Table 1. Topic evolution path of the key topics in the same category

Topic Categories Key Topics Time Slice-Topic Number
1 Oncology Disease Drug Research Nucleoside drug research-S 1-3
Protein kinase and proteomics research-S 2-16
Kinase drug research for cancer therapy-SI; Inhibitors and complexes research-S 3-26;3-29
Growth factor and receptor anti-tumor mechanisms-ST;Oncology disease and chemotherapy drug research-S 4-35;4-37
Kinase drug research for cancer therapy-ST; Growth Factor, Receptor and Tumor Research-ST; Inhibitors and complexes research-ST;Oncology disease and chemotherapy drug research-SI 5-51;5-53;5-54;5-55
Antitumor Inhibitor Development and Trials-STI 6-79
Chemical formulation and composition study of small molecule inhibitors-STI 7-97
2 Opioid analgesic drug research Pharmacokinetic studies of opioids-S; Opioid Research-S 1-2;1-5
Research on opioid analgesics such as cocaine-SI 2-18
Opioid receptor specific antagonists-S 3-30
Opioid receptor modulators-STI 6-83
Release pattern and pharmacokinetic study of opioids and their antagonists-T 7-98
3 Cardiovascular Disease Drug Research Fatty acid physiological effects-S 3-27
Functional fatty acid research-SI 4-39
Cardiovascular Disease Prevention and Drug Research-TI 7-99
4 Psychiatric Drug Research Psychiatric and sleep disorder drug research-S 3-28
Psychiatric and sleep disorder drug research-SI 4-38
5 Epilepsy Disease Research Study of the efficacy and adverse effects of antiepileptic drugs-S 4-36
Anti-Epileptic Drug Study-S 7-100
6 Drug delivery and release studies Polymeric drug delivery-SI 1-4
Drug release pattern study-S 2-19
Drug polymer release pattern and pharmacokinetic study-ST 6-80
Release pattern and pharmacokinetic study of opioids and their antagonists-T 7-98
7 Drug reagent composition ratio development study Drug reagent proportioning and development research-STI 5-52
Drug reagent composition and formulation research-I 6-81
Chemical formulation and composition study of small molecule inhibitors-STI 7-97

We analyzed 66 highly related topic pairs (threshold greater than 0.860) to screen different categories of topic pairs that were related across time slices. Table 2 presents the topic evolution paths across categories over time. The topic serial number in Table 4 is also the ordinal label of the 112 topics extracted by Bertopic. We screened four paths, which were " In-depth study of antibiotics in pneumonia disease drugs "," Research and development of gonadotropins on contraceptive drugs ", " Maintenance of the histamine nervous system in the brain and the development and treatment of epilepsy "," Narcotic analgesic drugs in glaucoma treatment ". For example, the first path evolved from “antibiotic research-SI” to “antibiotic research and pneumonia-SI”, and finally to “antimicrobial drug research and pneumonia”-S.

Through the annotation of corpus sources, we can clearly find that the first path is to evolve from science-innovation integration topics to science-dominated topics. The second path is from science topics to science-innovation integration topics. The third path evolved from the science-innovation integration topic to the topics dominated by science; The fourth path is always science-dominated topics.

Table 2. Topic evolution path of the high related topics in across category

Path Name High related topics Time Slice-Topic Number
1 In-depth study of antibiotics in pneumonia disease drugs Antibiotic Research-SI 1-14
Antibiotic Research&Pneumonia-SI 3-34
Antimicrobial drug research-S 4-44
2 Research and development of gonadotropins on contraceptive drugs Study of gonadotropic agents-S 1-6
Research on gonadal hormones and contraceptive agents-SI 5-59
3 Maintenance of the histamine nervous system in the brain and the development and treatment of epilepsy Study on the efficacy and safety of antihistamines-SI 1-1
Study of the efficacy and adverse effects of antiepileptic drugs-S 4-36
4 Narcotic analgesic drugs in glaucoma treatment Analgesic research-S 4-42
Narcotic sedative drugs&Glaucoma treatment drugs-S 5-66

3.5 Topic evolution pattern analysis

In the Sankey plot, the first number labelled before each topic is the serial number in the 112 topics constructed by Bertopic, and the second number is the serial number of that topic in that time slice. The letter to the right of the topics (S/T/I) is the sign of the topic corpus category, with S for science (papers), T for technology (patents), and I for innovation (drugs). We found that there were roughly three patterns of topic evolution.

3.5.1 “Independent development” evolution pattern

Figure 4 shows the independent evolution pattern. Most of the topics of this pattern were located in the first five time slices, occurring in 2010 and before, while very few appeared this year. Most of the topic pairs of this pattern were located in the first five time slices, occurring in 2010 and before, while very few appeared this year. These topics tend to be older. Most of them were established before 1990, and most of them were scientific research (S). After 10-20 years of development, this type of topic has been transformed from a single scientific research topic to a science-drug integration topic.

Figure 4: “Independent development” evolution pattern

3.5.2 “Simple split fusion” evolution pattern

Figure 5 illustrates a simple pattern of topic evolution that exhibits topic fragmentation and fusion. Among the highly related theme pairs, most of them appeared from 2000 to 2015, and the number of simple split fusion topic pairs was more than that of independent development topic pairs. After 2000, there was a gradual increase in topic fragmentation, and after 2010, the convergence of topics began to increase. During the topic evolution, the splitting and fusion of topics run through the whole process.

Figure 5: “Simple split fusion” evolution pattern

3.5.3 “Complex and synergistic fragmentation and fusion” pattern

Figure 6 illustrates the complex and synergetic patterns of topic split and fusion development. This mode initially appeared after 2000. In recent years, the fragmentation and integration of different topics under the synergistic state has become more rapid and close, and has become an important way of cross-integration of science, technology and innovation. As is shown in Figure 6, the topics based on scientific research is divided into multiple topics, and at the same time, different topics are cross-fused and then split. From an overall point of view, the cross-integration and fragmentation of different topics have improved the speed of knowledge transfer from overall scientific research to patent technology and then to drug innovation, and at the same time, the combination of science, technology and innovation is closer.

Figure 6: “Complex and synergistic fragmentation and fusion” pattern

4. Conclusion

This study combined citation analysis method and text analysis method to measure the linkage of science-technology-innovation and its evolution. The results showed that from 1949 to 2022, the linkage between science-technology-innovation increased and became closer year by year. The strength of the linkage increased, but the speed of the linkage tended to slow down. Through the topic evolution path and evolution mode analysis, we found that the cross-integration and fragmentation of topics promote the linkage between science-technology-innovation knowledge more closely.

Acknowledgments

We would like to thank China Medical University for its support of this study.

Author contributions

Chunli Liu: Conceptualization, Writing – original draft, Wring – review & editing, Formal analysis. Dongyu Zang: Data curation, Formal analysis, Methodology, Visualization. Shuang Chen: Writing-review & editing.

Competing interests

No competing interests.

References

Ba, Z.C., & Liang, Z. T. (2021). A novel approach to measuring science-technology

Boyack K. W. & Klavans R. (2008). Measuring science–technology interaction using rare inventor–author names. Journal of Informetrics, 2(3): 173-182. https://doi.org/10.1016/j.joi.2008.03.001

Chen, L.X. (2017). Do patent citations indicate knowledge linkage? The evidence from text similarities between patents and their citations. Journal of Informetrics,11(1): 63-79. https://doi.org/10.1016/j.joi.2016.04.018

Du, J., Li, P.X., et al. (2019). Measuring the knowledge translation and convergence in pharmaceutical innovation by funding-science-technology-innovation linkages analysis. Journal of Informetrics, 13(1):132-148. https://doi.org/10.1016/j.joi.2018.12.004

Gazni, A. (2020). The growing number of patent citations to scientific papers: Changes in the world, nations, and fields. Technology in Society, 62. https://doi.org/10.1016/j.techsoc.2020.101276

Han, W., Han, X., et al. (2022). The Development History and Research Tendency of Medical Informatics: Topic Evolution Analysis. JMIR Med Inform, 10(1): e31918. DOI: 10.2196/31918

Ke, Q. (2020). An analysis of the evolution of science-technology linkage in biomedicine. Journal of Informetrics, 14 (4). https://doi.org/10.1016/j.joi.2020.101074

linkage: From the perspective of knowledge network coupling. Journal of Informetrics, 15(3). doi:10.1016/j.joi.2021.101167

Narin, F. (1994). Patent bibliometrics. Scientometrics, 30(1), 147–155.

Narin, F., & Olivastro, D. (1988). Chapter 15 – technology indicators based on patents and patent citations. Handbook of Quantitative Studies of Science & Technology, 465–507.

Narin, F., & Olivastro, D. (1998). Linkage between patents and papers: An interim epo/us comparison. Scientometrics, 41: 51–59.

Narin, F., Hamilton, K. S. & Olivastro, D. (1997). The increasing linkage between U.S. technology and public science. Research Policy, 26 (3), 317–330.

Sun, X & Ding, K. (2018). Identifying and tracking scientific and technological knowledge memes from citation networks of publications and patents. Scientometrics, 116(3): 1735-1748. https://doi.org/10.1007/s11192-018-2836-1

Sung, H-Y., Wang, C-C., et al. (2015). Measuring science-based science linkage and non-science-based linkage of patents through non-patent references. Journal of Informetrics, 9(3), 488–498. https://doi.org/10.1016/j.joi.2015.04.004

Van de Wiele, V., Torrance, A. W., Kesselheim, A.S. (2022). Characteristics Of Key Patents Covering Recent FDA-Approved Drugs. Health Aff (Millwood). 41(8):1117-1124. doi: 10.1377/hlthaff.2022.00002.

Verbeek, A., Debackere, K., Luwel, M, et al. (2002). Linking science to technology: Using bibliographic references in patents to build linkage schemes. Scientometrics, 2002, 54: 399-420.

Wang, J. J & Ye, F.Y. (2021). Probing into the interactions between papers and patents of new CRISPR/CAS9 technology: A citation comparison. Journal of Informetrics, 2021, 15(4). https://doi.org/10.1016/j.joi.2021.101189

Wang, X.F., Zhang, S, et al. (2021). How pharmaceutical innovation evolves: The path from science to technological development to marketable drugs. Technological Forecasting and Social Chang, 167. https://doi.org/10.1016/j.techfore.2021.120698

Yang, X, Feng, L.Z, Yuan J. P. (2023). Research on linkage of science and technology in the library and information science field. Data and Information Management, 2023. https://doi.org/10.1016/j.dim.2023.100033

Figures (6)

Publication ImagePublication ImagePublication ImagePublication ImagePublication ImagePublication Image
Submitted by21 Apr 2023
Download Publication

No reviews to show. Please remember to LOG IN as some reviews may be only visible to specific users.

ReviewerDecisionType
User Avatar
Hidden Identity
Minor Revision
Peer Review
User Avatar
Hidden Identity
Accepted
Peer Review