基于多维深层数据关联的医学知识挖掘研究进展

杜建

农业图书情报学报. 2019, 31(3): 4-9

PDF(1591 KB)
PDF(1591 KB)
农业图书情报学报 ›› 2019, Vol. 31 ›› Issue (3) : 4-9. DOI: 10.13998/j.cnki.issn1002-1248.2019.03.19-0362
特约综述

基于多维深层数据关联的医学知识挖掘研究进展

  • 杜建
作者信息 +

Biomedical Knowledge Discovery Based on Big Data Linkage Analysis

  • DU Jian
Author information +
History +

摘要

数据科学和情报学方法的核心在于如何从数据中挖掘出知识和见解。在与生命健康密切相关的医学和医疗领域,大数据分析应在相关性挖掘基础上揭示因果关系,增强重复性和解释性。基于因果关系的数据关联对于智库研究和情报感知具有重要意义。文章提出基于多维数据关联和深层数据关联的医学知识挖掘思路,介绍了相关数据平台和研究进展。一是实验室—临床知识转化测度与临界分析;二是科学的技术影响力测度;三是交叉性、变革性创新前沿识别;四是基于全文本、融合文献计量学与计算语言学的不确定性医学知识挖掘。前三个方面拓展了医学知识的空间,包括从实验室到临床,从科学空间到技术空间。对于确定性/不确定性医学证据和论断挖掘深化了对医学知识的因果关系的揭示和解释。

Abstract

Extracting knowledge and insights from the data is the core of data science and informatics approach. In the medical field, big data analysis is applied to reveal causal relationships and enhance its repeatability and inter-pretability based on correlation mining. Analysis of data association with causality is of great significance for think tank research and intelligence perception. To reveal the causal relationship between knowledge, the paper introduces relevant data platforms and research progress, and proposes a medical knowledge mining ideas based on multi-space and deep data. One is the measurement and critical analysis of laboratory-clinical knowledge trans-formation; the other is scientific technological influence measurement; the third is the identification of cross-cutting and innovative frontiers; the last one is the mining of medical knowledge based on the combination of full-text, bib-liometrics and computational linguistics. The first three approaches expand the space of medical knowledge, includ-ing from basic research space to applied research space, and from scientific space to technological space. The fourth way deepens the disclosure and explanation of the causality of medical knowledge based on certainty or uncertainty of the medical knowledge.

关键词

数据关联 / 生物医学知识发现 / 非专利论文 / 不确定性论断挖掘 / 引用语句分析

Key words

big data linkage / biomedical knowledge discovery / non-patent literature / uncertainty argumentation mining / citation sentence analysis

引用本文

导出引用
杜建. 基于多维深层数据关联的医学知识挖掘研究进展. 农业图书情报学报. 2019, 31(3): 4-9 https://doi.org/10.13998/j.cnki.issn1002-1248.2019.03.19-0362
DU Jian. Biomedical Knowledge Discovery Based on Big Data Linkage Analysis. Journal of Library and Information Sciences in Agriculture. 2019, 31(3): 4-9 https://doi.org/10.13998/j.cnki.issn1002-1248.2019.03.19-0362

参考文献

[1] 维克托·迈尔-舍恩伯格(Viktor Mayer-Schonberger)著, 盛杨燕、周涛译. 大数据时代:生活、工作与思维的大变革[M].浙江人民出版社, 2013.
[2] NIH OPA. iSearch v2.0: Summary of New Fea-tures[EB/OL]. https://dpcpsi.nih.gov/sites/default/files/iSearch%202.0%20-%20one%20pager%20-%2010302017.pdf.
[3] Jefferson O A, Jaffe A, Ashton D, et al.Mapping the global influence of published research on industry and innovation[J]. Nature biotechnology, 2018, 36: 31-39.
[4] Fujiwara T, Yamamoto Y.Colil: a database and search service for citation contexts in the life sciences domain[J]. Journal of biomedical semantics, 2015, 6(1): 38.
[5] 潘教峰, 杨国梁, 刘慧晖. 智库DIIS三维理论模型[J]. 中国科学院院刊,2018,33(12):97-104.
[6] 杜建, 唐小利. 转化研究过程测度与绩效评估:方法与实践[J]. 图书情报工作, 2015, 59(3): 103-111.
[7] Li R, Chambers T, Ding Y, et al.Patent citation analysis: Calculating science linkage based on citing motivation[J]. Journal of the Association for Information Science and Technology, 2014, 65(5): 1007-1017.
[8] National Science Foundation. Science and Engineering Indicators2018[EB/OL]. https://www.nsf.gov/statistics/2018/nsb20181/figures.
[9] 日本の科学研究力の現状と課題[EB/OL]. http://www.nistep.go.jp/archives/38930. 2018-12-03.
[10] Weber G M.Identifying translational science within the triangle of biomedicine[J]. Journal of translational medi-cine, 2013, 11(1): 126.
[11] NIH Office of Portfolio Analysis. Translational Science Workshop IDENTIFYING TRANSLATION: iTrans[EB/OL].https://www.dpcpsi.nih.gov/sites/default/files/iTrans_one_pager%2004172018.pdf.
[12] Ke Q.Identifying translational science through embed-dings of controlled vocabularies[J]. J Am Med Inform Assoc, 2019.
[13] Petersen A M, Rotolo D, Leydesdorff L.A triple helix model of medical innovation: Supply, demand, and tech-nological capabilities in terms of Medical Subject Head-ings[J]. Research Policy, 2016, 45(3): 666-681.
[14] Du J, Li P, Guo Q, et al.Measuring the knowledge trans-lation and convergence in pharmaceutical innovation by funding-science-technology-innovation linkages analy-sis[J]. Journal of Informetrics, 2019, 13(1): 132-148.
[15] 潘龙飞, 周程. 步入大科学时代的诺贝尔奖[J]. 智库理论与实践, 2016, 1(6):17-25.
[16] Ahmadpoor M, Jones B F.The dual frontier: Patented in-ventions and prior scientific advance[J]. Science, 2017, 357(6351): 583-587.
[17] Ke Q.Comparing scientific and technological impact of biomedical research[J]. Journal of Informetrics, 2018, 12(3): 706-717.
[18] Ke Q.An analysis of the evolution of science-technology linkage in biomedicine[J]. arXiv preprint arXiv:1903.10610, 2019.
[19] Li D, Azoulay P, Sampat B N.The applied value of public investments in biomedical research[J]. Science, 2017, 356(6333): 78-81.
[20] Jibu M.An analysis of the achievements of JST opera-tions through scientific patenting: linkage between pat-ents and scientific papers[C]//Science and Innovation Policy, 2011 Atlanta Conference on. IEEE, 2011: 1-7.
[21] 杜建, 孙轶楠, 李永洁, 等. 从科学—技术交叉处识别创新前沿: 方法与实证[J]. 情报理论与实践, 2019, 42(1): 94-99.
[22] 杜建, 孙轶楠, 张阳, 等. 变革性研究的科学计量学特征与早期识别方法[J]. 中国科学基金, 2019 (1): 17.
[23] Du J, Li PX, Robin H, et al. Patent citations to scientific papers as early signs for predicting delayed recognition of scientific discoveries: a comparative study with instant recognition[C]//Accepted by the 17th International Con-ference on Scientometrics & Informetrics, 2019, Sapienza University, ROME, Italy.
[24] Ding Y, Song M, Han J, et al.Entitymetrics: Measuring the impact of entities[J]. PloS one, 2013, 8(8): 71416.
[25] Song M, Kang K, Young An J.Investigating drug-disease interactions in drug-symptom-disease triples via citation relations[J]. Journal of the Association for Information Science and Technology, 2018, 69(11): 1355-1368.
[26] Radev D. R.,& Abu-Jbara, A. Rediscovering ACL Dis-coveries Through the Lens of ACL Anthology Network Citing Sentences[C]//In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discov-eries. 2012:1-12.
[27] Small H, Tseng H, Patek M.Discovering discoveries: Identifying biomedical discoveries using citation con-texts[J]. Journal of Informetrics, 2017, 11(1):46-62.
[28] Szarvas G, Vincze V, Farkas R, et al.Cross-genre and cross-domain detection of semantic uncertainty[J]. Com-putational Linguistics, 2012, 38(2): 335-367.
[29] Chen C, Song M, Heo G E.A scalable and adaptive method for finding semantically equivalent cue words of uncertainty[J]. Journal of Informetrics, 2018, 12(1): 158-180.
PDF(1591 KB)

文章所在专题

热点综述

Accesses

Citation

Detail

段落导航
相关文章

/