Computational approaches for predicting biomedical research collaborations
UMass Chan Affiliations
Department of Quantitative Health SciencesDocument Type
Journal ArticlePublication Date
2014-11-06Keywords
BioinformaticsBiomedical
Biostatistics
Medicine and Health Sciences
Statistics and Probability
UMCCTS funding
Metadata
Show full item recordAbstract
Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naive Bayes, naive Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets.Source
PLoS One. 2014 Nov 6;9(11):e111795. doi: 10.1371/journal.pone.0111795. eCollection 2014. Link to article on publisher's siteDOI
10.1371/journal.pone.0111795Permanent Link to this Item
http://hdl.handle.net/20.500.14038/39672PubMed ID
25375164Related Resources
Link to Article in PubMedRights
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Distribution License
http://creativecommons.org/publicdomain/zero/1.0/ae974a485f413a2113503eed53cd6c53
10.1371/journal.pone.0111795
Scopus Count
Except where otherwise noted, this item's license is described as <p>This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.</p>