UMMS Affiliation

Department of Psychiatry

Publication Date


Document Type



Community Health and Preventive Medicine | Health Information Technology | Health Policy | Health Services Administration | Health Services Research


Homelessness is poorly captured in most administrative data sets making it difficult to understand how, when, and where this population can be better served. This study sought to develop and validate a classification model of homelessness. Our sample included 5,050,639 individuals aged 11 years and older who were included in a linked dataset of administrative records from multiple state-maintained databases in Massachusetts for the period from 2011-2015. We used logistic regression to develop a classification model with 94 predictors and subsequently tested its performance. The model had high specificity (95.4%), moderate sensitivity (77.8%) for predicting known cases of homelessness, and excellent classification properties (area under the receiver operating curve 0.94; balanced accuracy 86.4%). To demonstrate the potential opportunity that exists for using such a modeling approach to target interventions to mitigate the risk of an adverse health outcome, we also estimated the association between model predicted homeless status and fatal opioid overdoses, finding that model predicted homeless status was associated with a nearly 23-fold increase in the risk of fatal opioid overdose. This study provides a novel approach for identifying homelessness using integrated administrative data. The strong performance of our model underscores the potential value of linking data from multiple service systems to improve the identification of housing instability and to assist government in developing programs that seek to improve health and other outcomes for homeless individuals.


Homeless, Opioids, Critical care and emergency medicine, Housing, Forecasting, Medical risk factors, Mental health and psychiatry, Data management, Dermatology

Rights and Permissions

Copyright: © 2020 Byrne et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

DOI of Published Version



Byrne T, Baggett T, Land T, Bernson D, Hood ME, Kennedy-Perez C, Monterrey R, Smelson D, Dones M, Bharel M. A classification model of homelessness using integrated administrative data: Implications for targeting interventions to improve the housing status, health and well-being of a highly vulnerable population. PLoS One. 2020 Aug 20;15(8):e0237905. doi: 10.1371/journal.pone.0237905. PMID: 32817717; PMCID: PMC7446866. Link to article on publisher's site

Journal/Book/Conference Title

PloS one

Related Resources

Link to Article in PubMed

PubMed ID


Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.