Classification and regression tree analysis in public health: methodological review and comparison with logistic regression
Department of Medicine, Division of Preventive and Behavioral Medicine
*Decision Trees; Humans; *Public Health; *Regression Analysis; Research Design; Statistics as Topic
Behavioral Disciplines and Activities | Behavior and Behavior Mechanisms | Community Health and Preventive Medicine | Preventive Medicine
BACKGROUND: Audience segmentation strategies are of increasing interest to public health professionals who wish to identify easily defined, mutually exclusive population subgroups whose members share similar characteristics that help determine participation in a health-related behavior as a basis for targeted interventions. Classification and regression tree (CandRT) analysis is a nonparametric decision tree methodology that has the ability to efficiently segment populations into meaningful subgroups. However, it is not commonly used in public health.
PURPOSE: This study provides a methodological overview of CandRT analysis for persons unfamiliar with the procedure.
METHODS AND RESULTS: An example of a CandRT analysis is provided and interpretation of results is discussed. Results are validated with those obtained from a logistic regression model that was created to replicate the CandRT findings. Results obtained from the example CandRT analysis are also compared to those obtained from a common approach to logistic regression, the stepwise selection procedure. Issues to consider when deciding whether to use CandRT are discussed, and situations in which CandRT may and may not be beneficial are described.
CONCLUSIONS: CandRT is a promising research tool for the identification of at-risk populations in public health research and outreach.
Rights and Permissions
Citation: Ann Behav Med. 2003 Dec;26(3):172-81. DOI: 10.1207/S15324796ABM2603_02
Lemon, Stephenie C.; Roy, Jason; Clark, Melissa A.; Friendmann, Peter D.; and Rakowski, William, "Classification and regression tree analysis in public health: methodological review and comparison with logistic regression" (2003). Preventive and Behavioral Medicine Publications and Presentations. 186.