Impact of cross-calibration methods on the interpretation of a treatment comparison study using 2 depression scales
UMass Chan Affiliations
Department of Quantitative Health SciencesDocument Type
Journal ArticlePublication Date
2012-04-01Keywords
AdultCalibration
Data Interpretation, Statistical
Depression
Female
Humans
Male
Psychiatric Status Rating Scales
Psychometrics
Questionnaires
Severity of Illness Index
Treatment Outcome
Behavioral Disciplines and Activities
Health Services Research
Metadata
Show full item recordAbstract
BACKGROUND: Many questionnaires assessing depressive symptoms are available. Most of these questionnaires are constructed based on classical test theory, making comparisons of individual scores difficult. Item response theory (IRT) allows the comparison of scores from different instruments. In this study, the impact of IRT-based cross-calibration methods on the results of a treatment outcome study was evaluated using 2 instruments. METHODS: Data collected during admission and discharge procedures from 1066 inpatients in 2 psychosomatic clinics using different depression measures were analyzed. To achieve comparability across the applied depression measures, we used an IRT-based conversion table to transform scores from one instrument's scale to the other. Latent trait values were also estimated using different instruments in each clinic. We compared these methods to the traditional approach of using the same instrument in both clinics and examined their effects on the statistical analyses. RESULTS: There was no substantial change in the interpretation of the study results when different instruments were used. However, F values, P values, and effect sizes in the analysis of variance changed significantly. This might be attributed to differences in the content or measurement properties of the instruments. Interestingly, no difference was observed between use of transformed sum scores and latent trait values. CONCLUSIONS: IRT cross-calibration methods are a convenient way to enhance the comparability of questionnaire data in applied clinical settings but seem not to be able to overcome differences in measurement properties of the instruments. As these differences can lead to biased results, there is a need for further research into more advanced techniques.Source
Med Care. 2012 Apr;50(4):320-6. Link to article on publisher's siteDOI
10.1097/MLR.0b013e31822945b4Permanent Link to this Item
http://hdl.handle.net/20.500.14038/46594PubMed ID
22422054Related Resources
Link to Article in PubMedae974a485f413a2113503eed53cd6c53
10.1097/MLR.0b013e31822945b4