Email updates

Keep up to date with the latest news and content from Source Code for Biology and Medicine and BioMed Central.

Open Access Highly Accessed Research

Purposeful selection of variables in logistic regression

Zoran Bursac1*, C Heath Gauss1, David Keith Williams1 and David W Hosmer2

Author Affiliations

1 Biostatistics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA

2 Biostatistics, University of Massachusetts, Amherst, MA 01003, USA

For all author emails, please log on.

Source Code for Biology and Medicine 2008, 3:17  doi:10.1186/1751-0473-3-17

Published: 16 December 2008

Abstract

Background

The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process.

Methods

In this paper we introduce an algorithm which automates that process. We conduct a simulation study to compare the performance of this algorithm with three well documented variable selection procedures in SAS PROC LOGISTIC: FORWARD, BACKWARD, and STEPWISE.

Results

We show that the advantage of this approach is when the analyst is interested in risk factor modeling and not just prediction. In addition to significant covariates, this variable selection procedure has the capability of retaining important confounding variables, resulting potentially in a slightly richer model. Application of the macro is further illustrated with the Hosmer and Lemeshow Worchester Heart Attack Study (WHAS) data.

Conclusion

If an analyst is in need of an algorithm that will help guide the retention of significant covariates as well as confounding ones they should consider this macro as an alternative tool.