Objective: Genomic profiling, the use of genetic variants at
multiple loci simultaneously for the prediction of disease
risk, requires the selection of a set of genetic variants that
best predicts disease status. The goal of this work was to provide
a new selection algorithm for genomic profiling. Methods:
We propose a new ...»»»»
Objective: Genomic profiling, the use of genetic variants at
multiple loci simultaneously for the prediction of disease
risk, requires the selection of a set of genetic variants that
best predicts disease status. The goal of this work was to provide
a new selection algorithm for genomic profiling. Methods:
We propose a new algorithm for genomic profiling
based on optimizing the area under the receiver operating
characteristic curve (AUC) of the random forest (RF). The proposed
strategy implements a backward elimination process
based on the initial ranking of variables. Results and Conclusions:
We demonstrate the advantage of using the AUC instead
of the classification error as a measure of predictive
accuracy of RF. In particular, we show that the use of the classification
error is especially inappropriate when dealing with
unbalanced data sets. The new procedure for variable selection
and prediction, namely AUC-RF, is illustrated with data
from a bladder cancer study and also with simulated data.
The algorithm is publicly available as an R package, named
AUCRF, at http://cran.r-project.org/.^^^^
Tipo de documento:
Artículo
Indexación:
Indexat a WOS/JCR
Derechos:
(c) Karger
Tots els drets reservats
Citación Bibliográfica:
Calle Rosingana, M. L., Urrea, V., Boulesteix, A., & Malats, N. (2011). AUC-RF: A New Strategy for Genomic Profiling with Random Forest. Human heredity, 72(2), 121-132. doi:10.1159/000330778