Erik Andries
Research Assistant
Department of Mathematics and Statistics,
Albuquerque High Performance Computing Center (AHPCC) &
UNM Cancer Research Center
The University of New Mexico
Albuquerque, NM 87131
Phone: 505-277-6887, Fax: 505-277-8235
Email: andriese@math.unm.edu, andriese@ahpcc.unm.edu
"Computationally Tractable Gene Selection
Methods for Cancer
Classification"
Classification of patients into cancer subtypes or other medically
relevant categories, from a numerical linear algebra perspective, is
an ill-posed problem with respect to certain classes of classification
algorithms (e.g., parametric hyperplane/hypersurface classifiers such
as Fisher's Linear Discriminant (FLD) or Regularized Discriminant
Analysis (RDA)). Furthermore, the process of gene selection (i.e.
identifying highly-discriminating subsets of genes that are the most
responsible for the class distinctions of interest) can be
computationally burdensome, requiring high performance computing--both
serial and parallel.
This paper/poster surveys
a) a variety linear and nonlinear classification algorithms, e.g.
-- large-margin classifiers such as support vector
machines and
boosting
-- parametric hyperplane/hypersurface classification
algorithms
such as FLD and RDA,
and
b) strategies to make these algorithms effective and efficient with
respect to both classification and gene selection,
e.g.
-- gradient-based methods,
-- linear and nonlinear programming approaches,
-- regularization and generalized inverses.
These methodologies will be benchmarked on both publically available
microarray data sets and from leukemia microarray data sets from the
UNM Cancer Research
Center.
====