Sparse feature selection for classification and prediction of metastasis in endometrial cancer
Abstract
Background: Metastasis via pelvic and/or para-aortic lymph nodes is a major risk factor for endometrial cancer. Lymph-node resection ameliorates risk but is associated with significant co-morbidities. Incidence in patients with stage I disease is 4-22% but no mechanism exists to accurately predict it. Therefore, national guidelines for primary staging surgery include pelvic and para-aortic lymph node dissection for all patients whose tumor exceeds 2cm in diameter. We sought to identify a robust molecular signature that can accurately classify risk of lymph node metastasis in endometrial cancer patients. 86 tumors matched for age and race, and evenly distributed between lymph node-positive and lymph node-negative cases, were selected as a training cohort. Genomic micro-RNA expression was profiled for each sample to serve as the predictive feature matrix. An independent set of 28 tumor samples was collected and similarly characterized to serve as a test cohort. Results: A feature selection algorithm was designed for applications where the number of samples is far smaller than the number of measured features per sample. A predictive miRNA expression signature was developed using this algorithm, which was then used to predict the metastatic status of the independent test cohort. A weighted classifier, using 18 micro-RNAs, achieved 100% accuracy on the training cohort. When applied to the testing cohort, the classifier correctly predicted 90% of node-positive cases, and 80% of node-negative cases (FDR = 6.25%). Conclusion: Results indicate that the evaluation of the quantitative sparse-feature classifier proposed here in clinical trials may lead to significant improvement in the prediction of lymphatic metastases in endometrial cancer patients.