A Raman spectroscopic-based Platform Using Advanced Data Mining Methods for In-Situ Cancer Cell Classification and Characterization

Date(s) - 05/09/2013
11:00 am

Michael Fenn, BME PhD Candidate

Abstract:Raman spectroscopy has the potential to significantly aid in the research, diagnosis and treatment of cancer. The information dense, complex spectra generate vast datasets in which subtle correlations among peaks often provide essential clues for biological interpretation. Thus, the implementation of advanced data mining techniques is imperative for complete, rapid and accurate data analysis of large spectral datasets; particularly in regards to clinical translation of the technology. Raman spectral datasets are defined by high dimensionality, often with limited sample sizes. Standard classification models have shown to perform poorly on such high dimensional datasets, typically transforming the original feature space, and making it unfeasible to ascribe biological relevance to the discriminating features. In the first part of this work, Raman spectroscopy is combined with a novel data mining framework, known as Fisher-based Feature Selection-Support Vector Machines (FFS-SVM), to classify and characterize, in-situ, five breast cell lines based on differences in biochemical composition (e.g. lipids, DNA, protein). Raman spectral analysis combined with the FFS-SVM framework affords feature selection control over the feature input type and the number of features utilized based on sample size in order to reduce variance and over-fitting during classification. This provides both high classification accuracy of cell type, as well as extraction of biologically relevant ‘biomarker-type’ information based on selected features from each classification schema. The subsequent phase of this work is based on further broadening the application of this Raman spectroscopic-based platform for developing a non-invasive, real-time, in-vitro assay methodology for the classification and characterization of anti-cancer agent efficacy with simultaneous analysis of the agent’s mechanism of action (MOA). The effects of six different anti-cancer agents, with diverse MOAs, are evaluated by Raman spectroscopy using an optimized multi-class extension of the FFS-SVM framework with a modified feature selection capacity in order to evaluate differences in spectral fingerprints based on the most significantly discriminative spectral features. Assessment of efficacy and toxicity by classification of cell spectra as apoptotic, dead/necrotic, or healthy, as well as characterization of the MOA, is achieved. Correlation of the features, or spectral peaks, to the corresponding biology reveals that the Raman-based platform provides a wealth information comparable to that of combining a number of the most commonly used conventional assay methods, yet in a more efficient and effective, non-invasive manner. The extension of FFS-SVM to a multi-class classification framework, along with an optimized cluster analysis, provides classification accuracies of greater than 95% for classification schemes of up to 8 classes, as well as biologically relevant spectral features associated with the MOAs. It is shown that the top features selected for each class are indicative of the biomolecules most affected by that particular agent. Prediction of both efficacy and potential MOA are possible when new agents are tested using the Raman-based platform. Continued development of this platform could improve pre-clinical model predictive capabilities, while concurrently providing more effective insight into the MOA of new and potential anti-cancer agents, thus increasing anti-cancer agent development and screening efficiency, while decreasing developmental cost.