Date(s) - 01/09/2014
It is a lesser-known fact that Alan Turing was among the first computer scientists to pursue mathematical treatments of how biological form encodes function. Unraveling the form-function relationship is a fundamental and persistent problem in science. Doing so for biomolecules is key to our understanding of biology and treatment of disease but poses outstanding computational challenges. Algorithmic frameworks designed for the nanoscale or single-molecule treatment have to address what are often NP-hard problems. Doing so for the petascale to extract functional information encoded over millions or more biological forms in the era of big data seems an insurmountable task for computation.
The research in my lab focuses on designing effective algorithmic treatments that balance the basic science focus on accuracy with the big data or application-driven demand on speed. These treatments combine concepts from computer science, statistical physics and mechanics, and biophysics.
I will summarize our work on representative form-to-function problems in the context of DNA sequences, protein structures, and that of a richer characterization of the relationship between sequence, structure, dynamics, and function in monomeric protein molecules and multimeric assemblies. The latter is additionally motivated by the central role of structure in proteinopathies, such as cancer and neurodegenerative disorders. Three key areas of contributions will be detailed: (1) employment of biological or biophysical insight to formulate constraints imposed by function on form; (2) design of suitable representations of form to capture not only trivial explicit constraints but also non-trivial implicit constraints arising from the high coupling of the building blocks that make up a molecular system; (3) design of novel algorithmic frameworks to capture remaining constraints and provide informative summarizations of function through constraint-satisfying states of form. These frameworks are probabilistic in order to manage the complexity of molecular systems in the presence of constraints. They build on machine learning and statistics, or on stochastic search and optimization, combining concepts from sampling-based robot motion planning and evolutionary algorithms. The talk will showcase the ability of these frameworks to advance knowledge and further spur biological research, thus completing the closed-loop relationship between dry- and wet-lab research. I will additionally emphasize the ability of our algorithmic treatments to further advance general knowledge in computer science on modeling complex modular systems operating in the presence of constraints. The talk will conclude with an outline of ongoing and future research agenda.
Amarda Shehu is an Assistant Professor in the Department of Computer Science with affiliated appointments in the Department of Bioengineering and the School of Systems Biology. Shehu received her Ph.D. in Computer Science from Rice University in Houston, TX, where she was an NIH fellow of the Nanobiology Training Program of the Gulf Coast Consortia. Shehu’s general research interests are in the field of Artificial Intelligence. Her research contributions to date are in computational structural biology, biophysics, and bioinformatics with a focus on issues concerning the relationship between sequence, structure, dynamics, and function in biological molecules. Shehu’s research is currently supported by the NSF, the Jeffress Trust Program in Interdisciplinary Research, and the Virginia Youth Tobacco Program. Shehu is also the recent recipient of an NSF CAREER award. She is a member of ACM, IEEE, Biophysical Society, International Society for Computational Biology, the American Chemical Society, and the Council on Undergraduate Research. Research and educational materials resulting from Shehu’s work, including images, videos, publications, and software, can be found at: http://cs.gmu.edu/~ashehu.