I'm a mathematical biologist who is interested in developing mathematically sound approaches to the analysis of high-throughput DNA sequencing data. I often employs probabilistic and optimization techniques to facilitate the analysis of such data. You can take a look at my publications for a more detailed look at the kind of research I do.

On the mathematical side, my research interests include:

- probabilistic and graphical data analysis
- symbolic dynamical systems (viewing a sequence of symbols as a "set of directions" and analyzing where this takes you)
- compressive sensing and other optimization techniques (finding the simplest explanation for given observations)
- entropy techniques (quantifying the complexity of given data)

On the biological side, my application areas include the analysis of *-omics data, specializing in metagenomics (the study of microbial communities through their sequenced DNA).

Due to being an applied mathematician in the area of mathematical biology, I maintain a number of collaborations with biological scientists. Here is a selection of the collaborations I am involved with:

The CAMI consortium is an international group of scientists, practitioners and mathematicians who concerned with the proper interpretation, comparison, and analysis of metagenomic data. We have held a competition to compare the myriad of metagenomic computational techniques and currently have the paper under review at Nature Methods (preprint here). I am an organizing member in charge of the portion of CAMI regarding taxonomic profiling (answering the question of which organisms are present in a given sample and at what frequency). A brief presentation given at a recent OSU metagenomics conference is available in powerpoint format here.

This collaboration developed from a program on metagenomics at the Cambridge Isaac Newton Institute.

SSiMBio is a a broad collaboration taking place at Oregon State University across multiple departments, involving 11 researchers and 6 graduate students. Our aim is to develop a systems-biology framework around the intertidal sea anemone Anthopleura elegantissima. By integrating datasets across disciplines, we hope to develop a complete picture of the biological responses that are associated in the system and particularly around the symbiotic state of this organism.

The CGSI is a summer institute at UCLA which brings together faculty, students, and industry to focus on cutting-edge research on all things genomics. The institute (funded by the NIH) is held during the Summer and consists of a week long conference followed by a 3 week long program which provides a research residency in computational genomics. I am helping to organize this institute.

The CGSI grew out of (and is assisted by) the Institute of Pure and Applied Mathematics (IPAM) and was originally conceived during the program on high-throughput genomics.