• publication
    Friday, May 11, 2018

    Along with collaborators at UCLA, we were able to detect a small, but significant amount of microbes in blood This is surprising since it's typically assumed that the immune system typically removes any microbial presence from human blood. I used a reference-free microbial community algorithm, called EMDeBruijn, to help corroborate the patterns we saw which included an increase in microbial diversity in schizophrenia patients. EMDeBruijn is a metric based on the Wasserstein metric (aka the Earth Mover's Distance) and a de Bruijn graph induced by the k-mers in a metagenomic DNA sample.

  • Event
    Tuesday, May 1, 2018

    My recent work with the CAMI project was

  • Article

    I'm pleased to announce that we've recently been funded by the NIH National Center for Advancing Translational Sciences (NCATS) along with Steve Ramsey (Oregon State).

  • publication
    Tuesday, July 4, 2017

    This work improves upon the so called "min hash" technique (a "probabilistic data analysis" method) to develop a very fast and efficient way to estimate the similarity of two sets of objects (in terms of how much they overlap). The approach we present is orders of magnitude faster (and uses orders of magnitude less space) when two data sets under consideration are of very different size. The kinds of sets we consider are sets of sub-strings (called k-mers) of DNA sequences from communities of microorganisms.

  • publication
    Sunday, July 2, 2017

    A gene regulatory network is basically a representation of how genes interact with each other. In this work, we develop the only (to date) method to assess the accuracy of so called "motif discovery algorithms" that seek to find important sub-networks of a given gene regulatory network. We develop a provably correct mathematical approach (based on a variety of metrics that say how close two matrices are to each other) and use this to assess the performance of a variety of motif discovery algorithms.

Improving Min Hash for Metagenomic Classification

A presentation about work with Hooman Zabeti that used probabilistic data analysis to analyze metagenomic communities.

MTH 321: Introductory applications of mathematical software

This is a course that I created back in 2014 (that continues to run, typically in the Fall and Spring) to introduce students to Mathematica, Matlab, and LaTeX. In the future, I will be incorporating modules on Python and/or Julia. This hands-on course has been attended by over 80 undergradutes, as well as a handfull of graduate students and faculty as well!

I wrote a (~200 page) textbook to accompany this course which can be found here.

MetaPalette Summary video

Very brief explanation of how MetaPalette works.