This is my PhD thesis from Penn State (advised by Manfred Denker).

Coding sequence density estimation via topological pressure

We demonstrate that a concept of "weighted information content" (known as topological pressure, from the ergodic theory literature) can be used to facilitate the analysis of genomic data (in particular, find areas of a genome that have many genes in them). This is a conceptual extension to topological entropy approach presented earlier.

Substitution Markov chains and Martin boundaries

After introducing the notion of a random substitution Markov chain, we relate it to other notions of a "random substitution" and give a complete description of the Martin boundary for a few interesting examples.

Sparse recovery by means of nonnegative least squares

We prove that nonnegative least squares (typically prone to over-fitting) can be slightly modified to return sparse results.

Exact probabilities for the indeterminacy of complex networks as perceived through press perturbations

In a network of interacting quantities (such as a food web), we examine how qualitative and quantitative predictions change when a quantity (such as the abundance of an organism or a set of organisms) is increased. This is quantified in terms of which model parameters cause the largest change in predictions.