Battle Lab
Johns Hopkins University
Our lab is interested in modeling the impact of human genetic variation on complex traits, from cellular phenotypes to health and disease. We are a computational lab specializing in machine learning methods and statistical models appropriate for the analysis of diverse, large-scale genomic data. Using methods based on probabilistic graphical models and structured regularization, we develop integrative, informed models leveraging the underlying structure and behavior of biological systems.
Modeling the effect of genetic variation on gene expression
Unraveling the influence of genetics on the cell is a critical step toward understanding its impact on our health. Genetic variation outside of the protein-coding regions of the genome has proven particularly difficult to interpret, but large scale studies of gene expression offer a window into the regulatory impact of functional non-coding variants. We are interested in modeling the effect of regulatory variation on the complete human transcriptome, enabled by large-scale RNA-sequencing studies. Our work includes machine leanring methods for predicting the impact of non-coding variation, improving regulatory network inference, and developing methods for analysis of RNA-sequencing data.
Personal genomics: predicting the impact of rare genetic variants
Traditional studies of both cellular phenotypes and human disease rely on population-based designs, where impact of each genetic variant is assessed through direct association testing. However, each of us also carries many genetic variants that are rare, or even never seen before, and cannot be assessed through standard association methods. Rare variants may have a significant impact on our health, and we are interested in methods for predicting which of these variants are likely to be harmful. We are developing new methods for integrating genomic and transcriptomic data with diverse sources of prior knowledge to evaluate the potential impact of rare variants in personal genomes.
Leveraging networks and cellular pathways in disease studies
Genes and variants with small effects on disease risk are often buried among many spurious associations in genome-wide studies. However, a key observation is that genes function in highly interconnected pathways, so multiple co-functional or pathway-connected genes often affect the same trait. We are interested in leveraging gene networks and pathways to improve analysis of disease studies, and in simultaneously improving our understanding of how genes work together to influence complex traits. We are developing novel machine learning methods for integrating pathway information into whole-genome sequencing and integrative studies of disease.