Beyond the exome

DFG funded Research Group (FOR 2841)

P09: Determination of DNA sequence patterns crucial for transcription factor binding

Open postions: 2 PhD students

Principal investigators: Prof. Dr. Dominik Seelow and Prof. Dr. Markus Schuelke

P01 Tight regulation of gene expression plays a major role in human development. Its dysregulation by mutations within the transcription factor binding sites (TFBS) of enhancers or promoters may lead to disease. As the current methods for predicting human TFBSs are notoriously inaccurate, we want to improve the predictability of disease-causing variants in the human genome. To this end we will perform random saturation mutagenesis to alter short DNA fragments containing experimentally confirmed TFBSs and capture these by immunoprecipitation (IP) against the respective transcription factor (TF). Deep sequencing will determine the degree of relative enrichment or depletion of sequences containing specific sequence variants before and after IP. These data will enable exact quantification of enrichment or depletion of specific bases in individual sequence motifs. This will enable the precise determination of base residues crucial for TF binding and of the effect single nucleotide variants and small indels have on TF binding. We will use deep learning algorithms to determine motifs and base pair patterns responsible for TB binding, ultimately resulting in a predictor that can identify variants likely to change TFBSs in whole genome sequencing (WGS) data from patients with rare diseases. The predictor will be tested and optimized on novel TF binding incidence data generated by this RU and on unsolved cases of patients with variants potentially affecting TFBSs. We will test the accuracy of our predictor by mutagenesis of TFBSs and subsequent functional in vivo studies in cellular models. The predictor will be made available to the public as open source software and will also be integrated into our online MutationTaster and RegulationSpotter software tools. This work is essential for the overarching goal of our RU: The reduction, or even elimination, of the diagnostic odyssey.