Large-scale prediction of transcription factor binding across human cell types informs regulatory genomics and reveals promiscuous occupancy associated with chromatin contacts

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Large-scale prediction of transcription factor binding across human cell types informs regulatory genomics and reveals promiscuous occupancy associated with chromatin contacts

Authors

Sonder, E.; Aymergen, I. G.; Sun, J.; Feuvrier, A.; Schratt, G.; Gapp, K.; Bohacek, J.; Robinson, M. D.; Germain, P.-L.

Abstract

Our understanding of the mechanisms regulating gene expression has been hampered by our limited knowledge of which transcription factors (TFs) bind where in the genome, which is highly cell type-specific. While genome-wide TF binding can be experimentally assayed for individual TFs in individual cell types, profiling the full combinations of over 1600 TFs in hundreds of cell types is beyond practical reach. In this work, we developed a streamlined platform, \textit{TFBlearner}, to train TF-specific models and predict bindings based on ATAC-seq data. We focused on biologically-motivated feature engineering and harnessed TF cooperativity and binding similarity across cell types to achieve state of the art binding predictions in unseen cell types in a scalable fashion. This enabled us to generate a compendium of binding predictions for 1108 Chromatin-associated proteins, of which 960 TFs, across 43 human cell types including widely-used cell lines and 36 physiological cell types representing all major human cell lineages. We show how the models additionally provide biological insights on the TFs, and show how the binding predictions can be used in downstream tasks such as TF activity inference. Our study additionally led to the observation of high promiscuity in TF occupancy. To investigate aspecific occupancy, we characterized crowded or high-occupancy (HOT) regions across cell types, providing evidence of their functionality, and reporting important cell type-specificity. Finally, we show that, across cell types, crowded regions engage in more 3D contacts, and that most TF occupancy at crowded promoters can be explained as tethered bindings from distal regulatory elements.

Follow Us on

0 comments

Add comment