Quantification of the effects of single nucleotide variants in NKX2.1 transcription factor binding sites
Quantification of the effects of single nucleotide variants in NKX2.1 transcription factor binding sites
Lenihan-Geels, F.; Proft, S. A.; Bommer, M.; Heinemann, U.; Seelow, D.; Opitz, R.; Krude, H.; Schuelke, M.; Malecka, M.
AbstractTranscription factors recognise and bind specific DNA sequence patterns in promoters and enhancers thereby regulating gene expression. Variations in the DNA sequence of transcription factor binding sites (TFBSs) can alter gene regulation and may disrupt development. The transcription factor NKX2.1 is a crucial regulator of thyroid, lung, and neural development. Mutations in its coding gene NKX2-1 may cause choreoathetosis and congenital hypothyroidism with or without pulmonary dysfunction (CAHTP, OMIM #610978). Most genetically solved patients carry mutations in the coding regions of NKX2-1 that affect DNA binding, while the majority of patients with CAHTP-like symptoms do not carry mutations in the NKX2-1 coding sequence. We hypothesise that variations in the DNA-sequence at promoter or enhancer sites to which the transcription factor NKX2.1 binds could cause disease as well. We employed EMSA-seq to quantify the effects of genetic variation on NKX2.1 binding strength and used this data to train neural network models to forecast the influence of DNA variation on NKX2.1 binding. We validated our models using microscale thermophoresis, X-ray crystallography, and publicly available ChIP-seq data sets. The neural networks were able to detect TFBSs in ChIP-seq data and can thus be used to evaluate whole genome sequencing data of CAHTP-patients in order to prioritise potential disease-causing variants in regulatory elements.