A complex CTCF binding code defines TAD boundary structure and function
Topologically Associating Domains (TADs) compartmentalize vertebrate genomes into sub-Megabase functional neighborhoods for gene regulation, DNA replication, recombination and repair. TADs are formed by Cohesin-mediated loop extrusion, which compacts the DNA within the domain, followed by blocking of Cohesin by the CTCF insulator protein at their boundaries. CTCF blocks loop extrusion in an orientation dependent manner, with both experimental and in-silico studies assuming that a single site of static CTCF binding is sufficient to create a stable TAD boundary. Here, we report that most TAD boundaries in mouse cells are modular entities where CTCF binding clusters within extended genomic intervals. Optimized ChIP-seq analysis reveals that this clustering of CTCF binding does not only occur among peaks but also frequently within those peaks. Using a newly developed multi-contact Nano-C assay, we confirm that individual CTCF binding sites additively contribute to TAD separation. This complex code of CTCF clustering may counter against the dynamic DNA-binding kinetics of CTCF, which urges a reevaluation of current models for the blocking of loop extrusion. Our work thus reveals an unanticipatedly complex organization of TAD boundaries that provides further means for the regulation of TAD structure and thus can help to explain how distant non-coding structural variation can influence gene regulation, DNA replication, recombination and repair. In this data data set, unprocessed imaging data and processed sequencing tracks (ChIP-seq, Nano-C, 4C-seq) have been deposited.