A complex CTCF binding code defines TAD boundary structure and function
Topologically Associating Domains (TADs) compartmentalize vertebrate genomes into sub-Megabase functional neighbourhoods for gene regulation, DNA replication, recombination and repair. TADs are formed by Cohesin-mediated loop extrusion, which compacts the DNA within the domain, followed by blocking of loop extrusion by the CTCF insulator protein at their boundarie. CTCF blocks loop extrusion in an orientation dependent manner, with both experimental and in-silico studies assuming that a single site of static CTCF binding is sufficient to create a stable TAD boundary. We report that most TAD boundaries in mouse cells are modular entities where CTCF binding clusters within extended genomic intervals. Optimized ChIP-seq analysis reveals clustering not only among CTCF ChIP-seq peaks but frequently also within those peaks. Using a newly developed multi-contact Nano-C assay, we confirm that individual CTCF binding sites (CBS) additively contribute to TAD insulation. The clustering of CBS may counter against the dynamic DNA-binding kinetics of CTCF, which urges a re-evaluation of current models for the blocking of loop extrusion. Our work thus reveals an unanticipatedly complex CTCF binding code at TAD boundaries that expands the regulatory potential for TAD structure and function and may help to better explain how non-coding structural variation can influence gene regulation, DNA replication, recombination and repair.