Motif analysis of human ZNF343 using published Chip-exo data

Published: 09-03-2018| Version 1 | DOI: 10.17632/d7c23zbvvv.1
Zheng Zuo


For human ZNF343, to derive the extended motif besides the previously identified core motif GAAGCA, we searched this hexamer and found 3,237 intact binding sites within all those 4,532 binding peaks called by Trono lab (NCBI GEO GSE78099). Using this hexamer as the prefixed core (position 1-6), all flanking sequences within ±40bp ranges were extracted and aligned accordingly (from -39 to 46). For each particular site, all those raw Chip-exo reads (NCBI SRA SRX2512772) falling in its neighboring range were mapped based on the their starting positions, either in forward or reverse direction, thus the total Chip-exo reads for this site was calculated as the forward exo reads count (position -39 to +5) plus the reverse exo reads count (position +2 to 46). Ideally the Chip-exo reads count for each site should be proportional to the binding probability of that site, or approximately its binding affinity if the in vivo ZNF343 protein occupancy is low enough, therefore the negative logarithmic ratio of its Chip-exo reads was used as the relative binding energy for data regression and motif analysis.