This SuperSeries is composed of the following subset Series: GSE36735: Distribution of Drosophila insulator protein BEAF-32 in Wing imaginal tissue (Wildtype) [ChIP-seq] GSE36736: Genome wide transcriptional profiling of BEAF-32 in wing imaginal tissues of wildtype and mutants [expresion array] Refer to individual Series
Contributors:Zabet, Nicolae Radu, Adryan, Boris
Quantifying the distances between Giant, Hunchback and Kruppel ChIP-seq profiles and the profiles derived with the analytical model which includes DNA accessibility data. This is the same as Figure fig:heatmapChIPseq_GT_HB_KR_NoaccGroup070, except that we included binary DNA accessibility data in the analytical model. fig:heatmapChIPseq_GT_HB_KR_AccRegionsGroup070... Quantifying the distances between Bicoid and Caudal ChIP-seq profiles and the profiles derived with the analytical model. We plotted heatmaps for the correlation ( A ) and ( B ) and mean squared error ( C ) and ( D ) between the analytical model and the ChIP-seq profile of Bicoid ( A , C ) and Caudal ( B , D ). We computed these values for different sets of parameters: N ∈ 1 10 6 and λ ∈ 0.25 5 . We considered only the sites that have a PWM score higher than 70 % of the difference between the lowest and the highest score. ( A , B ) Orange colour indicates high correlation between the analytical model and the ChIP-seq profile, while white colour low correlation. ( C , D ) Blue colour indicates low mean squared error between the analytical model and the ChIP-seq profile, while white colour high mean squared error. ( E , F ) We plotted the regions where the mean square error is in the lower 12 % of the range of values (blue) and the correlation is the higher 12 % of the range of values (orange). With green rectangle we marked the optimal set of parameters in terms of mean squared error and with a black rectangle the intersection of the parameters for which the two regions intersect. fig:heatmapChIPseq_BCD_CAD_NoaccGroup070... First, there is an inconsistency in the experimental data in the sense there are peaks in the ChIP-seq profile that are located in DNA inaccessible areas, e.g. there are peaks in the Bicoid ChIP-seq profile at run, slp, eve, tll, gt, oc loci that overlap with DNA that is marked as inaccessible; see Figure fig:profileAllPositivesBCD in the Appendix. This indicates that either or both the DNA accessibility or the ChIP-seq data display some technical biases, e.g. , and, in these cases, the analytical model assumes that the DNA accessibility data is accurate and predicts that there is no binding in DNA inaccessible areas. One solution is to use continuous data for DNA accessibility, where different areas display different levels of accessibility. When using continuous values for DNA accessibility data, we did not observe any improvements of our model’s predictions. Nevertheless, we still observed ChIP-seq peaks for all five TFs that were overlapping with regions with reduce or no accessibility, thus, indicating the one or both data sets (ChIP-seq or DNase I) contain experimental biases; e.g. .... Nevertheless, Figures fig:profileAllPositivesHB and fig:profileAllPositivesKR in the Appendix show that the ChIP-seq profiles of Hunchback and Kruppel display some sharp peaks, which suggest that these two TFs display higher specificity than predicted by our approach. This contradicts our findings and one explanation for the few narrow ChIP-seq peaks is that these two TFs bind cooperatively to the genome. In this scenario, in the few narrow peaks for Hunchback and Kruppel, these TFs co-localise with co-factor(s) and previous studies identified that this is the case for both TFs; e.g. . This means that, by using our model, one could potentially underestimate the number of peaks in the binding profile.... The influence of weak binding on Hunchback and Kruppel ChIP-seq profiles. We plotted heatmaps for the correlation ( A ) and ( B ) and mean squared error ( C ) and ( D ) between the analytical model and the ChIP-seq profile of Hunchback A C and Kruppel B D . The analytical model includes binary DNA accessibility data (the accessibility of any site can be either 0 or 1 depending on whether the site is accessible or not). We computed these values for different sets of parameters: N ∈ 1 10 6 and λ ∈ 0.25 5 . Colour code as above. PWM filtering as in Figure fig:heatmapChIPseq_BCD_CAD_GT_AccRegionsGroup030. ( E , F ) We plotted the regions where the mean squared error is in the lower 12 % of the range of values (blue) and the correlation is the higher 12 % of the range of values (orange). With green rectangle we marked the optimal set of parameters in terms of mean squared error and with a black rectangle the intersection of the parameters for which the two regions intersect. fig:heatmapChIPseq_HB_KR_AccRegionsGroup030... Genome-wide quality of the fit. The boxplots represent the A C correlation and B D mean squared error between the ChIP-seq data sets and the analytically estimated profiles. We partitioned the genome in 20 K b p regions and we kept only the regions that had at least one DNA accessible site ( 4599 regions). Next for each ChIP-seq data set we selected the regions where the mean ChIP-seq signal is higher than a proportion of the background (see Table tab:ChIPseqProfileStatistics in the Appendix). In A B , we selected the regions with a mean ChIP-seq signal higher than the background ( > B ). In C D , we selected the regions with a mean ChIP-seq signal higher than half the background ( > 0.5 ⋅ B ). The numbers of DNA regions that display a mean ChIP-seq signal higher than the thresholds are listed in Table ... Quantifying the distances between Bicoid and Caudal ChIP-seq profiles and the profiles derived with the analytical model which includes DNA accessibility data. This is the same as Figure fig:heatmapChIPseq_BCD_CAD_NoaccGroup070, except that we included binary DNA accessibility data in the analytical model. fig:heatmapChIPseq_BCD_CAD_AccRegionsGroup070... Binding profiles for Hunchback at all 21 loci. The grey shading represents a ChIP-seq profile, the red line represents the prediction of the analytical model, the yellow shading represents the inaccessible DNA and the vertical blue lines represent the percentage of occupancy of the site (we only displayed sites with an occupancy higher than 5 % ). We considered the optimal set of parameters for Hunchback ( 2000 m o l e c u l e s and λ = 3.00 ).... One advantage of our analytical model is that it can be used to predict the binding profiles genome-wide and, thus, we extended the analysis from the original twenty one loci to the entire genome. We partitioned the genome in 20 K b p regions, from which we removed regions that did not have any accessible site. For each ChIP-seq profile, we then selected the regions that display a ChIP-seq signal higher than the genome-wide background. We found that the quality of our model’s predictions vary widely; see Figure fig:genomeWideQuality ( A ) and ( B ). In particular, there are regions where the correlation between our model predictions and the ChIP-seq profile is high, but at the same time regions where this correlation is low.... Kaplan et al. found that, at loci with low binding (low ChIP-seq signal), the correlation between the statistical thermodynamics model and the ChIP-seq profile was low. To test whether this is valid genome-wide, we also analysed regions where the mean signal is higher than half of the genome-wide background (leading in an increase in the number of investigated loci). Our results confirm that there is a decrease in the mean correlation when including regions with lower ChIP-seq signal; see Figure fig:genomeWideQuality ( C ). We also perform a Kolmogorov-Smirnov test that showed that in the case of Bicoid and Caudal this difference is statistically significant; see Figure fig:GenomeWideKSPvalue in the Appendix. This also means that, at least for regions with strong binding, the model predictions are highly correlated with the ChIP-seq profile as previously found ; see Figure fig:genomeWideQuality. Nevertheless, for regions with low binding, in addition to the reduction in the correlation we also observed a decrease in the mean squared error, which is statistically significant in the case of Bicoid, Caudal and Kruppel; see Figure fig:GenomeWideKSPvalue in the Appendix. Note that for Giant and Hunchback the difference is not statistically significant due to the small number of loci included in the analysis; see Table tab:GenomeWideNoOfRegions in the Appendix. This indicates that our model is able to correctly capture the low signal in those regions, but there is little or no correlation to the actual ChIP-seq signal. One explanation for this result is that, in those regions, there is little or no binding and what the ChIP-seq method recovers might be considered technical noise.
Loss of Lsd1 in Drosophila in specific cells of the Drosophila ovary results in increased BMP signaling outside the cap cell niche and an expanded germline stem cell (GSC) phenotype. To better characterize the function of Lsd1 in different cell populations within the ovary, we performed Chromatin immunoprecipitation coupled with massive parallel sequencing (ChIP-seq). This analysis shows that Lsd1 associates with a surprisingly limited number of sites in escort cells and fewer, and often, different sites in cap cells. These findings indicate that Lsd1 displays highly selective binding in specific cellular contexts. Examination of epitope tagged Lsd1 transgenes in specific cell populations within the Drosophila ovary
A central question in biology is how enhancers are made accessible. The Drosophila embryo is a good model system to study this question as the gene regulatory networks regulating early developmental events have been well characterized. Zelda (Zld) is a uniformly distributed transcription factor (TF) integral to these networks, acting prior to and in collaboration with the patterning TFs to regulate target enhancers. Here we test the hypothesis that Zld directs TF binding, examplified by Dorsal (Dl) which patterns the dorsoventral axis, across the genome by displacing nucleosomes at enhancers. By performing ChIP-seq and MNase-seq experiments on early embryos with or without Zld, we demonstrated that early enhancers are characterized by an intrinsically high nucleosome barrier, which is overcome by Zld, and that without Zld, Dl binding decreases at enhancers and redistributes to open regions devoid of enhancer activity. We propose that enhancers are initially specified across the genome by the binding of Zld, which locally decreases nucleosome occupancy, thereby assisting TFs in accessing their binding motifs and promoting transcriptional activity. Zld, Dl, Pol II ChIP-seq and MNase-seq profiles comparing 2-3h wild-type (wt) and zld- embryos, and MNase-seq profiles comparing 2-4h wt and gd7 embryos, all in 2 replicates
modENCODE_submission_4192 This submission comes from a modENCODE project of David MacAlpine. For full list of modENCODE projects, see http://www.genome.gov/26524648 Project Goal: We will precisely identify sequence elements that direct DNA replication by using chromatin immunoprecipitation of known replication initiation complexes. These experiments will be conducted in multiple cell types and developmental tissues. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf EXPERIMENT TYPE: CHIP-seq. BIOLOGICAL SOURCE: Strain: Oregon-R(official name : Oregon-R-modENCODE genotype : wild type ); Developmental Stage: Embryo 4-7h; Genotype: wild type; EXPERIMENTAL FACTORS: Strain Oregon-R(official name : Oregon-R-modENCODE genotype : wild type ); read length (read_length) ; Antibody dORC2 (target is Drosophila ORC2p); Developmental Stage Embryo 4-7h
Contributors:Grzegorz Sienski, Derya Dönertas, Julius Brennecke
Related to Figure 7
(A and B) Shown are normalized density profiles of Pol II ChIP-seq (red), GRO-seq (black), RNA-seq (brown) and H3K9me3 ChIP-seq (green) for the indicated OSC knockdowns (left).
(A) Shown is the ∼140kb area with an mdg1 insertion upstream of the typically non-expressed gene CG15278. Upon loss of the piRNA pathway transcriptional bleeding from the TE insertion into the CG15278 locus leads to accumulation of RNA reads.
(B) Shown is a ∼120kb area with a 17.6 insertion in sense orientation into an intron of the Btk29A transcription unit. This insertion triggers H3K9me3 spreading, which depends on Piwi but only weakly on Mael. Loss of the piRNA pathway does not lead to upregulation of the host gene, classifying this insertion.
... The piRNA Pathway Silences TEs at the Transcriptional Level
(A) Experimental scheme of genome-wide profiling experiments performed for this study.
(B) Scatter plot of RPKM values (log2) for all TEs (n = 125) in GFP (control) or piwi knockdown samples based on RNA-seq. Four TE groups are color coded.
(C) Scatter plot of RPKM values (log2) for all TEs (n = 125) in GFP (control) or mael knockdown samples based on RNA-seq. Four TE groups are color coded.
(D) Displayed are fold changes of TE expression (groups I–III; colors as in B) in OSCs transfected with indicated siRNAs (normalized to control cells) at the level of steady-state sense RNA (RNA-seq; heatmap), Pol II occupancy (ChIP-seq), or nascent sense RNA (GRO-seq). The piRNA-seq diagram indicates Piwi-bound piRNA levels mapping antisense to indicated TEs.
(E) Density profiles of normalized reads from RNA-seq (top), Pol II ChIP-seq (middle), and GRO-seq (bottom) experiments on mdg1 (group I) and F-element (group III). Orange line indicates levels in control cells, and solid signal indicates levels in piwi KD cells.
(F–H) Box plots showing fold changes (log2) in the expression of group I, group II, and group III TEs based on RNA-seq (F), Pol II ChIP-seq (G), or GRO-seq (H) upon piwi KD (compared to control; p values based on Wilcoxon rank-sum test). Box plots show median (line), 25th–75th percentile (box) ± 1.5 interquartile range; circles represent outliers. Contrasted are sense and antisense reads (RNA-seq and GRO-seq) and IP versus input (Pol II ChIP-seq).
See also Figure S2.
... Related to Discussion
(A) Shown are normalized density profiles of Pol II ChIP-seq (red), GRO-seq (black), RNA-seq (brown), H3K9me3 ChIP-seq (green) and piRNA-seq (light green) for the indicated OSC knockdowns (left). Shown is the ∼20kb area around the transcriptional start site of the flamenco cluster. Shown are only reads mapping uniquely to the genome but we note that nearly all areas in this window are genome-unique.
(B) Western blot showing protein levels of Armi, Piwi, Lamin, Mael, HP1 and Histone 3 (H3) in cytoplasmic, nucleoplasmic, soluble and insoluble chromatin fractions of OSCs. The relative amount of each fraction loaded per lane (based on fraction volume) is given below. The following antibodies were used: α-Lamin (ADL67.10, DSHB), α-HP1 (C1A9, DSHB) and α-H3 (Abcam, ab1791).
... Related to Discussion
Shown are normalized density profiles of Pol II ChIP-seq (red), GRO-seq (black), RNA-seq (brown), H3K9me3 ChIP-seq (green) and piRNA-seq (light green) for the indicated OSC knockdowns (left). Shown is a ∼60kb area that resides in the peri-centromeric heterochromatin of chromosome 2R (cytological position 42A); the position of the mdg1 insertion (minus strand) is indicated; note the absence of piRNAs mapping to this region and the massive spreading of H3K9me3 in mael KD cells.
... Piwi-RISC Mediates TGS of TEs in Ovarian Somatic Cells
(A) Scheme of a Drosophila ovariole and an individual egg chamber (somatic cells in green, germline cells in beige). Indicated is the classification of TEs according to Malone et al. (2009).
(B) Scatter plot of Pol II ChIP-seq RPKM values (log2) for all TEs (n = 125; color code as in A) from control KD ovaries (tj-GAL4 > RNAi tej) versus armi KD ovaries (tj-GAL4 > RNAi armi).
(C) Density profiles of normalized Pol II ChIP-seq reads on ZAM and gypsy (soma dominant) and Burdock and HeT-A (germline dominant). Orange line indicates levels in control, and solid signal indicates levels in armi KD ovaries.
(D) Box plots indicating fold enrichments (log2) of Pol II ChIP-seq reads on TEs belonging to the indicated classes. Contrasted are IP (Pol II) versus input (p values based on Wilcoxon rank-sum test). Box plots are as in Figure 3.
(E) Normalized Pol II ChIP-seq read density on the gypsy-lacZ reporter in control ovaries (black line) versus armi KD ovaries (red line). Small inset displays the fold change (armi KD versus control) of Pol II occupancy on the reporter.
(F) Shown to the left are β-gal stainings of egg chambers from gypsy-restrictive ovaries (top) and gypsy-permissive ovaries (bottom) harboring the gypsy-lacZ reporter (Sarot et al., 2004). In the center, piRNA levels (black, restrictive strain; red, permissive strain) mapping to the indicated TEs (sense up, antisense down; normalized to 1Mio miRNAs) are displayed, and the portion of gypsy present in the gypsy-lacZ reporter (cartoon at top) is indicated.
(G) Shown is the Pol II ChIP-qPCR analysis on the gypsy-reporter (primers 1 and 2 indicated in F) in ovaries from restrictive versus permissive strains (enrichments calculated over intergenic region; n = 3; error bars represent SD.).
See also Figure S3.
Contributors:Tamer Ali, Rainer Renkawitz, Marek Bartkuhn
Insulators, chromatin domains and topologically associated domains (TADs). Interaction matrix representing a virtual Hi-C experiment (top). The grey scale above indicates interaction frequencies. Interactions occur predominantly within TADs (e.g. enhancer–promoter interactions), which are often grouped in subdomains. Interactions between TAD boundaries are thought to depend on the binding of CTCF (shown as a schematic ChIP-seq track in red) to its cognate DNA-binding motif (black arrows). CTCF sites not involved in binding to TAD boundaries are shown in pale red and grey motifs, respectively. Motifs involved in long-range chromatin interactions show an inverted repeat orientation (see Figure 2). As not all TAD boundaries are bound by CTCF it is likely that additional factors may be involved in their function (indicated by question mark). TADs are often co-incident with chromatin domains represented by a schematic ChIP-seq track for an active (H3K36me3; green) and a repressive (H3K27me3; blue) histone modification. Active TADs are gene-rich (black bars for active genes) in contrast to gene-poor repressed domains (grey bars).
... Insulator components with conserved features in vertebrates and Drosophila.
... Drosophila CP190 recruitment and strength of TAD boundaries/insulators correlate with combinatorial binding of architectural proteins. (a) The interaction matrix represents TADs. Boundaries between TADs are often marked by CP190 binding (schematic ChIP-seq track, blue). CP190 is recruited to chromatin by a wide variety of insulator binding factors (IBPs, as exemplified by CTCF, BEAF32 and Pita in schematic ChIP-seq tracks). Frequently, different insulator binding factors cluster together, suggesting a cooperative recruitment mode for targeting CP190 to chromatin. Combinatorial recruitment of CP190 to TAD boundaries may be functionally important since high occupancy of IBPs and other architectural proteins such as cohesin, condensin and TFIIIC predict the strength of insulator function as well as TAD borders [54••]. It should be noted that not all TAD boundaries are bound by known IBPs (?) and that many IBP binding sites are found within TADs. (b) The physical DNA string model summarizes the contact and binding data illustrated in (a).
Contributors:Bell O, Schwaiger M, Oakeley EJ, Lienert F, Beisel C, Stadler MB, Schï¿½beler D
Full title: Complex patterns of genome accessibility discriminate sites of PcG repression, H4K16 acetylation and replication initiation Histone modifications have been proposed to regulate gene expression in part by modulating DNA accessibility and higher-order chromatin structure. However, there is limited direct evidence to support structural differences between euchromatic and heterochromatic fibers in the nucleus. To ask how histone modifications relate to chromatin compaction, we measured DNA accessibility throughout the genome by combining M.SssI methylase footprinting with methylated DNA immunoprecipitation (MeDIP-footprint). In the Drosophila genome, we find that accessibility to DNA methylase is variable in a manner that relates to the differential distribution of active and repressive histone modifications. Active promoters are highly permissive to M.SssI activity, yet inactive chromosomal domains decorated with H3 lysine 27 trimethylation are least accessible providing in vivo evidence for Polycomb-mediated chromatin compaction. Conversely, DNA accessibility is increased at active chromosomal regions marked with H4 lysine 16 acetylation and at the dosage-compensated male X chromosome suggesting that Drosophila transcriptional dosage compensation is facilitated by more permissive chromatin structure. Interestingly early replicating chromosomal regions and sites of replication initiation show also higher accessibility linking temporal and spatial control of genome duplication to the structural organization of chromatin. In conclusion, using a novel protocol we generated a comprehensive view of DNA accessibility and uncover different levels of chromatin organization, which are delineated by distinct patterns of posttranslational histone modifications and replication. Keywords: cell type comparison, ChIP-chip, MeDIP-footprint, RNA-seq, ChIP-seq MeDIP-footprint and ChIP-chip: ChIP-chip was performed for H3K4me3, H3K36me2, H3K36me3, H3K27me3, and H3K9me2 in Kc cells. We measured DNA accessibility throughout the genome by combining M.SssI methylase footprinting with methylated DNA immunoprecipitation (MeDIP-footprint) in Kc and S2 cells. RNA-seq: cDNA from RNA from Drosophila Kc cells was sequenced using Illumina deep sequencing. Reads were mapped and the abundance of all transcripts was determined. ChIP-seq: PSC ChIP from Drosophila Kc cells was sequenced using Illumina deep sequencing in three lanes. Reads were mapped and the binding profile of PSC was determined.
Contributors:Artyom A. Alekseyenko, Shouyong Peng, Erica Larschan, Andrey A. Gorchakov, Ok-Kyung Lee, Peter Kharchenko, Sean D. McGrath, Charlotte I. Wang, Elaine R. Mardis, Peter J. Park, Mitzi I. Kuroda
MSL3-Independent Chromatin Entry Sites Are a Subset of the Wild-Type Binding Pattern for MSL Complex and Coincide with the Strongest Enrichment Peaks Detected by ChIP-seq
Two representative chromatin entry sites, CES11D1 (A) and CES15A8 (B), are shown. ChIP-chip profiles were generated from y w; MSL3-TAP; msl3 embryonic chromatin (WT) using IgG to IP the TAP epitope, or msl3 mutant embryos (CES) using anti-MSL2 antibodies. DNA resulting from ChIP was hybridized to custom NimbleGen tiling arrays (Alekseyenko et al., 2006). The y axis shows the log2 ratio of IP/Input signal. The ChIP-seq tag profile (Solexa) was obtained from an MSL3-TAP transformed male cell line, Clone 8, using IgG to IP the TAP epitope. The ChIP-seq profile displays broad distribution along WT MSL targets and high peaks that correspond to entry sites. The y axis shows the tag density. Gray lines within ChIP-chip and ChIP-seq panels indicate the regions identified as bound clusters (See Experimental Procedures for details). Genes are color-coded based on their transcriptional status (transcribed, red; nontranscribed, black; genes that are differentially transcribed in S2 and Clone 8 cells, salmon; and genes without transcriptional data, gray). Genes on the top row are transcribed left to right, and genes on the bottom row are transcribed from right to left. Numbers along the x axis refer to chromosomal position (bp) (Dm1 release coordinates). Polytene map cytological locations are indicated below.
Contributors:Nathan Boley, Kenneth H. Wan, Peter J. Bickel, Susan E. Celniker
Heat map visualization of modENCODE ChIP-seq (left) and RNA-seq (right) data.