Contributors:Joshi, Anagha, Beck, Yvonne, Michoel, Tom
Transcription factors and their number of target genes in the D. melanogaster ChIP-chip gold standard network and in the predicted networks for six Drosophila species at the 10% recall level (in brackets for each TF the number of true positive predictions). The bottom two rows are the total number of interactions in each network and the overall precision (percentage of true positives) of the predicted networks.... We collected gene expression data for 3,610 genes in six Drosophila species measured at 9–13 time points during early embryonic development with 3–8 replicates per time point (200 samples in total) . To obtain a global view on the similarities and differences between samples, we performed multi-dimensional scaling using Sammon’s nonlinear mapping criterion on the 3,610-dimensional sample vectors (cf. Methods and Figure fig:sammona). The first (horizontal) axis of variation corresponded to developmental time, with samples ordered along this dimension according to increasing developmental time points, while the second (vertical) axis of variation corresponded to evolutionary distance, with samples ordered along this dimension according to species. By expanding these two axes of variation into principal components, we found that the “developmental” dimension explained 34% of the total variation in the data, while the “evolutionary” dimension explained 11% (cf. Methods). This result confirms that variations in gene expression levels across Drosophila species at the same developmental time point are not greater than variations across time points within the same species. In this study we were interested whether this additional layer of inter-species expression variation can be harnessed in the reconstruction of gene regulatory networks.... To quantitatively compare different methods across different gold standard networks we considered the area under the recall–precision curve (AUC) and the precision at 10% recall (PREC10) as performance measures and converted them to P -values by comparison to AUCs and PREC10s of networks generated by randomly assigning ranks to all possible edges in the corresponding gold standard network (cf. Methods and Figure fig:rec-prec-aggr for the recall vs. precision curves). While the AUC assesses the overall performance of a predicted network, PREC10 measures the quality of the top-ranked predictions, a property that may be of greater practical relevance. This analysis showed that no predicted network performs best for either measure across all gold standards (Figure fig:scorea-f). The single-species virilis networks performed best for 5 out of 12 AUC and PREC10 scores, albeit not for the ChIP-seq network measured in its own species. This overall good performance is consistent with virilis having the highest number of measured time points in the data (Supplementary Table tab:data). D. melanogaster also had more data points available than the other four species, but its time series were less complete (Supplementary Table tab:data). Among the integrative methods, the centroid and union methods both performed best for 5 out of 12 AUC and PREC10 scores (Figure fig:scorea-f). Both also had higher average AUC score than the best single-species network, but only the centroid method had higher average PREC10 score than the best single-species network (Figure fig:scoreg). The most important result however is the fact that the single-species network for the species were the gold standard network was measured never has the highest single-species AUC and only twice has the highest PREC10. In contrast, the centroid method always performs as good, and in most cases better, than the single-species network for the reference species (Figure fig:scorea-f). We conclude that the centroid method is the most robust network integration method achieving consistently high AUC and PREC10 scores, at least on this dataset.... Embryonic developmental time-course expression data in 6 Drosophila species (D. melanogaster (“amel”), D. ananassae (“ana”), D. persimilis (“per”), D. pseudoobscura (“pse”), D. simulans (“sim”) and D. virilis (“vir”)) was obtained from (ArrayExpress accession code E-MTAB-404). The data consists of 10 (amel), 13 (vir) or 9 (ana, per, pse, sim) developmental time points with several replicates per time point resulting in a total of 56 (amel), 36 (vir) or 27 (ana, per, pse, sim) arrays per species (Supplementary Table tab:data). The downloaded data was processed by averaging absolute expression levels over all reporters for a gene followed by taking the log 2 transform.... a. Number of interactions found in one to six species in the inferred gene regulatory networks at 10% recall level (red dots) and in 100 randomized networks with the same in- and out-degree distribution as the inferred networks (boxplots). b. Percentage of all predicted interactions (yellow) and of all true positive predictions (blue) in one to six species c. Precision of interactions found in one to six species. d. Recall of ChIP-seq gold standard interactions conserved in one to three species (green; data for BCD, KR, HB) and one to four species (red; data for TWI). e. Phylogenetic tree between six Drosophila species reconstructed from the inferred interactions at 10% recall level, with the total number of interactions in each species shown in brackets. The tree correctly splits the species in 3 groups – melanogaster (top), obscura (middle), virilis (bottom). Each branch, (numbered 1–9) represents a inferred network state transition. At each network state transition, the number of interactions inferred to be gained (red) or lost (blue) as well as the bootstrap value for each branch (in brackets) is indicated.... Although the gold standard network reconstructed from ChIP-chip data was in D. melanogaster, perhaps surprisingly the D. melanogaster predicted network did not perform better overall than the networks predicted for the other species (Figure fig:sammonb). To get confidence in this observation, we downloaded ChIP-sequencing data for three TFs (BCD, KR, HB) in three Drosophila species (melanogaster, pseudoobscura and virilis) and one TF (TWI) in four species (melanogaster, simulans, ananassae and pseudoobscura) , and created ChIP-seq gold standard networks for five species (cf. Methods). The recall-precision curves generated from the D. melanogaster ChIP-seq gold standard network (Supplementary Figure fig:rec-prec-singleb) were in good agreement with the ChIP-chip data, demonstrating again that the D. melanogaster predicted network performed no better than other Drosophila species. We also calculated recall-precision curves using the D. ananassae, D. pseudobscura, D. simulans and D. virilis ChIP-seq gold standard networks. Again, the regulatory network in that species did not perform better compared to the other species (Supplementary Figure fig:rec-prec-singlec–f).... Performance scores with respect to the gold standard ChIP-chip network for 14 TFs in D. melanogaster (a) and the ChIP-seq networks for D. melanogaster (b, 4 TFs), D. ananassae (c, 1 TF), D. pseudoobscura (d, 4 TFs), D. simulans (e, 1 TF), D. virilis (f, 4 TFs), and their averages over all gold standard networks (g). In each panel, the left, resp. right, figure shows - log 10 P A U C , resp. - log 10 P P R E C 10 for the six single-species predicted networks (green) and the five prediction aggregation methods (red). The dashed lines indicate the performance level of the single-species network for the gold standard species (a–f) or of the best performing single-species network (g). Values with a ∗ in panel a indicate numerical underflow values truncated to the smallest non-zero P -value ( 10 -324 ).... Recall vs. precision curves for predicted regulatory networks for five multi-species meta-analysis methods. The gold standard networks were the ChIP-chip network for 14 TFs in D. melanogaster (a) and the ChIP-seq networks for D. melanogaster (b, 4 TFs), D. ananassae (c, 1 TF), D. pseudoobscura (d, 4 TFs), D. simulans (e, 1 TF) and D. virilis (f, 4 TFs). The numbers in each legend are the area under the curve for each method.... Recall vs. precision curves for predicted regulatory networks in six Drosophila species. The gold standard networks were the ChIP-chip network for 14 TFs in D. melanogaster (a) and the ChIP-seq networks for D. melanogaster (b, 4 TFs), D. ananassae (c, 1 TF), D. pseudoobscura (d, 4 TFs), D. simulans (e, 1 TF) and D. virilis (f, 4 TFs). In panel a, the numbers in the legend are the area under the curve for each species. In panel b–f, the curve for the reference species is in red while the other species are in black.
Quantifying the distances between Giant, Hunchback and Kruppel ChIP-seq profiles and the profiles derived with the analytical model which includes DNA accessibility data. This is the same as Figure fig:heatmapChIPseq_GT_HB_KR_NoaccGroup070, except that we included binary DNA accessibility data in the analytical model. fig:heatmapChIPseq_GT_HB_KR_AccRegionsGroup070... Quantifying the distances between Bicoid and Caudal ChIP-seq profiles and the profiles derived with the analytical model. We plotted heatmaps for the correlation ( A ) and ( B ) and mean squared error ( C ) and ( D ) between the analytical model and the ChIP-seq profile of Bicoid ( A , C ) and Caudal ( B , D ). We computed these values for different sets of parameters: N ∈ 1 10 6 and λ ∈ 0.25 5 . We considered only the sites that have a PWM score higher than 70 % of the difference between the lowest and the highest score. ( A , B ) Orange colour indicates high correlation between the analytical model and the ChIP-seq profile, while white colour low correlation. ( C , D ) Blue colour indicates low mean squared error between the analytical model and the ChIP-seq profile, while white colour high mean squared error. ( E , F ) We plotted the regions where the mean square error is in the lower 12 % of the range of values (blue) and the correlation is the higher 12 % of the range of values (orange). With green rectangle we marked the optimal set of parameters in terms of mean squared error and with a black rectangle the intersection of the parameters for which the two regions intersect. fig:heatmapChIPseq_BCD_CAD_NoaccGroup070... First, there is an inconsistency in the experimental data in the sense there are peaks in the ChIP-seq profile that are located in DNA inaccessible areas, e.g. there are peaks in the Bicoid ChIP-seq profile at run, slp, eve, tll, gt, oc loci that overlap with DNA that is marked as inaccessible; see Figure fig:profileAllPositivesBCD in the Appendix. This indicates that either or both the DNA accessibility or the ChIP-seq data display some technical biases, e.g. , and, in these cases, the analytical model assumes that the DNA accessibility data is accurate and predicts that there is no binding in DNA inaccessible areas. One solution is to use continuous data for DNA accessibility, where different areas display different levels of accessibility. When using continuous values for DNA accessibility data, we did not observe any improvements of our model’s predictions. Nevertheless, we still observed ChIP-seq peaks for all five TFs that were overlapping with regions with reduce or no accessibility, thus, indicating the one or both data sets (ChIP-seq or DNase I) contain experimental biases; e.g. .... Nevertheless, Figures fig:profileAllPositivesHB and fig:profileAllPositivesKR in the Appendix show that the ChIP-seq profiles of Hunchback and Kruppel display some sharp peaks, which suggest that these two TFs display higher specificity than predicted by our approach. This contradicts our findings and one explanation for the few narrow ChIP-seq peaks is that these two TFs bind cooperatively to the genome. In this scenario, in the few narrow peaks for Hunchback and Kruppel, these TFs co-localise with co-factor(s) and previous studies identified that this is the case for both TFs; e.g. . This means that, by using our model, one could potentially underestimate the number of peaks in the binding profile.... The influence of weak binding on Hunchback and Kruppel ChIP-seq profiles. We plotted heatmaps for the correlation ( A ) and ( B ) and mean squared error ( C ) and ( D ) between the analytical model and the ChIP-seq profile of Hunchback A C and Kruppel B D . The analytical model includes binary DNA accessibility data (the accessibility of any site can be either 0 or 1 depending on whether the site is accessible or not). We computed these values for different sets of parameters: N ∈ 1 10 6 and λ ∈ 0.25 5 . Colour code as above. PWM filtering as in Figure fig:heatmapChIPseq_BCD_CAD_GT_AccRegionsGroup030. ( E , F ) We plotted the regions where the mean squared error is in the lower 12 % of the range of values (blue) and the correlation is the higher 12 % of the range of values (orange). With green rectangle we marked the optimal set of parameters in terms of mean squared error and with a black rectangle the intersection of the parameters for which the two regions intersect. fig:heatmapChIPseq_HB_KR_AccRegionsGroup030... Genome-wide quality of the fit. The boxplots represent the A C correlation and B D mean squared error between the ChIP-seq data sets and the analytically estimated profiles. We partitioned the genome in 20 K b p regions and we kept only the regions that had at least one DNA accessible site ( 4599 regions). Next for each ChIP-seq data set we selected the regions where the mean ChIP-seq signal is higher than a proportion of the background (see Table tab:ChIPseqProfileStatistics in the Appendix). In A B , we selected the regions with a mean ChIP-seq signal higher than the background ( > B ). In C D , we selected the regions with a mean ChIP-seq signal higher than half the background ( > 0.5 ⋅ B ). The numbers of DNA regions that display a mean ChIP-seq signal higher than the thresholds are listed in Table ... Quantifying the distances between Bicoid and Caudal ChIP-seq profiles and the profiles derived with the analytical model which includes DNA accessibility data. This is the same as Figure fig:heatmapChIPseq_BCD_CAD_NoaccGroup070, except that we included binary DNA accessibility data in the analytical model. fig:heatmapChIPseq_BCD_CAD_AccRegionsGroup070... Binding profiles for Hunchback at all 21 loci. The grey shading represents a ChIP-seq profile, the red line represents the prediction of the analytical model, the yellow shading represents the inaccessible DNA and the vertical blue lines represent the percentage of occupancy of the site (we only displayed sites with an occupancy higher than 5 % ). We considered the optimal set of parameters for Hunchback ( 2000 m o l e c u l e s and λ = 3.00 ).... One advantage of our analytical model is that it can be used to predict the binding profiles genome-wide and, thus, we extended the analysis from the original twenty one loci to the entire genome. We partitioned the genome in 20 K b p regions, from which we removed regions that did not have any accessible site. For each ChIP-seq profile, we then selected the regions that display a ChIP-seq signal higher than the genome-wide background. We found that the quality of our model’s predictions vary widely; see Figure fig:genomeWideQuality ( A ) and ( B ). In particular, there are regions where the correlation between our model predictions and the ChIP-seq profile is high, but at the same time regions where this correlation is low.... Kaplan et al. found that, at loci with low binding (low ChIP-seq signal), the correlation between the statistical thermodynamics model and the ChIP-seq profile was low. To test whether this is valid genome-wide, we also analysed regions where the mean signal is higher than half of the genome-wide background (leading in an increase in the number of investigated loci). Our results confirm that there is a decrease in the mean correlation when including regions with lower ChIP-seq signal; see Figure fig:genomeWideQuality ( C ). We also perform a Kolmogorov-Smirnov test that showed that in the case of Bicoid and Caudal this difference is statistically significant; see Figure fig:GenomeWideKSPvalue in the Appendix. This also means that, at least for regions with strong binding, the model predictions are highly correlated with the ChIP-seq profile as previously found ; see Figure fig:genomeWideQuality. Nevertheless, for regions with low binding, in addition to the reduction in the correlation we also observed a decrease in the mean squared error, which is statistically significant in the case of Bicoid, Caudal and Kruppel; see Figure fig:GenomeWideKSPvalue in the Appendix. Note that for Giant and Hunchback the difference is not statistically significant due to the small number of loci included in the analysis; see Table tab:GenomeWideNoOfRegions in the Appendix. This indicates that our model is able to correctly capture the low signal in those regions, but there is little or no correlation to the actual ChIP-seq signal. One explanation for this result is that, in those regions, there is little or no binding and what the ChIP-seq method recovers might be considered technical noise.