Filter Results
9 results
With the establishment of better protocols and decreasing costs, high-throughput sequencing experiments such as RNA-seq or ChIP-seq are now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data available in public domain might be hindered due to lack of bioinformatics expertise. Though several user friendly tools allow such comparison gene or promoter level, a genome-wide picture is missing. We developed Heat*seq, a free, open-source web-tool that allows comparison at genome-wide scale of any experiments provided by the user to public datasets (RNA-seq, ChIP-seq and CAGE experiments from Bgee, Blueprint epigenome, CODEX, ENCODE, FANTOM5, FlyBase, modEncode, and Roadmap epigenomics) in human, mouse and drosophila. Correlation coefficients amongst experiments is displayed as an interactive correlation heatmaps. Users can thus identify clusters of experiments in public domain similar to their experiment in minutes through a user-friendly interface. This fast interactive web-application uses the R/shiny framework allowing the generation of high-quality figures and tables that can be easily downloaded in multiple formats. Heat*seq is freely available at http://www.heatstarseq.roslin.ed.ac.uk/.
Data Types:
  • Other
With the establishment of better protocols and decreasing costs, high-throughput sequencing experiments such as RNA-seq or ChIP-seq are now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data available in public domain might be hindered due to lack of bioinformatics expertise. Though several user friendly tools allow such comparison gene or promoter level, a genome-wide picture is missing. We developed Heat*seq, a free, open-source web-tool that allows comparison at genome-wide scale of any experiments provided by the user to public datasets (RNA-seq, ChIP-seq and CAGE experiments from Bgee, Blueprint epigenome, CODEX, ENCODE, FANTOM5, FlyBase, modEncode, and Roadmap epigenomics) in human, mouse and drosophila. Correlation coefficients amongst experiments is displayed as an interactive correlation heatmaps. Users can thus identify clusters of experiments in public domain similar to their experiment in minutes through a user-friendly interface. This fast interactive web-application uses the R/shiny framework allowing the generation of high-quality figures and tables that can be easily downloaded in multiple formats. Heat*seq is freely available at http://www.heatstarseq.roslin.ed.ac.uk/.
Data Types:
  • Other
ChIP-seq... Drosophila simulans... We subsampled reads from R2 ChIP-seq to 100x coverage using BBnorm (v37.54) with the parameters "threads=24 prefilter=t target=100", and created de novo contigs from the subsampled ChIPseq reads (ChIPtigs) with Spades v3.11.0 (-t 24 -careful –sc;).... Drosophila melanogaster... Drosophila... We created a custom Drosophila-specific consensus repeat library modified from RepBase v20150807 to include all complex satellite DNAs from Drosophila melanogaster.
Data Types:
  • Other
  • Sequencing Data
  • Tabular Data
  • Dataset
  • Text
Expression profiling analysis: Transcriptome data from four biological replicates were generated using 8x15K Customized Drosophila Genome Oligo Microarrays (Agilent). Slide image data was quantified using Agilent's Feature Extraction software.... ChIP-Seq experiments were visualized as custom tracks using Integrative Genomics Viewer (Broad Institute). Total uniquely mapped tags were normalized to 10 million reads to generate tracks using HOMER.... Drosophila melanogaster
Data Types:
  • Other
  • Dataset
  • File Set
In Drosophila, graded expression of the maternal transcription factor Bicoid (Bcd) provides positional information to activate target genes at different positions along the anterior-posterior axis. We have measured the genome-wide binding profile of Bcd using ChIP-seq in embryos expressing single, uniform levels of Bcd protein, and grouped Bcd-bound targets into several “affinity” classes based on occupancy at different concentrations. By measuring the biochemical affinity of target enhancers in these classes in vitro and genome-wide chromatin accessibility by ATAC-seq, we found that the occupancy of target sequences by Bcd is not primarily determined by Bcd binding sites, but by genomic context. Bcd drives an open chromatin state at a subset of its targets. Our data support a model whereby Bcd influences chromatin structure to gain access to low affinity targets at high concentrations, while high affinity targets are found in more accessible chromatin and are bound at low concentrations.
Data Types:
  • Other
Dosage compensation is a highly conserved process across species that is necessary for correct embryonic development. It is the equalization of transcriptional activation of essential genes across sex-linked chromosomes. In Drosophila melanogaster the Male specific lethal (MSL) complex binds to MSL recognition elements (MRE) within Chromatin entry sites (CES) and spreads to active genes on male the X chromosome and increases transcriptional regulation by 2 fold. Pre-existing arrangement of the CES in space established by long-range interactions between them is necessary for MSL complex recruitment and spreading. By Schedl Lab, an insulator complex LBC was shown to bind to CES. Thus, CES could potentially establish the X chromosome 3D network responsible for MSL binding based on the insulator-like properties of the CES. In this study, I aimed at finding out whether two of the most important CES, roX1 and roX2, exhibit boundary region activity in vivo via transvection assay. I show that the minimal LBC-bound region is insufficient to mediate transvection, and propose that some additional sequences may be required for full boundary function based on the analysis of insulator proteins ChIP-seq data.
Data Types:
  • Other
The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, we propose an analytical model to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in the case of eukaryotes), the number of TF molecules expected to be bound specifi- cally to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in the form of ChIP-seq profiles, we backwards inferred copy number and specificity for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. Our results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that whilst Bicoid and Caudal display a higher specificity, the other three TFs (Giant, Hunchback and Kruppel) display lower specificity in their binding (despite having PWMs with higher information content). This study gives further weight to earlier investigations into TF copy numbers that suggest a significant proportion of molecules are not bound specifically to the DNA.
Data Types:
  • Other
We are developing a customized version of Galaxy called G-OnRamp that will enable biologists to annotate the functional elements of eukaryotic genomes using large genomic datasets, a task that can also serve as an introduction to other “big data” biomedical analyses. Genome annotation—identifying functionally active regions within a genome—requires the use of diverse datasets and tools, including sequence similarity to known genes, gene prediction models, and high-throughput genomic data. To construct this interactive Web-based environment for genome annotation, we are building on two successful efforts, the Genomics Education Partnership (GEP) and Galaxy. GEP (http://gep.wustl.edu) is a consortium of over 100 colleges/universities that provides classroom undergraduate research experiences in genomics for students at all levels. Students perform primary research on selected regions of Drosophila genomes using genomic databases (e.g., FlyBase) and bioinformatics tools (e.g., BLAST) while learning about gene structure, evolution, programming, and other topics. GEP faculty are now interested in annotating other eukaryotic genomes, reflecting their diverse research interests. G-OnRamp will extend Galaxy by providing (a) analysis pipelines for functional genomic data (e.g., ChIP-Seq, RNA-Seq); (b) interactive visual analytics to annotate a genome (e.g., create UCSC Assembly Hubs); and (c) capacity for collaborative genome annotation. The GEP will serve as a key use case to validate and refine G-OnRamp, ensuring that it satisfies real educational needs. In this poster and demonstration, we will describe G-OnRamp’s vision and showcase its current features. G-OnRamp is available under the Academic Free License, and the software will be available via https://github.com/goeckslab
Data Types:
  • Other
Intellectual disability (ID), one of the most complex disorders, has a worldwide prevalence of approximately 2% and is a frequent cause of severe disability. Therefore this disorder constitutes a major burden not only on the affected families but also on society. It has become clear that X-linked forms account for only ten percent of ID cases, which means that the vast majority of the underlying genetic defects must be autosomal, but it has so far received considerably less attention. A particularly straightforward strategy for the identification of genes underlying autosomal recessive disorders is homozygosity mapping in extended consanguineous families, followed by mutation screening of candidate genes. In Western societies, where most of the research takes place, its investigation has been hampered by infrequent parental consanguinity and small family sizes. Therefore, to unravel the molecular basis of ARID in a systematic fashion as a prerequisite for diagnosis, counselling and therapy, we focused on large consanguineous Iranian families with several mentally retarded children. During the course of our investigations into the autosomal recessive causes of intellectual disability (ARID,) we have identified numerous new loci for this condition. However, no more than six hotspot loci for unspecific or non-syndromic autosomal recessive intellectual disability (NS-ARID) have been identified, which may indicate that, at least in the Iranian population, not all of the gene defects causing NS-ARID are extremely rare and the possible existence of common molecular causes for NS-ARID have not been ruled out. The work presented here is part of this large project to shed more light on the molecular causes of ARID. In this study the investigation of two out of these 6 hot spot loci led to the identification of underlying gene defects. One of these involves the linkage intervals of two Iranian families with several NS-ARID patients overlapping on Chr19q13.2-q13.31. Two different missense mutations with high pathogenicity scores were detected in ZNF526, which encodes a krueppel-type zinc finger protein. One of these changes was observed in DNA samples collected from two distinct families with nonsyndromic ID, but closer inspection revealed that these families, which live in the same city in the Northwestern part of Iran, share a common haplotype and thus must be distantly related. Each mutation affects a functional domain of ZNF526 and both alter the protein conformation, causing a putative functional impairment as suggested by in silico protein modelling. A decrease in DNA affinity was confirmed by CHIP-seq, and array- based gene expression studies showed specific changes in the expression patterns of patient lymphoblastoid cells, which could be recapitulated in ZNF526-deficient neuroblastoma cells. Functional annotation showed significant enrichment of the deregulated genes in pathways that play a role in protein synthesis, mitochondrial dysfunction, energy metabolism and gene regulation. We could implicate that ZNF526 protein interacts with PRKRIR, which is a transcription factor (TF) that has been added to human TF repertoire recently in the primate history. Therefore ZNF526 and PRKRIR together are particularly promising candidates in investigating the development and evolution of higher brain function in primates. This study also resolved the underlying gene defect of MRT5 and reported three deleterious mutations in NSUN2. These were found in two independent consanguineous Iranian families and one Turkish family with several patients suffering from non-syndromic ARID. NSUN2 encodes a methyltransferase, which catalyzes the intron-dependent formation of 5- methylcytosine at C34 of tRNA-leu(CAA). Hence all mutations lead to a loss of NSUN2 protein function in homozygous mutation carriers and in all likelihood cause the patient phenotype. In order to gain further evidence for an involvement of NSUN2 in cognitive functions, we studied Drosophila mutants that lack the NSUN2 ortholog. These experiments revealed a marked learning impairment in mutant flies, which clearly underscores the relevance of NSUN2 in higher brain functions. Furthermore, this study was the first report on a mutation in patients with dysequilibrium syndrome that affects VLDLR exclusively, confirming the central role of the very low-density lipoprotein receptor in the aetiology of this condition. The mutations in this gene have been found to be associated with quadrupedal mobility in other families, but not in our patients. In summary, our results show that both NSUN2 and ZNF526 belong to the still few genes known to carry NS-ID-causing mutations in independent families, which suggests that defects in either gene belong to the more common causes of NS-ARID, at least in the Iranian population. Further studies are necessary to identify the disease causing mutations in the other 4 hot spot identified loci and to determine the contribution of the affected genes to the complex processes of human cognition. These studies will be greatly facilitated by exome enrichment and next generation sequencing (NGS), which have recently been introduced as a cost-effective and fast strategy for comprehensive mutation screening and disease-gene identification in the coding portion of the human genome.,Mentale Retardierung (MR), eine der komplexesten Erkrankungen, hat eine weltweite Prävalenz von etwa 2% und ist ein häufiger Grund schwerster Behinderung. Aus diesem Grund ist diese Erkrankung sowohl für die Familie als auch für die Gesellschaft eine enorme Belastung. Es hat sich gezeigt, dass genetische Defekte auf dem X-Chromosom nur für 10% aller Fälle von MR verantwortlich sind, daher muss die überwältigende Mehrheit der genetischen Defekte auf den Autosomen kodiert sein. Die Erforschung der autosomalen MR hat bislang jedoch wesentlich weniger Aufmerksamkeit erhalten. Die Strategie zur Identifizierung der an autosomal rezessiven Erbkrankheiten beteiligten Gene umfasst, Homozygosity Mapping, in großen blutsverwandten Familien mit anschließendem Sequenzieren der Kandidatengene um Mutationen aufzuspüren. In westlichen Gesellschaften, wo ein Großteil der Erforschung von MR stattfindet, ist diese Strategie aufgrund von kleinen Familien und seltener Blutsverwandtschaft der Eltern kaum einsetzbar. Um die molekularen Ursachen autosomal rezessiver MR (ARMR) systematisch zu entschlüsseln, um als Grundvoraussetzung für Diagnose, Beratung und Therapie zu dienen, haben wir den Fokus auf große, blutsverwandte iranische Familien gelegt. In unseren Forschungen zu den molekularen Ursachen von ARMR haben wir neue ARMR-Loci identifiziert, doch sind darunter sechs Hot Spot-Loci für unspezifische oder nicht syndromaleARMR(NS-ARMR). Daraus kann geschlossen werden, dass zumindest in der iranischen Bevölkerung nicht alle Gendefekte selten sind und dass die Möglichkeit gemeinsamer molekularer Ursachen von NS-ARMR nicht ausgeschlossen werden kann. Diese Arbeit ist Teil des großen Projekts die molekularen Ursachen von ARMR zu erforschen. Zwei der sechs Hot Spots für ARMR wurden untersucht und die zugrunde liegenden Gendefekte identifiziert. Die Kopplungsintervalle von zwei iranischen Familien mit mehreren von NS-ARMR betroffenen Patienten überlappen auf Chr19q13.2-q13.31. Im Gen ZNF526 wurden zwei Missense-Mutationen mit hoher Pathogenitätsvoraussage identifiziert. ZNF526 kodiert ein Krüppel-Zinkfinger-Protein. Eine dieser Mutationen wurde in einer weiteren Familie mit NS-ARMR identifiziert, die in der gleichen Stadt im Nordwesten des Irans wohnt. Weitere Analysen ergaben einen gemeinsamen Haplotyp beider Familien, die daher entfernt miteinander verwandt sein müssen. Beide Mutationen betreffen funktionelle Domänen von ZNF526. In silico- Proteinmodellierung zeigte eine Veränderung der Proteinkonformation, welche wahrscheinlich die Funktion des Proteins behindert. Eine Minderung der DNA- Affinität wurde anhand von Chip-seq bestätigt. Spezifische Veränderung des Genexpressionsmusters in Lymphoblasten der Patienten wurde anhand von Arrays gezeigt. Dieser Befund konnte in ZNF526-defizienten Neuroblastomzellen rekapituliert werden. Die Annotation der Genfunktionen zeigte eine Anreicherung der deregulierten Gene in Singal-und Stoffwechselwegen, die eine Rolle in der Proteinsynthese, mitochondrialer Dysfunktion, Energiemetabolismus und Genregulation spielen. Des Weiteren konnten wir die Interaktion von ZNF526 und PRKRIR zeigen, einem Transkriptionsfaktor, welcher sehr spät in der Primatenevolution entstand. Diese beiden Proteine sind daher sehr viel versprechende Kandidaten zur Erforschung der Entwicklung und Evolution höherer Gehirnfunktionen von Primaten. In dieser Arbeit haben wir auch die Gendefekte in MRT5 aufgedeckt und drei schädliche Mutationen in NSUN2 identifiziert. Diese Mutationen wurden in zwei nicht miteinander verwandten blutsverwandten iranischen Familien sowie in einer türkischen Familie identifiziert. Die Patienten leiden unter NS-ARMR. NSUN2 kodiert eine Methyltransferase, welche die vom Intron abhängige Bildung von 5-Methylcytosin an das C34 der tRNA- leu(CAA) katalysiert. Alle Mutationen führen zum Verlust der Proteinfunktion und verursachen mit allerWahrscheinlichkeit den Phänotyp der Patienten. Um weitere Beweise für die Beteiligung von NSUN2 an kognitiven Funktionen zu erhalten, wurden Drosophila-Mutaten untersucht, denen das NSUN2-Ortholog fehlt. Die Mutanten zeigten deutliche Behinderung des Lernens, was deutlich die Relevanz von NSUN2 für höhere Gehirnfunktionen betont. Des Weiteren konnte diese Studie erstmals Mutationen in VLDLR (very low-density lipoprotein receptor) in Patienten mit Dysäquilibrium-Syndrom identifizieren und die zentrale Rolle des Rezeptors in der Verursachung der Krankheit bestätigen. Mutationen in VLDLR wurden auch mit vierfüßiger Fortbewegung in einigen Familien assoziiert, die jedoch in unseren Patienten nicht vorhanden ist. Unsere Ergebnisse zeigen, dass sowohl NSUN2 als auch ZNF526 zu den wenigen Genen gehören, die NS-ARMR verursachende Mutationen in mehreren voneinander unabhängigen Familien tragen. Daraus lässt sich schließen, dass beide Gene zu den häufiger von Mutationen betroffenen Genen bei NS-ARMR gehören, zumindest in der iranischen Bevölkerung. Weiter Studien sind nötig, um die krankheitsverursachenden Mutationen in den anderen vier Hot Spots zu identifizieren und die Beteiligung der betroffenen Gene an der humanen Kognition zu erschließen. Diese Untersuchungen werden durch die Exom- Anreicherung und das Next Generation Sequencing (NGS) erleichtert, welche kürzlich als kosteneffiziente und schnelle Strategien für das Mutationsscreening und die Identifizierung von Krankheitsgenen in den kodierenden Regionen des humanen Genoms eingeführt wurden.,
Data Types:
  • Other