Filter Results
527632 results
Matlab scripts used to analyze data associated with the manuscript entitled "A single cell atlas of the human liver tumor microenvironment". *please used Matlab 2019b to run the following m files. Files: inputData.mat: mat contains all raw and preprocessed data used in the study Create_Interactions_Network.m: Matlab script used to calculate Ligand-Receptor interaction score between different cell types. The script creates panels of Figure 3 and Table S5. Hepatocytes_Reconstruction.m: Matlab script used to reconstruct human hepatocytes zonation along the lobule axis. The script creates panel 'c' of Figure 4, Figure S4, and Table S7. Cancer_Cells_Spatial_Analysis.m: Matlab script used to calculate differential gene expression between malignant cells found at different zones (malignant border, malignant core, and fibrotic zone) captured by laser microdissection. The script creates panel 'd' of Figure 4 helperFunctions.zip: This folder contains required functions used by the m files.
Data Types:
  • Software/Code
Cambridge Butterfly Collection. Loreto, Peru Part 1 EN: This upload contains photographs taken by Eva van der Heijden at the Butterfly Genetics Group at the University of Cambridge, from a butterfly wing collection from Loreto, Peru, in collaboration with Green Gold Forestry. Individual sample names can be found in the information sheet. Further Information on individual samples from the Butterfly Genetics Group Collection can be found on the public database Earthcape (click here for the database, and here for FAQ). Please contact Chris Jiggins (c.jiggins[at]zoo.cam.ac.uk) or Gabriela Montejo-Kovacevich (gmontejokovacevich[at]gmail.com) for further information. ES: Este repositorio contiene fotografías tomadas por Eva van der Heijden en el Butterfly Genetics Group de la Universidad de Cambridge, de mariposas de Loreto (Peru), en colaboración con la compañía Green Gold Forestry. Puede encontrar información sobre muestras individuales de Butterfly Genetics Group Collection en la base de datos pública Earthcape (haga clic aquí para la base de datos, y aquí para preguntas frecuentes) Por favor, póngase en contacto con Chris Jiggins (c.jiggins [arroba] zoo.cam.ac.uk) o Gabriela Montejo-Kovacevich (gmontejokovacevich[at]gmail.com) con sus preguntas o peticiones.
Data Types:
  • Other
  • Image
intronsf10k image
Data Types:
  • Image
# Installation conda create -n deep_texture python=3.6 source activate deep_texture conda install numpy pillow conda install keras-gpu conda install keras # if GPUs are not available pip install git+https://github.com/keras-team/keras-applications.git@d506dc82d0 # downgrade keras-application ## usage import deep_texture (prep, dnn) = deep_texture.setup_texture(arch = 'nasnet', layer = 'normal_concat_11', cbp_dir = '/tmp') dtr = deep_texture.calc_features_file("./test.png", prep, dnn)
Data Types:
  • Other
  • Software/Code
Abstract Planning for power systems with high penetrations of variable renewable energy requires higher spatial and temporal granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. Documentation of dataset development A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Please cite as: Y. Xu, Nathan Myhrvold, Dhileep Sivam, Kaspar Mueller, Daniel J. Olsen, Bainan Xia, Daniel Livengood, Victoria Hunt, Benjamin Rouillé d'Orfeuil, Daniel Muldrew, Merrielle Ondreicka, Megan Bettilyon, "U.S. Test System with High Spatial and Temporal Resolution for Renewable Integration Studies," 2020 IEEE PES General Meeting, Montreal, Canada, 2020. Dataset version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data. 0.2.1, March 25, 2020: corrected a bug in the wind profile generation process which was pulling the wrong locations for wind farms outside the Western Interconnection.
Data Types:
  • Dataset
  • File Set
Abstract Planning for power systems with high penetrations of variable renewable energy requires higher spatial and temporal granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. Documentation of dataset development A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Please cite as: Y. Xu, Nathan Myhrvold, Dhileep Sivam, Kaspar Mueller, Daniel J. Olsen, Bainan Xia, Daniel Livengood, Victoria Hunt, Benjamin Rouillé d'Orfeuil, Daniel Muldrew, Merrielle Ondreicka, Megan Bettilyon, "U.S. Test System with High Spatial and Temporal Resolution for Renewable Integration Studies," 2020 IEEE PES General Meeting, Montreal, Canada, 2020. Dataset version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data. 0.2.1, March 25, 2020: [erroneous upload] 0.2.2, March 26, 2020: [erroneous upload]
Data Types:
  • Dataset
  • File Set
Planning for power systems with high penetrations of variable renewable energy requires higher spatial and tempo- ral granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data.
Data Types:
  • Dataset
  • File Set
Integrating data from multiple sources with the aim to identify records that correspond to the same entity is required in many real-world applications including healthcare, national security, and businesses. However, privacy and confidentiality concerns impede the sharing of personal identifying values to conduct linkage across different organizations. Privacy-preserving record linkage (PPRL) techniques have been developed to tackle this problem by performing clustering based on the similarity between encoded record values, such that each cluster contains (similar) records corresponding to one single entity. When employing PPRL on databases from multiple parties, one major challenge is the prohibitively large number of similarity comparisons required for clustering, especially when the number and size of databases are large. While there have been several private blocking methods proposed to reduce the number of comparisons, they fall short in providing an efficient and effective solution for linking multiple large databases. Further, all of these methods are largely dependent on data. In this paper, we propose a novel private blocking method for efficiently linking multiple databases by exploiting the data characteristics in the form of probabilistic signatures and introduce a local blocking evaluation step for validating blocking methods without knowing the ground-truth. Experimental results show the efficacy of our method in comparison to several state-of-the-art methods.
Data Types:
  • Other
  • Software/Code
Abstract Motivation Antibodies are widely used experimental reagents to test expression of proteins. However, they might not always provide the intended tests because they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable and irreproducible research results. While many proposals have been developed to deal with the problem of antibody specificity, they may not scale well to deal with the millions of antibodies that have ever been designed and used in research. In this study, we investigate the feasibility of automatically extracting statements about antibody specificity reported in the literature by text mining, and generate reports to alert scientist users of problematic antibodies. Results We developed a deep neural network system called Antibody Watch and tested its performance on a corpus of more than two thousand articles that report uses of antibodies. We leveraged the Research Resource Identifiers (RRID) to precisely identify antibodies mentioned in an input article and the BERT language model to classify if the antibodies are reported as nonspecific, and thus problematic, as well as inferred the coreference to link statements of specificity to the antibodies that the statements referred to. Our evaluation shows that Antibody Watch can accurately perform both classification and linking with F-scores over 0.8, given only thousands of annotated training examples. The result suggests that with more training, Antibody Watch will provide useful reports about antibody specificity to scientists.
Data Types:
  • Dataset
  • File Set
This code enables the mapping of single-molecule m6A methylations.
Data Types:
  • Software/Code