Filter Results
101051 results
# Installation conda create -n deep_texture python=3.6 source activate deep_texture conda install numpy pillow conda install keras-gpu conda install keras # if GPUs are not available pip install git+https://github.com/keras-team/keras-applications.git@d506dc82d0 # downgrade keras-application ## usage import deep_texture (prep, dnn) = deep_texture.setup_texture(arch = 'nasnet', layer = 'normal_concat_11', cbp_dir = '/tmp') dtr = deep_texture.calc_features_file("./test.png", prep, dnn)
Data Types:
  • Other
  • Software/Code
Abstract Planning for power systems with high penetrations of variable renewable energy requires higher spatial and temporal granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. Documentation of dataset development A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Please cite as: Y. Xu, Nathan Myhrvold, Dhileep Sivam, Kaspar Mueller, Daniel J. Olsen, Bainan Xia, Daniel Livengood, Victoria Hunt, Benjamin Rouillé d'Orfeuil, Daniel Muldrew, Merrielle Ondreicka, Megan Bettilyon, "U.S. Test System with High Spatial and Temporal Resolution for Renewable Integration Studies," 2020 IEEE PES General Meeting, Montreal, Canada, 2020. Dataset version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data. 0.2.1, March 25, 2020: corrected a bug in the wind profile generation process which was pulling the wrong locations for wind farms outside the Western Interconnection.
Data Types:
  • Dataset
  • File Set
Abstract Planning for power systems with high penetrations of variable renewable energy requires higher spatial and temporal granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. Documentation of dataset development A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Please cite as: Y. Xu, Nathan Myhrvold, Dhileep Sivam, Kaspar Mueller, Daniel J. Olsen, Bainan Xia, Daniel Livengood, Victoria Hunt, Benjamin Rouillé d'Orfeuil, Daniel Muldrew, Merrielle Ondreicka, Megan Bettilyon, "U.S. Test System with High Spatial and Temporal Resolution for Renewable Integration Studies," 2020 IEEE PES General Meeting, Montreal, Canada, 2020. Dataset version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data. 0.2.1, March 25, 2020: [erroneous upload] 0.2.2, March 26, 2020: [erroneous upload]
Data Types:
  • Dataset
  • File Set
Planning for power systems with high penetrations of variable renewable energy requires higher spatial and tempo- ral granularity. However, most publicly available test systems are of insufficient fidelity for developing methods and tools for high- resolution planning. This paper presents methods to construct open-access test systems of high spatial granularity to more accurately represent current infrastructure and high temporal granularity to represent variability of demand and renewable resources. To demonstrate, a high-resolution test system representing the United States is created using only publicly available data. This test system is validated by running it in a production cost model, with results validated against historical generation to ensure that they are representative. The resulting open source test system can support power system transition planning and aid in development of tools to answer questions around how best to reach decarbonization goals, using the most effective combinations of transmission expansion, renewable generation, and energy storage. A paper describing the process of developing the dataset is available at https://arxiv.org/abs/2002.06155. Version history 0.1, January 31, 2020: initial data upload. 0.2, March 10, 2020: addition of Tabular Data Package metadata, modifications to cost curves and transmission capacities aimed at more closely matching optimization results to historical data.
Data Types:
  • Dataset
  • File Set
Integrating data from multiple sources with the aim to identify records that correspond to the same entity is required in many real-world applications including healthcare, national security, and businesses. However, privacy and confidentiality concerns impede the sharing of personal identifying values to conduct linkage across different organizations. Privacy-preserving record linkage (PPRL) techniques have been developed to tackle this problem by performing clustering based on the similarity between encoded record values, such that each cluster contains (similar) records corresponding to one single entity. When employing PPRL on databases from multiple parties, one major challenge is the prohibitively large number of similarity comparisons required for clustering, especially when the number and size of databases are large. While there have been several private blocking methods proposed to reduce the number of comparisons, they fall short in providing an efficient and effective solution for linking multiple large databases. Further, all of these methods are largely dependent on data. In this paper, we propose a novel private blocking method for efficiently linking multiple databases by exploiting the data characteristics in the form of probabilistic signatures and introduce a local blocking evaluation step for validating blocking methods without knowing the ground-truth. Experimental results show the efficacy of our method in comparison to several state-of-the-art methods.
Data Types:
  • Other
  • Software/Code
Abstract Motivation Antibodies are widely used experimental reagents to test expression of proteins. However, they might not always provide the intended tests because they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable and irreproducible research results. While many proposals have been developed to deal with the problem of antibody specificity, they may not scale well to deal with the millions of antibodies that have ever been designed and used in research. In this study, we investigate the feasibility of automatically extracting statements about antibody specificity reported in the literature by text mining, and generate reports to alert scientist users of problematic antibodies. Results We developed a deep neural network system called Antibody Watch and tested its performance on a corpus of more than two thousand articles that report uses of antibodies. We leveraged the Research Resource Identifiers (RRID) to precisely identify antibodies mentioned in an input article and the BERT language model to classify if the antibodies are reported as nonspecific, and thus problematic, as well as inferred the coreference to link statements of specificity to the antibodies that the statements referred to. Our evaluation shows that Antibody Watch can accurately perform both classification and linking with F-scores over 0.8, given only thousands of annotated training examples. The result suggests that with more training, Antibody Watch will provide useful reports about antibody specificity to scientists.
Data Types:
  • Dataset
  • File Set
ATC-Anno is an annotation tool for the transcription and semantic annotation of air traffic control utterances. It was developed at the Spoken Language Systems (LSV) group at Saarland University. The latest version of the tool can always be found on the LSV GitHub account. If you use the tool in your research, please cite the associated paper: Marc Schulder, Johannah O'Mahony, Yury Bakanouski, Dietrich Klakow (2020). ATC-Anno: Semantic Annotation for Air Traffic Control with Assistive Auto-Annotation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marseilles, France.
Data Types:
  • Other
  • Software/Code
Compressed fastqs for raw sequences of clinical isolates of Escherichia coli infection from Toronto, Canada in 2018 (Dataset 2). Sequencing details outlined in associated publication. Performed using Illumina NextSeq platform.
Data Types:
  • Document
  • File Set
MATLAB codes used to model arsenic(III) remediation using a composite TiO2-Fe2O3 sorbent in batch and continuous-flow systems, using a modified form of the pseudo-second order (PSO) adsorption kinetic model. This data supports the manuscript provisionally titled 'A kinetic adsorption model to inform the design of arsenic(III) treatment plants using photocatalyst-sorbent materials'
Data Types:
  • Software/Code
Tools for interacting with the publicly available California Delta Fish Salvage Database, including continuous deployment of data access, analysis, and presentation.
Data Types:
  • Software/Code