Filter Results
26665 results
Data on competency (360 degree feedback), employability and career success
Data Types:
  • Tabular Data
Tools for interacting with the publicly available California Delta Fish Salvage Database, including continuous deployment of data access, analysis, and presentation.
Data Types:
  • Software/Code
A Forecasting Framework for Predicting Perceived Fatigue: Using Time Series Methods to Forecast Ratings of Perceived Exertion with Features from Wearable Sensors (R Markdown)
Data Types:
  • Software/Code
No description provided.
Data Types:
  • Software/Code
Julia's CUTEst Interface
Data Types:
  • Software/Code
This release has Windows binaries for ViewportSaverc included. For other binaries, please see earlier releases. Please unzip the file, install the VC_redist_x64 files if needed. Built with VS2015 on Win10 running on virtualbox.
Data Types:
  • Software/Code
The goal of this project was to understand how different temperature and nutrient concentrations affect the larvae of three Hawaiian reef coral taxa: Lobactis scutaria (broadcast spawner, aposymbiotic larvae), Pocillopora acuta (brooder, symbiotic larvae), and Montipora capitata (broadcast spawner, symbiotic larvae). Kitchen RM, Piscetta M, Lenz EA, de Souza MR, Schar D, Gates RD, Wall CB (2020) Symbiont transmission and reproductive mode influence responses of three Hawaiian coral larvae to elevated temperature and nutrients. Coral Reefs
Data Types:
  • Software/Code
No description provided.
Data Types:
  • Software/Code
Rust tokenizers (@mfuntowicz, @n1t0 ) Tokenizers for Bert, Roberta, OpenAI GPT, OpenAI GPT2, TransformerXL are now leveraging tokenizers library for fast tokenization :rocket: AutoTokenizer now defaults to fast tokenizers implementation when available Calling batch_encode_plus on fast version of tokenizers will make better usage of CPU-cores. Tokenizers leveraging native implementation will use all the CPU-cores by default when calling batch_encode_plus. You can change this behavior by setting the environment variable RAYON_NUM_THREADS = N An exception is raised when tokenizing an input with pad_to_max_length=True but no padding token is defined. Known Issues: RoBERTa fast tokenizer implementation has slightly different output when compared to the original Python tokenizer (< 1%). Squad example are not currently compatible with the new fast tokenizers thus, it will default to plain-old Python one. DistilBERT base cased (@VictorSanh) The distilled version of the bert-base-cased BERT checkpoint has been released. Model cards (@julien-c ) Model cards are now stored directly in the repository CLI script for environment information (@BramVanroy ) We now host a CLI script that gathers all the environment information when reporting an issue. The issue templates have been updated accordingly. Contributors visible on repository (@clmnt ) The main contributors as identified by Sourcerer are now visible directly on the repository. From fine-tuning to pre-training (@julien-c ) The language fine-tuning script has been renamed from run_lm_finetuning to run_lm_pretraining as it is now able to train language models from scratch. Extracting archives now available from cached_path (@thomwolf ) Slight modification to cached_path so that zip and tar archives can be automatically extracted. archives are extracted in the same directory than the (possibly downloaded) archive in a created extraction directory named from the archive. automatic extraction is activated by setting extract_compressed_file=True when calling cached_file. the extraction directory is re-used to avoid extracting the archive again unless we set force_extract=True, in which case the cached extraction directory is removed and the archive is extracted again. New activations file (@sshleifer ) Several activation functions (relu, swish, gelu, tanh and gelu_new) can now be accessed from the activations.py file and be used in the different PyTorch models. Community additions/bug-fixes/improvements Remove redundant hidden states that broke encoder-decoder architectures (@LysandreJik ) Cleaner and more readable code in test_attention_weights (@sshleifer) XLM can be trained on SQuAD in different languages (@yuvalpinter) Improve test coverage on several models that were ill-tested (@LysandreJik) Fix issue where TFGPT2 could not be saved (@neonbjb ) Multi-GPU evaluation on run_glue now behaves correctly (@peteriz ) Fix issue with TransfoXL tokenizer that couldn't be saved (@dchurchwell) More Robust conversion from ALBERT/BERT original checkpoints to huggingface/transformers models (@monologg ) FlauBERT bug fix; only add langs embeddings when there is more than one language handled by the model (@LysandreJik ) Fix CircleCI error with TensorFlow 2.1.0 (@mfuntowicz ) More specific testing advice in contributing (@sshleifer ) BERT decoder: Fix failure with the default attention mask (@asivokon ) Fix a few issues regarding the data preprocessing in run_language_modeling (@LysandreJik ) Fix an issue with leading spaces and the RobertaTokenizer (@joeddav ) Added pipeline: TokenClassificationPipeline, which is an alias over NerPipeline (@julien-c )
Data Types:
  • Software/Code
Python library written in C++ for calculation of local atomic structural environment
Data Types:
  • Software/Code