Experiments with faster dissemination of research began in the 1960s, and in the 1990s first preprint servers emerged and became widely used in Physical Sciences and Economics. Since 2010, more than 30 new preprint servers have emerged and the number of deposited preprints has grown exponentially, with numerous journals now supporting posting of preprints and accepting preprints as submissions for journal peer review and publication. Research on preprints is, however, still scarce.
The goals of this project are:
1) Study preprint policies, submission requirements and addressing of transparency in reporting and research integrity topics of all know preprint servers that allow deposit of preprints to researchers regardless of their institutional affiliation or funding.
2) Study comments deposited on preprint servers’ platforms and social media and their relation to peer review and information exchange.
3) Study differences between preprint version(s) and version of record.
Team Members (by first name alphabetical order):
Ana Jerončić,1 Gerben ter Riet,2,3 IJsbrand Jan Aalbersberg,4 John P.A. Ioannidis,5-9 Joseph Costello,10 Juan Pablo Alperin,11,12 Lauren A. Maggio,10 Lex Bouter,13,14 Mario Malički,5 Steve Goodman5-7
1 Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia
2 Urban Vitality Centre of Expertise, Amsterdam University of Applied Sciences, Amsterdam, The Netherlands
3 Amsterdam UMC, University of Amsterdam, Department of Cardiology, Amsterdam, The Netherlands
4 Elsevier, Amsterdam, The Netherlands
5 Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA
6 Department of Medicine, Stanford University School of Medicine, Stanford, California, USA
7 Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California, USA
8 Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, California, USA
9 Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, USA
10 Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
11 Scholarly Communications Lab, Simon Fraser University, Vancouver, British Columbia, Canada
12 School of Publishing, Simon Fraser University, Vancouver, British Columbia, Canada
13 Department of Philosophy, Faculty of Humanities, Vrije Universiteit, Amsterdam, The Netherlands
14 Amsterdam UMC, Vrije Universiteit, Department of Epidemiology and Statistics, Amsterdam, The Netherlands
The inadequacy of breast cancer early detection methods in Sudan, makes the implementations in breast cancer awareness and regular breast self examination worth doing. Among medical students, the level of knowledge, attitude and practice of breast self examination is worth questioning, being the potential professional health care providers on whom the awareness spread duty is laid.
the knowledge, attitude and practice level of breast self-examination in female undergraduate students, faculty of medicine university of Khartoum were found to be low.
- This dataset includes a Python code for sequence-to-sequence time-series forecasting by training and evaluating recurrent neural network models.
- The code was developed to enable rapid and wide-scale development, production and evaluation of time-series models and predictions.
- The RNN's architecture has a convolutional layer for handling inputs, within a composite autoencoder’s neural network.
Instructions for usage:
- The Python code is located in a Jupyter notebook that can be opened online or locally, by using a Jupyter Notebook compatible platform as:
https://jupyter.org (accessed 11 July 2020).
https://colab.research.google.com (accessed 11 July 2020).
- In order to use the code, a data source should exist in a "csv" file extension and it should be named as 'data_input.csv' or alternatively, an online link to the data source could be entered when executing the code. The data source should have first 4 columns for metadata. The unique name or identifier for each row will be located in the 2nd column, otherwise, a change has to be made in the code in the gen_data function (line 282) and line 286 in case of the need to change metadata columns size, into less or more. The rest of the columns indicate the accumulated number or value in each column.
- target_pred: specifies which row in the data to predict.
- crop_point: specifies which data point to crop the time-series data at, ex. training data = before crop_point, evaluation data = after crop_point.
- time_steps: specifies which time-steps to use, ex. 15 or 20, meaning: 15 for X and 15 for Y in the sequence-to-sequence model.
- RNN parameters: ex. batch size, epochs, layer sizes, RNN architecture (GRU or LSTM).
- ext: specifies the end date of predictions.
This code is licensed under MIT license.
Contributors:Maisonnave Mariano, Delbianco Fernando, Fernando Tohme, Maguitman Ana, Milios Evangelos
The present is a manually labeled data set for the task of Event Detection (ED). The task of ED consists of identifying event triggers, the word that most clearly indicates the occurrence of an event.
The present data set consists of 2,200 news extracts from The New York Times (NYT) Annotated Corpus, separated into training (2,000) and testing (200) sets. Each news extract contains the plain text with the labels (event mentions), along with two metadata (publication date and an identifier).
We consider as event any ongoing real-world event or situation reported in the news articles. It is important to distinguish those events and situations that are in progress (or are reported as fresh events) at the moment the news is delivered from past events that are simply brought back, future events, hypothetical events, or events that will not take place. In our data set we only labeled as event the first type of event. Based on this criterion, some words that are typically considered as events are labeled as non-event triggers if they do not refer to ongoing events at the time the analyzed news is released. Take for instance the following news extract: "devaluation is not a realistic option to the current account deficit since it would only contribute to weakening the credibility of economic policies as it did during the last crisis." The only word that is labeled as event trigger in this example is "deficit" because it is the only ongoing event refereed in the news. Note that the words "devaluation", "weakening" and "crisis" could be labeled as event triggers in other news extracts, where the context of use of these words is different, but not in the given example.
For a more detailed description of the data set and the data collection process please visit: https://cs.uns.edu.ar/~mmaisonnave/resources/ED_data.
The dataset is split in two folders: training and testing. The first folder contains 2,000 XML files. The second folder contains 200 XML files. Each XML file has the following format.
The first three tags (pubdate, file-id and sent-idx) contain metadata information. The first one is the publication date of the news article that contained that text extract. The next two tags represent a unique identifier for the text extract. The file-id uniquely identifies a news article, that can hold several text extracts. The second one is the index that identifies that text extract inside the full article.
The last tag (sentence) defines the beginning and end of the text extract. Inside that text are the tags. Each of these tags surrounds one word that was manually labeled as an event trigger.
This repository comprehends all the data supporting the analysis of the recent volcano-tectonic activity of the Ririba rift, at the southern tip of the Ethiopian Rift Valley, near the Kenya/Ethiopia border. It consists of a pdf file, two .kmz files and two Excel tables.
Specifically, the pdf file ('Supplementary material') includes a list of the samples collected during fieldwork (Table S1), details concerning the methodology employed in the morphometric analysis of volcanic structures (S2) and in the statistical analysis of vent clustering and its results (S3), and chemical analysis of the collected volcanic rock samples (Table S4).
The two Excel tables report the data used for the morphometric analysis of the subset of volcanic centres (specifically, 26 for Dilo VF and 41 for Mega VF) which could be well delimited from satellite images (Table S2d contains information of the volcanic cones and lava flows, while Table S2e regards maars and tuff rings).
The data collected from remote sensing analysis of the two volcanic fields it is also reported as two .kmz files ("Dilo VF" and "Mega VF"). In each volcanic field, the subset of volcanic centres which have been used to extract morphometric measurements, lava flows, characteristic alignment and vent elongation trends and the sampling sites of the collected rocks are marked.
This simplified Matlab demo code shows how to use the new Mayfly Algorithm to solve global continuous optimization problems.
Zervoudakis, K., & Tsafarakis, S. (2020). A mayfly optimization algorithm. Computers & Industrial Engineering, 145, 106559. https://doi.org/10.1016/j.cie.2020.106559