Adera2.0 Text mining dataset for training neural networks

Published: 8 September 2022| Version 1 | DOI: 10.17632/whr7wrh42y.1
Michel Edwar Khalil Mickael


This is a text mining dataset. It consists of 2000 entries divided into two columns of 1000 entry each. The first column contains sentences. The second column includes drug compound. This database could primarily be used to train neural networks to extract drug compounds from sentences parsed from any medical literature. The sentences have been pulled from DrugBank, Wikipedia, etc...The compound names have been extracted manually.



Artificial Intelligence, Artificial Neural Networks, Natural Language Processing, Neural Networks (Biological Sciences)