A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities
Description of this data
Language models. As described in the publication titled above.
DSM-langauge-models-3M-LARGE is generated from over 3M posts using window size 5 and dimension 400. For applications, this should be used (the others can be used for development).
Experiment data files
Cite this dataset
Sarker, Abeed; Gonzalez, Graciela (2017), “A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities”, Mendeley Data, v2 http://dx.doi.org/10.17632/dwr4xn8kcv.2
Compare to version
The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.