A Customizable Pipeline for Social Media Text Normalization

Published: 11 Sep 2017 | Version 1 | DOI: 10.17632/fsxrypxnym.1

Description of this data

First release of resources associated with the following publication.

Sarker A. A Customizable Pipeline for Social Media Text Normalization. Soc. Netw. Anal. Min. (2017) 7:45. DOI 10.1007/s13278-017-0464-z.

Updates will be published sporadically.

  • see README inside the compressed folder for details about each release
  • please always use the latest version for research tasks

Experiment data files

Latest version


Views: 48
Downloads: 10

Previous versions

  • Version 1


    Published: 2017-09-11

    DOI: 10.17632/fsxrypxnym.1

    Cite this dataset

    Sarker, Abeed (2017), “A Customizable Pipeline for Social Media Text Normalization”, Mendeley Data, v1 http://dx.doi.org/10.17632/fsxrypxnym.1

Compare to version


University of Pennsylvania


Computational Linguistics, Natural Language Processing, Social Networks, Text Mining


CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?

You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.