Data for: Words are important: A textual content based identity resolution scheme across multiple online social networks

Published: 4 April 2020| Version 1 | DOI: 10.17632/rcpxp7m3tn.1
Contributor:
Deepesh Srivastava

Description

1. Training Datasets: 1.1. Columns details: columns represent the extracted features from the content of source profile (Twitter) and the target profiles (Facebook). Last column shows the match and no-match condition. Total number of columns in this dataset are 31. 1.2. Rows details: all the rows in this dataset represent the source and target profiles pairs for match and no-match. Total number of rows in this dataset are 31882. 2. Test Datasets: 2.1. Columns details: columns represent the extracted features from the content of source profile (Twitter)and the target profiles (Facebook). Last column shows the match and no-match condition. Total number of columns in this dataset are 31. 2.2. Rows details: all the rows in this dataset represent the source and target profiles pairs for match and no-match. Total number of rows in this dataset are 17392.

Files

Categories

Natural Language Processing, Machine Learning, Machine Learning Algorithm, Social Network Analysis, Social Networks, Text Mining, Social Media Analytics

Licence