Dataset of 50 Online Services Advertised in the Internet Marketing Forum searchengines.guru

Name: Dataset of 50 Online Services Advertised in the Internet Marketing Forum searchengines.guru
Creator: Veronica Valeros
Published: 2021-05-17T12:59:52.562Z
Keywords: Social Sciences, Computer Science Applications, Internet Marketing, Crime Analysis

Valeros, Veronica; Garcia, Sebastian

doi:10.17632/48gyrs6y37.2

Dataset of 50 Online Services Advertised in the Internet Marketing Forum searchengines.guru

Published: 17 May 2021| Version 2 | DOI: 10.17632/48gyrs6y37.2

Contributors:

Veronica Valeros,

Description

Our dataset of organized services contains the top 50 services offering products or services in the forum. The services were the most discussed in their own forum categories. The data was gathered throughout 2020 and January 2021. For every service the dataset contains the following features: * User Forum Name: forum user that posted about the service. * User Forum Rating: the rating of the user in the forum. * User Forum Registration: the date the user registered in the forum. * Type of Service Offered: personal, professional, training, computer programs, conferences and events. * Types of Advertisement: individuals, groups, organizations * Service Overall Maturity Level: low, medium, high * Type of User Advertising: on their own behalf, on behalf of a group, on behalf of an organization * Service Label: personal service, emerging service, organized service * searchengines.guru Category: forum main category where the service was posted * searchengines.guru Sub Category: forum sub-category where the service was posted * Country of Legal Terms: which country laws they abide * Domain Registration: year the domain name of the service was first registered * Domain Registrar: registrar organization associated with the domain name * Site ASN: autonomous system information associated with the website * Site ISP: internet service provider associated with the website * Site Hosting Country: country where the website is hosted * Site Alexa Ranking: Alexa website popularity global ranking * Web Reputation: Cisco Umbrella Security Score: security score based on the domain name. The final risk scores to assess a domain's reputation are Low Risk, Medium Risk, and High Risk. * Web Reputation: BrightCloud® TI Risk: risk level based on the domain name (Trustworthy, Low Risk, ModerateRisk, Suspicious, and High Risk) * Web Reputation: Suspicious Activity from VirusTotal Intelligence Indicators: a positive value (yes) means there is at least one indicator retrieved from VirusTotal that associated the domain with malicious behavior. * Web Reputation: OSINT Reports on Suspicious Behavior: a positive value (yes) if there is at least one OSINT report tying the domain to malicious behavior. * Trustworthiness Label #1: untrustworthy if at least one of the web reputation indicators is positive. Otherwise trustworthy. * Trustworthiness Label #2: untrustworthy if at least two of the web reputation indicators is positive. Otherwise trustworthy. * Trustworthiness Label #3: untrustworthy if at least three of the web reputation indicators is positive. Otherwise trustworthy. * Trustworthiness Label #4: untrustworthy if at least four of the web reputation indicators is positive. Otherwise trustworthy.

Files

Steps to reproduce

Our dataset of organized services contains the top 50 services offering products or services in the forum. The services were the most discussed in their own categories. The data was gathered throughout 2020 and January 2021. The dataset was enriched using three types of reputation information. First, for each organized service, we obtained its public information, such as the organization’s main domain name, domain registration information, legal country of operation, etc. This information was extracted from public company records. In the website’s terms of service and privacy policies, website owners establish by which country laws they abide, thus we consider this is the country associated with the service provided. Second, we computed the reputation of the main domain name of each organization using three online services: (i) Cisco Umbrella Security Score, (ii) Bright-Cloud Threat Intelligence, and (iii) VirusTotal. The Cisco Umbrella Security Score can have three values: Low Risk, Medium Risk, and High Risk. Any organization with a domain scoring in Medium Risk or High Risk was considered as having a suspicious reputation. BrightCloud Threat Intelligence can have four values: Trustworthy, Low Risk, Moderate Risk, Suspicious, and High Risk. Any evaluation including Low Risk to High Risk was considered as having a suspicious reputation. VirusTotal can have two values: yes, if a domain name has at least one direct antivirus detection on the domain, and no otherwise. Third, we complemented these indicators with an indicator based on Open-Source Intelligence (OSINT) searches. This indicator is used to evaluate the existence of public reports that link these organizations to malicious online activities, such as spamming, malware, or fraud. It can take two values: yes, if there is a public report with malicious indications associated with the domain, and no otherwise. Alexa score, ISP, ASN, Registrar, and other domain related values were obtained from the above mentioned services. Forum related information was obtained from searchengines.guru.

Institutions

Ceske Vysoke Uceni Technicke v Praze

Dataset of 50 Online Services Advertised in the Internet Marketing Forum searchengines.guru

Description

Files

Steps to reproduce

Institutions

Categories

Licence