ROI article contributions
This repository contains the Contributions of the article "ROI: A method for identifying organizations receiving personal data". The distribution of the datasets is the following: Privacy Policies dataset This dataset ["Policies_urls.csv"] contains 142 privacy policy URLs with the corresponding organization. These URLs were obtained with the two methods (Selenium & Google) described in the article. This is the reason for duplicated URLs. 300 Domain Holders This dataset ["300_domain_holders.xlsx"] contains three different sheets for each dataset used for the validations and described in the article i.e. Fortune 500, PII_receivers_1 (for the technique's evaluation), and PII_receivers_2 (for ROI's evaluation). Recipient Domains This dataset ["Domains_receiving_PII.csv"] contains 40,493 dataflows corresponding to the 1,112 unique domains along with the type of personal data which received from an Android app.