Privacy Policy Ontology

Published: 15-12-2020| Version 1 | DOI: 10.17632/46r2vd7jnv.1
Mitra Bokaei Hosseini,
Travis Breaux


Government regulations increasingly require mobile and web-based application (app) companies to standardize their data practices concerning the collection, use, and sharing of various types of information. A summary of these practices are communicated to users through online privacy policies. The challenge of acquiring requirements from data practice descriptions, however, is that privacy policies often contain ambiguities. Abstract and ambiguous terminology in requirements statements concerning information types (e.g., "we collect your device information"), can reduce shared understanding among app developers, policy writers, and users. To address this challenge, we propose a syntax-driven method that first parses a given information type phrase (e.g. mobile device identifier) into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules. The inferred semantic relationships between a given phrase and its constituents generate a hierarchy that models the generality and ambiguity of phrases. Through this method, we infer relations from a lexicon consisting of a set of information type phrases to populate a partial ontology. The resulting ontology is a knowledge graph that can be used to guide requirements authors in the selection of the most appropriate information type terms. We evaluate the method’s performance using two criteria: (1) expert assessment of relations between information types; and (2) non-expert preferences for relations between information types. The results suggest performance improvement when compared to a previously proposed method. We also evaluate the reliability of the method considering the information types extracted from different data practices (e.g., collection, usage, sharing, etc.) in privacy policies for mobile or web-based apps in various app domains. This data repository contains lexicons and ontologies that we used to construct and evaluate our method.