Dataset: Measuring Vulnerabilities in a Business Process Model

Published: 16 March 2023| Version 1 | DOI: 10.17632/c7v5t6kdhc.1
Eva Hariyanti,
Rifqi Hanief,
Nania Nuzulita


This dataset is used in a study entitled: Measuring Vulnerabilities in a Business Process Model. In this study, we developed a method for scoring information security vulnerability in a business process model by adopting the CVSS metric. Our method consists of three stages. First, we measured technical impact using the vignette matrix. Second, we measured exploitability components using assumptions regarding the system’s implementation plan. Third, we predicted the base score using linear regression. We predicted the base score for vulnerabilities in business processes using linear regression because calculating the base score using the CVSS formula produces a significant error. We used the dataset for the third step. We retrieved e-commerce application vulnerability data from the National Vulnerability Database (NVD). We processed the data to get the CVSS components. We used this data as a training dataset to form a linear regression model. The dataset can be seen on the “Training” sheet. On the second to the fourth sheet is the dataset for testing. We used the linear regression model that has been formed to predict the base score on vulnerabilities in business processes. For each vulnerability, we scored the CVSS component. The rules for scoring vulnerabilities in business processes using CVSS can be seen in our published article. The second sheet, “Testing-Manage Account”, contains a dataset about vulnerabilities in the process of managing accounts in e-commerce applications. The third sheet, “Testing-Manage Comm Channel” and the fourth, “Testing-Manage Payment”, contain a dataset about vulnerabilities in managing communication channels and managing payments in e-commerce applications. This dataset has the same structure as the dataset on the “Testing-Manage Account” sheet.


Steps to reproduce

Other researchers can use our dataset to carry out the same experiment. The dataset on the first sheet is used as a training dataset to form a linear regression model. We use nine features for linear regression. Three features are related to security impact: C (confidentiality), I (integrity), and A (availability). Six features relate to exploitability: SU (scope unchanged), SC (scope changed), AV (attack vector), AC (attack complexity), PR (privileges required), and UI (user interaction). ISS, Impact, Exploitability, and BaseScore are additional data that are calculated using the CVSS formula. The second to fourth datasets are used as dataset testing to predict the base score on business process vulnerabilities. The base score prediction for each vulnerability is listed in the “LinRegBSC” column. Meanwhile, the base score calculated using the CVSS formula is in the “Base Score” column. It can be seen that there is a difference between the two scores.


Universitas Airlangga


e-Commerce, Information Security, Linear Regression, Vulnerability