Understandability of Global Post-hoc Explanations of Black-box Models: Dataset and Analysis
The dataset contains the data collected in a user study carried out to evaluate the impact of using domain knowledge, ontologies, in the creation of global post-hoc explanations of black-box models. The research hypothesis was that the use of ontologies could enhance the understandability of explanations by humans. To validate this research hypothesis we ran a user study where participants were asked to carry out several tasks. In each task, the answers, time of response, and user understandability and confidence were collected and measured. The data analysis revealed that the use of ontologies do enhance the understandability of explanations of black-box models by human users, in particular, in the form of decision trees explaining artificial neural networks.
Steps to reproduce
The data folder contains the dataset used for the analysis: - all_data_subject_per_row_anon.csv: the collected data from the questionnaire. - all_data_anon.csv: processed data for analysis of Task 1, 2 and 3. - all_data_anon_empow.csv: processed data for analysis of Task 4. - tree_data.csv: data about trees (explanations) extracted by neural networks. The script folder contains several scripts used to analyse the data. Some scripts are in R and some other scripts are in Python (files with .r extensions are not allowed to be uploaded, and they have been renamed to .r.txt): - general_statistics.py: Python script that generates the statistics presented in Table 2. - trees_fidelity_accuracy_complexity.py: Python scripts that generates the statistic about extracted trees in Tables 1 and 3. - task1_task2_accuracy_analysis.r: R script with the analysis of answers' accuracies carried out for the classification and inspection tasks (Figure 5). - task1_task2_time_of_response_analysis.r: R script with the analysis of answers' time of responses carried out for the classification and inspection tasks (Figure 6). - task1_task2_user_understandability_analysis.py: R script with the analysis of answers' time of responses carried out for the classification and inspection tasks (Figure 7). - task3_analysis.r.txt : R script with the analysis of the comparison task. - task3_analysis_ratios.r.txt: R script that generates the statistics presented in the comparison task analysis. - task4_semantic_similarity.py: Python script that generates statistics about correct answers, time of response, and semantic similarity of free text answers in the empowerment task (Figure 7). - task4_analysis.r.txt: R script with the analysis of accuracies and time of response in the empowerment task.