Underpricing Fraud Detection using Artificial Intelligence Technology to reduce Tax evasion in Tanzaniaa

Published: 27 October 2025| Version 1 | DOI: 10.17632/ypb66kydvw.1
Contributor:
Benitho Chengula

Description

This repository includes qualitative data collected through interviews, with responses gathered and analyzed using ATLAS.ti. In addition, videos and magazine notes were also coded in ATLAS.ti. The results of these analyses are presented in the research article submitted for journal publication. The repository also contains quantitative data involving human participants, including taxpayers, tax officers, and customers (buyers). The questions varied depending on the group of respondents, which explains why some questions have missing values. The quantitative data were analyzed using SPSS (Statistical Package for the Social Sciences). Moreover, supplementary data such as government and official reports were analyzed using Excel, and the results are included in the research paper. Lastly, the dataset used to train the machine learning model for developing the AI tool is also attached. This AI tool is integrated into a simulated EFD (Electronic Fiscal Device) application to detect underpricing fraud in real time.

Files

Steps to reproduce

Steps to Reproduce ====> Access the Repository Download the repository containing both qualitative and quantitative datasets, supplementary reports, and the trained machine learning model. Ensure that the folder structure remains unchanged after extraction. => Qualitative Data Analysis (ATLAS.ti) Open the Qualitative_Data folder in ATLAS.ti. Load the project file (.atlproj). Review coded interviews, video transcripts, and magazine notes. Generate thematic reports to verify the patterns presented in the research article. => Quantitative Data Analysis (SPSS) Open the Quantitative_Data.sav file in SPSS. Run descriptive and inferential statistics as described in the research paper. Note that some variables contain missing values due to differences in respondent types (taxpayers, tax officers, and customers). => Supplementary Data Review (Excel) Open the Government_Reports.xlsx file in Excel. Verify calculations, summaries, and charts used in the research paper. => Machine Learning Model Reproduction Load the Underpricing_Model.ipynb (or .py) file in Jupyter Notebook or your preferred Python environment. Import the training dataset located in ML_Dataset.csv. Run all cells to preprocess data, train the model, and generate performance metrics. Save the trained model as a .pkl or .tflite file. Integration with the Simulated EFD Application Open the EFD Simulation App (in the EFD_AI_Tool folder). Integrate the trained AI model as instructed in the README.md. Launch the simulated app to test real-time detection of underpricing fraud. => Verification of Results Compare outputs (themes, statistics, and AI predictions) with the results presented in the published research article to ensure consistency.

Institutions

  • Nelson Mandela African Institute of Science and Technology School of Mathematics Computational and Communication Science and Engineering

Categories

Fraud

Licence