Underpricing Fraud Detection using Artificial Intelligence Technology to reduce Tax evasion in Tanzaniaa
Description
The dataset includes both qualitative and quantitative data collected during research on the prevalence of tax evasion through product underpricing. This dataset is designed to be suitable for machine learning applications to detect underpricing fraud in real time. The qualitative data can be analyzed using Atlas.ti software, while the quantitative data is compatible with SPSS. The qualitative data was gathered from traders in the Arusha region, with ethical clearance from the Nelson Mandela African Institute of Science and Technology and informed consent from participants. Google Forms was used to distribute questionnaires to those with internet access, while in-person visits were made to businesses for participants without smartphones or computers. No data was collected from individuals who declined to participate, adhering strictly to ethical guidelines. All collected data is anonymous, except for publicly available information. Interviews were conducted with tax authority officials based at the headquarters in Dar es Salaam, including officers from the EFD department, ICT officers, and economists. These interviews provided insights into the extent of tax evasion in Tanzania, existing mechanisms to combat it, persistent loopholes, and opinions on proposed measures to address underpricing fraud. Additionally, document analysis was performed on publicly available government reports to assess trends in tax collection. Media sources, including images, videos, and magazines, were also analyzed qualitatively to better understand the intensity of tax evasion, the challenges the government faces, and the efforts made to boost revenue.
Files
Steps to reproduce
The quantitative data for this research was collected through interviews and document analysis. Taxpayers in the Arusha region were interviewed to understand their perspectives on tax evasion. The discussions focused on techniques used to evade taxes, the frequency of tax authority visits to inspect compliance, the methods employed to report sales and tax liabilities, and the effectiveness of these methods. Data from questionnaires, shared via Google Forms, was collected with ethical clearance from the Nelson Mandela African Institute of Science and Technology (NM-AIST) and participant consent. Document analysis involved a thorough literature review. The qualitative data was gathered through interviews conducted face-to-face or via telephone, depending on the availability of tax officers given their demanding schedules. Researchers submitted an ethical clearance letter along with a formal request to the Tanzania Revenue Authority (TRA) for permission to interview their officers. The TRA facilitated the process by introducing the researchers to the participants to ensure trust and encourage open responses. The interview responses were recorded and documented in Word files. For analysis, the qualitative data was coded and processed using Atlas.ti, with codes extracted from the documents to form themes, providing a comprehensive understanding of participant insights. The quantitative data from Google Forms was coded in SPSS for statistical analysis, offering a structured and measurable perspective on the study findings.
Institutions
Categories
Funding
HEET project Tanzania - Ardhi University