Adaptive Prefix Filtering for Accurate Code Clone Detection in conjunction with Meta-Learning
This research project has significantly contributed to the advancement of code duplication detection by in-troducing a new testing method and constructing a highly accurate meta-classifier. The evaluation of various architectures and utilization of a diverse dataset enabled the development of a novel and versatile solution for detecting duplicate code in both Java and Python programming languages. This process successfully constructed a novel classifier, demonstrating exceptional accuracy in detecting duplicate code. The algorithm was trained on a dataset comprising 19,988 data points, encompassing code metrics from both Java and Python programming languages. This diverse dataset enabled the model to learn and generalize across multiple language paradigms, enhancing its versatility and effectiveness in code clone detection. The proposed model outperformed the state-of-the-art models, which proves that it is the appropriate choice for constructing a meta-classifier for cloned code detection.