POLB SNPs Dataset

Published: 17 April 2024| Version 1 | DOI: 10.17632/d6385g6kv6.1
Contributors:
,
,
,
,
,

Description

Research Hypothesis: Our study focuses on the critical role of DNA polymerase β (Polβ), encoded by the POLB gene, in the base excision repair process. This DNA repair is important for maintaining genomic stability. We hypothesize that Single Nucleotide Polymorphisms (SNPs) within the POLB gene may influence the enzyme's ability to repair DNA, affecting the likelihood of cancer development. By employing bioinformatics tools to extract significant features from SNPs and utilizing machine learning algorithms, we aim to predict cancer development associated with specific POLB gene mutations. Data Overview: The dataset comprises SNPs within the Polymerase Beta (POLB) gene, significant for studying genetic variations related to cancer. It includes raw and processed data: • Positive SNPs: 232 known cancer-related SNPs from the COSMIC and NCI databases, within the human genome (hg19). • Unknown SNPs: Around 12,000 SNPs (hg19) obtained from dbSNP, not yet determined to be positive or negative in cancer association. • Processed Data: A matrix of 813 SNPs with bioinformatics features, prepared for classification via machine learning. Data Interpretation: The classification code available via GitHub facilitates further research and study replication. Researchers can use the processed data for immediate machine learning classification or explore the raw data for more in-depth study. Summary of Findings: Our study highlights the critical role of the POLB gene in DNA repair and cancer prevention. Highlighting the potential of combining bioinformatics tools and machine learning to develop predictive models for genetic and cancer research using Single Nucleotide Polymorphisms (SNPs) within the POLB gene.

Files

Institutions

Khalifa University of Science and Technology

Categories

DNA Polymerase, Single Nucleotide Polymorphism

Licence