A Deep Learning and XGBoost-based Method for Predicting Protein-protein Interaction Sites

Published: 10 September 2021| Version 4 | DOI: 10.17632/9tft3vz5tm.4
Contributor:

Description

local_feature_training_set.csv: Preprocessing data of feature extractor contains 65869 rows and 344 columns data. The first 343 columns represent feature and the last column represent label (The CSV file contains row indexes and column index) local_feature_testing_set.csv: Preprocessing data of feature extractor contains 11791 rows and 344 columns data. The first 343 columns represent feature and the last column represent label (The CSV file contains row indexes and column index) global&local_feature_training_set.csv: Preprocessing data of feature extractor contains 65869 rows and 1028 columns data. The first 1027 columns represent feature and the last column represent label (The CSV file contains row indexes and column index) global&local_feature_testing_set.csv: Preprocessing data of feature extractor contains 11791 rows and 1028 columns data. The first 1027 columns represent feature and the last column represent label(The CSV file contains row indexes and column index) raw_feature_training_set.csv: raw feature data (secondary structure, raw protein sequence, position specific scoring matrix feature) contains 65869 rows and 24844 columns data. The first 24843 columns represent feature and the last column represent labell(The CSV file contains column index). raw_feature_testing_set.csv: raw feature data (secondary structure, raw protein sequence, position specific scoring matrix feature) contains 11791 rows and 24844 columns data. The first 24843 columns represent feature and the last column represent label(The CSV file contains column index).

Files