Dataset 4 - Membrane Protein Types

Published: 27 Jun 2018 | Version 1 | DOI: 10.17632/dbzdybks82.1
Contributor(s):

Description of this data

To establish a quality benchmark dataset for developing a predictor to identify the functional types of membrane proteins, the sequences were collected from UniProtKB/
Swiss-Prot release on 2018_04 at http://www.uniprot.org/according to the following steps (Lin et al. 2013). Proteins belonging to all eight types were collected. Those proteins annotated with ‘‘fragment’’ were removed; meanwhile, those proteins with the length of sequence less than 50 residues were also excluded, in case of the influence of the fragment. Sequences annotated with ambiguous or uncertain terms, such as ‘‘potential,’’ ‘‘probable,’’‘‘probably,’’ ‘‘maybe,’’ or ‘‘by similarity,’’ were removed for further consideration.
The Dataset 4 is divided as training dataset and testing dataset with 1332 and 1033 respectively.

Experiment data files

Steps to reproduce

To establish a quality benchmark dataset for developing a predictor to identify the functional types of membrane proteins, the sequences were collected from UniProtKB/
Swiss-Prot release on 2018_04 at http://www.uniprot.org/according to the following steps (Lin et al. 2013). Proteins belonging to all eight types were collected. Those proteins annotated with ‘‘fragment’’ were removed; meanwhile, those proteins with the length of sequence less than 50 residues were also excluded, in case of the influence of the fragment. Sequences annotated with ambiguous or uncertain terms, such as ‘‘potential,’’ ‘‘probable,’’‘‘probably,’’ ‘‘maybe,’’ or ‘‘by similarity,’’ were removed for further consideration.
The Dataset 4 is divided as training dataset and testing dataset with 1332 and 1033 respectively.

This data is associated with the following publication:

Predicting membrane protein types by incorporating a novel feature set into Chou's general PseAAC

Published in: Journal of Theoretical Biology

Latest version

  • Version 1

    2018-06-27

    Published: 2018-06-27

    DOI: 10.17632/dbzdybks82.1

    Cite this dataset

    Siva Sankari, Elangovan; megalai, mani (2018), “Dataset 4 - Membrane Protein Types”, Mendeley Data, v1 http://dx.doi.org/10.17632/dbzdybks82.1

Statistics

Views: 541
Downloads: 30

Institutions

Government College of Engineering

Categories

Membrane Proteins

Licence

CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?

This dataset is licensed under a Creative Commons Attribution 4.0 International licence. What does this mean? You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.

Report