monash_7848.pdf (451.16 kB)

Discrimination of GO term annotated proteins based on amino acid occurrence and composition

dataset

posted on 2017-11-21, 00:21 authored by Taguchi, Y. H., Gromiha, M. Michael

In this paper, we have applied linear discriminant analysis and support vector machine for predicting GO term annotated proteins using amino acid occurrence/composition in uniref50 data set, i.e., uniprot with less than 50 % sequence identity.We found that our method could discriminate between proteins with at least one known GO term and those without any annotation at an AUC of 0.82 using three-fold cross validation test. Discrimination of the 38 most frequent GO terms is achieved with the maximum AUC of 0.91. Our method is solely based on amino acid sequence and hence it will be useful to predict GO term associations of newly obtained amino acid sequence without any annotated known homolog. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1 Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.

History

Usage metrics

Keywords

Bioinformatics -- Congresses Computational biology -- Congresses Computer vision in medicine -- Congresses Computational biology -- Methods -- Congresses Pattern recognition, automated -- Methods -- Congresses 2008 conference paper 1959.1/63680 monash:7848 Bioinformatics Software Bioinformatics Pattern Recognition and Data Mining

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC