Multiple sequence alignment for functional correlation among low similarity sequences
datasetposted on 21.11.2017 by Chou, Wei-Yao, Chou, Wei-I, Pai, Tun-Wen, Lin, Shu-Chuan, Chang, Fan-Yu, Sun, Yuh-Ju, Tang, Chuan-Yi, Chang, Margaret Dah-Tsyr
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Multiple sequence alignment is a broadly used methodology in biological applications. It is expected to locate consensus sequence stretches with evolutionary and functional conservation. However, when sequence similarity among the queries becomes low, it works poorly. The aim of this study is to incorporate important biological knowledge and assumption to improve the quality of a general alignment on low similarity sequences such as carbohydrate binding module (CBM) families. Since the recognition of characteristic patterns in CBMs does not apply to a general model, a more accurate scoring function employing secondary-structure-based and key-residue-weighted algorithms for alignment was designed to approach this goal. Our results indicated that the new method was practically applicable to identify the key residues in terms of three-dimensional structures, while conventional tools could fail. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1 Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.