Improvements in the score matrix calculation method using parallel score estimating algorithm

Abstract

The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to align and analyze the sequences, providing improvements mainly in the running time of these algorithms. In many situations, the parallel strategy contributes to reducing the computational complexity of the big problems. This work shows some results obtained by an implementation of a parallel score estimating technique for the score matrix calculation stage, which is the first stage of a progressive multiple sequence alignment. The performance and quality of the parallel score estimating are compared with the results of a dynamic programming approach also implemented in parallel. This comparison shows a significant reduction of running time. Moreover, the quality of the final alignment, using the new strategy, is analyzed and compared with the quality of the approach with dynamic programming.

Share and Cite:

Zafalon, G. , Marucci, E. , Momente, J. , Amazonas, J. , Sato, L. and Machado, J. (2013) Improvements in the score matrix calculation method using parallel score estimating algorithm. Journal of Biophysical Chemistry, 4, 47-51. doi: 10.4236/jbpc.2013.42006.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Chou, K.C., Zhou, D., Fan, X., Tan, D., Xu, Y., Tavis, J.E. and Bisceglie, A.M.D. (2007) Separation of near full-length hepatitis c virus quasispecies variants from a complex population. Journal of Virological Methods, 141, 220-224. doi:10.1016/j.jviromet.2006.12.002
[2] Edgar, R.C. and Batzoglou, S. (2006) Multiple sequence alignment. Current Opinion in Structural Biology, 16, 368-373. doi:10.1016/j.sbi.2006.04.004
[3] Arcuri, H.A., Zafalon, G.F.D., Marucci, E.A., Bonalumi, C.E., Da Silveira, N.J.F., Machado, J.M., De Azevedo, W.F. and Palma, M.S. (2010) SKPDB: A structural database of shikimate pathway enzymes. BMC Bioinformatics, 11, 1-7. doi:10.1186/1471-2105-11-12
[4] Needleman, S.B. and Wunsch, C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443-453. doi:10.1016/0022-2836(70)90057-4
[5] Wallace, I.M., Blackshields, G. and Higgins, D.G. (2005) Multiple sequence alignments. Current Opinion in Structural Biology, 15, 261-266. doi:10.1016/j.sbi.2005.04.002
[6] Larkin, M., Blackshields, G., Brown, N., Chenna, R., Mc-Gettigan, P., McWilliam, H., Valentin, F., Wallace, I., Wilm, A., Lopez, R., Thampson, J., Gibson, T. and Higgins, D. (2007) Clustal w and clustal x version 2.0. Bioinformatics, 23, 2947-2948. doi:10.1093/bioinformatics/btm404
[7] Zomaya, A.Y., Ercal, F. and Olariu, S. (2001) Solutions to parallel and distributed computing problems—Lessons from biological sciences. John Wiley & Sons, Chichester.
[8] Chen, Y., Pan, Y., Chen, J., Liu, W. and Chen, L. (2006) Partitioned optimization algorithms for multiple sequence alignment. Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA’06), 18-20 April 2006, 2. doi:10.1109/AINA.2006.260
[9] Bilu, Y., Agarwal, P.K. and Kolodny, R. (2006) Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons. IEEE/ACM Transactions on Com-Putational Biology and Bioinformatics, 3, 408-422. doi:10.1109/TCBB.2006.53
[10] Thorsen, O., Smith, B., Sosa, C.P., Jiang, K., Lin, H., Peters, A. and Chung F.W. (2007) Parallel genomic sequence-search on a massively parallel system. Proceedings of the 4th International Conference on Computing Frontiers, Ischia, 7-9 May 2007, 59-68. doi:10.1145/1242531.1242542
[11] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) A basic local alignment search tool. Journal of Molecular Biology, 215, 403-410. doi:10.1016/S0022-2836(05)80360-2
[12] Gardner, M.K., Chung F.W., Archuleta, J., Lin, H. and Mal, X. (2006) Parallel genomic sequence-searching on an adhoc grid: Experiences, lessons learned, and implications. Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, Tampa, 11-17 November 2006, 22. doi:10.1109/SC.2006.46
[13] Moss, J. and Johnson, C.G. (2003) An ant colony algorithm for multiple sequence alignment in bioinformatics. Artificial Neural Networks and Genetic Algorithms, 182-186. doi:10.1007/978-3-7091-0646-4_33
[14] Lee, Z.-J., Su, S.-F., Chuang, C.-C. and Liu, K.-H. (2008) Genetic algorithm with ant colony optimization (ga-aco) for multiple sequence alignment. Applied Soft Computing, 8, 55-78. doi:10.1016/j.asoc.2006.10.012
[15] Ebedes, J. and Datta, A. (2004) Multiple sequence alignment in parallel on a workstation cluster. Bioinformatics, 20, 1193-1195. doi:10.1093/bioinformatics/bth055
[16] Thompson, J.D., Koehl, P., Ripp, R. and Poch, O. (2005) Balibase 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics, 61, 127-136. doi:10.1002/prot.20527

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.