Share This Article:

A Bioinformatics-Inspired Adaptation to Ukkonen’s Edit Distance Calculating Algorithm and Its Applicability Towards Distributed Data Mining

Abstract Full-Text HTML Download Download as PDF (Size:189KB) PP. 8-12
DOI: 10.4236/jsea.2008.11002    6,067 Downloads   9,328 Views   Citations
Author(s)    Leave a comment

ABSTRACT

Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and can be used to represent the string resulting from applying s to a reference string. We present a modification to Ukkonen’s edit distance calculating algorithm based upon representing strings by edit sequences. We conclude with a demonstration of how using this representation can improve mitochondrial DNA query throughput performance in a distributed computing environment.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

J. Bruce, "A Bioinformatics-Inspired Adaptation to Ukkonen’s Edit Distance Calculating Algorithm and Its Applicability Towards Distributed Data Mining," Journal of Software Engineering and Applications, Vol. 1 No. 1, 2008, pp. 8-12. doi: 10.4236/jsea.2008.11002.

References

[1] M. D. Vose, “A formal analysis of edit distance,” UT CS Technical Report ut-cs-04-517, February 2004.
[2] R. O. Duda and P. E. Hart, Pattern Classification (2nd ed.). Wiley Interscience, 2000.
[3] A. Wagner and M. I. Fischer, “The string-to-string correction problem,” Journal of the ACM, 21(1) (Jan. 1974), pp. 168-173, 1974.
[4] E. Ukkonen, “Algorithms for approximate string matching,” International Control 64, pp. 100-118, 1985.
[5] E. Ukkonen, “On approximate string matching,” International Conference Fundamentals of Computation Theory, Lecture Notes in Computer Science, pp. 158:487-495, 1983.
[6] N. Campbell and J. Reese, Biology (6th ed.), Addison Wesley, 1997.
[7] S. Anderson, et al., “Sequence and organization of the human mitochondrial genome,” Nature, 290(5806) (April 9, 1981), pp. 457-265, 1981.
[8] K. L. Monson, et al., “The mtDNA population database: An integrated software and database resource for forensic comparison,” Forensic Science Communications, 4(2), April 2002. DOI=http://www.fbi.gov/hq/lab/fsc/ backissu/april2002/miller1.htm.
[9] http://iperf.sourceforge.net.

  
comments powered by Disqus

Copyright © 2019 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.