A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation ()
ABSTRACT
Nowadays exchanging data in XML format become more popular and have widespread application because of simple maintenance and transferring nature of XML documents. So, accelerating search within such a document ensures search engine’s efficiency. In this paper, we propose a technique for detecting the similarity in the structure of XML documents; in the following, we would cluster this document with Delaunay Triangulation method. The technique is based on the idea of representing the structure of an XML document as a time series in which each occurrence of a tag corresponds to a given impulse. So we could use Discrete Fourier Transform as a simple method to analyze these signals in frequency domain and make similarity matrices through a kind of distance measurement, in order to group them into clusters. We exploited Delaunay Triangulation as a clustering method to cluster the d-dimension points of XML documents. The results show a significant efficiency and accuracy in front of common methods.
Share and Cite:
Shafieian, N. (2015) A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation.
Applied Mathematics,
6, 1076-1085. doi:
10.4236/am.2015.66098.