VATdt: Visual Assessment of Cluster Tendency Using Diagonal Tracing

The visual assessment of tendency (VAT) technique, for visually finding the number of meaningful clusters in data, developed by J. C. Bezdek, R. J. Hathaway and J. M. Huband, is very useful, but there is room for improvements. Instead of displaying the ordered dissimilarity matrix (ODM) as a 2D gray-level image for human interpretation as is done by VAT, we trace the changes in dissimilarities along the diagonal of the ODM. This changes the 2D data structure (matrices) into 1D arrays, displayed as what we call the tendency curves, which enables one to concentrate only on one variable, namely the height. One of these curves, called the d-curve, clearly shows the existence of cluster structure as patterns in peaks and valleys, which can be caught not only by human eyes but also by the computer. Our numerical experiments showed that the computer can catch cluster structures from the d-curve even in some cases where the human eyes see no structure from the visual outputs of VAT. And success on all numerical experiments was obtained us- ing the same (fixed) set of program parameter values.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Y. Hu, "VATdt: Visual Assessment of Cluster Tendency Using Diagonal Tracing," American Journal of Computational Mathematics, Vol. 2 No. 1, 2012, pp. 27-41. doi: 10.4236/ajcm.2012.21004.

 [1] A. K. Jain and R. C. Dubes, “Algorithms for Clustering Data,” Prentice-Hall, Englewood Cliffs, 1988. [2] B. S. Everitt, “Graphical Techniques for Multivariate Data,” Elsevier, New York, 1978. [3] J. W. Tukey, “Exploratory Data Analysis,” Addison-Wesley, Reading, 1977. [4] W. S. Cleveland, “Visualizing Data,” Hobart Press, Summit, 1993. [5] J. C. Bezdek and R. J. Hathaway, “VAT: A Tool for Visual Assessment of (Cluster) Tendency,” Proceedings of the 2002 International Joint Conference on Neural Networks, Honolulu, 12-17 May 2002, pp. 2225-2230. [6] J. C. Bezdek, R. J. Hathaway and J. M. Huband, “Visual Assessment of Clustering Tendency for Rectangular Dissimilarity Matrices,” IEEE Transactions on Fuzzy Systems, Vol. 15, No. 5, 2007, pp. 890-903. doi:10.1109/TFUZZ.2006.889956 [7] R. J. Hathaway, J. C. Bezdek and J. M. Huband, “Scalable Visual Assessment of Cluster Tendency for Large Data Sets,” Pattern Recognition, Vol. 39, No. 7, 2006, pp. 1315-1324. doi:10.1016/j.patcog.2006.02.011 [8] J. M. Huband, J. C. Bezdek and R. J. Hathaway, “Revised Visual Assessment of (Cluster) Tendency (reVAT),” Proceedings of the North American Fuzzy Information Processing Society (NAFIPS), Banff, 27-30 June 2004, pp. 101-104. [9] J. M. Huband, J. C. Bezdek and R. J. Hathaway, “bigVAT: Visual Assessment of Cluster Tendency for Large Data Set,” Pattern Recognition, Vol. 38, No. 11, 2005, pp. 1875-1886. doi:10.1016/j.patcog.2005.03.018 [10] I. Borg and J. Lingoes, “Multidimensional Similarity Structure Analysis,” Springer-Verlag, New York, 1987. doi:10.1007/978-1-4612-4768-5 [11] M. Kendall and J. D. Gibbons, “Rank Correlation Methods,” Oxford University Press, New York, 1990.