High Dimensional Cluster Analysis Using Path Lengths

HTML  XML Download Download as PDF (Size: 5807KB)  PP. 93-125  
DOI: 10.4236/jdaip.2018.63007    967 Downloads   2,478 Views  Citations

ABSTRACT

A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimensions (). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering techniques are used, including spectral clustering; however, new techniques are also introduced based on the path length between partitions that are connected to one another. A Line-of-Sight algorithm is also developed for clustering. A test bank of 12 data sets with varying properties is used to expose the strengths and weaknesses of each technique. Finally, a robust clustering technique is discussed based on reaching a consensus among the multiple approaches, overcoming the weaknesses found individually.

Share and Cite:

Mcilhany, K. and Wiggins, S. (2018) High Dimensional Cluster Analysis Using Path Lengths. Journal of Data Analysis and Information Processing, 6, 93-125. doi: 10.4236/jdaip.2018.63007.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.