Online clustering with single-pass topology based fuzzy clustering algorithm
Department of Electrical and Computer Engineering
Master of Science
Dhawan, Atam P.
Manikopoulos, Constantine N.
Online clustering is of significant interest for real-time data analysis. Generic offline clustering methods such as K-Means, C-Means and others are computationally expensive. The computational burden of these methods increases non-linearly with the size of the data set. In addition these methods usually require a good amount of supervised knowledge yielding a non-unique solution. For real-time data analysis, there is an important tradeoff between accuracy and computational efficiency. An unsupervised one-pass clustering method that efficiently adapts to data distribution and evaluation is proposed. This method, Topology-Based Fuzzy Clustering (TFC), uses the topology of data to discover clusters. TFC uses the method of Growing Neural Gas (GNG) method of creating linked sub-clusters and extends GNG by assigning a fuzzy membership to the sub-clusters, noting the link structure for creating clusters and influencing the learning nodes at each sub-clusters. This also gives a fuzzy estimation of data distribution within each cluster. The computational burden for TFC is proportional to the size of the initial data set and increases linearly with the addition of new data.
As TFC is based on GNG, it is an unsupervised algorithm. A supervised learning method is proposed that can be used in conjunction with TFC, to increases its accuracy with minimum computational burden. This adaptive algorithm is called the Adaptive Topology-Based Fuzzy Clustering (ATFC). In this study, the performance of ATFC and TFC is also evaluated against standard datasets.
njit-etd2006-006 (174 pages ~ 12,011 KB pdf)
Please complete this Feedback Form to inform us about your experience using this website. It will assist us in better serving your information needs in the future. Thank You!
Created September 8, 2008