Tags:Affinity Propagation, Algorithms and Models, Data Stream Clustering and Unsupervised Learning
Abstract:
Clustering of an evolving data stream is always very complex as data objects are not static in nature. To identify the real-time clusters is a task. The characteristic of the data objects in an evolving data stream is also evolving. If these evolving pat-terns of the data objects are labelled and defined it may be possible to identify the context change and such repetitive change in the context will explore the trends in the evolving data stream. However, it is necessary to understand the occurrence, re-occurrence and diminishing property of these clusters over time. Improved Affinity Propagation (IMAP) Clustering algorithm identifies these changes in the characteristics of the clusters over time. Continuous observation and registration of new clusters leads us to identify the trends in the evolving data stream. Analyzing these trends may able to understand the patterns or change in the context so that future prediction is possible. The proposed IMAP algorithm is more efficient over pre-estimation of number of clusters, and probing at right time the evolving data stream over time without any loss of information. The Algorithms is also robust in identifying the outliers. Experiments on real data sets are presented to demonstrate the benefits of the trend analysis method.
Identifying Trends Using Improved Affinity Propagation (IMAP) Clustering Algorithm on Evolving Data Stream