Novel dynamic k-modes clustering of categorical and non categorical dataset with optimized genetic algorithm based feature selection

Link to full Paper

Clustering is a technique that segregates a provided dataset into homogenous groups in accordance with the provided features. It aims to determine a structure in a group of unlabelled data. Cluster analysis is an unsupervised learning technology that determines the interesting patterns in data objects without class labels. K mode clustering algorithm seems to be effective in clustering categorical data due to its easy implementation and capability to handle the massive amount of data. But because of its random selectivity of initial centroids, it gives the local optimum solution. The main contribution of the paper is to evaluate the performance of clustering on the various dataset with the proposed system. The proposed method utilizes a genetic-based Metaheuristic encircle algorithm to select enriched features and novel dynamic K modes clustering based on Dimensionality Reduced PSO for clustering process with better computational time. The encircling Prey concept has been incorporated to choose the fitness function and overcome the genetic algorithm limitations in feature selection. This paper integrated the k-modes algorithm with particle swarm optimization algorithm to obtain a global optimum solution and update the initial centroid. Several dataset utilized for the evaluation of the proposed work has been found to achieve low accuracy in the previous work. But the proposed approach’s effectiveness has been proved to be better by performing a comparative analysis with the state of art methods in terms of performance metrics such as F1 score, accuracy, NMI.

Nifty tech tag lists from Wouter Beeftink