Document Type: Original Research Paper

Author

Dr.Sivanthi Aditanar College of Engineering Tiruchendur Tamilnadu India

Abstract

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper, a new hybrid data clustering approach which combines the modified krill herd and K-means algorithms, named as K-MKH, is proposed. K-MKH algorithm utilizes the power of quick convergence behaviour of K-means and efficient global exploration of Krill Herd and random phenomenon of Levy flight method. The Krill-herd algorithm is modified by incorporating Levy flight in to it to improve the global exploration. The proposed algorithm is tested on artificial and real life datasets. The simulation results are compared with other methods such as K-means, Particle Swarm Optimization (PSO), Original Krill Herd (KH), hybrid K-means and KH. Also the proposed algorithm is compared with other evolutionary algorithms such as hybrid modified cohort intelligence and K-means (K-MCI), Simulated Annealing (SA), Ant Colony Optimization (ACO), Genetic Algorithm (GA), Tabu Search (TS), Honey Bee Mating Optimization (HBMO) and K-means++. The comparison shows that the proposed algorithm improves the clustering results and has high convergence speed.

Keywords

Main Subjects

[1] Jain, A.K., Murty, M.N., and Flynn, P.J., 1999. Data clustering: A review. ACM Computing Survey, 31,pp.264-323.
[2] Jiawei han,Michelin Kamber , 2010. Data mining concepts and techniques, Elsevier.
[3] Kennedy, J., and Eberhart, R.C. , 2001. Swarm Intelligence, Morgan Kaufmann 1-55860-595-9.
[4] Selim, S.Z., and Al-Sultan, K.S., 1991. A simulated annealing algorithm for the clustering problem. Pattern Recognition. 24(10), pp.1003–1008.
[5] Ujjwal Maulik, Sanghamitra Bandyopadhyay, 2000. Genetic algorithm-based clustering technique. Pattern Recognition. 33, pp.1455-1465.
[6] Sung, C., & Jin, H. (2000). A tabu-search-based heuristic for clustering. Pattern Recognition, 33, pp.849–858.
[7] Shelokar, P.S., Jayaraman, V.K., Kulkarni, B.D., 2004. An ant colony approach for clustering. Analytica Chimica Acta, 509( 2), pp.187–195.
[8] Liu, Y., Yi, Z., Wu, H., Ye, M., Chen, K., 2008. A tabu search approach for the minimum sum-of-squares clustering problem. Information Sciences. 178 , pp. 2680–2704 .
[9] Yi-Tung Kao, Erwie Zahara , I-Wei Kao, 2008. A hybridized approach to data clustering. Expert Systems with Applications. 34(3), pp.1754–1762.
[10] Fathian, M. , Amiri, B. , 2008. A honey-bee mating approach on clustering. The International Journal of Advanced Manufacturing Technology. 38, pp.809–821.
[11] Dervis Karaboga, Celal Ozturk., 2011. A novel clustering approach: Artificial Bee Colony (ABC) algorithm. Applied Soft Computing. 11,pp. 652–657.
[12] Xiaohui Yan, Yunlong Zhu , Wenping Zou, Liang Wang, 2012. A new approach for data clustering using hybrid artificial bee colony algorithm. Neurocomputing. 97 , pp. 241–250.
[13] Miao Wan ,Lixiang Li ,Jinghua Xiao ,Cong Wang , Yixian Yang., 2012. Data clustering using bacterial foraging optimization. Journal of Intelligent Information Systems. 38(2), pp.321-341.
[14] Senthilnath, J., Omkar, S.N., Mani, V., 2011. Clustering using firefly algorithm: performance study. Swarm and Evolutionary Computation. 1(3), pp.164–171.
[15] Tunchan Cura, 2012. A particle swarm optimization approach to clustering. Expert Systems with Applications. 39(1), pp.1582–1588.
[16] Abdolreza Hatamlou ,2012. In search of optimal centroids on data clustering using a binary search algorithm. Pattern Recognition Letters. 33, pp.1756–1760.
[17] Abdolreza Hatamlou, 2013. Black hole: A new heuristic optimization approach for data clustering. Information Sciences. 222, pp.175-184.
[18] Taher Niknam, Bahman Bahmani Firouzi and Majid Nayeripour, 2008. An Efficient Hybrid Evolutionary Algorithm for Cluster Analysis. World Applied Sciences Journal. 4 (2), pp.300-307.
[19] Taher NIKNAM, Babak AMIRI, Javad OLAMAEI, Ali AREFI, 2009. An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering. Journal of Zhejiang University SCIENCE A. 10(4), pp.512-519.
[20] Bahamn Nahmanifirouzi, lokhtar sha sadeghi and taher niknam, 2010. A new hybrid algorithm based on PSO,SA and K-means for cluster analysis. Int journal of innovative computing,information and control, 6(7), pp.3177-3192.
[21] Niknam, T., Olamaei, J., Amiri, B., 2008. A Hybrid Evolutionary Algorithm Based on ACO and SA for Cluster Analysis. Journal of Applied sciences. 8(15), pp.2675-2702.
[22] Taher Niknam, Babak Amiri, 2010. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Applied Soft Computing. 10, pp.183–197.
[23] Chi-Yang Tsai, I-Wei Kao, 2011. Particle swarm optimization with selective particle regeneration for data clustering. Expert Systems with Applications. 38, pp. 6565–6576.
[24] Taher Niknam , Elahe Taherian Fard , Narges Pourjafarian , Alireza Rousta, 2011. An efficient hybrid algorithm based on modified imperialist competitive algorithm and K-means for data clustering. Engineering Applications of Artificial Intelligence. 24 (2), pp.306–317.
[25] Ganesh Krishnasamy, Anand J. Kulkarni , Raveendran Paramesran, 2014. A hybrid approach for data clustering based on modified cohort intelligence and K-means, Expert Systems with Applications, 41, pp. 6009–6016.
[26] Amir Hossein Gandomi, Amir Hossein Alavi . 2012. Krill herd: A new bio-inspired optimization algorithm, Communications in Nonlinear Science and Numerical Simulation, 17, pp.4831–4845.
[27] Gai-Ge Wang , AmirH.Gandomi , AmirH.Alavi , 2014, Stud krill herd algorithm , Neurocomputing, 128, pp.363–370.
[28] Gai-Ge Wanga, Amir H. Gandomi, Amir H. Alavi, 2014, An effective krill herd algorithm with migration operator in biogeography-based optimization, Applied Mathematical Modelling, 38, pp.2454–2462.
[29] Gai-Ge Wang, Amir H. Gandomi , Amir H. Alavi ,Guo-Sheng Hao, 2014, Hybrid krill herd algorithm with differential evolution for global numerical optimization, Neural Comput & Applic, 25, pp.297–308.
[30] GaigeWang, Lihong Guo, Amir Hossein Gandomi, Lihua Cao, Amir Hossein Alavi,Hong Duan, and Jiang Li1, 2013. Lévy-Flight Krill Herd Algorithm, Mathematical Problems in Engineering.
[31] Barthelemy P., Bertolotti J., Wiersma. D. S., 2008. A Levy flight for light. Nature. 453, pp. 495-498.
[32] Huseyin Hakl., & Harun Uguz.(2014).A novel particle swarm optimization algorithm with Levy flight. Applied Soft Computing, 23, pp.333–345.
[33] Xin-She Yang.(2010). Nature-Inspired Metaheuristic Algorithms: Second Edition. (pp. 11-19). United Kingdom, Luniver Press.
[34] Yang, X.-S., & Deb, S. (2010). Engineering Optimisation by Cuckoo Search. Int. J. Mathematical Modelling and Numerical Optimisation, 1(4), pp.330–343.
[35] Blake, C.L., Merz, C.J., 1998. University of California at Irvine Repository of Machine Learning Databases. <http://www.ics.uci.edu/mlearn/ MLRepository.html>
[36] Arthur, D., & Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM–SIAM symposium on discrete algorithms SODA ’07, Philadelphia, PA (pp. 1027–1035). USA: Society for Industrial and Applied Mathematics
[37]http://www.mathworks.com/matlabcentral/fileexchange/55486-krill-herd-algorithm
[38] R.Jensi and G.Wiselin Jiji, 2015. Hybrid data clustering approach using k-means and flower pollination algorithm. Advanced computational intelligence: an international journal (ACII), 2 (2), pp.15-25.
[39] R.Jensi and G.Wiselin Jiji, 2015. MBA-LF: a new data clustering method using modified bat algorithm and levy flight. ICTACT journal on soft computing, 6(1), pp.1093-1101.
[40] G.-G.Wang, A.H.Gandomi, X.- Yang, and A.H.Alavi.2016. A new hybrid method based on krill herd and cuckoo search for global optimisation tasks, International Journal of Bio-Inspired Computation , 8(5), pp.286-299.
[41] G.Wang L.Guo H.Wang H.DuanL.LiuJ.Li, 2014. Incorporating mutation scheme into krill herd algorithm for global numerical optimization, Neural Computing Applications, 24(3), pp.853-871.
[42] Raed Abdulkareem HASAN, and Muamer N. MOHAMMED, 2017. A Krill Herd Behaviour Inspired Load Balancing of Tasks in Cloud Computing, Studies in Informatics and Control, 26(4), pp. 413-424.
[43] Qin Li and Bo Liu, 2017. Clustering Using an Improved Krill Herd Algorithm, Algorithms, 10(56), pp.1-12.
[44] Kinza Qadeer, Muhammad Abdul Qyyum, and Moonyong Lee, 2018. Krill-Herd-Based Investigation for Energy Saving Opportunities in Offshore Liquefied Natural Gas Processes, Ind. Eng. Chem. Res., 57 (42),pp.14162–14172.
[45] Arthur, D., Vassilvitskii, S. 2007. k-means++: the advantages of careful seeding". Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA. pp. 1027–1035.