Institutional Repository
    • Türkçe
    • English
  • English 
    • Türkçe
    • English
  • Login
View Item 
  •   University of Economics & Technology Repository
  • Akademik Arşiv / Institutional Repository
  • Mühendislik Fakültesi / Faculty of Engineering
  • Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • View Item
  •   University of Economics & Technology Repository
  • Akademik Arşiv / Institutional Repository
  • Mühendislik Fakültesi / Faculty of Engineering
  • Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm

Thumbnail
Date
2015
Author
Karimov, Jeyhun
Özbayoğlu, Ahmet Murat
Metadata
Show full item record
Abstract
Achieving high quality clustering is one of the most well-known problems in data mining. k-means is by far the most commonly used clustering algorithm. It converges fairly quickly, but achieving a good solution is not guaranteed. The clustering quality is highly dependent on the selection of the initial centroid selections. Moreover, when the number of clusters increases, it starts to suffer from "empty clustering". The motivation in this study is two-fold. We not only aim at improving the k-means clustering quality, but at the same time not being effected by the empty cluster issue. For achieving this purpose, we developed a hybrid model, H(EC)S-2, Hybrid Evolutionary Clustering with Empty Clustering Solution. Firstly, it selects representative points to eliminate Empty Clustering problem. Then, the hybrid algorithm uses only these points during centroid selection. The proposed model combines Fireworks and Cuckoo-search based evolutionary algorithm with some centroid-calculation heuristics. The model is implemented using a Hadoop Mapreduce algorithm for achieving scalability when faced with a Big Data clustering problem. The advantages of the developed model is particularly attractive when the amount, dimensionality and number of cluster parameters tend to increase. The results indicate that considerable clustering quality performance improvement is achieved using the proposed model.
URI
https://ieeexplore.ieee.org/document/7363909
http://hdl.handle.net/20.500.11851/2004
Collections
  • Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 




by OpenAIRE

Browse

All of RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsBy Submit DateBy TypeKapsamWOSScopusPubMedTR-DizinAvrupa Birliği Destekli Yayın SayısıTÜBİTAK Destekli Yayın SayısıDilErişimThis CollectionBy Issue DateAuthorsTitlesSubjectsBy Submit DateBy TypeKapsamWOSScopusPubMedTR-DizinAvrupa Birliği Destekli Yayın SayısıTÜBİTAK Destekli Yayın SayısıDilErişim

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 


Creative Commons License
Institutional Repository by TOBB ETU Institutional Repository is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License..

Institutional Repository:



TOBB ETU için Devinim Yazılım Eğitim Danışmanlık tarafından özelleştirilerek kurulmuştur.