Show simple item record

dc.contributor.authorKarimov, Jeyhun
dc.contributor.authorÖzbayoğlu, Ahmet Murat
dc.contributor.authorDogdu, Erdogan
dc.date.accessioned2019-07-10T14:42:45Z
dc.date.available2019-07-10T14:42:45Z
dc.date.issued2015
dc.identifier.citationKarimov, J., Ozbayoglu, M., & Dogdu, E. (2015, June). K-means performance improvements with centroid calculation heuristics both for serial and parallel environments. In 2015 IEEE International Congress on Big Data (pp. 444-451). IEEE.en_US
dc.identifier.isbn978-1-4673-7278-7
dc.identifier.issn2379-7703
dc.identifier.urihttps://ieeexplore.ieee.org/document/7207256
dc.identifier.urihttp://hdl.handle.net/20.500.11851/2003
dc.description4th IEEE International Congress on Big Data, BigData Congress  ( 2015 : New York City; United States)
dc.description.abstractk-means is the most widely used clustering algorithm due to its fairly straightforward implementations in various problems. Meanwhile, when the number of clusters increase, the number of iterations also tend to slightly increase. However there are still opportunities for improvement as some studies in the literature indicate. In this study, improved implementations of k-means algorithm with a centroid calculation heuristics which results in a performance improvement over traditional k-means are proposed. Two different versions of the algorithm for various data sizes are configured, one for small and the other one for big data implementations. Both the serial and MapReduce parallel implementations of the proposed algorithm are tested and analyzed using 2 different data sets with various number of clusters. The results show that big data implementation model outperforms the other compared methods after a certain threshold level and small data implementation performs better with increasing k value.en_US
dc.description.sponsorshipIEEE Computer Society Technical Committee on Services Computing (TC-SVC),Services Society (SS)
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.ispartofProceedings - 2015 IEEE International Congress on Big Data, BigData Congress
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectk-meansen_US
dc.subjectBig Dataen_US
dc.subjectHadoopen_US
dc.subjectMapReduceen_US
dc.subjectClusteringen_US
dc.subjectparallel algorithmsen_US
dc.subjectdata miningen_US
dc.subjectunsupervised learningen_US
dc.titlek-means Performance Improvements with Centroid Calculation Heuristics both for Serial and Parallel environmentsen_US
dc.typeconferenceObjecten_US
dc.contributor.departmentTOBB ETU, Faculty of Engineering, Department of Computer Engineeringen_US
dc.contributor.departmentTOBB ETÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümütr_TR
dc.identifier.startpage444
dc.identifier.endpage451
dc.contributor.orcidhttps://orcid.org/0000-0001-7998-5735
dc.identifier.wosWOS:000380443700062
dc.identifier.scopus2-s2.0-84959484303
dc.contributor.tobbetuauthorÖzbayoğlu, Ahmet Murat
dc.contributor.YOKid142991
dc.identifier.doi10.1109/BigDataCongress.2015.72
dc.contributor.wosresearcherIDH-2328-2011
dc.contributor.ScopusAuthorID6505999525
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıtr_TR


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record