A K-medoids based clustering scheme with an application to document clustering

dc.contributor.authorOnan A.
dc.date.accessioned2024-07-22T08:10:29Z
dc.date.available2024-07-22T08:10:29Z
dc.date.issued2017
dc.description.abstractClustering is an important unsupervised data analysis technique, which divides data objects into clusters based on similarity. Clustering has been studied and applied in many different fields, including pattern recognition, data mining, decision science and statistics. Clustering algorithms can be mainly classified as hierarchical and partitional clustering approaches. Partitioning around medoids (PAM) is a partitional clustering algorithms, which is less sensitive to outliers, but greatly affected by the poor initialization of medoids. In this paper, we augment the randomized seeding technique to overcome problem of poor initialization of medoids in PAM algorithm. The proposed approach (PAM++) is compared with other partitional clustering algorithms, such as K-means and K-means++ on text document clustering benchmarks and evaluated in terms of F-measure. The results for experiments indicate that the randomized seeding can improve the performance of PAM algorithm on text document clustering. © 2017 IEEE.
dc.identifier.DOI-ID10.1109/UBMK.2017.8093409
dc.identifier.urihttp://akademikarsiv.cbu.edu.tr:4000/handle/123456789/15262
dc.language.isoEnglish
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectCluster analysis
dc.subjectData mining
dc.subjectInformation retrieval
dc.subjectPattern recognition
dc.subjectPulse amplitude modulation
dc.subjectText processing
dc.subjectClustering
dc.subjectDocument Clustering
dc.subjectPartitional clustering
dc.subjectPartitional clustering algorithm
dc.subjectPartitioning around medoids
dc.subjectSeeding techniques
dc.subjectText Document Clustering
dc.subjectText mining
dc.subjectClustering algorithms
dc.titleA K-medoids based clustering scheme with an application to document clustering
dc.typeConference paper

Files