hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A General Framework for Discovering Multiple Data Groupings
Halmstad University, School of Information Technology.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Clustering helps users gain insights from their data by discovering hidden structures in an unsupervised way. Unlike classification tasks that are evaluated using well-defined target labels, clustering is an intrinsically subjective task as it depends on the interpretation, need and interest of users. In many real-world applications, multiple meaningful clusterings can be hidden in the data, and different users are interested in exploring different perspectives and use cases of this same data. Despite this, most existing clustering techniques only attempt to produce a single clustering of the data, which can be too strict. In this thesis, a general method is proposed to discover multiple alternative clusterings of the data, and let users select the clustering(s) they are most interested in. In order to cover a large set of possible clustering solutions, a diverse set of clusterings is first generated based on various projections of the data. Then, similar clusterings are found, filtered, and aggregated into one representative clustering, allowing the user to only explore a small set of non-redundant representative clusterings. We compare the proposed method against others and analyze its advantages and disadvantages, based on artificial and real-world datasets, as well as on images enabling a visual assessment of the meaningfulness of the discovered clustering solutions. On the other hand, extensive studies and analysis concerning a variety of techniques used in the method are made. Results show that the proposed method is able to discover multiple interesting and meaningful clustering solutions.

Place, publisher, year, edition, pages
2018. , p. 114
Keywords [en]
machine learning, unsupervised learning, data mining, clustering, multiple-clusterings, clustering algorithm
National Category
Engineering and Technology Computer Systems
Identifiers
URN: urn:nbn:se:hh:diva-38047OAI: oai:DiVA.org:hh-38047DiVA, id: diva2:1250646
Educational program
Master's Programme in Embedded and Intelligent Systems, 120 credits
Supervisors
Examiners
Available from: 2018-09-25 Created: 2018-09-24 Last updated: 2018-09-25Bibliographically approved

Open Access in DiVA

fulltext(6371 kB)77 downloads
File information
File name FULLTEXT02.pdfFile size 6371 kBChecksum SHA-512
0f01522caade05e6b9b032f63261efd86948e174e512a3d7aa1f2bf9ed875763a60e77ccce8225c0fc603affdc8eb5befbf300ee3c0277d628547a2ea190747a
Type fulltextMimetype application/pdf

By organisation
School of Information Technology
Engineering and TechnologyComputer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 77 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 217 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf