Center-based clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the k-median clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the k-median clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.
Center-based clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the k-median clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the k-median clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.
Streaming algorithms for center-based clustering in general metrics
BADIN, LUCA
2021/2022
Abstract
Center-based clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the k-median clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the k-median clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.File | Dimensione | Formato | |
---|---|---|---|
Badin_Luca.pdf
accesso aperto
Dimensione
513.43 kB
Formato
Adobe PDF
|
513.43 kB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/40246