Centerbased clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the kmedian clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the kmedian clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.
Centerbased clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the kmedian clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the kmedian clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.
Streaming algorithms for centerbased clustering in general metrics
BADIN, LUCA
2021/2022
Abstract
Centerbased clustering is an important yet computationally difficult primitive in the realm of unsupervised learning and data analysis. We specifically focus on the kmedian clustering problem in general metric spaces, in which one seeks to find a set of k centers, so to minimise the sum of distances from each point in the dataset to its closest center. In this thesis, we present and analyze efficient techniques to deal with the kmedian clustering problem in the streaming setting – where the dataset is presented one point at a time and not accessible in its entirety – by leveraging the dimensionality of the dataset’s underlying metric space.File  Dimensione  Formato  

Badin_Luca.pdf
accesso aperto
Dimensione
513.43 kB
Formato
Adobe PDF

513.43 kB  Adobe PDF  Visualizza/Apri 
The text of this website © Università degli studi di Padova. Full Text are published under a nonexclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/40246