In this thesis, we study the relationship between the notion of outlier employed by Isolation Forest and the 3-approximation algorithm for solving the k-center with z outliers problem. Both algorithms’ strategy is influenced by the concept of density, which motivates our comparison. We also design a new method employing Isolation Forest as a preprocessing step for efficiently solving the k-center with z outliers problem. Through our experimental analysis we find that, depending on outlier type, these methods do not always return similar sets of outliers but nonetheless the returned outliers are of comparable outlying degree. Furthermore, the proposed method shows substantial efficiency gains, with a nearly linear complexity as opposed to the more than quadratic complexity of the classical 3-approximation algorithm.
In this thesis, we study the relationship between the notion of outlier employed by Isolation Forest and the 3-approximation algorithm for solving the k-center with z outliers problem. Both algorithms’ strategy is influenced by the concept of density, which motivates our comparison. We also design a new method employing Isolation Forest as a preprocessing step for efficiently solving the k-center with z outliers problem. Through our experimental analysis we find that, depending on outlier type, these methods do not always return similar sets of outliers but nonetheless the returned outliers are of comparable outlying degree. Furthermore, the proposed method shows substantial efficiency gains, with a nearly linear complexity as opposed to the more than quadratic complexity of the classical 3-approximation algorithm.
Comparison of Isolation Forest and Clustering Methods for Outlier Detection
BEJAJ, XHACU
2024/2025
Abstract
In this thesis, we study the relationship between the notion of outlier employed by Isolation Forest and the 3-approximation algorithm for solving the k-center with z outliers problem. Both algorithms’ strategy is influenced by the concept of density, which motivates our comparison. We also design a new method employing Isolation Forest as a preprocessing step for efficiently solving the k-center with z outliers problem. Through our experimental analysis we find that, depending on outlier type, these methods do not always return similar sets of outliers but nonetheless the returned outliers are of comparable outlying degree. Furthermore, the proposed method shows substantial efficiency gains, with a nearly linear complexity as opposed to the more than quadratic complexity of the classical 3-approximation algorithm.| File | Dimensione | Formato | |
|---|---|---|---|
|
Bejaj_Xhacu.pdf
accesso aperto
Dimensione
761.42 kB
Formato
Adobe PDF
|
761.42 kB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/86927