With the rising complexity of IT infrastructures and the constant growth of threats in the field of cybersecurity, there has been a growing demand for advanced instruments capable of detecting cyber-attacks automatically. This work proposes the development and evaluation of an Intrusion Detection System, based on machine learning models for identifying anomalous activities within a computer network. Starting from the analysis of a large public dataset containing several types of simulated attacks, supervised learning models have been trained and compared. After a pre-processing phase, which aimed to clean the data and make it more usable overall, penalized regression models, classification trees and ensemble techniques have been evaluated. The results show how tree-based models, particularly ensemble techniques, have optimal discriminative power both for classifying malicious traffic and for identifying different categories of attacks. The analysis of the most important variables made the identification of the most informative features possible, suggesting which are the factors that have the greatest impact on classification and deserve attention for future developments. The developed prototype shows the potential of machine learning techniques in cybersecurity and shows promise for future enhancements aimed at enabling real-time network flow analysis on network devices.
Con l'incremento della complessità delle infrastrutture informatiche e il costante aumento di minacce alla sicurezza, è sempre più necessario lo sviluppo di strumenti avanzati per il rilevamento automatico degli attacchi informatici. Questo elaborato propone la progettazione e la valutazione di un sistema di rilevamento delle minacce basato su tecniche di machine learning, in grado di identificare attività anomale all'interno del traffico di rete. Partendo dall'analisi di un ampio dataset pubblico, contenente diverse tipologie di attacco in scenari realistici di rete, è stata avviata una fase di pre-processamento finalizzata alla pulizia e preparazione complessiva dei dati. Successivamente, sono stati addestrati e confrontati diversi modelli di apprendimento supervisionato, tra cui modelli di regressione penalizzata, alberi di classificazione e metodi ensemble, con l’obiettivo di identificare le tecniche più efficaci nel rilevamento del traffico malevolo. I risultati mostrano come le tecniche basate su alberi, in particolare gli ensemble, garantiscano un'ottima capacità discriminativa sia nella classificazione di traffico malevolo, sia nell'identificazione di diverse categorie di attacco. Il prototipo sviluppato dimostra il potenziale delle soluzioni di machine learning nella difesa informatica, con prospettive di aggiornamento e integrazione in dispositivi di rete per l'analisi in tempo reale dei flussi di rete.
Machine learning per il rilevamento di minacce in una rete informatica
SENYUVA, HURCAN ANDREI
2024/2025
Abstract
With the rising complexity of IT infrastructures and the constant growth of threats in the field of cybersecurity, there has been a growing demand for advanced instruments capable of detecting cyber-attacks automatically. This work proposes the development and evaluation of an Intrusion Detection System, based on machine learning models for identifying anomalous activities within a computer network. Starting from the analysis of a large public dataset containing several types of simulated attacks, supervised learning models have been trained and compared. After a pre-processing phase, which aimed to clean the data and make it more usable overall, penalized regression models, classification trees and ensemble techniques have been evaluated. The results show how tree-based models, particularly ensemble techniques, have optimal discriminative power both for classifying malicious traffic and for identifying different categories of attacks. The analysis of the most important variables made the identification of the most informative features possible, suggesting which are the factors that have the greatest impact on classification and deserve attention for future developments. The developed prototype shows the potential of machine learning techniques in cybersecurity and shows promise for future enhancements aimed at enabling real-time network flow analysis on network devices.| File | Dimensione | Formato | |
|---|---|---|---|
|
Senyuva_HurcanAndrei.pdf
accesso aperto
Dimensione
2.76 MB
Formato
Adobe PDF
|
2.76 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/92977