Penetration Testing and Large Language Models: A Study on LLM-based PT with Local and Remote Models

In recent years, Artificial Intelligence (AI) has rapidly evolved from being integrated only in smart assistants devices (e.g. Siri or Alexa) to becoming widespread everywhere (e.g. search engines and smarthphones). This diffusion is due to the advent of Large Language Models (LLMs), which have also started to be studied in scientific fields such as Computer Engineering and Cybersecurity. This thesis explores the use of LLMs connected to Penetration Testing (PT), a crucial activity for evaluating the security of a computer system or a network to improve the overall security. In particular, it examines PentestGPT, an open-source tool created incorporating LLMs, which was designed to help penetration testers during their working activities, providing guidance to the user with suggestions, results explanations and keeping a history of what has been done. The LLMs can be used through Application Programming Interface (API) calls, which allow to run remote and local models. Motivated by the goal of understanding limitations and strengths of LLMs in PT, two different studies were performed. The first evaluated the feasibility of using small local models as replacement of remote models, considering that their size is much smaller (~10B vs 1.5T). The second, instead, analyzed the performances of remote models against Virtual Machines (VMs) coming from Hack The Box (HTB), a learning platform for pentesters. The results of the first experiment showed that small local models can be integrated into PentestGPT, but they introduced several limitations (e.g. lack of coherence and safety alignment) that required specific improvements. In contrast, the second analysis with remote models demonstrated that, while LLMs can provide meaningful guidance, they are not fully effective, since human intervention is still necessary to achieve successful outcomes.

Penetration Testing and Large Language Models: A Study on LLM-based PT with Local and Remote Models

MIELE, RICCARDO

2024/2025

Abstract

In recent years, Artificial Intelligence (AI) has rapidly evolved from being integrated only in smart assistants devices (e.g. Siri or Alexa) to becoming widespread everywhere (e.g. search engines and smarthphones). This diffusion is due to the advent of Large Language Models (LLMs), which have also started to be studied in scientific fields such as Computer Engineering and Cybersecurity. This thesis explores the use of LLMs connected to Penetration Testing (PT), a crucial activity for evaluating the security of a computer system or a network to improve the overall security. In particular, it examines PentestGPT, an open-source tool created incorporating LLMs, which was designed to help penetration testers during their working activities, providing guidance to the user with suggestions, results explanations and keeping a history of what has been done. The LLMs can be used through Application Programming Interface (API) calls, which allow to run remote and local models. Motivated by the goal of understanding limitations and strengths of LLMs in PT, two different studies were performed. The first evaluated the feasibility of using small local models as replacement of remote models, considering that their size is much smaller (~10B vs 1.5T). The second, instead, analyzed the performances of remote models against Virtual Machines (VMs) coming from Hack The Box (HTB), a learning platform for pentesters. The results of the first experiment showed that small local models can be integrated into PentestGPT, but they introduced several limitations (e.g. lack of coherence and safety alignment) that required specific improvements. In contrast, the second analysis with remote models demonstrated that, while LLMs can provide meaningful guidance, they are not fully effective, since human intervention is still necessary to achieve successful outcomes.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				CYBERSECURITY Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Penetration Testing and Large Language Models: A Study on LLM-based PT with Local and Remote Models
			
	Abstract in italiano
	
				In recent years, Artificial Intelligence (AI) has rapidly evolved from being integrated only in smart assistants devices (e.g. Siri or Alexa) to becoming widespread everywhere (e.g. search engines and smarthphones).
This diffusion is due to the advent of Large Language Models (LLMs), which have also started to be studied in scientific fields such as Computer Engineering and Cybersecurity.

This thesis explores the use of LLMs connected to Penetration Testing (PT), a crucial activity for evaluating the security of a computer system or a network to improve the overall security. In particular, it examines PentestGPT, an open-source tool created incorporating LLMs, which was designed to help penetration testers during their working activities, providing guidance to the user with suggestions, results explanations and keeping a history of what has been done. The LLMs can be used through Application Programming Interface (API) calls, which allow to run remote and local models.
Motivated by the goal of understanding limitations and strengths of LLMs in PT, two different studies were performed. The first evaluated the feasibility of using small local models as replacement of remote models, considering that their size is much smaller (~10B vs 1.5T). 
The second, instead, analyzed the performances of remote models against Virtual Machines (VMs) coming from Hack The Box (HTB), a learning platform for pentesters. 

The results of the first experiment showed that small local models can be integrated into PentestGPT, but they introduced several limitations (e.g. lack of coherence and safety alignment) that required specific improvements. In contrast, the second analysis with remote models demonstrated that, while LLMs can provide meaningful guidance, they are not fully effective, since human intervention is still necessary to achieve successful outcomes.
			
	Parola chiave
	
				LLM
Penetration Testing
PentestGPT
			
	Relatore
	
				CONTI, MAURO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Miele_Riccardo.pdf Accesso riservato Dimensione 1.3 MB Formato Adobe PDF	1.3 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91817