The present thesis document details the research work, findings and potential future applications made by graduand Matteo Baggio during his Internship at Zucchetti S.p.A. subsidiary in Padua, Veneto, Italy. This three hundred twenty hours Internship is part of the Bachelor of Sciences in Computer Science curriculum, Matteo Baggio chose to pursue a research oriented Internship. Large Language Models (LLMs) are an emerging technology, with new models being continuously developed for various purposes. However, their capabilities and behaviours are still not fully understood, making them unreliable for use as assistants in complex tasks, particularly by operators with limited knowledge of this technology. To address this issue, various manipulation techniques have been developed by designing and engineering prompts in specific ways. These techniques substantially improve the reliability and quality of the generated answers. However the effectiveness of these techniques also depends on the model being used and its metrics like context length, parameter size and temperature. The research proposed in this thesis aims to study and test the behaviours of Large Language Models (LLMs) to improve their performance and use them in practical applications contexts, specifically with the restraint of using locally ran models. The main practical application that this research analyzes, focuses on assisting Developers with Static Analysis Software reports aiming to reduce the often high number of false positives by analyzing the results and discarding false positives while providing a contextual solution for the true positives. Zucchetti S.p.A. internally utilizes the SonarQube platform to perform Static Analysis, so this thesis presents a solution that revolves around the reports obtained through this platform.

The present thesis document details the research work, findings and potential future applications made by graduand Matteo Baggio during his Internship at Zucchetti S.p.A. subsidiary in Padua, Veneto, Italy. This three hundred twenty hours Internship is part of the Bachelor of Sciences in Computer Science curriculum, Matteo Baggio chose to pursue a research oriented Internship. Large Language Models (LLMs) are an emerging technology, with new models being continuously developed for various purposes. However, their capabilities and behaviours are still not fully understood, making them unreliable for use as assistants in complex tasks, particularly by operators with limited knowledge of this technology. To address this issue, various manipulation techniques have been developed by designing and engineering prompts in specific ways. These techniques substantially improve the reliability and quality of the generated answers. However the effectiveness of these techniques also depends on the model being used and its metrics like context length, parameter size and temperature. The research proposed in this thesis aims to study and test the behaviours of Large Language Models (LLMs) to improve their performance and use them in practical applications contexts, specifically with the restraint of using locally ran models. The main practical application that this research analyzes, focuses on assisting Developers with Static Analysis Software reports aiming to reduce the often high number of false positives by analyzing the results and discarding false positives while providing a contextual solution for the true positives. Zucchetti S.p.A. internally utilizes the SonarQube platform to perform Static Analysis, so this thesis presents a solution that revolves around the reports obtained through this platform.

Adopting Large Language Models as Assistants or Substitutes to Developers in Static Analysis

BAGGIO, MATTEO
2023/2024

Abstract

The present thesis document details the research work, findings and potential future applications made by graduand Matteo Baggio during his Internship at Zucchetti S.p.A. subsidiary in Padua, Veneto, Italy. This three hundred twenty hours Internship is part of the Bachelor of Sciences in Computer Science curriculum, Matteo Baggio chose to pursue a research oriented Internship. Large Language Models (LLMs) are an emerging technology, with new models being continuously developed for various purposes. However, their capabilities and behaviours are still not fully understood, making them unreliable for use as assistants in complex tasks, particularly by operators with limited knowledge of this technology. To address this issue, various manipulation techniques have been developed by designing and engineering prompts in specific ways. These techniques substantially improve the reliability and quality of the generated answers. However the effectiveness of these techniques also depends on the model being used and its metrics like context length, parameter size and temperature. The research proposed in this thesis aims to study and test the behaviours of Large Language Models (LLMs) to improve their performance and use them in practical applications contexts, specifically with the restraint of using locally ran models. The main practical application that this research analyzes, focuses on assisting Developers with Static Analysis Software reports aiming to reduce the often high number of false positives by analyzing the results and discarding false positives while providing a contextual solution for the true positives. Zucchetti S.p.A. internally utilizes the SonarQube platform to perform Static Analysis, so this thesis presents a solution that revolves around the reports obtained through this platform.
2023
Adopting Large Language Models as Assistants or Substitutes to Developers in Static Analysis
The present thesis document details the research work, findings and potential future applications made by graduand Matteo Baggio during his Internship at Zucchetti S.p.A. subsidiary in Padua, Veneto, Italy. This three hundred twenty hours Internship is part of the Bachelor of Sciences in Computer Science curriculum, Matteo Baggio chose to pursue a research oriented Internship. Large Language Models (LLMs) are an emerging technology, with new models being continuously developed for various purposes. However, their capabilities and behaviours are still not fully understood, making them unreliable for use as assistants in complex tasks, particularly by operators with limited knowledge of this technology. To address this issue, various manipulation techniques have been developed by designing and engineering prompts in specific ways. These techniques substantially improve the reliability and quality of the generated answers. However the effectiveness of these techniques also depends on the model being used and its metrics like context length, parameter size and temperature. The research proposed in this thesis aims to study and test the behaviours of Large Language Models (LLMs) to improve their performance and use them in practical applications contexts, specifically with the restraint of using locally ran models. The main practical application that this research analyzes, focuses on assisting Developers with Static Analysis Software reports aiming to reduce the often high number of false positives by analyzing the results and discarding false positives while providing a contextual solution for the true positives. Zucchetti S.p.A. internally utilizes the SonarQube platform to perform Static Analysis, so this thesis presents a solution that revolves around the reports obtained through this platform.
Large Language Model
Static Analysis
Prompt Engineering
Prompt Design
LLM
File in questo prodotto:
File Dimensione Formato  
Adopting Large Language Models as Assistants or Substitutes to Developers in Static Analysis.pdf

accesso aperto

Dimensione 5.46 MB
Formato Adobe PDF
5.46 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/70944