The increasing complexity of ticket management within enterprise support systems necessitates innovative solutions that enhance operational efficiency and enable intuitive data interaction. This thesis presents the design, development, and evaluation of a natural language-to-SQL system for querying proprietary ticketing databases using open-source Large Language Models (LLMs). The system empowers non-technical users to retrieve actionable insights from structured data by converting natural language queries into executable SQL commands. The solution integrates open-source models, including Llama3.2:1b, Llama2 (7b, 13b), code-llama (7b, 13b), mistral:7b and mistral-nemo:12b. It is enhanced by prompt engineering and schema-aware query generation. The underlying database reflects Xidera’s proprietary ticketing system, containing over 12,000 tickets, 3,400 shipment records, and 350 hardware returns. The system is engineered to work with Italian and English language to deliver accurate and contextually appropriate responses while operating under hardware constraints, addressed through efficient data handling and model selection. A key contribution of this work is a structured evaluation framework that assesses each model’s performance across five dimensions: SQL condition match, output value accuracy, column match accuracy, table match accuracy and row count validation. A benchmark set of hand-crafted questions in dual language English and Italian, based on the company database, was designed to evaluate a broad spectrum of query types, including filters, joins, aggregations, and time-based conditions. Through detailed analysis, common failure patterns such as incorrect JOIN logic, improper timestamp handling, and function misuse were identified and mitigated via prompt refinement and example-driven learning. Comparative evaluation across LLM models revealed that prompt design, schema integration, and system constraints significantly influence the quality of SQL generation. Overall, the results demonstrate that with proper optimization, open-source LLMs can serve as scalable, privacy-preserving alternatives to commercial solutions for enterprise-level ticket management systems.
The increasing complexity of ticket management within enterprise support systems necessitates innovative solutions that enhance operational efficiency and enable intuitive data interaction. This thesis presents the design, development, and evaluation of a natural language-to-SQL system for querying proprietary ticketing databases using open-source Large Language Models (LLMs). The system empowers non-technical users to retrieve actionable insights from structured data by converting natural language queries into executable SQL commands. The solution integrates open-source models, including Llama3.2:1b, Llama2 (7b, 13b), code-llama (7b, 13b), mistral:7b and mistral-nemo:12b. It is enhanced by prompt engineering and schema-aware query generation. The underlying database reflects Xidera’s proprietary ticketing system, containing over 12,000 tickets, 3,400 shipment records, and 350 hardware returns. The system is engineered to work with Italian and English language to deliver accurate and contextually appropriate responses while operating under hardware constraints, addressed through efficient data handling and model selection. A key contribution of this work is a structured evaluation framework that assesses each model’s performance across five dimensions: SQL condition match, output value accuracy, column match accuracy, table match accuracy and row count validation. A benchmark set of hand-crafted questions in dual language English and Italian, based on the company database, was designed to evaluate a broad spectrum of query types, including filters, joins, aggregations, and time-based conditions. Through detailed analysis, common failure patterns such as incorrect JOIN logic, improper timestamp handling, and function misuse were identified and mitigated via prompt refinement and example-driven learning. Comparative evaluation across LLM models revealed that prompt design, schema integration, and system constraints significantly influence the quality of SQL generation. Overall, the results demonstrate that with proper optimization, open-source LLMs can serve as scalable, privacy-preserving alternatives to commercial solutions for enterprise-level ticket management systems.
Development, Optimization, and Comparative Evaluation of Open-Source Language Models for Proprietary SQL-Based Ticketing Systems
POTHUGANTI, SHIREESHA
2024/2025
Abstract
The increasing complexity of ticket management within enterprise support systems necessitates innovative solutions that enhance operational efficiency and enable intuitive data interaction. This thesis presents the design, development, and evaluation of a natural language-to-SQL system for querying proprietary ticketing databases using open-source Large Language Models (LLMs). The system empowers non-technical users to retrieve actionable insights from structured data by converting natural language queries into executable SQL commands. The solution integrates open-source models, including Llama3.2:1b, Llama2 (7b, 13b), code-llama (7b, 13b), mistral:7b and mistral-nemo:12b. It is enhanced by prompt engineering and schema-aware query generation. The underlying database reflects Xidera’s proprietary ticketing system, containing over 12,000 tickets, 3,400 shipment records, and 350 hardware returns. The system is engineered to work with Italian and English language to deliver accurate and contextually appropriate responses while operating under hardware constraints, addressed through efficient data handling and model selection. A key contribution of this work is a structured evaluation framework that assesses each model’s performance across five dimensions: SQL condition match, output value accuracy, column match accuracy, table match accuracy and row count validation. A benchmark set of hand-crafted questions in dual language English and Italian, based on the company database, was designed to evaluate a broad spectrum of query types, including filters, joins, aggregations, and time-based conditions. Through detailed analysis, common failure patterns such as incorrect JOIN logic, improper timestamp handling, and function misuse were identified and mitigated via prompt refinement and example-driven learning. Comparative evaluation across LLM models revealed that prompt design, schema integration, and system constraints significantly influence the quality of SQL generation. Overall, the results demonstrate that with proper optimization, open-source LLMs can serve as scalable, privacy-preserving alternatives to commercial solutions for enterprise-level ticket management systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Pothuganti_Shireesha.pdf
Accesso riservato
Dimensione
2.43 MB
Formato
Adobe PDF
|
2.43 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/89835