Performing data analysis in the mobility domain requires large volumes of data from diverse sources, which then need to be combined into a single dataset, such as relational databases or document indices. Furthermore, interrogating databases with SQL queries, organising and visualising the data can be quite complex for most users. Large Language Models (LLMs) can help with these challenges thanks to their large knowledge bases and the usage of prompts. However, they cannot handle the more practical tasks by themselves, such as executing the SQL queries or the index searches. This thesis describes a chatbot offering a user-friendly tool to interact with these datasets and support decision-making. It is based on an agent architecture, which expands the capabilities of the core LLM by allowing it to interact with a series of tools that can execute the more practical tasks and organise the procedure to complete its question-answering task. This work, performed in collaboration with Motion Analytica and beanTeach, is based on the adaptation of two different versions of a general-purpose architecture to two different mobility domains: public transport services datasets based on an open-source standard and proprietary data on tourism for the Tuscany region. This work provides an example of how the developed framework and application can be adapted to other, different domains. We also present the testing procedure devised to assess the accuracy of the SQL data retrieval. For both domains, a dataset of test questions was created from existing examples or question templates, and a `correct' query is associated to each of them. These questions are asked several times; the generated query, the retrieved data and the natural language response for each of them are stored. This procedure yields a dataset for the evaluation of the chatbot's performance, especially the consistency of its answers and the correctness of the generated queries. The comparison of the testing results for both architectures and for both domains returns an insight into the importance of properly tuning general-purpose LLMs to the specific data or application, as well as accounting for the structure and complexity of the data.
Performing data analysis in the mobility domain requires large volumes of data from diverse sources, which then need to be combined into a single dataset, such as relational databases or document indices. Furthermore, interrogating databases with SQL queries, organising and visualising the data can be quite complex for most users. Large Language Models (LLMs) can help with these challenges thanks to their large knowledge bases and the usage of prompts. However, they cannot handle the more practical tasks by themselves, such as executing the SQL queries or the index searches. This thesis describes a chatbot offering a user-friendly tool to interact with these datasets and support decision-making. It is based on an agent architecture, which expands the capabilities of the core LLM by allowing it to interact with a series of tools that can execute the more practical tasks and organise the procedure to complete its question-answering task. This work, performed in collaboration with Motion Analytica and beanTeach, is based on the adaptation of two different versions of a general-purpose architecture to two different mobility domains: public transport services datasets based on an open-source standard and proprietary data on tourism for the Tuscany region. This work provides an example of how the developed framework and application can be adapted to other, different domains. We also present the testing procedure devised to assess the accuracy of the SQL data retrieval. For both domains, a dataset of test questions was created from existing examples or question templates, and a `correct' query is associated to each of them. These questions are asked several times; the generated query, the retrieved data and the natural language response for each of them are stored. This procedure yields a dataset for the evaluation of the chatbot's performance, especially the consistency of its answers and the correctness of the generated queries. The comparison of the testing results for both architectures and for both domains returns an insight into the importance of properly tuning general-purpose LLMs to the specific data or application, as well as accounting for the structure and complexity of the data.
Adapting a general-purpose multi-agent LLM-powered chatbot to the mobility domain
FANTIN, LUCA
2024/2025
Abstract
Performing data analysis in the mobility domain requires large volumes of data from diverse sources, which then need to be combined into a single dataset, such as relational databases or document indices. Furthermore, interrogating databases with SQL queries, organising and visualising the data can be quite complex for most users. Large Language Models (LLMs) can help with these challenges thanks to their large knowledge bases and the usage of prompts. However, they cannot handle the more practical tasks by themselves, such as executing the SQL queries or the index searches. This thesis describes a chatbot offering a user-friendly tool to interact with these datasets and support decision-making. It is based on an agent architecture, which expands the capabilities of the core LLM by allowing it to interact with a series of tools that can execute the more practical tasks and organise the procedure to complete its question-answering task. This work, performed in collaboration with Motion Analytica and beanTeach, is based on the adaptation of two different versions of a general-purpose architecture to two different mobility domains: public transport services datasets based on an open-source standard and proprietary data on tourism for the Tuscany region. This work provides an example of how the developed framework and application can be adapted to other, different domains. We also present the testing procedure devised to assess the accuracy of the SQL data retrieval. For both domains, a dataset of test questions was created from existing examples or question templates, and a `correct' query is associated to each of them. These questions are asked several times; the generated query, the retrieved data and the natural language response for each of them are stored. This procedure yields a dataset for the evaluation of the chatbot's performance, especially the consistency of its answers and the correctness of the generated queries. The comparison of the testing results for both architectures and for both domains returns an insight into the importance of properly tuning general-purpose LLMs to the specific data or application, as well as accounting for the structure and complexity of the data.| File | Dimensione | Formato | |
|---|---|---|---|
|
Fantin_Luca.pdf
Accesso riservato
Dimensione
4.9 MB
Formato
Adobe PDF
|
4.9 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/94121