Orange Virtual Assistant: Investigating Large Language Models’ Ability to Understand and Construct Data Mining Workflows

This thesis looks into how important workflow processes can be automated by Large Language Models (LLMs) like GPT-4 to improve user engagement with Orange, a data mining application. The research specifically focuses on three primary goals: naming workflows, describing workflow functionality, and recommending new widgets to users based on partially completed workflows. Using methods such as prompt engineering and workflow similarity analysis, this study investigates LLMs' ability to comprehend and navigate enhance directed acyclic networks that serve as the foundation for Orange operations. The findings show that workflow naming and description generating activities may be effectively completed by LLMs with high accuracy, especially when high quality examples are given. While advanced models produced encouraging results in the widget suggestion task, there are still certain issues, especially with limited existing workflow examples. The results show the strengths and weaknesses of existing LLMs in helping users in constructing and understanding data mining workflows. This thesis lays the groundwork for future advancements in workflow management and user assistance by giving insightful information about the incorporation of LLMs in a data mining setting.

Orange Virtual Assistant: Investigating Large Language Models’ Ability to Understand and Construct Data Mining Workflows

TIVERON, ALESSANDRO

2024/2025

Abstract

This thesis looks into how important workflow processes can be automated by Large Language Models (LLMs) like GPT-4 to improve user engagement with Orange, a data mining application. The research specifically focuses on three primary goals: naming workflows, describing workflow functionality, and recommending new widgets to users based on partially completed workflows. Using methods such as prompt engineering and workflow similarity analysis, this study investigates LLMs' ability to comprehend and navigate enhance directed acyclic networks that serve as the foundation for Orange operations. The findings show that workflow naming and description generating activities may be effectively completed by LLMs with high accuracy, especially when high quality examples are given. While advanced models produced encouraging results in the widget suggestion task, there are still certain issues, especially with limited existing workflow examples. The results show the strengths and weaknesses of existing LLMs in helping users in constructing and understanding data mining workflows. This thesis lays the groundwork for future advancements in workflow management and user assistance by giving insightful information about the incorporation of LLMs in a data mining setting.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				BIOINGEGNERIA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Orange Virtual Assistant: Investigating Large Language Models’ Ability to Understand and Construct Data Mining Workflows
			
	Abstract in italiano
	
				This thesis looks into how important workflow processes can be automated by Large Language Models (LLMs) like GPT-4 to improve user engagement with Orange, a data mining application. The research specifically focuses on three primary goals: naming workflows, describing workflow functionality, and recommending new widgets to users based on partially completed workflows. Using methods such as prompt engineering and workflow similarity analysis, this study investigates LLMs' ability to comprehend and navigate enhance directed acyclic networks that serve as the foundation for Orange operations. 

The findings show that workflow naming and description generating activities may be effectively completed by LLMs with high accuracy, especially when high quality examples are given. While advanced models produced encouraging results in the widget suggestion task, there are still certain issues, especially with limited existing workflow examples. The results show the strengths and weaknesses of existing LLMs in helping users in constructing and understanding data mining workflows. This thesis lays the groundwork for future advancements in workflow management and user assistance by giving insightful information about the incorporation of LLMs in a data mining setting.
			
	Parola chiave
	
				Large Language Model
Data mining
Data analysis
Image analysis
Virtual assistant
			
	Relatore
	
				BARUZZO, GIACOMO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Tiveron_Alessandro.pdf accesso aperto Dimensione 2.55 MB Formato Adobe PDF Visualizza/Apri	2.55 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/83311