Switchgear maintenance is a complex task that requires expert knowledge to address a wide range of potential issues. Although technical reports can assist, they are often disorganized and difficult to navigate. Knowledge graphs offer a structured solution, but constructing them from extensive text sources is challenging. Previous research demonstrated that Large Language Models can effectively convert unstructured data into knowledge graphs; however, their performance has been limited to short texts, struggling with larger documents. This work aimed to address that gap by developing a two-step extraction pipeline consisting of Entity Extraction and Relationship Extraction, applied in a few-shot learning context. Preprocessing techniques were used to improve data quality, and entity alignment methods were applied to reduce graph sparsity. The results show that with proper tuning, high-quality knowledge graphs can be generated from lengthy technical reports. These findings provide a pathway for more efficient use of unstructured knowledge in multiple industrial domains, potentially reducing reliance on manual expertise and improving the organization of large-scale knowledge bases.
Switchgear maintenance is a complex task that requires expert knowledge to address a wide range of potential issues. Although technical reports can assist, they are often disorganized and difficult to navigate. Knowledge graphs offer a structured solution, but constructing them from extensive text sources is challenging. Previous research demonstrated that Large Language Models can effectively convert unstructured data into knowledge graphs; however, their performance has been limited to short texts, struggling with larger documents. This work aimed to address that gap by developing a two-step extraction pipeline consisting of entity extraction and relationship extraction, applied in a few-shot learning context. Preprocessing techniques were used to improve data quality, and entity alignment methods were applied to reduce graph sparsity. The results show that with proper tuning, high-quality knowledge graphs can be generated from lengthy technical reports. These findings provide a pathway for more efficient use of unstructured knowledge in multiple industrial domains, potentially reducing reliance on manual expertise and improving the organization of large-scale knowledge bases.
Enhancing Knowledge Graph Construction from Multilingual Technical Reports using Large Language Models
FORMAGGIO, ALBERTO
2023/2024
Abstract
Switchgear maintenance is a complex task that requires expert knowledge to address a wide range of potential issues. Although technical reports can assist, they are often disorganized and difficult to navigate. Knowledge graphs offer a structured solution, but constructing them from extensive text sources is challenging. Previous research demonstrated that Large Language Models can effectively convert unstructured data into knowledge graphs; however, their performance has been limited to short texts, struggling with larger documents. This work aimed to address that gap by developing a two-step extraction pipeline consisting of Entity Extraction and Relationship Extraction, applied in a few-shot learning context. Preprocessing techniques were used to improve data quality, and entity alignment methods were applied to reduce graph sparsity. The results show that with proper tuning, high-quality knowledge graphs can be generated from lengthy technical reports. These findings provide a pathway for more efficient use of unstructured knowledge in multiple industrial domains, potentially reducing reliance on manual expertise and improving the organization of large-scale knowledge bases.File | Dimensione | Formato | |
---|---|---|---|
Formaggio_Alberto.pdf
accesso riservato
Dimensione
6.01 MB
Formato
Adobe PDF
|
6.01 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/77848