The technological progress and machine learning’s development played a crucial role in the field of natural language processing (NLP) and computational linguistics (CL), which have made considerable steps forward in last few years. Nowadays, engines such as Google Translate, Alexa, DeepL, etc., are increasingly able to both understand and produce human language performing linguistic tasks such as speech recognition or machine translation. The relationship between CL and NLP is undeniable; however, from a terminological perspective, there is not a clear difference between these two domains because their terminology overlaps, and it is used interchangeably in the relevant literature. The focus of this thesis is investigating the difference between these two domains, delimiting their boundaries. The research methodology consists of studying the terminology of CL and NLP domains, through the creation of four corpora, an English and Italian corpus per domain, terminology extraction through Sketch Engine tool, the creation of two concept systems and four lexical networks, and finally the compilation of terminological records on FAIRterm application. Thanks to this terminological work, it has been possible to affirm a real difference between CL and NLP domains, which emerged during the entire terminological study, starting with the literature cited in the first chapter, then the term extraction, followed by concept systems, and ending with the compilation of terminological record on FAIRterm application.
The technological progress and machine learning’s development played a crucial role in the field of natural language processing (NLP) and computational linguistics (CL), which have made considerable steps forward in last few years. Nowadays, engines such as Google Translate, Alexa, DeepL, etc., are increasingly able to both understand and produce human language performing linguistic tasks such as speech recognition or machine translation. The relationship between CL and NLP is undeniable; however, from a terminological perspective, there is not a clear difference between these two domains because their terminology overlaps, and it is used interchangeably in the relevant literature. The focus of this thesis is investigating the difference between these two domains, delimiting their boundaries. The research methodology consists of studying the terminology of CL and NLP domains, through the creation of four corpora, an English and Italian corpus per domain, terminology extraction through Sketch Engine tool, the creation of two concept systems and four lexical networks, and finally the compilation of terminological records on FAIRterm application. Thanks to this terminological work, it has been possible to affirm a real difference between CL and NLP domains, which emerged during the entire terminological study, starting with the literature cited in the first chapter, then the term extraction, followed by concept systems, and ending with the compilation of terminological record on FAIRterm application.
Terminological Study of Computational Linguistics and Natural Language Processing Domain: a Contrastive Analysis.
FERRON, SELENE
2022/2023
Abstract
The technological progress and machine learning’s development played a crucial role in the field of natural language processing (NLP) and computational linguistics (CL), which have made considerable steps forward in last few years. Nowadays, engines such as Google Translate, Alexa, DeepL, etc., are increasingly able to both understand and produce human language performing linguistic tasks such as speech recognition or machine translation. The relationship between CL and NLP is undeniable; however, from a terminological perspective, there is not a clear difference between these two domains because their terminology overlaps, and it is used interchangeably in the relevant literature. The focus of this thesis is investigating the difference between these two domains, delimiting their boundaries. The research methodology consists of studying the terminology of CL and NLP domains, through the creation of four corpora, an English and Italian corpus per domain, terminology extraction through Sketch Engine tool, the creation of two concept systems and four lexical networks, and finally the compilation of terminological records on FAIRterm application. Thanks to this terminological work, it has been possible to affirm a real difference between CL and NLP domains, which emerged during the entire terminological study, starting with the literature cited in the first chapter, then the term extraction, followed by concept systems, and ending with the compilation of terminological record on FAIRterm application.File | Dimensione | Formato | |
---|---|---|---|
Ferron_Selene.pdf
accesso riservato
Dimensione
7.58 MB
Formato
Adobe PDF
|
7.58 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/59924