In the context of public administration, especially in bilingual realities such as the Autonomous Province of Bolzano, consistent terminology management is essential to ensure translation quality and legal certainty. Although neural machine translation has become a valuable resource, its use in institutional fields poses complex challenges in terms of controlling and standardizing technical language. However, the relevance of terminological consistency is often underestimated in practice, as it is mistakenly assumed that this work is implicitly carried out by translators during the translation process. This neglects the fact that terminology science is an independent discipline whose specific rules are not necessarily part of translator training. This problem is exacerbated by technological change: since relying solely on translation memories is often not enough to guide neural engines in terms of terminology, gaps in consistency arise. As a result, raw machine translations perform inadequately, which in turn requires increased post-editing to restore the necessary technical precision. Against this backdrop, this thesis examines the role of well-maintained terminology databases in conjunction with neural engines, with a particular focus on the resulting post-editing effort. An integrative approach is pursued, shifting the focus from the purely reactive use of translation memories to the proactive integration of structured terminology databases. The aim is to determine the concrete added value that such systematic terminology implementation can generate for the efficiency and quality of the interlingual translation and post-editing process. To empirically answer the research question, a case study was conducted in collaboration with the Translation Office of the South Tyrolean Provincial Council. The focus was on investigating the influence that systematic terminology integration has on the quality of machine translations. Technical texts from the field of housing construction served as a representative subject of investigation. The corpus examined consisted of four different texts from three different text types in order to cover various technical language requirements. These documents were machine translated and then subjected to complete post-editing. This methodological approach made it possible to analyze in detail the specific effects of the integrated terminology resources on the translation process and to determine the resulting added value for the post-editing effort. The methodological approach was divided into several phases. First, computer-assisted term extraction was performed from an existing translation memory using Sketch Engine, followed by collaborative validation and the creation of terminological entries in MultiTerm in order to build an operational term bank that meets institutional requirements. The core experiment consisted of translating the technical texts in Trados Studio using two neural systems in different configurations: With ModernMT, the influence of the translation memory as a contextual reference was analyzed, while with DeepL, the effects of direct glossary integration were examined in comparison to the basic translation. Finally, all versions were post-edited to evaluate whether the processing effort was actually reduced in the scenarios with integrated terminology.
Im Kontext der öffentlichen Verwaltung, besonders in zweisprachigen Realitäten wie der Autonomen Provinz Bozen, ist ein kohärentes Terminologiemanagement für die Gewährleistung von Übersetzungsqualität und Rechtssicherheit zwingend erforderlich. Obwohl die neuronale maschinelle Übersetzung mittlerweile eine wertvolle Ressource darstellt, bringt ihr Einsatz in institutionellen Fachbereichen komplexe Herausforderungen hinsichtlich der Steuerung und Einheitlichkeit der Fachsprache mit sich. Die Relevanz terminologischer Konsistenz wird in der Praxis jedoch häufig unterschätzt, da fälschlicherweise davon ausgegangen wird, dass diese Arbeit implizit von ÜbersetzerInnen im Übersetzungsprozess erledigt wird. Dabei wird vernachlässigt, dass die Terminologiewissenschaft eine eigenständige Disziplin ist, deren spezifische Regelwerke nicht zwangsläufig Teil der translatorischen Ausbildung sind. Diese Problematik verschärft sich durch den technologischen Wandel: Da die alleinige Abhängigkeit von Translation Memorys oft nicht ausreicht, um neuronale Engines terminologisch eindeutig zu lenken, entstehen Lücken in der Konsistenz. Dies hat zur Folge, dass die maschinellen Rohübersetzungen unzureichend performen, was wiederum einen erhöhten Post-Editing nach sich zieht, um die erforderliche fachsprachliche Präzision wiederherzustellen. Vor diesem Hintergrund untersucht die vorliegende Arbeit die Rolle gepflegter Terminologiebestände im Zusammenspiel mit neuronalen Engines, wobei ein besonderer Schwerpunkt auf dem resultierenden Post-Editing-Aufwand liegt. Dabei wird ein integrativer Ansatz verfolgt, der den Fokus von der rein reaktiven Nutzung von Translation Memorys hin zur proaktiven Einbindung strukturierter Terminologiedatenbanken verlagert. Ziel ist es, die konkreten Mehrwerte zu bestimmen, die eine solche systematische Terminologieimplementierung für die Effizienz und Qualität des interlingualen Übersetzungs- und Post-Editing-Prozesses generieren kann. Zur empirischen Beantwortung der Forschungsfrage wurde in Zusammenarbeit mit dem Übersetzungsamt des Südtiroler Landtages eine Fallstudie durchgeführt. Im Mittelpunkt stand dabei die Untersuchung des Einflusses, den eine systematische Terminologieintegration auf die Qualität maschineller Übersetzungen ausübt. Als repräsentativer Untersuchungsgegenstand dienten Fachtexte aus dem Fachbereich des Wohnbaus. Das untersuchte Korpus setzte sich aus vier verschiedenen Texten zusammen, die drei unterschiedlichen Textsorten entstammten, um verschiedene fachsprachliche Anforderungen abzudecken. Diese Dokumente wurden maschinell übersetzt und anschließend einem vollständigen Post-Editing unterzogen. Dieser methodische Aufbau ermöglichte es, die konkreten Auswirkungen der eingebundenen Terminologieressourcen auf den Übersetzungsprozess detailliert zu analysieren und den daraus resultierenden Mehrwert für den Post-Editing-Aufwand zu bestimmen. Das methodische Vorgehen gliederte sich dabei in mehrere Phasen. Zunächst erfolgte eine computergestützte Termextraktion aus einem bestehenden Translation Memory mittels Sketch Engine, gefolgt von einer kollaborativen Validierung und der Erstellung terminologischer Einträge in MultiTerm, um eine operative Termbank aufzubauen, die den institutionellen Anforderungen entspricht. Das Kernexperiment bestand aus der Übersetzung der Fachtexte in Trados Studio unter Verwendung zweier neuronaler Systeme in unterschiedlichen Konfigurationen: Bei ModernMT wurde der Einfluss des Translation Memorys als kontextuelle Referenz analysiert, während bei DeepL die Auswirkungen der direkten Glossar-Einbindung im Vergleich zur Basisübersetzung untersucht wurden. Abschließend wurden sämtliche Versionen post-editiert, um zu evaluieren, ob sich der Bearbeitungsaufwand bei den Szenarien mit integrierter Terminologie tatsächlich verringerte.
Vom Translation Memory zur Terminologiedatenbank: Ein integrativer Ansatz zur Optimierung neuronaler maschineller Übersetzung und terminologischer Konsistenz in der öffentlichen Verwaltung.
DI CRISTO, FEDERICA
2025/2026
Abstract
In the context of public administration, especially in bilingual realities such as the Autonomous Province of Bolzano, consistent terminology management is essential to ensure translation quality and legal certainty. Although neural machine translation has become a valuable resource, its use in institutional fields poses complex challenges in terms of controlling and standardizing technical language. However, the relevance of terminological consistency is often underestimated in practice, as it is mistakenly assumed that this work is implicitly carried out by translators during the translation process. This neglects the fact that terminology science is an independent discipline whose specific rules are not necessarily part of translator training. This problem is exacerbated by technological change: since relying solely on translation memories is often not enough to guide neural engines in terms of terminology, gaps in consistency arise. As a result, raw machine translations perform inadequately, which in turn requires increased post-editing to restore the necessary technical precision. Against this backdrop, this thesis examines the role of well-maintained terminology databases in conjunction with neural engines, with a particular focus on the resulting post-editing effort. An integrative approach is pursued, shifting the focus from the purely reactive use of translation memories to the proactive integration of structured terminology databases. The aim is to determine the concrete added value that such systematic terminology implementation can generate for the efficiency and quality of the interlingual translation and post-editing process. To empirically answer the research question, a case study was conducted in collaboration with the Translation Office of the South Tyrolean Provincial Council. The focus was on investigating the influence that systematic terminology integration has on the quality of machine translations. Technical texts from the field of housing construction served as a representative subject of investigation. The corpus examined consisted of four different texts from three different text types in order to cover various technical language requirements. These documents were machine translated and then subjected to complete post-editing. This methodological approach made it possible to analyze in detail the specific effects of the integrated terminology resources on the translation process and to determine the resulting added value for the post-editing effort. The methodological approach was divided into several phases. First, computer-assisted term extraction was performed from an existing translation memory using Sketch Engine, followed by collaborative validation and the creation of terminological entries in MultiTerm in order to build an operational term bank that meets institutional requirements. The core experiment consisted of translating the technical texts in Trados Studio using two neural systems in different configurations: With ModernMT, the influence of the translation memory as a contextual reference was analyzed, while with DeepL, the effects of direct glossary integration were examined in comparison to the basic translation. Finally, all versions were post-edited to evaluate whether the processing effort was actually reduced in the scenarios with integrated terminology.| File | Dimensione | Formato | |
|---|---|---|---|
|
DiCristo_Federica.pdf
Accesso riservato
Dimensione
4.24 MB
Formato
Adobe PDF
|
4.24 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/106999