This thesis investigates how retrieval-augmented generation can enhance domain-specific machine translation without the need for full model fine-tuning. The proposed system integrates retrieved examples into language model prompts, allowing for more accurate and context-aware translations of structured and technical content. A supporting data processing pipeline is developed to handle noisy bilingual datasets and enable incremental updates, ensuring the system remains adaptable and scalable. Preliminary results suggest improvements in translation quality and consistency, with evaluation conducted through a combination of established automated metrics and expert human assessment.
This thesis investigates how retrieval-augmented generation can enhance domain-specific machine translation without the need for full model fine-tuning. The proposed system integrates retrieved examples into language model prompts, allowing for more accurate and context-aware translations of structured and technical content. A supporting data processing pipeline is developed to handle noisy bilingual datasets and enable incremental updates, ensuring the system remains adaptable and scalable. Preliminary results suggest improvements in translation quality and consistency, with evaluation conducted through a combination of established automated metrics and expert human assessment.
Enhancing Domain-Specific Machine Translation with Retrieval-Augmented Generation (RAG)
SHEIKHI, SAHAR
2024/2025
Abstract
This thesis investigates how retrieval-augmented generation can enhance domain-specific machine translation without the need for full model fine-tuning. The proposed system integrates retrieved examples into language model prompts, allowing for more accurate and context-aware translations of structured and technical content. A supporting data processing pipeline is developed to handle noisy bilingual datasets and enable incremental updates, ensuring the system remains adaptable and scalable. Preliminary results suggest improvements in translation quality and consistency, with evaluation conducted through a combination of established automated metrics and expert human assessment.| File | Dimensione | Formato | |
|---|---|---|---|
|
final-thesis.pdf
Accesso riservato
Dimensione
1.89 MB
Formato
Adobe PDF
|
1.89 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/102092