This thesis explores the use of deep learning and retrieval-augmented generation techniques to decode natural language from non-invasive brain signals, specifically electroencephalography (EEG). The primary goal is to develop a flexible and modular brain-to-text decoding pipeline that can support assistive communication systems for individuals with severe speech impairments. The system leverages sentence-level EEG embeddings extracted via a custom convolutional neural network, trained to align with contextual embeddings from a pre-trained language model. These embeddings are stored in a vector store to enable semantic retrieval, and the top-matching sentences are refined using a Large Language Model (LLM) to generate coherent textual output. The approach is evaluated on the ZuCo dataset, with both qualitative and quantitative metrics demonstrating the feasibility of generating fluent sentences from brain activity. The findings highlight the challenges of working with noisy, single-trial EEG data, but also underline the potential of combining deep learning with language models for non-invasive brain-to-text translation. The pipeline lays a foundation for future improvements in decoding accuracy and real-time application in Brain-Computer Interfaces (BCIs).
This thesis explores the use of deep learning and retrieval-augmented generation techniques to decode natural language from non-invasive brain signals, specifically electroencephalography (EEG). The primary goal is to develop a flexible and modular brain-to-text decoding pipeline that can support assistive communication systems for individuals with severe speech impairments. The system leverages sentence-level EEG embeddings extracted via a custom convolutional neural network, trained to align with contextual embeddings from a pre-trained language model. These embeddings are stored in a vector store to enable semantic retrieval, and the top-matching sentences are refined using a Large Language Model (LLM) to generate coherent textual output. The approach is evaluated on the ZuCo dataset, with both qualitative and quantitative metrics demonstrating the feasibility of generating fluent sentences from brain activity. The findings highlight the challenges of working with noisy, single-trial EEG data, but also underline the potential of combining deep learning with language models for non-invasive brain-to-text translation. The pipeline lays a foundation for future improvements in decoding accuracy and real-time application in Brain-Computer Interfaces (BCIs).
Brain-to-Text Translation Using Deep Learning and LLMs
COLLAUTTI, ENRICO
2024/2025
Abstract
This thesis explores the use of deep learning and retrieval-augmented generation techniques to decode natural language from non-invasive brain signals, specifically electroencephalography (EEG). The primary goal is to develop a flexible and modular brain-to-text decoding pipeline that can support assistive communication systems for individuals with severe speech impairments. The system leverages sentence-level EEG embeddings extracted via a custom convolutional neural network, trained to align with contextual embeddings from a pre-trained language model. These embeddings are stored in a vector store to enable semantic retrieval, and the top-matching sentences are refined using a Large Language Model (LLM) to generate coherent textual output. The approach is evaluated on the ZuCo dataset, with both qualitative and quantitative metrics demonstrating the feasibility of generating fluent sentences from brain activity. The findings highlight the challenges of working with noisy, single-trial EEG data, but also underline the potential of combining deep learning with language models for non-invasive brain-to-text translation. The pipeline lays a foundation for future improvements in decoding accuracy and real-time application in Brain-Computer Interfaces (BCIs).| File | Dimensione | Formato | |
|---|---|---|---|
|
Collautti_Enrico.pdf
embargo fino al 13/10/2026
Dimensione
928.04 kB
Formato
Adobe PDF
|
928.04 kB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/94140