In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.
In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.
Enhanced Topic Modeling for Textual Data
JAVIDFAR, MASOUD
2022/2023
Abstract
In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.File | Dimensione | Formato | |
---|---|---|---|
MsC_Thesis_Report__final-.pdf
accesso aperto
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/58767