In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.

In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.

Enhanced Topic Modeling for Textual Data

JAVIDFAR, MASOUD
2022/2023

Abstract

In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.
2022
Enhanced Topic Modeling for Textual Data Supervisor: Professor Tomaso Erseghe tomaso.erseghe@unipd.it
In this thesis, we present an innovative approach for topic modeling and text classification using a combination of Non-Negative Matrix Factorization (NMF), Variational Autoencoder (VAE), and Bidirectional Long Short-Term Memory (Bi-LSTM) models. Our approach leverages CountVectorizer and bigrams to preprocess the text data, capturing word frequencies and co-occurrence patterns. NMF is applied to extract latent topics, while VAE reduces dimensionality and learns meaningful representations. The Bi-LSTM model is employed for sequential pattern learning and accurate classification. Through extensive experiments and evaluations, we demonstrate the effectiveness of our approach in capturing topics and achieving high classification accuracy. This research contributes to the field of text analysis by offering an advanced methodology for uncovering insights from textual data.
NMF
VAE
Bi-LSTM
NNDL
File in questo prodotto:
File Dimensione Formato  
MsC_Thesis_Report__final-.pdf

accesso aperto

Dimensione 1.05 MB
Formato Adobe PDF
1.05 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/58767