In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.

In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.

Email mining to uncover automation opportunities

PUTINA, ANNA
2024/2025

Abstract

In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.
2024
Email mining to uncover automation opportunities
In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.
Data mining
Process extraction
Process automation
Text embeddings
Clustering
File in questo prodotto:
File Dimensione Formato  
thesis_anna_putina_email_mining.pdf

accesso aperto

Dimensione 4.1 MB
Formato Adobe PDF
4.1 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/84864