In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.
In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.
Email mining to uncover automation opportunities
PUTINA, ANNA
2024/2025
Abstract
In the modern working environment, people are overloaded with data. We collect, store, and process vast amounts of information for business, safety, and legal purposes. Managing this information manually is increasingly challenging, and employees spend significant time performing repetitive tasks. Consequently, automation offers a promising breakthrough that could substantially reduce costs. This project leverages data mining on email data, initially focusing on emails with the potential to extend findings to other areas. We aim to identify patterns within email interactions to automate repetitive tasks such as reading, responding, attaching les, forwarding information, and managing spam. This automation could significantly enhance productivity and user satisfaction by reducing the manual effort involved in email management. Our approach involves analyzing a large dataset of business emails to detect recurring interaction patterns. To reconstruct email chains and convert them into structured data, we use a systematic methodology relying on the emails' metadata and content. After automatically processing the texts, we generate embeddings to convert the text into numerical representations. A time-aware distance metric assesses sequence similarity to cluster the emails, revealing potential automation opportunities. The results demonstrate the feasibility of extracting processes and similar interactions from emails using the proposed solution. This serves as a model pipeline for future projects, where specific steps can be adapted to meet different task requirements, improve performance, and adapt to other data formats.File | Dimensione | Formato | |
---|---|---|---|
thesis_anna_putina_email_mining.pdf
accesso aperto
Dimensione
4.1 MB
Formato
Adobe PDF
|
4.1 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/84864