In Process Mining, the augmentation of event logs plays a crucial role in overcoming the limitations posed by the scarcity of real-world data. Current augmentation techniques focus mainly on reproducing global patterns and provide little control over domain-specific rules. However, in practice, augmented logs may need to comply with explicit constraints to be meaningful for further analysis. This thesis introduces a constraint-aware framework for event log augmentation. The approach combines a set of automata, one per rule, which are intersected with the probabilistic automaton built on all input traces, including those that are not compliant, ensuring that the augmented logs preserve both the variability of real data and the conditions imposed by the constraints. The framework was evaluated on several case studies and compared with a probabilistic automaton trained on a preprocessed event log, in which non-compliant traces are removed to enforce a set of human-defined rules, before generating new synthetic traces. Results based on entropy metrics indicate that the proposed method ensures higher generalization by enabling the generation of a larger number of traces while still satisfying the imposed rules, with computational times that remain competitive. These findings confirm the practicality of constraint-aware augmentation and open promising directions for extensions involving additional process perspectives such as resources, attributes, and temporal relations.

In Process Mining, the augmentation of event logs plays a crucial role in overcoming the limitations posed by the scarcity of real-world data. Current augmentation techniques focus mainly on reproducing global patterns and provide little control over domain-specific rules. However, in practice, augmented logs may need to comply with explicit constraints to be meaningful for further analysis. This thesis introduces a constraint-aware framework for event log augmentation. The approach combines a set of automata, one per rule, which are intersected with the probabilistic automaton built on all input traces, including those that are not compliant, ensuring that the augmented logs preserve both the variability of real data and the conditions imposed by the constraints. The framework was evaluated on several case studies and compared with a probabilistic automaton trained on a preprocessed event log, in which non-compliant traces are removed to enforce a set of human-defined rules, before generating new synthetic traces. Results based on entropy metrics indicate that the proposed method ensures higher generalization by enabling the generation of a larger number of traces while still satisfying the imposed rules, with computational times that remain competitive. These findings confirm the practicality of constraint-aware augmentation and open promising directions for extensions involving additional process perspectives such as resources, attributes, and temporal relations.

Augmentation of Event Logs under User-Defined Process Constraints

CIMBRO, LETIZIA
2024/2025

Abstract

In Process Mining, the augmentation of event logs plays a crucial role in overcoming the limitations posed by the scarcity of real-world data. Current augmentation techniques focus mainly on reproducing global patterns and provide little control over domain-specific rules. However, in practice, augmented logs may need to comply with explicit constraints to be meaningful for further analysis. This thesis introduces a constraint-aware framework for event log augmentation. The approach combines a set of automata, one per rule, which are intersected with the probabilistic automaton built on all input traces, including those that are not compliant, ensuring that the augmented logs preserve both the variability of real data and the conditions imposed by the constraints. The framework was evaluated on several case studies and compared with a probabilistic automaton trained on a preprocessed event log, in which non-compliant traces are removed to enforce a set of human-defined rules, before generating new synthetic traces. Results based on entropy metrics indicate that the proposed method ensures higher generalization by enabling the generation of a larger number of traces while still satisfying the imposed rules, with computational times that remain competitive. These findings confirm the practicality of constraint-aware augmentation and open promising directions for extensions involving additional process perspectives such as resources, attributes, and temporal relations.
2024
Augmentation of Event Logs under User-Defined Process Constraints
In Process Mining, the augmentation of event logs plays a crucial role in overcoming the limitations posed by the scarcity of real-world data. Current augmentation techniques focus mainly on reproducing global patterns and provide little control over domain-specific rules. However, in practice, augmented logs may need to comply with explicit constraints to be meaningful for further analysis. This thesis introduces a constraint-aware framework for event log augmentation. The approach combines a set of automata, one per rule, which are intersected with the probabilistic automaton built on all input traces, including those that are not compliant, ensuring that the augmented logs preserve both the variability of real data and the conditions imposed by the constraints. The framework was evaluated on several case studies and compared with a probabilistic automaton trained on a preprocessed event log, in which non-compliant traces are removed to enforce a set of human-defined rules, before generating new synthetic traces. Results based on entropy metrics indicate that the proposed method ensures higher generalization by enabling the generation of a larger number of traces while still satisfying the imposed rules, with computational times that remain competitive. These findings confirm the practicality of constraint-aware augmentation and open promising directions for extensions involving additional process perspectives such as resources, attributes, and temporal relations.
Event Log Accuracy
Final State Automata
Predictive Analytics
File in questo prodotto:
File Dimensione Formato  
Cimbro_Letizia.pdf

accesso aperto

Dimensione 3.14 MB
Formato Adobe PDF
3.14 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91824