This thesis introduces SynBI, an integrated Decision Support framework that combines advanced data analytics, experimental design, and context-aware AI reasoning to enhance managerial decision-making in retail-oriented small and medium-sized enterprises (SMEs). The project addresses a common gap in SME environments: although large volumes of transactional data are available, firms often lack the analytical capabilities, methodological discipline, and interpretability required to translate such data into strategic and operational decisions. SynBI is structured around three complementary pillars. First, a data validation and preprocessing layer enforces schema consistency, detects anomalies, performs winsorization of extreme monetary values, and supports automated column recognition through fuzzy matching. This ensures that all downstream analyses operate on a reliable and auditable data foundation. Second, an analytical pipeline integrates customer segmentation through RFM-based clustering, association-rule mining for cross-selling insights, time-series forecasting using Prophet and ARIMA, cohort-based retention analysis, and anomaly detection via Isolation Forest. These modules allow managers to explore customer dynamics, purchasing patterns, seasonality, and irregular events in a structured and interpretable manner. Third, SynBI incorporates a Design of Experiments mindset, operationalized through Bayesian Optimization, to identify optimal analytical configurations without relying on exhaustive or manual search. By exploring combinations of clustering granularity, winsorization thresholds, and rulemining filters, the optimizer evaluates each configuration using a composite objective function that balances segmentation quality and association-rule relevance. This approach reduces computational cost and enables more systematic, reproducible experimentation even in resource-constrained SME contexts. An additional layer built on Large Language Models (LLMs) transforms validated analytical outputs into context-grounded natural language explanations. Instead of generating unsupported insights, the system uses structured JSON payloads containing cluster metrics, top products, forecast breakdowns, and association rules. This ensures transparency, minimizes hallucinations, and allows managers to interact with analytics through a disciplined, human-in-the-loop paradigm. The framework is empirically validated using the Online Retail II dataset, a large-scale collection of UK e-commerce transactions. Experiments demonstrate that SynBI can reduce the search space for analytical configurations by an order of magnitude, improve segmentation stability, and enhance the interpretability of forecasting and cross-selling insights. Moreover, the combination of experimental design and LLM-based reasoning shows potential for bridging the gap between technical analytics and managerial action, especially in organizations without dedicated data science teams. The thesis concludes by discussing current limitations—such as the restricted number of factors optimized, the reliance on a single dataset, and the need for more comprehensive experimental plans—and outlines future directions. These include extending Bayesian Optimization to multi-objective formulations, integrating SynBI with CRM and S&OP processes, enabling automated A/B testing, and enhancing the governance of AI-driven decision systems. SynBI ultimately demonstrates how an engineering approach that blends experimental design, data analytics, and controlled AI reasoning can support SMEs in transitioning from reactive reporting to proactive, evidence-based strategic management.
SynBI: Bridging Analytics and Strategy through AI-Driven Experimentation and Context-Grounded Decision Support
RUGGERO, NICOLÒ
2025/2026
Abstract
This thesis introduces SynBI, an integrated Decision Support framework that combines advanced data analytics, experimental design, and context-aware AI reasoning to enhance managerial decision-making in retail-oriented small and medium-sized enterprises (SMEs). The project addresses a common gap in SME environments: although large volumes of transactional data are available, firms often lack the analytical capabilities, methodological discipline, and interpretability required to translate such data into strategic and operational decisions. SynBI is structured around three complementary pillars. First, a data validation and preprocessing layer enforces schema consistency, detects anomalies, performs winsorization of extreme monetary values, and supports automated column recognition through fuzzy matching. This ensures that all downstream analyses operate on a reliable and auditable data foundation. Second, an analytical pipeline integrates customer segmentation through RFM-based clustering, association-rule mining for cross-selling insights, time-series forecasting using Prophet and ARIMA, cohort-based retention analysis, and anomaly detection via Isolation Forest. These modules allow managers to explore customer dynamics, purchasing patterns, seasonality, and irregular events in a structured and interpretable manner. Third, SynBI incorporates a Design of Experiments mindset, operationalized through Bayesian Optimization, to identify optimal analytical configurations without relying on exhaustive or manual search. By exploring combinations of clustering granularity, winsorization thresholds, and rulemining filters, the optimizer evaluates each configuration using a composite objective function that balances segmentation quality and association-rule relevance. This approach reduces computational cost and enables more systematic, reproducible experimentation even in resource-constrained SME contexts. An additional layer built on Large Language Models (LLMs) transforms validated analytical outputs into context-grounded natural language explanations. Instead of generating unsupported insights, the system uses structured JSON payloads containing cluster metrics, top products, forecast breakdowns, and association rules. This ensures transparency, minimizes hallucinations, and allows managers to interact with analytics through a disciplined, human-in-the-loop paradigm. The framework is empirically validated using the Online Retail II dataset, a large-scale collection of UK e-commerce transactions. Experiments demonstrate that SynBI can reduce the search space for analytical configurations by an order of magnitude, improve segmentation stability, and enhance the interpretability of forecasting and cross-selling insights. Moreover, the combination of experimental design and LLM-based reasoning shows potential for bridging the gap between technical analytics and managerial action, especially in organizations without dedicated data science teams. The thesis concludes by discussing current limitations—such as the restricted number of factors optimized, the reliance on a single dataset, and the need for more comprehensive experimental plans—and outlines future directions. These include extending Bayesian Optimization to multi-objective formulations, integrating SynBI with CRM and S&OP processes, enabling automated A/B testing, and enhancing the governance of AI-driven decision systems. SynBI ultimately demonstrates how an engineering approach that blends experimental design, data analytics, and controlled AI reasoning can support SMEs in transitioning from reactive reporting to proactive, evidence-based strategic management.| File | Dimensione | Formato | |
|---|---|---|---|
|
RUGGERO_NICOLÒ.pdf
accesso aperto
Dimensione
7.88 MB
Formato
Adobe PDF
|
7.88 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/108039