In recent years, the technological landscape of data management has undergone a significant transformation, driven by the increasing demand for scalable, flexible, and cost-effective solutions. Traditional on-premises data platforms are struggling to keep pace with the growing volume of data and the need for real-time processing. This shift has led to the emergence of cloud-native architectures, which offer elastic scalability, seamless integration, and advanced analytics capabilities. As organizations seek to modernize their data infrastructures, cloud-based platforms have positioned themselves as essential tools for ensuring long-term operational efficiency and business intelligence. This thesis focuses on the re-engineering and modernization of a legacy, on-premises data platform into a cloud-native architecture, with a strong emphasis on scalability, reliability, and performance. The project aims to transform an outdated system—burdened by limitations in infrastructure scalability, code management, task orchestration, and observability—into a flexible, cloud-based solution leveraging Snowflake, AWS services (Lambda, S3, SNS, DMS), and serverless technologies. The new architecture addresses key challenges like growing data workloads and the increasing complexity of business operations by utilizing cloud resources that ensure elastic scaling, efficient data processing, and enhanced monitoring capabilities. The work also incorporates the knowledge gained through an in-depth exploration and analysis of Snowflake, a fully managed Data Warehousing solution that has proven itself a powerful tool in modern data management. The study provided both theoretical and practical insights into Snowflake’s architectural features and resource management. Key objectives included analyzing best practices for data modeling, optimizing queries, and evaluating partitioning strategies. By combining practical experience with a theoretical understanding of cutting-edge technologies, this thesis aims to contribute to the field of cloud-native data platforms by offering insights into system design and best practices for modernizing legacy infrastructures.

Re-engineering Legacy Data Platforms with Cloud-Native Technologies

LENARTAVICIUS, VAIDAS
2023/2024

Abstract

In recent years, the technological landscape of data management has undergone a significant transformation, driven by the increasing demand for scalable, flexible, and cost-effective solutions. Traditional on-premises data platforms are struggling to keep pace with the growing volume of data and the need for real-time processing. This shift has led to the emergence of cloud-native architectures, which offer elastic scalability, seamless integration, and advanced analytics capabilities. As organizations seek to modernize their data infrastructures, cloud-based platforms have positioned themselves as essential tools for ensuring long-term operational efficiency and business intelligence. This thesis focuses on the re-engineering and modernization of a legacy, on-premises data platform into a cloud-native architecture, with a strong emphasis on scalability, reliability, and performance. The project aims to transform an outdated system—burdened by limitations in infrastructure scalability, code management, task orchestration, and observability—into a flexible, cloud-based solution leveraging Snowflake, AWS services (Lambda, S3, SNS, DMS), and serverless technologies. The new architecture addresses key challenges like growing data workloads and the increasing complexity of business operations by utilizing cloud resources that ensure elastic scaling, efficient data processing, and enhanced monitoring capabilities. The work also incorporates the knowledge gained through an in-depth exploration and analysis of Snowflake, a fully managed Data Warehousing solution that has proven itself a powerful tool in modern data management. The study provided both theoretical and practical insights into Snowflake’s architectural features and resource management. Key objectives included analyzing best practices for data modeling, optimizing queries, and evaluating partitioning strategies. By combining practical experience with a theoretical understanding of cutting-edge technologies, this thesis aims to contribute to the field of cloud-native data platforms by offering insights into system design and best practices for modernizing legacy infrastructures.
2023
Re-engineering Legacy Data Platforms with Cloud-Native Technologies
Data Engineering
Cloud Infrastructure
Data Warehousing
DevOps Automation
Legacy Data Platform
File in questo prodotto:
File Dimensione Formato  
Lenartavicius_Vaidas.pdf

accesso aperto

Dimensione 2.62 MB
Formato Adobe PDF
2.62 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/77248