The objective of the thesis was to implement an ETL (Extract, Load, Transform) process. I collected the data from a VR device then I transformed and stored them. At last I used different services to visualize the data in Real Time into a Grafana dashboard. Then I build two different architectures to compare the efficiency, costs and the response speed. The first solution uses the Kinesis Analytics service. It allows me to run the Apache Flink applications that I builded in a continuous way and scale automatically with no setup cost and without managing servers. Instead the second solution uses step functions and a non-relational DB called DynamoDB. To build the second solution I used a service that permits the creation of a production-ready architecture. This technique permits building architecture by a code and also uses the concepts of versioning and reproducibility.
The objective of the thesis was to implement an ETL (Extract, Load, Transform) process. I collected the data from a VR device then I transformed and stored them. At last I used different services to visualize the data in Real Time into a Grafana dashboard. Then I build two different architectures to compare the efficiency, costs and the response speed. The first solution uses the Kinesis Analytics service. It allows me to run the Apache Flink applications that I builded in a continuous way and scale automatically with no setup cost and without managing servers. Instead the second solution uses step functions and a non-relational DB called DynamoDB. To build the second solution I used a service that permits the creation of a production-ready architecture. This technique permits building architecture by a code and also uses the concepts of versioning and reproducibility.
Implementation of an ETL process in AWS
VAROTTO, MARCO
2021/2022
Abstract
The objective of the thesis was to implement an ETL (Extract, Load, Transform) process. I collected the data from a VR device then I transformed and stored them. At last I used different services to visualize the data in Real Time into a Grafana dashboard. Then I build two different architectures to compare the efficiency, costs and the response speed. The first solution uses the Kinesis Analytics service. It allows me to run the Apache Flink applications that I builded in a continuous way and scale automatically with no setup cost and without managing servers. Instead the second solution uses step functions and a non-relational DB called DynamoDB. To build the second solution I used a service that permits the creation of a production-ready architecture. This technique permits building architecture by a code and also uses the concepts of versioning and reproducibility.File | Dimensione | Formato | |
---|---|---|---|
Varotto_Marco.pdf
accesso riservato
Dimensione
2.13 MB
Formato
Adobe PDF
|
2.13 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/36798