Table of Contents
∘ Problem Statement
∘ AWS Architecture
∘ Data Storage with AWS S3
∘ Designing the Schema
∘ ETL with AWS Glue
∘ Data Warehousing with AWS Redshift
∘ Extracting Insights with AWS Redshift
∘ Visualizing data with Power BI
∘ Future Steps
Air travel has become an integral part of our lives. It is a means for businesses to network and conduct commerce and for families to visit loved ones or travel.
Despite its influence, the aviation industry is known for facing turbulence. It is subject to continuous change due to external factors like economic busts and booms, climate change, the Covid-19 pandemic, and a push to rely more on renewable energy sources.
To be cognizant of such changes and their impact on air travel, it is worth tracking these flights over time. Such an endeavor requires a robust strategy for data warehousing, data analysis, and data visualization.
This project has 2 primary objectives. The first is to utilize the resources provided by Amazon Web Services (AWS) to build a data pipeline that facilitates the storage, transformation, and analysis of U.S. flight data.
The second is to build a visualization tool with Power BI that can effectively illustrate the key findings from the data.
The dataset used for this project is obtained from the Bureau of Transportation Statistics. It primarily reports the number of total flights, delays, and cancellations in airports and carriers from 2003 to 2023.
The following is a preview of the dataset: