How I Built A Cascading Data Pipeline Based on AWS (Part 2) | by Memphis Meng | Aug, 2023


Automatic, scalable, and powerful

Photo by Mehmet Ali Peker on Unsplash

Previously, I shared my experience in developing a data pipeline using AWS CloudFormation technology. It is not an optimal approach, though, because it leaves behind 3 more issues awaiting resolution:

  1. The deployment has to be imposed manually which could increase the chances of errors;
  2. All resources are created in one single stack, without proper boundaries and layers; as the development cycle goes on, the resource stack will be heavier, and managing it will be a disaster;
  3. Many resources are supposed to be sustained and reused in other projects.

In short, we are going to increase the manageability and reusability of this project, in an agile manner.

AWS enables users to implement 2 types of CloudFormation structural patterns: cross-stack reference and nested stacking. Cross-stack reference stands for a designing style of developing cloud stacks separately, and usually independently, while the resources among all stacks can be interrelated based on the reference relationship. Nested stacking means a CloudFormation stack composed of other stacks. It is achieved by using the AWS::CloudFormation::Stack resource.

A nested stack in real life: a nest full of nests/eggs (Photo by Giorgi Iremadze on Unsplash)

Because one of our missions we aim to achieve is to come up with better project management, the project is going to be broken down by layered separation and nested stacking is the one to help. However, in regard to the intrinsic interrelationship between the artifacts of the existing stack, we would also need to take a drop of cross-stack reference.

We created 3 Lambda functions, 3 DynamoDB tables, 1 IAM role along with its policies attached, several SQS queues, and several Cloudwatch alarms. Due to the complexity of the functions themselves, in this version, they are going to be defined in separate templates, with the services only used by themselves including alarms and dead letter queues. Apart from those, IAM resources will be…



Source link

Leave a Comment