In the last post we shared a lot of the What and Why we do certain things. In this post we want to dive deeper into our actual cloud architecture. This means we will talk account setups, deployment pipelines, infrastructure setup (think ECS, Beanstalk, EKS etc.) and everything around.
AWS Account setup
We deploy each of our environment (Live, QA, Dev etc.) in a separate account, which serve as member accounts to a root account in an organisational setup, where only users are configured in the root account.
In the member accounts we define roles like
developer , that users from the root account can assume in these member accounts. The developer role has only the permissions it needs so developers get their job done. In addition we have the
administrator role. This gives us the flexibility of only adding users to the root account and what they can do is handled via the roles in the member account — through permissions on role level.
In addition to roles all our apps are deployed in these member accounts. This is where the action is.
All apps are deployed and managed by ECS and we could have opted to just create a separate ECS cluster for each environment and work in one account only, but there is always additional stuff that may interfere with each other without you knowing. In the beginning of your cloud journey on AWS Security Groups are a bit of a magic box that take you time to understand and analysis in case of failures. Same is true for Application Load Balancer and Target Groups. So we wanted to keep all of this separate on all environments and the only real way of doing this, is through different accounts.
AWS Account setup ECS as the hosting infrastructure
When thinking about hosting the applications there are a few options on AWS:
All of them have their pros and cons for certain scenarios. We have chosen to go down the ECS route. The reasons for us were the simplicity that ECS brings to the table. The concepts are very close to docker. ECS even runs on docker. EKS from a feature perspective is very rich, but also opinionated. It brings its own concepts and implementations and as a user you need to understand them. That was too much for us in a migration scenario to chew on top.
Our opinionated and by all means uncomplete take on Kubernetes and ECS is the following.
ECS and AWS infrastructure
After all this talking let’s dive into how it actually looks like for us. All our EC2 instances that we use for our applications, run inside a Private Subnet. We run in one Region with multiple Availability Zones (AZs). All apps are always deployed in multiple AZs. Communication to the outside happens through the NAT Gateway and Internet Gateway, which both reside in the Public Subnet.
Service Registry, Discovery and Load Balancing
There is service-to-service communication. A challenge in such a scenario is that a calling service needs to know the physical address (IP + Port) of the receiving service at runtime. Considering service instances come and go and with it, its physical addresses, it is advisable to externalise the Service Discovery and Resolution. Each ECS Service has a specific DNS entry in Route 53 that points to a specific route in our internal Application Load Balancer (ALB). Route 53 resolves the DNS of the Application Load Balancer. The ALB balances the load between different instances of the same service.
This setup gives us the ability to just use a fix service name in the Spring Rest Template, which is the same across environments. When an ECS Service instance dies, this instance gets automatically removed from the ALB Target Group and the opposite when spawned.A message from the past: In our pre-AWS times (we were on a CloudFoundry environment) we used Eureka for Service Discovery and Registry and Ribbon for client-side load-balancing from Netflix. Initially we tried using the Netflix stack, but ended up with a challenge: the Eureka server need to be reachable from your service with a fixed URL, which brings you back to the point of how to create a URL that is somehow static for one service in your environment. So while lookingintosolutions of making this work, we discovered we could just apply the same approach for every service and get rid of the Eureka Server all together.
A message from the past: In our pre-AWS times (we were on a CloudFoundry environment) we used Eureka for Service Discovery and Registry and Ribbon for client-side load-balancing from Netflix. Initially we tried using the Netflix stack, but ended up with a challenge: the Eureka server need to be reachable from your service with a fixed URL, which brings you back to the point of how to create a URL that is somehow static for one service in your environment. So while looking into solutions of making this work, we discovered we could just apply the same approach for every service and get rid of the Eureka Server all together.
Infrastructure as code
The entire infrastructure is scripted in CloudFormation across all accounts and therefore environments. Every change to any piece needs to be scripted, code reviewed through pull requests and successfully run in a deployment pipeline. Sounds tedious and long, the opposite is true.
Infrastructure is not a monolith. There are cross-cutting components like VPC and AZs, but there are also application specific infrastructure parts in the name of ECS Service, Route53 entries and ALB Listener Routes. We split into common and application-specific parts. The common infrastructure lives in its own git repository and has a dedicated pipeline to deploy it. Every app has an app-specific CloudFormation stack, which gets deployed during the application deployment. Usually we deploy an infrastructure change before we deploy the app that may rely on this infrastructure. Pro tip: make small changes to the infrastructure with CloudFormation. Big changes take long.
In the future we will extract roles and permissions into its own git repository to deploy it in isolation. We ran into a few issues where we weren’t able to rollback successfully after a failed infrastructure deployment, because the permissions got removed before the infrastructure components got removed again and we ended in a deadlock.
A word on carbon emissions
Data Centers take a big share in the world wide energy consumption — currently 2% of the world’s electricity. This puts data centers at the current energy consumption levels of the aviation industry and projected to exceed aviation by a factor of 4-5 by 2024. Comparisons of different cloud provider should therefore include its carbon emissions also. For AWS this depends on the region you run in. We run in the 100% sustainable region Frankfurt.
How does your AWS deployment look like? Do you use ECS, EKS, beanstalk or something else? How do you deploy your infrastructure as code? Through multiple pipelines or just one? How do you solve Service Registry and Discovery?This blog post is written by Steve Behrendt and part of the collaboration of Netlight Munich and eeMobility in the emerging field of electric charging. Together, we build a system to enable corporate fleets to use renewable energy and pioneer in the field of smart charging.