6.1 KiB
title, date, draft
title | date | draft |
---|---|---|
IaC (Infrastructure as Complexity?) | 2025-04-24T12:04:03-04:00 | false |
Why do IaC? (Or Infstructure as Complexity)
This is a question I have thought a lot about lately. I have personally wrestled with it quite a bit as I grow in my career. While I certainly recognize the many benefits of IaC, I was having trouble justifiying why a small team or individual should adopt this approach. After all, it does introduce overhead and complications. It introduces somewhat specialized workflows, processes and tooling, plus the configs tend to be pretty verbose.
Tech Debt
Most of the above concerns fall under the category of tech debt. For a single individual project where you are maybe only managing a handful of cloud resources, it can feel like overkill. It is tempting to just use the cli tool or even web console for most tasks. Some platforms make this easier than others. Azure for example will create most of the dependent resources for you when creating things through the console. Take a WebApp
for you. A basic WebApp in Azure requires all of the following:
- Resource Group
- App Service plan
- Web App
- Application Insights
- Log Analytics Workspace
Azure will happily create all of those resources for you when you walk through the process in the console, however if you are working via the cli or IaC then you are obviously on your own. This can be one of the biggest hurdles to getting started with this approach, knowing all of the required peices and how to put them together. In my experience, Azure seems to do a better job of this than AWS but it is still up to you as an engineer to learn and understand the fundamentals.
Disaster Recovery
With all of this complexity it can seem daunting, overwhelming and possibly overkill. My thoughts around all of this changed however after a conversation with my good friend John Goodwin. We were discussing this exact topic and he brought up a very good, indisputable point. One of the biggest benefits you get out of IaC is that you are essentially generating a blueprint of your infrastructure.
Think about it, this is a pretty common scenario. Your company hires a "DevOps" guy to help set up and maintain all of your infrastructure. You are a small remote team and he mostly works by himself in a closet (figuratively but sometimes also literally.). Let's say he is in this role for several years and your application grows considerably. You started out with a couple ec2 instances and now you have load balancers, container registries, custom vpc's, subnets, elastic ips, databases, etc.
If he were to leave, you would most likely have zero clue hat infrastructure you currently have, let alone how to maintain it. Worse, if there were a catastrophic failure, or disaster (hackers, ransomware, data center meltdown, cloud provider refuses you service, etc) you would have zero chance of setting everything back up on your own, or even figuring out what you had set up in the first place. At this point, you would probably reach out to a specialist like myself to help you "recover" everything. I know this because I have seen it many times before.
The problem is that if you are reaching out to me about something like this, it is oftentimes too late and the best we can do is an educated guess on how things may have been set up previously. Unfortunately with the number of dependencies in modern software and infrastructure, there will likely be a lot of pieces we will not be able to put back together properly and I will either advise you to start from scratch (within reason), or do my best to hack and patch things together.
All of this could have been avoided if you had pushed for and implemented IaC early on in the process. Instead of the mess and guess, you would have a (hopefully) current and up to date blueprint of your infrastructure and how to put it back together. Of course, in this scenario the insurance is only worth it if the "DevOps" guy is maintaining it in the first place. If you are small startup founder, this may be hard to enforce but for small teams and responsible engineers it should make perfect since.
Audit and Documentation trails
There are some other serious benefits of adopting this approach, even for small projects. As engineers, we tend to focus lots of time and energy into solving problems. We often write code that is complex, or requires several layers of abstraction. For some, we like to be able to reason about our code. If this variable is defined here, where is it being passed, manipulated, etc? Infrastructure is no different.
Think about how many times you have written a line of code that solved an annoying problem. You spent days on trying to solve this one thing and finally figured it out. Now six months later, there is a bug and you have to go back and try and remember what you did and why. This is why good commit messages and pull requests are so essential to the developer workflow. Not to mention comments and tests.
Let's take these same concepts and apply them to our infrastructure. It is pretty common to go into your cloud console and find resources hanging around with obscure names and no clear understanding of who created them and for what purpose. This is especially true with interdependent resources (vpcs, subnets, firewalls, etc.). Now if instead of creating resources like this, you are disciplined to only create them using GitOps workflows and IaC tools, you can have much more confidence in what you have and why.
It provides you with an audit trail, and hopefully you are displined enough to comment your configs and provide detailed descriptions of changes in your commits and pull requests. Your pull requests should also be a place where you can discuss these changes. Sure it may "slow things down" in the short term because it is always easier to just run a shell script, but this approach ultimately helps to solve some of the tribal knowledge issues.
Instead of wondering what scripts Jim is using to manage these pieces of infrastructure, you have a versioned, commented (and hopefuly up to date) configuration and plan in your source control.