Send new GitHub mention to Slack as message
Slack + GitHub
Send new GitHub mention to Slack as message
Slack + GitHub
After publishing 50 how-to guides on common DevOps and SecOps tasks, here is a summary of the common themes and best practices across tooling ecosystems.
At Blink, we’re always looking for how a little automation can free up a lot of time for CloudOps teams. Through working with our customers, we’ve heard firsthand about common challenges and chores that require step-by-step manual actions or creative, short-term work-arounds.
A few months ago, we started creating simple how-to guides that outline these tasks and walk through each of the manual steps to solving them.
36K words and 50 posts later, we thought it’d be a good idea to reflect, compare, contrast, and connect the dots. Also, we (and I personally) needed a little break from poring over documentation.
Here are the major takeaways across all of these 50 CloudOps tasks:
To keep true to form, let’s start where most of our how-to guides start.
Regardless of if we were writing about an AWS, GCP, Azure, or Kubernetes-based task, you almost always need to find something in your cloud stack that needs to be adjusted. It could be that you need to identify orphaned ConfigMaps, or unattached Elastic IP addresses, or find any missing mandatory tags. Searchability across your cloud stack is critical, but not always easy.
One of the key challenges with running queries is that each tooling ecosystem has its own CLI tool, or API, or Console navigation. If you are using several tools in your day-to-day work, it’s challenging to context-switch and jump in and out of documentation and platforms. That’s especially true if you aren’t an expert on the specific tool and highly familiar with the related querying capabilities.
If it isn’t easy to find what you want to work on, you might decide not to work on it at all, or maybe you’ll do it once and accept that it might not be worth making a part of your routine. But what if you encounter an error with a deployment and you need to find an answer fast? The first step in troubleshooting is getting more information, whether it’s for a Kubernetes pod stuck in a pending state or an issue with your EC2 configuration on a private subnet.
How easy is it for you to search across your cloud stack? And if it was easier, would you get more done? One of the ways that Blink empowers teams is by offering them the ability to easily search across their cloud resources using a SQL query action. With simple queries and no-code/low-code steps, you can create powerful automations fast.
There are an overwhelming number of ways to configure, deploy, and maintain your cloud stack… and there are new options and tools introduced every day.
For AWS alone, there are over 225 services that might be useful to your team depending on your situation.
If you are starting a project, you have to consider which resources and tools are the best fit for your use case, and also what are the short-term and long-term impacts of choosing each solution? How easy would it be to migrate to a different solution later?
With so many options to choose from, you might face decision paralysis or make an imperfect choice just to keep the project moving forward. It’s worth embracing the uncertainty, as long as you can iterate quickly.
When it comes to estimating computing requirements, you might not have enough data to really know how much you’ll need. For example, we recently published how-to guides on resizing low usage instances for each of the major cloud providers. This type of optimization can really only take place after you have the data to act on it. Out of our 50 how-to guides, 19 of them focus on this type of iterative resource optimization.
There are other Ops jobs that aren’t as clear-cut as optimizing for computing efficiency. For example, when it comes to user permissions, or resource tagging, or device management, you need to make decisions as an organization and create templated methods so you can scale your approach in a standardized way.
A term like DevSecOps exists because many believe that security is literally in the middle of many DevOps functions. Our guides demonstrate a similar overlap. Of our 50 how-to guides, at least 17 of them contribute to a strong security posture.
For example, when it comes to granting user permissions, how are you adhering to the principle of least privilege? A rigid approach might slow down your development velocity, so do you have an automated approval process that can facilitate changes? Creating an excellent permissions strategy requires a confluence of DevOps and SecOps mentalities.
Establishing and enforcing a device management strategy also requires a combination of security guidelines for what success looks like and automated processes so that you can ensure installation compliance with MDM tools like JumpCloud or Workspace One.
If you or team members are creating new resources, are there guardrails in place to make sure that these resources are only accessible by those who need them? On this topic, we published guides on common security checks like ensuring that your RDS instances and snapshots are not publicly accessible, or that your S3 buckets are encrypted properly.
Unused resources can also pose a security threat, whether that’s a non-active user account or orphaned Kubernetes secrets. Finding and cleaning up unnecessary resources reduces risk for your organization, while also simultaneously reducing cloud costs. Speaking of which…
Automated vendor billing puts extra responsibility on end-users to act within the limits of what they paid for or to spend more as they need more resources. For many SaaS subscriptions, the limits are easy to understand and changes in spend are handled only by account administrators.
Billing from the major cloud providers is exponentially more complex and granular. It’s common for DevOps practitioners or developers to need to create and scale resources for their own projects. When they do this, they are creating decentralized recurring expenses.
For AWS Elastic IP addresses for example, these are billed hourly, so tracking any unused IP addresses becomes a cost management necessity. To optimize your spending, you need to monitor for things like:
For every type of resource you want to create that has a recurring expense, you can either run manual checks on a regular basis and accept some waste here and there, or take the time to instrument automated systems to detect unnecessary expenses in real-time. Beyond just detecting waste, many teams choose a more proactive approach by creating approval flows to ensure that the right types of resources are selected, that the resources are necessary, and that the requested resources fit within the allocated budget.
Without automated cost mitigations in place, the time you will need to spend on resource cleanup will scale as much as your projects do.
So to recap, the main insights we’ve covered in this post so far are that:
As we mentioned at the beginning, we chose these 50 CloudOps tasks after talking with customers who were trying to solve these problems with new automations.
Some had existing scripts that were hard to maintain and track; some were handling each task manually. They needed a simple way to create smart automations.
Blink is a no-code/low-code automation platform for modern CloudOps teams.
Every how-to guide we have documented represents an example of a specific task that Blink can help solve for your team today. Whether you want to use no-code / low-code steps or scripting, you can create automations in Blink fast and start solving problems at scale. Query across your entire cloud stack and achieve operational excellence.