With an increasing shift to micro-services architectures, the ecosystem seems to be in a steady cycle of increasing complexity, followed by a new tool to help us manage that complexity. Each new tool makes it a little easier to scale up and distribute our applications, until they hit critical mass and a new level of intervention is needed to take the pressure off. Before getting to the title topic, a (rather simplified) walk through the recent steps helps set the scene.
Containerisation
While the concept of Linux containers has been around significantly longer, it was really brought to the forefront with the emergence of Docker. And with good reason – they’re useful. They allow us to decouple our application from the underlying environment, meaning developers can just focus on the logic of their application and, as long as it fits in a container, it’s a simple matter to deploy it where it needs to go. Deployment is easy, reliable and consistent and it’s less resource intensive than running applications on separate VMs. And because we have this new lightweight environment to run applications in, it makes it much easier to break down previously monolithic systems into smaller, isolated chunks that can all spin up in their own container and speak to each other.
But with this power comes the additional burden of managing all the containers we’ve split our application down into. We now have the task of managing what containers go together to make up an application, how many of each we need for high-availability and failover and how we balance traffic among them. We must decide how all our containers speak to each other and should they all be able to speak to each other? How do we interact with the containers from the outside world and ensure that certain ones stay off-limits? What data do we need mounted where? It quickly scales out of control.
Orchestration
And onto the next step of the cycle. We needed a new tool to give us an easier interface to specify what our application should look like and “orchestrate” all the correct containers together in the correct way. There are several tools in the market that address this problem, including Docker Swarm and AWS’s Elastic Container Service (ECS), but Kubernetes has emerged as the victor. Docker now supports Kubernetes natively and AWS introduced an Elastic Container Service for Kubernetes (EKS). Kubernetes also formed one of the seed technologies for the Cloud Native Computing Foundation (CNCF) partnership between Google and the Linux Foundation.
And it does just what we wanted. It provides some standard Resources that correspond to standard configuration patterns. If we need to make sure we keep a certain number of instances of a service available, we create a Deployment and Kubernetes manages the logic for us. We ask for 3 instances and it makes sure we keep our 3 instances. If one goes down, it brings up a new one.
This isn’t a deep dive on Kubernetes, but the idea is we get a new level of abstraction to make it easier to get what we want and outsource some of the logic behind deploying these resources to Kubernetes. If we want a consistent endpoint to talk to our apps on as they scale and shrink, we create a Service. If we want to expose our application to the world, we create an Ingress. If we want to add some important information, we create a Secret.
But complexity waits for no one, and soon we end up with the same problem of a whole set of Kubernetes Resources to manage ourselves.
Packaging
Now to help manage this growing collection of Kubernetes Resources, we need a new abstraction to package them together as a single application. And at the time of writing our package manager of choice is Helm. Helm provides a means to describe all the various Kubernetes Resources as part of a single “Chart” and configure all the values we are interested in from a single configuration file. One we’ve created a Chart it’s simple to install, upgrade or delete it with a single action, rather than applying/updating/deleting many Resources individually.
So now if we have any issue with our application, we don’t have to inspect each individual Resource that could be causing the problem. Helm can check the status of a Chart giving us feedback on all the Resources within that Chart. And if we need to update Resources within the chart, it’s a case of updating the relevant values in the configuration file and perform a helm upgrade, rather than updating the relevant information in individual resources.
From here, it’s not necessarily an increase in volume of Resources, but in the complexity of the logic behind our applications themselves the prompts a potential new layer of abstraction.
Kubernetes Operators
We’re not necessarily at the next great leap in the chain, but there are potential pain points with current tools for managing our applications and Kubernetes Operators may be a solution.
What are Operators?
Kubernetes Operators allow us to extend the Kubernetes API and declare our application as a native Kubernetes Resource. Hopefully this should sound quite powerful but if not, it is.
They’re based on two Kubernetes concepts, Resources and Controllers. Resources are what we interact with when we specify the desired state of our application. Pods, Deployments and Services are all standard Resources that Kubernetes knows how to manage. Controllers are the part that does that management. Each Resource has a corresponding Controller that listens for changes to Resources and performs the logic to reconcile the actual state with the desired state.
Operators allow us to specify our own Custom Resources and then implement our Custom Controller to perform the logic to manage that resource. So our app can become a truly native Resource in our cluster.
Operator Framework
Operators are such a good idea they have their own framework in development. The Operator Framework gives us a more standardized format for developing Operators, as well as some tooling to generate the skeleton for our Custom Resources and Custom Controllers and plumb them together in the Kubernetes environment. This lets developers focus on getting down the application-specific knowledge for controlling their application, rather than managing the Kubernetes APIs. It even gives us some more tools for managing and monitoring the lifecycle of Operators on our cluster. And while there are admittedly some sharp edges, as with any technology that has yet to hit a major release, it certainly shows the community drive for Operators.
Why are we interested?
Operators allow us to embed the application specific knowledge for managing our application into the cluster. Kubernetes standard Resources are fine for simple, stateless applications, but once the interactions between components get more complicated, state changes may not be as simple as add 1 pod to the existing 2 to get 3. Does the new pod need registering into a cluster, does the load need redistributing across the new node or does this new pod need to trigger other interactions or need a certificate generating before it can interact? What if part of our application isn’t a Kubernetes Resource, but an external platform Resource? Operators let us embed the operational logic to manage all of this, which would normally require human input, directly into the cluster.
We can also abstract as much complexity as we want from the end user. If we want the user to only worry about giving their instance of our application a name, that’s all we expose in our Custom Resource and let the Controller manage the rest. They don’t have to worry about the internals, or even the platform, if we don’t want them to. The Controller can handle as much or as little of the decision making as we need. That’s the key difference over just packaging with Helm. The Controller can make the decision of how to configure the application. With Helm, we can set good, but static, configuration behind the values file. If we move our application to a different environment we have to update all of that configuration manually to the specific details of the new environment. Whereas in a Controller, we can be smarter and adapt to the deployment situation automatically.
For example, if a developer has our application deployed in AWS using Helm and decides to change provider to Azure, they now have to manually change all the configuration to the Azure way of doing things. On the other hand, if we’ve written an intelligent Operator to manage our application that supports multiple providers, it could be a simple case of changing the provider name to redeploy the application. Or we could even write the Controller to detect the platform it is deployed on and configure the application accordingly.
This works in multiple ways. We can make it simpler for the user and we can make it safer. If there’s potential bad configuration of Resources, we just don’t expose that and there’s no danger of someone accidentally or intentionally configuring the application badly. Or we can expose different options to different users. For instance, a configuration Resource for a cluster administrator to specify a good set of properties and an instance Resource for a developer to just say “I’ll have one of those”. These can both be handled by the same Controller or different ones.
And finally, we can do all of this through the standard Kubernetes APIs, using kubectl, like any other Kubernetes Resource.
Why would we not be interested?
With all these very useful qualities, why would we not want to use Operators? For starters, the framework is still quite young. As already stated, there are some sharp edges and if there’s not a pressing need for automated logic, it may be worth waiting until it’s more stable.
Another consideration is how complicated the operational logic for an application is. For a simple application, they might be overkill. Kubernetes is already quite good at handling our applications and for most cases it’s doing just fine. If the logic for managing an application isn’t that complex, the Operator wouldn’t give much benefit other than allowing us to refer to our application by name.
On the other end of the spectrum, it can be hard to encode the operational logic for more complex applications. Particularly with the current framework, condensing the logic to manage a resource into a single reconcile function is no simple task.
Parting thoughts
While they may not be the next revolution in application infrastructure (yet), Operators provide an interesting new way to manage our applications and offload more complexity. This leaves users free to worry about using applications, rather than deploying and managing them.