Kubernetes is an open-source project by Google that’s been around since 2014. It’s a portable, open-source platform built for managing containerized workloads and services. It was created with automation in mind, and it’s also known for its extensive support tools.
Since it’s an introduction, Kubernetes has become one of the most popular ways to create a scalable platform. Development projects today need to be created with tomorrow in mind. What’s normal today will likely be obsolete in as soon as a few months, so we need new ways to keep moving forward.
Because kubernetes is container-centric, it’s flexible and easy to build complex stacks. However, one of the main prospects of container technology is running these containers on a serverless infrastructure or cloud infrastructure. When Kubernetes is serverless, log collection becomes more complex.
The logs on a single node will increase because there’s a larger number of pods running on a node with the potential for a metric collection requirement. Similarly, various pod types running on a single node become diversified which calls for management of diversified logs. Let’s talk further about the most common trends and challenges of kubernetes log processing.
When workloads run in shared worker VMs, those from different projects are divided by namespaces. Because different projects might have their own unique logging preferences, there needs to be a new way to configure these without compromising on security.
One option is to use Kubernetes CRD (Custom Resource Definition). This is done using the kubectl command which you can find outlined in the kubectl cheatsheet. Then, role-based access control (RBAC) can be applied to this resource to protect security measures. You might also be familiar with this process in PKS as a sink resource, though this name is still catching on in the world of Kubernetes.
There’s only one pod per Kubernetes worker node, and if this pod is rescheduled, it has an effect on all of the other Pods in the worker node. This presents a challenge. Each node can run up to 100 pods, so you need to find a way to make sure your log agency can collect logs from all of these pods.
This frequently creates a noisy environment. One error might lead to more errors in the same worker node. This is why it’s essential for you to have a fast disk and to work on back-pressure issues diligently to avoid unwanted problems.
We have logs for pods, K8s, and platforms. There are often add-ons for every standard workload logs. Different logs come with different characteristics and their own priorities. There are so many layers to this container system that it’s hard to handle all of them when they’re logged together.
Unfortunately, there’s still no clear solution for handling these different layers. These different characteristics are still compounding, and we still need a security solution that addresses the root cause.
Finally, pods are likely to be deleted or recreated quickly if something is wrong. The log file will likely go as well in this situation. Failing to collect all the critical logs when something goes wrong will slow down your ability to solve the problem. This is a challenge that still needs to be solved by the Kubernetes community as a whole, but that doesn’t mean there aren’t ways around it.
While you might use Heroku logs by Loggly for Heroku, for Kubernetes you can use a log agent like Fluentd. This agent succeeds around this problem because it scans new folders or log patterns at a regular interval (like every 60 seconds) to capture even short-lived pods. You can even lower this to a 1-second interval to create a higher performance.
Kubernetes has come a long way, but it also has a long way to go. As you can see from the challenges above, there are some logging issues that still need to be worked out. However, the Kubernetes community is active and utilizing DevOps to come to fast solutions.
Is Kubernetes right for your project? What are your biggest challenges?