This article is part of a series on KubeCon Europe 2018.

KubeCon EU "Security is hard" is a tautology, especially in the fast-moving world of container orchestration. We have previously covered various aspects of Linux container security through, for example, the Clear Containers implementation or the broader question of Kubernetes and security, but those are mostly concerned with container isolation; they do not address the question of trusting a container's contents. What is a container running? Who built it and when? Even assuming we have good programmers and solid isolation layers, propagating that good code around a Kubernetes cluster and making strong assertions on the integrity of that supply chain is far from trivial. The 2018 KubeCon + CloudNativeCon Europe event featured some projects that could eventually solve that problem.

Image provenance

Adrian Mouat

A first talk, by Adrian Mouat, provided a good introduction to the broader question of "establishing image provenance and security in Kubernetes" (video, slides [PDF]). Mouat compared software to food you get from the supermarket: "you can actually tell quite a lot about the product; you can tell the ingredients, where it came from, when it was packaged, how long it's good for". He explained that "all livestock in Europe have an animal passport so we can track its movement throughout Europe and beyond". That "requires a lot of work, and time, and money, but we decided that this is was worthwhile doing so that we know [our food is] safe to eat. Can we say the same thing about the software running in our data centers?" This is especially a problem in complex systems like Kubernetes; containers have inherent security and licensing concerns, as we have recently discussed.

You should be able to easily tell what is in a container: what software it runs, where it came from, how it was created, and if it has any known security issues, he said. Mouat also expects those properties to be provable and verifiable with strong cryptographic assertions. Kubernetes can make this difficult. Mouat gave a demonstration of how, by default, the orchestration framework will allow different versions of the same container to run in parallel. In his scenario, this is because the default image pull policy (ifNotPresent) might pull a new version on some nodes and not others. This problem arises because of an inconsistency between the way Docker and Kubernetes treat image tags; the former as mutable and the latter as immutable. Mouat said that "the default semantics for pulling images in Kubernetes are confusing and dangerous." The solution here is to deploy only images with tags that refer to a unique version of a container, for example by embedding a Git hash or unique version number in the image tag. Obviously, changing the policy to AlwaysPullImages will also help in solving the particular issue he demonstrated, but will create more image churn in the cluster.

But that's only a small part of the problem; even if Kubernetes actually runs the correct image, how can you tell what is actually in that image? In theory, this should be easy. Docker seems like the perfect tool to create deterministic images that consist exactly of what you asked for: a clean and controlled, isolated environment. Unfortunately, containers are far from reproducible and the problem begins on the very first line of a Dockerfile. Mouat gave the example of a FROM debian line, which can mean different things at different times. It should normally refer to Debian "stable", but that's actually a moving target; Debian makes new stable releases once in a while, and there are regular security updates. So what first looks like a static target is actually moving. Many Dockerfiles will happily fetch random source code and binaries from the network. Mouat encouraged people to at least checksum the downloaded content to prevent basic attacks and problems.

Unfortunately, all this still doesn't get us reproducible builds since container images include file timestamps, build identifiers, and image creation time that will vary between builds, making container images hard to verify through bit-wise comparison or checksums. One solution there is to use alternative build tools like Bazel that allow you to build reproducible images. Mouat also added that there is "tension between reproducibility and keeping stuff up to date" because using hashes in manifests will make updates harder to deploy. By using FROM debian, you automatically get updates when you rebuild that container. Using FROM debian:stretch-20180426 will get you a more reproducible container, but you'll need to change your manifest regularly to follow security updates. Once we know what is in our container, there is at least a standard in the form of the OCI specification that allows attaching annotations to document the contents of containers.

Another problem is making sure containers are up to date, a "weirdly hard" question to answer according to Mouat: "why can't I ask my registry [if] there is new version of [a] tag, but as far as I know, there's no way you can do that." Mouat literally hand-waved at a slide showing various projects designed to scan container images for known vulnerabilities, introducing Aqua, Clair, NeuVector, and Twistlock. Mouat said we need a more "holistic" solution than the current whack-a-mole approach. His company is working on such a product called Trow, but not much information about it was available at the time of writing.

The long tail of the supply chain

Verifying container images is exactly the kind of problem Notary is designed to solve. Notary is a server "that allows anyone to have trust over arbitrary collections of data". In practice, that can be used by the Docker daemon as an additional check before fetching images from the registry. This allows operators to approve images with cryptographic signatures before they get deployed in the cluster.

Notary implements The Update Framework (TUF), a specification covering the nitty-gritty details of signatures, key rotation, and delegation. It keeps signed hashes of container images that can be used for verification; it can be deployed by enabling Docker's "content Trust" in any Docker daemon, or by configuring a custom admission controller with a web hook in Kubernetes. In another talk (slides [PDF], video) Liam White and Michael Hough covered the basics of Notary's design and how it interacts with Docker. They also introduced Porteiris as an admission controller hook that can implement a policy like "allow any image from the LWN Docker registry as long as it's signed by your favorite editor". Policies can be scoped by namespace as well, which can be useful in multi-tenant clusters. The downside of Porteris is that it supports only IBM Cloud Notary servers because the images need to be explicitly mapped between the Notary server and the registry. The IBM team knows only about how to map its own images but the speakers said they were open to contributions there.

Lukas Puehringer

A limitation of Notary is that it looks only at the last step of the build chain; in itself, it provides no guarantees on where the image comes from, how the image was built, or what it's made of. In yet another talk (slides [PDF] video), Wendy Dembowski and Lukas Puehringer introduced a possible solution to that problem: two projects that work hand-in-hand to provide end-to-end verification of the complete container supply chain. Puehringer first introduced the in-toto project as a tool to authenticate the integrity of individual build steps: code signing, continuous integration (CI), and deployment. It provides a specification for "open and extensible" metadata that certifies how each step was performed and the resulting artifacts. This could be, at the source step, as simple as a Git commit hash or, at the CI step, a build log and artifact checksums. All steps are "chained" as well, so that you can track which commit triggered the deployment of a specific image. The metadata is cryptographically signed by role keys to provide strong attestations as to the provenance and integrity of each step. The in-toto project is supervised by Justin Cappos, who also works on TUF, so it shares some of its security properties and integrates well with the framework. Each step in the build chain has its own public/private key pair, with support for role delegation and rotation.

Wendy Dembowski

In-toto is a generic framework allowing a complete supply chain verification by providing "attestations" that a given artifact was created by the right person using the right source. But it does not necessarily provide the hooks to do those checks in Kubernetes itself. This is where Grafeas comes in, by providing a global API to read and store metadata. That can be package versions, vulnerabilities, license or vulnerability scans, builds, images, deployments, and attestations such as those provided by in-toto. All of those can then be used by the Kubernetes admission controller to establish a policy that regulates image deployments. Dembowski referred to this tutorial by Kelsey Hightower as an example configuration to integrate Grafeas in your cluster. According to Puehringer: "It seems natural to marry the two projects together because Grafeas provides a very well-defined API where you can push metadata into, or query from, and is well integrated in the cloud ecosystem, and in-toto provides all the steps in the chain."

Dembowski said that Grafeas is already in use at Google and it has been found useful to keep track of metadata about containers. Grafeas can keep track of what each container is running, who built it, when (sometimes vulnerable) code was deployed, and make sure developers do not ship containers built on untrusted development machines. This can be useful when a new vulnerability comes out and administrators scramble to figure out if or where affected code is deployed.

Puehringer explained that in-toto's reference implementation is complete and he is working with various Linux distributions to get them to use link metadata to have their package managers perform similar verification.

Conclusion

The question of container trust hardly seems resolved at all; the available solutions are complex and would be difficult to deploy for Kubernetes rookies like me. However, it seems that Kubernetes could make small improvements to improve security and auditability, the first of which is probably setting the image pull policy to a more reasonable default. In his talk, Mouat also said it should be easier to make Kubernetes fetch images only from a trusted registry instead of allowing any arbitrary registry by default.

Beyond that, cluster operators wishing to have better control over their deployments should start looking into setting up Notary with an admission controller, maybe Portieris if they can figure out how to make it play with their own Notary servers. Considering the apparent complexity of Grafeas and in-toto, I would assume that those would probably be reserved only to larger "enterprise" deployments but who knows; Kubernetes may be complex enough as it is that people won't mind adding a service or two in there to improve its security. Keep in mind that complexity is an enemy of security, so operators should be careful when deploying solutions unless they have a good grasp of the trade-offs involved.


This article first appeared in the Linux Weekly News.

Created . Edited .