Adopting cloud-native development has become synonymous with consuming public content. To be productive, and not “reinvent the wheel”, you likely take advantage of one of the many CNCF Projects. You may build from source, simply install a package or
docker build FROM <insretBaseImage>. All of these methods assist in keeping up with the constant demand for new capabilities. But, do you depend on constant access, reliability and the security of these public locations from your build or production environments? Are you really comfortable being dependent upon a public endpoint out of your control? As users continue locking down their build and production environments, do you even have access to the public endpoint? You likely shouldn’t. As you promote the content into another environment, how do you know it’s still the artifact you initially started with?
The best practice involves promoting the content you depend upon into an environment you trust, control and can rely upon. See the OCI Blog: Consuming Public Content for more details.
But, what about those package references? Are they still valid within your locked down environment? Most package managers support reconfiguring the registry they depend upon. I’ve been backlogging some designs and conversations around making registries a configuration, rather than embedding the registry within the OCI Artifact reference. See Is It Time to Change How We Reference Container Images? for more info. In all of these package and registry scenarios, we have some options, although they’re not all pretty.
Supply Chain References, How to Identify the Reference
What about SBoMs, signatures, scan results, and other types that need to declare a reference to another artifact? The ORAS Artifacts Spec provides a means to define a graph of objects, where the container image, SBoM, Security Scan Result, and all the associated signatures are all stored as linked graph within an OCI Distribution based registry. To do this effectively, registries will need to extend their OCI Distribution based registry to support the Artifacts Spec, but that’s not really the point of this article. The point here, is how do SBoMs refer to container images, helm charts and other types when they’re not stored within the same registry or not stored in an OCI Distribution based registry at all? And, within the SBoM, what identifier should it use to identify the very specific container image, or other package type the SBoM was generated for? If the reference is something that can be updated, it isn’t a reliable, unique identifier. For instance, if an SBoM references the
debian:11 image by the tag, an update can be pushed to the same tag with security fixes. How would the SBoM reflect what “fixes” were put into the
debian:11 image? The unique identifier for the
debian:11 image is the digest of the image, which at the time of this post was
Ok, great, so we could embed the
<artifact>@<digest> in the SBoM reference, right? If you actually use that format, and remove the registry name, that would work, as the digest does uniquely identify the debian container image, or any OCI Artifact stored in a registry.
If you follow the best practices, and promote that image into your private registry:
reg.acme-rockets.io/public/debian, you can still reference the debian image, as the digest will remain the same in the private registry. If you’re not using ORAS Artifacts to link the SBoM to the debian image within the registry, you just need to provide the location for where the debian image is. As long as the digest is the same, you can be confident the SBoM reference is accurate.
The key is decoupling location from identity. The identity of the debian container image was the
<artifact>@digest reference. The location was once
docker.io/library, and now it’s
reg.acme-rockets.io/public/. The identity is decoupled from the location.
Decoupled Private Content
The images and other artifacts you build internally, perhaps based on debian image, wasm, or helm charts would also have references that decouple the location from the identity. You may use a
dev-reg.acme-rockets.io registry for development, and a
prod-reg.acme-rockets.io for your VNet isolated production environment. As you promote the content between those registries, you want to assure the SBoMs, signatures and other reference types can be resolved.
Location Hints, to Where
What if you really, really, really want to include the location for where the reference could be found. Assume you’re using a mirror for the VNet isolated environments that have no public access. This may work for the content we think of today, as all container images are hosted on docker hub, right? Well, no. It turns out Docker Hub does have a vast array of container images. However, over the last several years, product registries have started to host their own content. All Microsoft cloud native software is distributed via mcr.microsoft.com. The catalog of content is syndicated to Docker Hub, allowing a common browsing experience. However, other software companies host their own registries, including NVIDIA, Oracle and RedHat. To build upon the aspnet image, the dockerfile would have
FROM mcr.microsoft.com/dotnet/aspnet:5.0. To store an SBoM, you would use
aspnet@<digest>. You may even want to store a hint to the registry
mcr.microsoft.com/dotnet/. But, location will get more complicated, and not just for the product registries.
Public Images From Multiple Registries
As users recognize the need to have registry content close to their deployment, the content currently located in docker hub is being replicated to other registries. The exact same debian image can be pulled from docker hub or ecr public, with others “coming soon”.
This will evolve further over the next year or so. This isn’t just for the Docker Hub base images. Hippie Hacker has been working on a way to distribute the k8s images to multiple locale-dependent registries.
If you own publishing the debian image, and you publish it to n registries, or it gets syndicated to a n dozen registries, what would the registry url hint be?
Separating Uniqueness from Identity with Signing
Now we understand separating the location from the identity. But, what does identity really mean? The digest is the unique identity of an artifact stored in an OCI Distribution based registry. However, do you want to know who created the debian image? If you can’t track this back to a single registry, and know the owner that created it, and every build is yet another digest, how do you know who built the debian image, or the SBoM for the debian image?
This is where signing comes into play. If you right-click on a dll in Windows, you can see the critical components are signed. If the file is copied from one location to another, even through a USB stick the file is still signed and verifiable.
The unique identity has an association to Microsoft. This association of the unique identity of the artifact (a dll file in this case) is attested to by the Microsoft signature. The trust of the signature and the artifact can be verified independently from the location on the
c:\windows\system32 directory, a USB stick or another computer. The same can be done for cloud native artifacts stored in registries.
Spin Off Innovations – Artifact References
While developing Notary v2, we identified the need to associate signatures with container images, and other OCI Artifacts stored in OCI distribution based registries. We realized the need to add signatures can be done through references, which evolved into the ORAS Artifacts Spec. Through artifact references, you can associate signatures, SBoMs, Security Scan Results, and all sorts of other references you haven’t yet thought of. Best of all, these graphs of content will be capable of being copied within and across registries that support the artifacts-spec. Just as you can copy a file through a USB stick to another machine, while still verifying the signature, you’ll also be able to copy a rich graph of reference types.
The adoption of cloud native development, where developers can benefit from public content does not mean we must depend on where that content was acquired. As a best practice, users should promote the content they depend upon into environments they trust, control and rely upon. This means promoting public content to private registries. And promoting dev to production.
To scale, the same public content will be available from multiple locations. It may be a China hosted registry, compared to a US or EU hosted registry. Or, it may be regional cloud located syndications of the public content.
The current or past location of content should not impact the unique identity. Identity and location are and must be separable concepts. Associating a unique identity with an entity you trust can be associated with a signature of the entity.
Just don’t embed location into the references, as you just need to know the unique id to assure association. Location is just a relative point in the space-time continuum.