Is It Time to Change How We Reference Container Images?

Every package manager has the ability to reference a thing, without having to reference where to get that thing. The where is a separate configuration.

Lets say you want to install react. npm install react will pull from npmjs.com

Let’s say you don’t want your development team to be dependent on the public registry, for reliability and security reasons, you want all packages to be maintained within your company, Acme Rockets. Or, you have a bunch of private packages that you don’t want to be public.

By reconfiguring the default registry:

npm config set registry https://npm.acme-rockets.io/

the same npm install react now comes from your private registry.

So, why do we bake in registry names to our image deployment manifests?

kube-deploy.yaml

apiVersion: apps/v1kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: registry.acme-rockets.io/nginx:1.14.2
        ports:
        - containerPort: 80

Docker’s Progressive Disclosure Design

Depending on the example you look at, you may not see a registry name. In the below example, nginx is referenced, but no registry name is provided.

    spec:
      containers:
      – name: nginx
        image: nginx:1.14.2

Docker has been highly successful because of their developer focus. They make the simple things simple, while enabling the harder elements by progressively adding parameters.

This does make the simple running of hello-world as simple as:

docker run hello-world

Notice, there’s no tag, not even the dreaded :latest. :latest is assumed when no tag is provided. Likewise, you don’t even think about a registry when running hello-world. It magically appears if not already on the local machine. Another great aspect of the docker experience; the passive caching.

Smooth Transitions and Cliffs

This is where it gets a bit more interesting. Rather than provide a configuration option for the registry, docker requires the registry as a parameter to the image reference. To use our registry.acme-rockets.io reference, we can’t simply say:

docker config registry=registry.acme.rockets.io

We must embed the registry as part of the image reference:

docker run registry.acme-rockets.io/nginx:1.14.2

At first, this doesn’t appear to be a huge issue, as it does follow the progressive disclosure model, where the non-default simply requires additional information. Most container clients know to substitute docker.io when no registry name is provided.

Complexity Manifesto

Where this gets complex is balancing the use of public content in our private environments. While it seems pretty trivial to simply change the registry reference, even concatenating registry.acme-rockets.io/ to nginx:1.14.2, the problem manifests itself when we start consuming public deployment artifacts.

Even our nginx sample above would default to pulling from Docker Hub. While fine for a demo, customers are looking to copy/paste, or simply consume as-is, without having to make changes.

If you have a large kube-deploy.yaml file, it may reference images from Docker Hub, gcr, quay, or other public registries like the recently announced GitHub Container Registry.

Are you expected to copy/replace all the references? How many people do? I can tell you from our Azure experiences that most don’t, until they get burned.

The Pros and Cons of Helm Chart Usage

As we were evaluating the impact of the Docker Terms of Service changes, and helping customers understand the value of keeping content they depend upon under their own control, we focused on 3 aspects:

Are all the 1st Party Azure Services pulling from Azure infrastructure?
Are customers pulling from Docker Hub, or other public registries that may be impacted by outages from DNS, CDN or other frailties of the internet?
Does the Azure documentation promote customers importing and maintaining public content into their privately controlled registry?

We quickly realized it wasn’t that simple, as if that were so simple.

Customers highly benefit from using Helm Charts to deploy “applications”, “application components” or “infrastructure services”. For instance, customers that use cert-manager may follow the helm instructions to:

helm install cert-manager jetstack/cert-manager

Helm got the private repository model correct. To get the jetstack version of cert-manager, you’re guided to configure the jetstack repository:

helm repo add jetstack https://charts.jetstack.io

Where the problem manifests is how helm charts reference images.

Looking at deployment.yaml, we see the containers reference:

containers:
  - name: {{ .Chart.Name }}
    image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"

The values.yaml provides the actual references:

image:
  repository: quay.io/jetstack/cert-manager-controller
  tag: v0.6.2

While values can be set , this imposes a complexity that most are not utilizing to provide references to their managed copy of the cert-manager-controller image.

helm upgrade cert-manager ./charts/cert-manager \ 
  --install \
  --set .image.repository=registry.acme-rockets.io/jetstack/cert-manager-controller

Further, there are many charts that don’t make the image value, nor do users easily know all the values in the primary or sub-charts they must override.

What To Do?

As with most things, there’s what can we do today, and what can we do as we have more time.

What we can do today:

Docker Run: With Env Vars

By simply extracting the registry name, even for a Docker Hub reference, we can see a simple change can unlock greater flexibility:

REGISTRY_NAME=
docker run ${REGISTRY_NAME}hello-world

To run hello-world from our acme-rockets registry:

REGISTRY_NAME=registry.acme-rockets.io/
docker run ${REGISTRY_NAME}hello-world

Note: the / is configured on the REGISTRY_NAME, as a blank registry would cause a failure when pulling from Docker Hub.

# blank for docker hub
REGISTRY_NAME=
docker run ${REGISTRY_NAME}hello-world 
# becomes
docker run /hello-world 
docker: invalid reference format.

Dockerfiles: Abstract the Registry Reference with Build Args

Docker added build-args for just this purpose:

Dockerfile:
ARG REGISTRY_NAME=mcr.microsoft.com/
FROM ${REGISTRY_NAME}dotnet/core/runtime

Using the above, you can use commands like az acr import to import the mcr.microsoft.com/dotnet/core/runtime image to registry.acme-rockets.io/dotnet/core/runtime

Using build-args, you can then run the following command without having to change the dockerfile:

# Set an env var for simplicity REGISTRY_NAME=registgry.acme-rockets.io/ 
# Build the new hello-world image, in the same registry as our imported base image: 
docker build \
  -t ${REGISTRY_NAME}hello-world:1 \
  --build-arg ${REGISTRY_NAME} \
  .

Helm Chart Values – Registry by Convention

While helm repo supports customizable repositories, there’s no default mechanism for abstracting the registry.

Helm Charts are essentially go templates. So, we just need a standard parameter model for the registry.

An alternative, which is arguably just another level of abstractions could be setting a standard of abstracting the registry name by convention. One that could transcend all sub-charts as well.

deployment.yaml:

containers:
  - name: {{ .Chart.Name }}
    image: "{{ .Values.image.registry }}{{.Vlaues.image.repository}}:{{ .Values.image.tag }}"

values.yaml:

image:
  registry: quay.io/
  repository: jetstack/cert-manager-controller
  tag: v0.6.2

The simplicity here is more akin to the progressive disclosure model. In most cases, the user would imply provide the registry name. If they import source images to the same path within their private registry, no other parameter changes are required:

helm upgrade cert-manager ./charts/cert-manager \ 
  --install \
  --set .image.registry=registry.acme-rockets.io/

When charts reference multiple images, or even sub-charts, by setting this one registry parameter, we get similar package manager abstractions where we just have need to provide the unique registry name. Everything else remains the same.

The Long Run

To be fair, all the above items are things that we can do today, but what we can do today is not what we’d like to do tomorrow.

The question is whether it’s time to lift the registry name as a configuration.

A collection of 1, or More

The challenge is we’ll likely have multiple registries we need to work with. We may want to build from a shared base-artifact registry, and build to our teams dev registry. In this case, changing the single default wont provide the flexibility needed.

However, we could lift the named reference pattern, along with a default registry.

Wild hair idea warning: these ideas are 5 mins worth of thought for how to specify multiple registries. I’m sure we can do better.

We’ll build a hello-world image, with a dockerfile that uses the default registry:

Dockerfile:
FROM dotnet/core/runtime
...

We’ll configure the default registry, along with a target registry:

docker configure --default-registry mcr.microsoft.com

docker configure --add-registry target base-images.acme-rockets.io

We now build our image, with a dockerfile we won’t change, using the target registry moniker :// for tagging the new image:

docker build --t target://hello-world:42 .

Change Takes Time, and a First Step

A change as impactful as making registries configurable across a growing suite of tools from container builders to container hosts will take a while. We’ll need to get some concensus on the patterns, including support for multiple registries. We’ll need the various projects to agree and commit to doing the actual coding. And, we’ll need to wait for all of this to be deployed across the various clouds and end user environments.

Every long journey begins with the first step. If we know it will take 2 years to complete, every month we delay starting, is another month beyond the 2 years that we’ll have a viable solution.

I’m hoping the community would agree, it’s time to take these first steps for acknowledging customers work with many registries, both public and private.

Steve

1Comment

Add yours

1

stevelas on February 1, 2021 at 10:28 pm

Here are some updates to this thread: https://github.com/notaryproject/nv2/discussions/31
There are some great lessons to be learned from the RedHat Short name designs. My hope is the ability to deterministically map a registry/namespace to a referenced name will solve the environmental configuration, while not exposing any security holes.
– https://www.redhat.com/sysadmin/container-image-short-names
– https://www.redhat.com/en/blog/be-careful-when-pulling-images-short-name

LikeLike