When referring to an image, an artifact, a registry, a tag, what exactly is the reference? Do we mean:
For the sake of clarity of communications, there are several elements that make up an artifact or image name, and they are fairly important when we think about artifacts moving from one registry to another. See Choosing a Docker Container Registry for more context
Should we really refer to an image, tied to a specific location? As humans, would we really say the fully qualified name, or would we use shorthand references? And what terminology would, or should we use? There are several terms we use interchangeably, which I’ll call out their meanings:
- Image / Artifact
- Unique Registry
- Repo (repository)
Images & Artifacts
The first thing you may notice is I reference Images and Artifacts interchangeably. It turns out the infrastructure we use to store container images in registries can be used to store any type of artifact; from a simple text file, a helm chart, a docker container, to a VM Image. Docker and OCI Container Images are just one of many types we can store, and therefore I refer to the things in a registry as artifacts.
When we refer to an artifact as
hello-world, are we referring to where we got it, or simply we’re using shorthand to refer to the thing, regardless where it was retrieved? Are we assuming a version, or did we want to qualify the version?
In the container space, we have a shorthand of
:latest to refer to a versionless reference. If not otherwise stated,
hello-world equates to
:latest is the devils spittle is beyond the scope of this article. Suffice to say, “friends don’t let friends build against
docker run something:latest for samples or demo purposes is perfectly reasonable. Just don’t ever deploy using updateable tags. See Docker Tagging: Best practices for tagging and versioning docker images for more background.
For the location, when we refer to product installers like
app.msi, or an
maven package, we may have assumptions for where we retrieved it. But do we every say the fully qualified reference? Or, is it assumed that npms are retrieved from the public npm.org site, or your companies private instance?
For the point of clarity, we’ll use artifact, or the specific artifact to mean the versioned, or default version
|spoken reference||inferred meaning|
|The web image, with the build id of |
When people refer to a registry, they usually mean a specific instance, within a multi-tenant registry. Docker Hub is a multi-tenant registry, where there’s a set of official / public images, which are promoted by docker, or simply public by a specific org, such as the nginx sample image. The docker official images are simply images that come from the docker org.
A unique registry can be referenced in one of two ways, by namespace, or by domain.
Unique by Namespace
Unique by Domain
Namespace or Domain Pros & Cons
There are pros and cons to each. They balance the complexity in hosting a multi-tenant registry, and the capabilities offered.
Hosting a unique by namespace registry is easier for the registry operator, as they host a single domain, and use the namespace to isolate each customer.
Hosting a unique by domain registry is harder, as sub domains must be managed by the registry operator. However, running a production registry almost always requires some amount of networking requirements, such as placing the registry in a VNet, or establishing firewall rules. Using an FQDN, two fiercely competitive customers can safely share the same registry as they can establish VNets and/or firewall rules that isolate network traffic.
There are other benefits, such as what region a registry may get resolved to. Customer A may only operate in California. Their registry FQDN would resolve to a westus location. While customer B may operate in the US and Europe. Their domain can be network routed, by their FQDN to the closest host. Registry operators can also implement custom domains over a registry that’s unique by domain.
Regardless of the approach, the unique registry can be either
We often refer to a registry repo as shorthand for the repository. A repo/repository refers to the unique location, within a registry. When referring to a repo, the registry is assumed, based on the context.
In the above case,
repo is the repository. Notice, there’s a set of namespaces to the left, and the only thing to the right is the
What Should a Repo Contain
A repository can contain a collection of things. And this has been the source of much debate. While you could put all of the things a specific app, or service depends upon in a single repo, this can quickly get daunting.
When you look at a tag listing of the repo, what do you expect to see?
As a best practice, we’ve found a repository should contain a collection of like things, only differentiated by version, and in some cases, platform and architecture. It should not contain the supporting technologies. See namespaces for separation.
Namespaces are simply the path between the unique registry and the repo.
Lets say the above fx is a complied language requiring an SDK to compile the binaries, which may require a runtime, or may generate native binaries. The above fx could be broken up into two namespaces.
Whether the runtime has a namespace, or you have a default namespace isn’t as important as keeping the repos separated for clarity.
In addition to tag listing, mirroring, purge policies, or which base image updates you wish to track; having separate namespaces will dramatically simplify your life.
Different registry operators support different levels of depth. Docker hub supports two levels
org/repo. Azure Container Registry supports as many as you wish, up to the docker clients max length of 256 characters. This includes the registry name, up to the repo. The Microsoft Container Registry, which is where all supported Microsoft products are sourced from uses nested namespaces to represent product groupings.
The microsoft/dotnet-core namespace represents the following nested repos:
- dotnet/core/sdk: .NET Core SDK
- dotnet/core/aspnet: ASP.NET Core Runtime
- dotnet/core/runtime: .NET Core Runtime
- dotnet/core/runtime-deps: .NET Core Runtime Dependencies
- dotnet/core/samples: .NET Core Samples
The Windows namespace represents the following nested repos:
- windows/servercore: Windows Server Core base OS image
- windows/nanoserver: Nano Server base OS image
- windows/iotcore: Windows IoT Core base OS image
- windows: Windows base OS image
Within your org, you may use nested namespaces to represent different development teams
You may also use namespaces to separate your environments
When combining multiple teams into a single registry, not different registries in a multi-tenant registry. But, a single registry as noted above, you’ll want to make sure your registry supports repository scoped permissions. By popular demand, Azure Container Registry now supports repository scoped permissions.
Namespace & Repo overlap
You’ll notice the namespace and repo have some overlap. The namespace is the path between the registry and the artifact.
org.example.com/dev/blue-team/web:v1.0. While the repo name, (
org.example.com/dev/blue-team/web:v1.0) includes the namespace. This is important as a repo is only unique, based on it’s namespace path. While an artifact may be stored in a registry multiple times for a dev team, and several environments. It may get moved to an archive repo when it’s no longer in production.
Potential life cycle of the
web:v1.0 artifact. In the following example, the same
web:v1.0 artifact is copied across several repos, using different namespaces, as it makes it way through production, to an archived state.
Isolation by Namespace or Registry
This is a larger conversation, relating to how you segment your registry. Some features like VNet and Firewall rules, or geo-replication require different registries. I’ll save this conversation for a different post, specific to ACR.
Fully Qualified Artifact Reference
This encompases all the elements, from the unique registry, the namesapce, the repo and tag. This is what you must submit to the artifact host, like kubernetes. Without the fully qualified reference, the host wouldn’t know where to find the artifact.
Should Artifact Reference Include the Registry
This is a great question, which would require a time-machine, or a massive breaking change. When you reference other package managers, like npm or NuGet, the registry is a configuration option. This is not the case with Docker.
You can use environment variables or build-args to separate the registry, even the namesapce, from the repo. However, this is also a topic for another article.
Summing It Up
- Image: A specific type of artifact.
- Artifact: A specific, versioned reference to a thing, regardless of which registry it may be stored in. An artifact without a :tag, is assumed to be
hello-world) A versioned artifact, includes the :tag reference. (
- Repo: A collection of like things, differentiated by version, and optionally platform/architecture. Includes the namespace and the leaf node artifact name.
- Namespace: The path between the unique registry and the repo. Depending on the registry, it may be nested, or single depth.
- Registry: A unique registry within a multi-tenant registry. It could be unique by domain or a root namespace.
- Fully Qualified Artifact: The combination of the unique registry, namespace and versioned artifact.
Sailing is known for having a unique language. Instead of left and right, we say port and starboard and bow & stern refer to the front and back. The front of the sail is called the luft, while the back is called the leech, and the bottom is appropriately called the foot. However, these names are to assure clarity. Regardless of which way you’re facing (front or back), you should know port and starboard.
Hopefully, with the above terminology, we can bring clarity to the references we make to our various artifacts, and avoid confusion.