Docker's V1 and V2 Image Specification
7 min read
In 2016, Docker has officially updated their image specification from V1 to V2, adopting a more sophisticated scheme that is inline with OCI Container Image Specification.
There are only a few minor differences between Docker's image spcification V2 and OCI image specification (See Compatibility Matrix). Here we will discuss some major changes from V1 and V2, and why Docker has moved towards these changes.
Docker images contain the underlying changes in the root filesystem, and the
execution parameters of a container. When we write a Dockerfile, the FROM clause
bring in a base image from an external registry; each line of operation adds a
new layer or attribute to the base image. Eventually docker build
will gives
us an image that we can deploy, share or run as a container.
It is important to distinguish that an image is different from a container. When a container is ran, the image (which is a stack of tarballs with manifest JSON files) is processed and transformed into the underlying filesystem, mounts, environment variables and so on. OCI Container Image Specification and OCI Container Runtime Specification give a good modern understanding on what an image is.
Content Addressability
In Docker Image Specification V1.0, each image layer has a randomly generated
256-bit id that uniquely references the layer. The space of the id is sufficient
to ensure the uniqueness of all layers, but it does not guarantee that the image
you pull is always the image you expect. Imagine that you have an image
pychat:1.0
that uses python:3
as its base image. You uploaded your image to
an image registry, and over the years someone comes along and swaps out some
files in python:3
. All in a sudden your code breaks and you might have to look
into a deep chain of dependencies to figure out why your container doesn't work
anymore. This is why having an unique id is not enough, we want a way to
efficiently address the content of a container.
Content addressing is achieved by using a collision resistent hash function. In Docker Image Specification V2, each image layer is referenced with a unique content address computed from the canonical representation of the layer changeset. We can look at it like this:
"""
Note that this is an over-simplified version of how
image_ids are generated.
"""
layer_ids = []
for layer_changeset in layers:
layer_id = hash(repr(layer_changeset))
layer_ids.add(layer_id)
image_manifest = layer_ids
image_id = hash(image_manifest)
Due to the nature of cryptographic hash functions, the hashes can be used to distinguish whether two layers contain exactly the same content. We can use this property to ensure that the image we download is the same image we used before, i.e. tell Docker to fetch an image with the content address of a previously used and tested image. This will ensure that the image we get hadn't been changed since, as any changes to the image will result in a entirely different content address.
A manifest
is a JSON file that contains all necessary configuration for an
image. In Docker image spec V2, a manifest
contains the content addresses of
all layers in an image. Since it contains the hash of all layers, we can simply
download this small JSON file and take the hash of it to verify that this image
has all the layers we wanted. This saves us a lot of time from downloading and
checking the content of the entire image.
An example V2 image manifest:
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
...
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 32654,
"digest": "sha256:e692418e4cbaf90ca...b51fab815ad7fc331f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 16724,
"digest": "sha256:3c3a4604a545cdc12...f4a9c1905b15da2eb4"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 73109,
"digest": "sha256:ec4b8955958665577...7f12184802ad867736"
}
]
}
Example Dockerfile utilizing the image described in the above manifest:
FROM python@sha256:b22de77...f118cb
...
CMD ["python3", "pychat.py"]
This way pychat:1.0
is guaranteed to pull the same base image everytime it is
ran, and the container will remain deterministic until our selected hash
algorithm is found to be broken, which hopefully is not in the near future.
One thing to look out for when pulling images by content address:
There are some inconsistencies between the content of the image manifest in the
system and the content of the manifest that exists in the registry, i.e. The
image ID displayed in docker image list --no-trunc
is different to the address
you use to fetch an image
(relevant blog post and
Github issue).
Image Specific Properties
Another important change is the location of image specific properties. In V1,
each layer contains a JSON file that specifies what the image should do up to
this layer. The file contains information such as
entrypoint, env, cmd, memory
and so on. However, these attributes should not
be layer specific. Each layer essentially represents a change in the image's
filesystem at a point in time, associating such component with runtime
properties such as environment variables, entrypoints and CPU usage is
completely unnecessary. Runtime properties should be related to an image as a
whole, not a single layer in the filesystem.
We can take a look at Docker V1's Image format by doing
docker save <image name> | tar -x
in an empty directory. docker save
and
docker load
are the two docker commands that still uses the V1 standard for
legacy reasons.
$ docker pull ubuntu@sha256:d2518289e66fd3892c2dae5003218117abeeed2edbb470cba544aef480fb6b3a
$ docker save ubuntu@sha256:d2518289e66fd3892c2dae5003218117abeeed2edbb470cba544aef480fb6b3a | tar -x
$ tree
.
├── 08ca6384a97957eac5a5a69cdc799434739655c88e69efb23d2bb963110dbf48
│ ├── json
│ ├── layer.tar
│ └── VERSION
├── 0fa211e5edebeb29d3e29cc2c8c87e9a6a8306901816c19b7f6fb6a7392c3cef
│ ├── json
│ ├── layer.tar
│ └── VERSION
├── 452a96d81c30a1e426bc250428263ac9ca3f47c9bf086f876d11cb39cf57aeec.json
├── 614c02cb92ee20d3cd51770f07d67503f87a75602ddf032a0a6163527fcf97e0
│ ├── json
│ ├── layer.tar
│ └── VERSION
├── cc8487ed6373e8b38c60ff8fc5bdfdd9576aa49226a6e4dcac522f61f5f19d31
│ ├── json
│ ├── layer.tar
│ └── VERSION
├── daf8616e33b20539309a114814ba9864367630ad8da63d4e96bea40dd22841ba
│ ├── json
│ ├── layer.tar
│ └── VERSION
└── manifest.json
We can see that the image has 5 underlying layers (represented by
sub-directories) and a top level layer (represented by the root directories).
The manifest.json
tells us the ordering of the layers and which of them is the
top level directory. We can also see that there is a file json in each sub
directory, this is the configuration of each layer in V1 format as described
earlier. Read the json files and pay special attention to the container_config
attribute set, these attributes should be assoicated to the image as a whole,
however they exist at each layer instead. Having a hierarchy of inheritance on
these runtime attributes layer by layer often leads to unnecessary complexity.
In V2, the layer JSON is discarded and instead it was decided that the layer
changeset itself is enough to represent a layer in the image. Layer hierarchies
are specified in the layers
attribute in the root level manifest, which
references the layers by its content addresses. The attributes of
container_config
is now stored in another file inline with the
OCI Runtime Specification. The
new specification makes sure that the runtime configurations are associated with
the image as a whole, and inheritance happens at a image level rather than a
container level. Note that the content address of this runtime configuration is
also included in the V2 manifest.
Multi-Architecture Images
V2 also introduces a new configuration component called a manifest list, or an image index in OCI's terminology. Manifest list is essentially a list of platform specific image manifests which contains similar contents, for example, an Ubuntu:16.04 container for a macOS host and an Ubuntu:16.04 container for a linux host. This allows multi-architecture containers to be coupled together and treated as a whole, which is useful for packaging a set of similar images and distributing them to an image registry.
Conclusion
Here's a summary of the differences mentioned in this article:
- Image Spec V1
- Randomly generated image ID
- Image specific properties are defined at layer level
- No multi-architecture support
- Image Spec V2
- Image IDs are content addresses
- Image specific properties defined at image level
- Multi-architecture support
There are other advantages, we'll examine in later posts.