Images

What's a docker image?

An image is an executable package that includes everything needed to run an application - the code, a runtime, libraries, environment, variables, and configuration file.

A container is a runtime instance of an image.

Container Images Are Big

Container images can be pretty big (though some are really small, like alpine linux is 2.5MB). Ubuntu 16.04 is about 27MB, and the Anaconda Python distribution is 800MB to 1.5GB.

Every container you start with an image starts out with the same blank slate, as if it made a copy of the image just for that container to use.

But for big container images, like that 800MB Anaconda image, making a copy would be both a waste of disk space and pretty slow.

So Docker doesn’t make copies – instead it uses layering technique called overlay.

How overlays work

Overlay filesystems, also known as “union filesystems” or “union mounts” let you mount a filesystem using 2 directories: a “lower” directory, and an “upper” directory.

Basically:

  • The lower directory of the filesystem is read-only

  • The upper directory of the filesystem can be both readable and writable

When a process reads a file, the overlayfs filesystem driver looks in the upper directory and reads the file from there if it’s present. Otherwise, it looks in the lower directory.

When a process writes a file, overlayfs will just write it to the upper directory.

Understanding Layering with Docker Images

Images are made up of multiple read-only layers. Multiple containers are typically based on the same image. When an image is instantiated into a container, a top writable layer is created. (which is deleted when the container is removed).

Docker uses storage drivers to manage the content of the image layers and the writable container layer.

Each storage driver handles the implementation differently, but all drivers use stackable image layers and the copy-on-write(CoW) strategy.

The copy-on-write (CoW) strategy

The storage location of Docker images and containers

A Docker container consists of network settings, volumes, and images. The location of Docker files depends on your operating system.

Here is an overview for the most used operating systems:

  • Ubuntu: /var/lib/docker/

  • Fedora: /var/lib/docker/

  • Debian: /var/lib/docker/

  • Windows: C:\ProgramData\DockerDesktop

  • MacOS: ~/Library/Containers/com.docker.docker/Data/vms/0/

Use docker info | grep -i root command to findout:

Docker File

You might create your own images or you might only use those created by others and published in a registry.

To build your own image, you create a Dockerfile with a simple syntax for defining the steps needed to create the image and run it.

Each instruction in a Dockerfile creates a layer in the image. When you change the Dockerfile and rebuild the image, only those layers which have changed are rebuilt.

This is part of what makes images so lightweight, small, and fast, when compared to other virtualization technologies. A Dockerfile is executed by the docker build command.

Lets take a look at sample Dockerfile:

Dockerfile Instructions

  • FROM: defines the base image; the FROM instruction must be the first instruction in Dockerfile.

  • LABEL: it's a Description about any thing you want to define about this image,

  • ADD: copies a file into the image but supports tar and remote URL

  • COPY: copy files into the image, preferred over ADD.

  • VOLUME: creates a mount point as defined when the container is run.

  • ENTRYPOINT: the executable runs when the container is run.

  • EXPOSE: documents the ports that should be published.

  • The CMD instruction has three forms:

    • CMD ["executable","param1","param2"] (exec form, this is the preferred form)

    • CMD ["param1","param2"] (as default parameters to ENTRYPOINT)

    • CMD command param1 param2 (shell form)

    There can only be one CMD instruction in a Dockerfile. If you list more than one CMD then only the last CMD will take effect.

  • ENV: used to define environmental variables in the container.

  • MAINTAINER: (while deprecated), MAINTAINER is used to document the author of the Dockerfile (typically an email address)

  • ONBUILD: only used as a trigger when this image is used to build other images; will define commands to run "on build"

  • RUN: runs a new command in a new layer.

  • WORKDIR: defines the working directory of the container.

Now to build an image from this Dockerfile, we'll use the build command.

The generic syntax for the command is as follows:

The build command requires a Dockerfile and the build's context. The context is the set of files and directories located in the specified location.

Docker will look for a Dockerfile in the context and use that to build the image.

Open up a terminal window inside that directory and execute the following command:

We're passing . as the build context which means the current directory.

If you put the Dockerfile inside another directory like /src/Dockerfile, then the context will be ./src.

The build process may take some time to finish:

If everything goes fine, you should see something like Successfully built fc32da11d651 at the end. This random string is the image id and not container id.

Try docker image inspect <image id> to get information about this image.

Also to see layers which our image includes try docker image history <image id>:

Listing Images

For listing local images, use the following syntax:

We can also use docker images which is deprecated somehow.

The image we have recently built is showing up in the first line. We haven't tagged out image during build process ,we will talk about tagging images later in this section.

Pulling an image from default registry

To download a particular image, or set of images, use docker pull:

As we mentioned Docker images can consist of multiple layers. In the example above, the image consists of two layers.

Remove one or more specific images

Use the docker images command to locate the ID of the images you want to remove.

When you’ve located the images you want to delete, you can pass their ID or tag to docker rmi:

For example lets remove debian image:

Remove dangling images

Docker images consist of multiple layers. Dangling images are layers that have no relationship to any tagged images.

They no longer serve a purpose and consume disk space. They can be located by adding the filter flag, -f with a value of dangling=true to the docker images command.

When you’re sure you want to delete them, you can use the docker images purge command:

As we haven't tag our image lets tag it before purging dangling images.

docker image prune -a will remove all images with out at least one container associate with them, the good news about this is that if you have images that are being used by containers those images won't be deleted.

Tagging images

In simple words, Docker tags adds useful information about a specific image version/variant.

They are aliases to the ID of your image which often look like this: f1477ec11d12.

It’s just a way of referring to your image. A good analogy is how Git tags refer to a particular commit in your history.

The two most common cases where tags come into play are:

  1. When building an image, we use the following command:

It tells the Docker daemon to fetch the Docker file present in the current directory (that’s what the . at the end does). Next, we tell the Docker daemon to build the image and give it the specified tag.

If you need to push your image to a registry use docker build -t username/image_name:tag_name . :

2. Explicitly tagging an image through the tag command:

It just creates an alias (a reference) by the name of the TARGET_IMAGE that refers to the SOURCE_IMAGE. That’s all it does. It’s like assigning an existing image another name to refer to it.

Notice how the tag is specified as optional here as well, by the [:TAG] :

What happens when you don’t specify a tag?

Alright, now let’s uncover what happens when you don’t specify a tag while tagging an image. This is where the latest tag comes into the picture.

Whenever an image is tagged without an explicit tag, it’s given the latest tag by default. It’s an unfortunate naming choice that causes a lot of confusion.

But I like to think of it as the default tag that’s given to images when you don’t specify one.

Storing images in Docker Registry

A docker registry is a stateless, highly scalable application that stores and lets you distribute Docker images. Registries could be local (private) or cloud-base (private or public).

Examples of Docker Registries:

  1. Docker Registry (local open-source registry)

  2. Docker Trusted Registry(DTR) [Available in Docker Enterprise Edition]

  3. Docker Hub [Default Registry]

You need to create an account in Docker Hub first.

The first thing to remember is any time you are going to use a registry you need to first log in to that registry:

If we had a docker local registry then it would be docker login localhost:5000 .

And when you finish your job, logout:

Pushing an image to the Default Registery

Use docker push to Push an image or a repository to a registry:

Searching for an image

Whether you are using a public or a private registry you can search that registry to find the image that you need.

And that is what docker search command does for us:

Docker search has a very useful filtering option, you can filter output based on these conditions:

  • - stars=<numberOfStar>

  • - is-automated=(true|false)

  • - is-official=(true|false)

Above command searches for official ubuntu images which have more that 90 stars.

The --limit flag limits the maximum number of results returned by a search.

Saving and loading images

Pushing to Docker Hub is great, but it does have some disadvantages:

  1. Bandwidth - many ISPs have much lower upload bandwidth than download bandwidth.

  2. Unless you’re paying extra for the private repositories, pushing equals publishing.

  3. When working on some clusters, each time you launch a job that uses a Docker container it pulls the container from Docker Hub, and if you are running many jobs, this can be really slow.

Solutions to these problems can be to save the Docker container locally as a a tar archive, and then you can easily load that to an image when needed.

To save a Docker image after you have pulled, committed or built it you use the docker save command.

For example, lets save a local copy of the myapp docker image we made:

Docker supports two different types of methods for saving container images to a single tarball:

  • docker save - saves a non-running container image to a file

  • docker export - saves a container’s running or paused instance to a file

If we want to load that Docker container from the archived tar file in the future, we can use the docker load command:

Difference between loading a saved image and importing an exported container as an image

Loading an image using the load command creates a new image including its history. Importing a container as an image using the import command creates a new image excluding the history which results in a smaller image size compared to loading an image.

Commiting changes to an image

When working with Docker images and containers, one of the basic features is committing changes to a Docker image.

When you commit to changes, you essentially create a new image with an additional layer that modifies the base image layer.

For example let run a container based on nginx image:

Now lets attach to it and modify index.html:

Now lets CTRL+P and then CTRL+Q to exit from the container without stopping that.

And finally lets creating a new image from this running container using commit command:

And see the result:

Now we can run as many containers as we like from this image.

Last updated