Building Efficient and Secure Containers

Dockerfiles serve as the blueprint for containerized applications, and adhering to best practices ensures efficiency, security, and maintainability. In this comprehensive guide, we'll explore step-by-step Dockerfile best practices, covering multi-stage builds, .dockerignore usage, ephemeral containers, and much more.

1. Multi-Stage Builds

Multi-stage builds help create smaller and more efficient Docker images. They involve using multiple FROM statements in a single Dockerfile, each representing a build stage. This reduces the final image size by discarding unnecessary build artifacts.

# Build stage
FROM node:13.12.0-alpine as build
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build

# Production stage
FROM nginx:1.17.9-alpine
COPY --from=build /app/dist /usr/share/nginx/html

2. Exclude with .dockerignore

Create a .dockerignore file to exclude unnecessary files from being copied into the Docker image. This reduces build context and accelerates image builds.

node_modules
.git
Dockerfile
docker-compose.yml

3. Create Ephemeral Containers

Ephemeral containers are lightweight, disposable containers used for specific tasks. They help reduce the attack surface and enhance security.

Imagine you need to download a file from a specific URL and you want to achieve this using an ephemeral container. Here's how you can create a Dockerfile for this scenario:

# Use Alpine Linux as the base image
FROM alpine:3.14

# Install curl for downloading files
RUN apk --no-cache add curl

# Set the default command to download a file
CMD ["curl", "-O", "https://example.com/sample-file.txt"]

Now, let's build and run the ephemeral container:

# Build the Docker image
docker build -t download-container .

# Run the container to download the file
docker run --rm download-container

In this example:

The Dockerfile uses Alpine Linux as a lightweight base image.
It installs the curl command-line tool using the Alpine package manager (apk).
The default command (CMD) is set to use curl to download a sample file (sample-file.txt) from "https://example.com". The -O flag is used to save the file with the same name as on the server.

When you run the container with docker run --rm download-container, the container starts, executes the curl command to download the file, and then automatically removes itself (--rm) once the command is completed. This ensures that the container is ephemeral, existing only for the duration of the task.

This approach is useful for tasks such as downloading files, fetching data, or performing one-time operations without the need for a long-running container. Ephemeral containers provide a clean and disposable environment for specific tasks, minimizing the attack surface and simplifying resource management.

4. Don't Install Unnecessary Packages

Minimize the number of installed packages in the final image. Remove unnecessary dependencies after installing the required packages.

RUN apk --no-cache add --virtual .build-deps \
    build-dependency-package && \
    apk del .build-deps

5. Sort Multi-line Arguments

Sort multi-line arguments for better readability and to leverage Docker's build cache effectively.

 apk --no-cache \
    add \
    package1 \
    package2 \
    package3

6. Leverage Build Cache

Leverage Docker's build cache by ordering your commands intelligently. Commands that change frequently should be placed towards the end of the Dockerfile.

Consider a scenario where you have a simple Python application that depends on external libraries specified in a requirements.txt file. Here's a Dockerfile for this scenario:

# Use an official Python image as the base image
FROM python:3.9

# Set the working directory in the container
WORKDIR /app

# Copy the requirements file into the container at /app
COPY requirements.txt .

# Install the dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the content of the local src directory to the /app directory in the container
COPY src/ .

# Set the default command to run the application
CMD ["python", "app.py"]

In this Dockerfile:

FROM python:3.9: Specifies the base image.
WORKDIR /app: Sets the working directory within the container.
COPY requirements.txt .: Copies the project's requirements.txt file into the container.
RUN pip install --no-cache-dir -r requirements.txt: Installs the Python dependencies. The --no-cache-dir flag ensures that the Pip cache is not used during installation.
COPY src/ .: Copies the application source code into the container.
CMD ["python", "app.py"]: Sets the default command to run the Python application.

Now, let's consider the scenario where you make changes only to your source code (located in the src/ directory) and not to the requirements.txt file. When you rebuild the Docker image, Docker can leverage the build cache for the unchanged dependencies layer.

# Build the Docker image
docker build -t mypythonapp .

# Subsequent builds with no changes in requirements.txt will use the cache
docker build -t mypythonapp .

In the second build command, Docker detects that the requirements.txt file hasn't changed since the previous build, so it reuses the existing cached layer. This significantly speeds up the build process, especially when dealing with larger dependencies that don't change frequently.

By organizing your Dockerfile to take advantage of the build cache, you optimize the build process and reduce the time it takes to build your Docker images, making development and CI/CD pipelines more efficient.

7. Rootless Containers

Run containers as non-root users to enhance security. Create a non-root user and switch to it during the container execution.

RUN adduser -D myuser
USER myuser

8. Make Executables Owned by Root and Not Writable

To minimize security risks, ensure that executable files are owned by the root user and not writable.

COPY --chown=root:root myscript.sh /usr/bin/myscript.sh
RUN chmod +x /usr/bin/myscript.sh

9. Use Distroless or From Scratch

Consider using distroless or a minimal base image like scratch for security and efficiency.

FROM gcr.io/distroless/base
COPY --from=build /app/myapp /myapp
CMD ["/myapp"]

10. Use Trusted Base Images

Choose base images from trusted sources to reduce the risk of vulnerabilities. Official images from Docker Hub or well-known repositories are recommended.

11. Update Your Images Frequently

Keep your base images and dependencies up to date to patch security vulnerabilities.

12. Exposed Ports

Explicitly document and expose only necessary ports in your Dockerfile.

EXPOSE 80

13. Prevent Confidential Data Leaks

Avoid including sensitive information directly in the Dockerfile. Use environment variables or Docker secrets instead.

14. ADD, COPY

Prefer COPY over ADD for clarity. Use ADD only when you need automatic tar extraction.

COPY src/ /app/src

15. Linting

Linting your Dockerfiles is a critical step to ensure they adhere to best practices and avoid potential issues. One popular tool for linting Dockerfiles is hadolint. Let's see how you can integrate linting into your Dockerfile development process:

Install `hadolint`

# Using Homebrew on macOS
brew install hadolint

# Using package manager on Linux (e.g., apt)
sudo apt-get install -y hadolint

Create a `.hadolint.yaml` Configuration File (Optional)

You can create a configuration file to customize linting rules. For example, create a .hadolint.yaml file in your project directory:

# .hadolint.yaml
ignore: DL3006

This configuration ignores the DL3006 rule, which checks for missing CMD in the Dockerfile.

Run `hadolint`

Navigate to the directory containing your Dockerfile and run hadolint:

hadolint Dockerfile

Integrating hadolint into your development process helps catch syntax errors, adhere to best practices, and maintain consistency across Dockerfiles.

16. Locally Scan Images During Development

Scanning your Docker images for vulnerabilities during development is crucial to identify and address security issues early on. Tools like Trivy or Clair can help with this. Let's explore how to use Trivy for local image scanning:

Install Trivy

# Using Homebrew on macOS
brew install trivy

# Using package manager on Linux (e.g., apt)
sudo apt-get install -y trivy

Scan a Docker Image

Assuming you have already built your Docker image, you can scan it using Trivy:

# Scan a local image
trivy image mypythonapp:latest

Trivy will analyze the image and provide information about known vulnerabilities.

Integrating image scanning into your CI/CD pipeline or development workflow ensures that you are aware of potential security risks in your Docker images. Regularly scan images and address vulnerabilities promptly to enhance the overall security of your containerized applications.

17. Include Health/Liveness Checks

Improve container reliability by including health checks in your Dockerfile.

HEALTHCHECK --interval=5s --timeout=3s \
    CMD curl --fail http://localhost/ || exit 1

18. Understand CMD and ENTRYPOINT

Understanding the distinction between CMD and ENTRYPOINT in a Dockerfile is crucial for defining the behavior of your container. Both CMD and ENTRYPOINT are instructions that specify what command should be run when a container starts, but they serve different purposes.

CMD Instruction

The CMD instruction sets the default command and/or parameters for the container. If a user runs a container without specifying a command, the CMD instruction will be used. It is often used for providing default behavior, and it can be overridden when running the container.

Example:

FROM node:14

# Set the default command for the container
CMD ["npm", "start"]

In this example, if someone runs the container without specifying a command, it will automatically run npm start. However, users can still override this command when running the container.

ENTRYPOINT Instruction

The ENTRYPOINT instruction sets the primary command that will be executed when the container starts. Unlike CMD, the command and any parameters specified in ENTRYPOINT cannot be easily overridden by users when running the container. If a user provides a command at runtime, it will be treated as arguments to the ENTRYPOINT.

Example:

FROM node:14

# Set the entry point for the container
ENTRYPOINT ["npm", "start"]

In this example, when someone runs the container, the specified command (npm start) becomes the primary command for the container, and users cannot easily override it.

Combining CMD and ENTRYPOINT

It's common to use both CMD and ENTRYPOINT together. When used together, CMD provides default arguments for the ENTRYPOINT instruction.

Example:

FROM node:14

# Set the entry point for the container
ENTRYPOINT ["npm", "start"]

# Set default arguments for the entry point
CMD ["--production"]

In this example, the container will run npm start --production by default. Users can still override the --production argument when running the container.

When to Use CMD and ENTRYPOINT

Use CMD for specifying default command arguments that can be easily overridden by users.
Use ENTRYPOINT for defining the main executable command of the container, especially when you want to enforce a specific behavior and prevent easy overrides.

Understanding and appropriately using CMD and ENTRYPOINT can help you design more flexible and predictable Docker images.

Conclusion

Adhering to Dockerfile best practices is crucial for building secure, efficient, and maintainable containerized applications. These guidelines help you create containers that are lightweight, secure, and follow industry best practices. Regularly review and update your Dockerfiles to incorporate the latest best practices and security measures.

Dockerfile Best Practices:

Building Efficient and Secure Containers

1. Multi-Stage Builds

2. Exclude with .dockerignore

3. Create Ephemeral Containers

4. Don't Install Unnecessary Packages

5. Sort Multi-line Arguments

6. Leverage Build Cache

7. Rootless Containers

8. Make Executables Owned by Root and Not Writable

9. Use Distroless or From Scratch

10. Use Trusted Base Images

11. Update Your Images Frequently

12. Exposed Ports

13. Prevent Confidential Data Leaks

14. ADD, COPY

15. Linting

Install hadolint

Create a .hadolint.yaml Configuration File (Optional)

Run hadolint

16. Locally Scan Images During Development

Install Trivy

Scan a Docker Image

17. Include Health/Liveness Checks

18. Understand CMD and ENTRYPOINT

CMD Instruction

ENTRYPOINT Instruction

Combining CMD and ENTRYPOINT

When to Use CMD and ENTRYPOINT

Conclusion

Install `hadolint`

Create a `.hadolint.yaml` Configuration File (Optional)

Run `hadolint`