Docker security » History » Version 1
Peter Amstutz, 10/07/2016 03:02 AM
| 1 | 1 | Peter Amstutz | h1. Docker security |
|---|---|---|---|
| 2 | |||
| 3 | The fundamental Docker security issue is that a "root" (uid 0) user |
||
| 4 | inside container is equivalent to "root" outside, unless steps are |
||
| 5 | taken to limit container permissions. We want to disallow containers |
||
| 6 | from sending data outside the private Arvados network, prevent |
||
| 7 | breakout from the container, and limit access if a breakout does |
||
| 8 | occur. We don't allow end users to invoke Docker directly, so we can |
||
| 9 | impose security measures both in the daemon configuration and the |
||
| 10 | individual container invocation. |
||
| 11 | |||
| 12 | Some of the knobs we have include: |
||
| 13 | |||
| 14 | h2. Setting the uid/gid of pid 1 in container |
||
| 15 | |||
| 16 | We can explicitly set the uid/gid of pid 1 inside the container so it |
||
| 17 | is not uid 0. This overrides the USER directive of the image. One |
||
| 18 | drawback is that some programs behave badly when the current uid |
||
| 19 | cannot be found in /etc/passwd. |
||
| 20 | |||
| 21 | h2. User id mapping |
||
| 22 | docker daemon --userns-remap |
||
| 23 | |||
| 24 | User ids inside container corresponds to a different host user id. |
||
| 25 | Can map uid 0 inside the container to non-root user outside the |
||
| 26 | container. Unclear if uid 0 inside the container still has some "root |
||
| 27 | powers" (like bypassing file access checks when accessing files inside |
||
| 28 | the container), or if this means uid 0 is just a regular unprivileged |
||
| 29 | user who happens to have a uid of 0. (More research necessary) |
||
| 30 | |||
| 31 | h2. Dropping capabilities |
||
| 32 | docker run --drop-cap |
||
| 33 | |||
| 34 | Drop capabilities of root user inside the container ("man |
||
| 35 | capabilities" for list). Dropping all capabilities effectively |
||
| 36 | neuters the root user (for example, without CAP_DAC_OVERRIDE the root |
||
| 37 | user is subject to the same file permission checks as regular users). |
||
| 38 | Unclear if this is necessary when user id remapping is in effect; it |
||
| 39 | may be the case that when user id mapping is in effect |
||
| 40 | |||
| 41 | h2. Restrict container networking |
||
| 42 | Crunch v2 communicates via arv-mount, which means most containers |
||
| 43 | don't need networking to read/write to Keep. Crunch v2 policy is that |
||
| 44 | networking is disabled by default but can be enabled with the runtime |
||
| 45 | constraint API: true (necessary for arvados-aware containers). The |
||
| 46 | Docker network bridge should be configured with a whitelist firewall |
||
| 47 | that limits communication to essential Arvados services (API server + |
||
| 48 | Keep server). |
||
| 49 | |||
| 50 | h2. Disable inter-container communication |
||
| 51 | docker daemon --icc=false |
||
| 52 | |||
| 53 | Our containers don't need to talk to each other. |
||
| 54 | |||
| 55 | h2. Resource limits via cgroups |
||
| 56 | |||
| 57 | Slurm can set up a cgroup (control group) to dictate resource limits, |
||
| 58 | and crunch-run can instruct Docker to put the container in the cgroup |
||
| 59 | set up by slurm. Note, for this to work, we may need to invoke the |
||
| 60 | Docker daemon with this option: |
||
| 61 | --exec-opt native.cgroupdriver=cgroupfs |
||
| 62 | |||
| 63 | Further research is required to see if slurm cgroup settings are |
||
| 64 | sufficient to prevent overloading the node or denial-of-service, or if |
||
| 65 | we need to set other limits (for example, a limit on the number of |
||
| 66 | processes inside the container to prevent forkbomb attacks.) |
||
| 67 | |||
| 68 | h2. Resource limits via ulimit |
||
| 69 | |||
| 70 | We can also set ulimits on daemon invocation (--default-ulimit) and on |
||
| 71 | container invocation (--ulimit). ulimit has some overlap with cgroups |
||
| 72 | but the difference seems to be that most ulimit settings apply |
||
| 73 | per-process rather than to a group of processes. |
||
| 74 | |||
| 75 | h2. seccomp |
||
| 76 | |||
| 77 | Seccomp filters system calls that can be made by programs inside the |
||
| 78 | container; many system calls it filters can also be blocked by |
||
| 79 | dropping capabilities. |
||
| 80 | https://docs.docker.com/engine/security/seccomp/ |
||
| 81 | |||
| 82 | h2. AppArmor |
||
| 83 | |||
| 84 | Can further limit what programs (including those running as "root") |
||
| 85 | inside the container can do. To be really effective, need to tailor |
||
| 86 | profiles to specific application containers. |
||
| 87 | |||
| 88 | https://docs.docker.com/engine/security/apparmor/ |
||
| 89 | |||
| 90 | h2. SELinux |
||
| 91 | |||
| 92 | docker daemon --selinux-enabled |
||
| 93 | |||
| 94 | Enable SELinux support. I don't know what that entails. |