This moves the bind mounts like /.dockerinit, /etc/hostname, volumes,
etc into the container namespace, by setting them up using lxc.
This is useful to avoid littering the global namespace with a lot of
mounts that are internal to each container and are not generally
needed on the outside. In particular, it seems that having a lot of
mounts is problematic wrt scaling to a lot of containers on systems
where the root filesystem is mounted --rshared.
Note that the "private" option is only supported by the native driver, as
lxc doesn't support setting this. This is not a huge problem, but it does
mean that some mounts are unnecessarily shared inside the container if you're
using the lxc driver.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Use the DOCKER_RAMDISK env var to tell the native driver not to use
a pivot root when setting up the rootfs of a container.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
This reverts commit 82f797f14096430c3edbace1cd30e04a483ec41f.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
This reverts commit bd263f5b15b51747e3429179fef7fcb425ccbe4a.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
This reverts commit 757b5775725fb90262cee1fa6068fa9dcbbff59f.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
This reverts commit 5b5c884cc8266d0c2a56da0bc2df14cc9d5d85e8.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
We can't keep file descriptors without close-on-exec except with
syscall.ForkLock held, as otherwise they could leak by accident into
other children from forks in other threads.
Instead we just use Cmd.ExtraFiles which handles all this for us.
This fixes https://github.com/dotcloud/docker/issues/4493
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Now that we unmount all the mounts from the global namespace we can
use a private namespace rather than a slave one (as we have no need
for unmounts of inherited global mounts to propagate into the
container).
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Instead of keeping all the old mounts in the container namespace and
just using subtree as root we pivot_root so that the actual root in
the namespace is the root we want, and then we unmount the previous
mounts.
This has multiple advantages:
* The namespace mount tree is smaller (in the kernel)
* If you break out of the chroot you could previously access the host
filesystem. Now the host filesystem is fully invisible to the namespace.
* We get rid of all unrelated mounts from the parent namespace, which means
we don't hog these. This is important if we later switch to MS_PRIVATE instead
of MS_SLAVE as otherwise these mounts would be impossible to unmount from the
parent namespace.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Remove logging for now because it is complicating things
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)