Commit graph

71 commits

Author SHA1 Message Date
Victor Marmol
c6e60b57a2 Merge pull request #5903 from alexlarsson/writable-proc
Make /proc writable, but not /proc/sys and /proc/sysrq-trigger
2014-05-19 12:21:15 -07:00
Alexander Larsson
6b97c80b4d Make /proc writable, but not /proc/sys and /proc/sysrq-trigger
Some applications want to write to /proc. For instance:

docker run -it centos groupadd foo

Gives: groupadd: failure while writing changes to /etc/group

And strace reveals why:

open("/proc/self/task/13/attr/fscreate", O_RDWR) = -1 EROFS (Read-only file system)

I've looked at what other systems do, and systemd-nspawn makes /proc read-write
and /proc/sys readonly, while lxc allows "proc:mixed" which does the same,
plus it makes /proc/sysrq-trigger also readonly.

The later seems like a prudent idea, so we follows lxc proc:mixed.
Additionally we make /proc/irq and /proc/bus, as these seem to let
you control various hardware things.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-05-19 20:46:05 +02:00
Victor Marmol
c3b01dfb59 Merge pull request #5792 from bernerdschaefer/nsinit-supports-pdeathsig
Add PDEATHSIG support to nsinit library
2014-05-19 11:13:23 -07:00
Sridhar Ratnakumar
18a7cee3c7 fix panic when passing empty environment
Docker-DCO-1.1-Signed-off-by: Sridhar Ratnakumar <github@srid.name> (github: srid)
2014-05-16 11:55:34 -07:00
Bernerd Schaefer
f6ddb6051c nsinit.Init() restores parent death signal before exec
Docker-DCO-1.1-Signed-off-by: Bernerd Schaefer <bj.schaefer@gmail.com> (github: bernerdschaefer)
2014-05-16 13:48:41 +02:00
Michael Crosby
bafc6a6233 Merge pull request #5630 from rjnagal/libcontainer-fixes
Check supplied hostname before using it.
2014-05-06 09:49:52 -07:00
Rohit Jnagal
10b377c0fb Check supplied hostname before using it.
Docker-DCO-1.1-Signed-off-by: Rohit Jnagal <jnagal@google.com> (github: rjnagal)
2014-05-05 18:12:25 +00:00
Michael Crosby
65fb57349d Don't restrict lxc because of apparmor
We don't have the flexibility to do extra things with lxc because it is
a black box and most fo the magic happens before we get a chance to
interact with it in dockerinit.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-02 11:14:24 -07:00
Michael Crosby
593c632113 Apply apparmor before restrictions
There is not need for the remount hack, we use aa_change_onexec so the
apparmor profile is not applied until we exec the users app.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 19:09:12 -07:00
Michael Crosby
af5420420b Update restrictions for better handling of mounts
This also cleans up some of the left over restriction paths code from
before.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Michael Crosby
5d1a3b2ab5 Update to enable cross compile
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 15:26:58 -07:00
Jérôme Petazzoni
a5364236a7 Mount /proc and /sys read-only, except in privileged containers.
It has been pointed out that some files in /proc and /sys can be used
to break out of containers. However, if those filesystems are mounted
read-only, most of the known exploits are mitigated, since they rely
on writing some file in those filesystems.

This does not replace security modules (like SELinux or AppArmor), it
is just another layer of security. Likewise, it doesn't mean that the
other mitigations (shadowing parts of /proc or /sys with bind mounts)
are useless. Those measures are still useful. As such, the shadowing
of /proc/kcore is still enabled with both LXC and native drivers.

Special care has to be taken with /proc/1/attr, which still needs to
be mounted read-write in order to enable the AppArmor profile. It is
bind-mounted from a private read-write mount of procfs.

All that enforcement is done in dockerinit. The code doing the real
work is in libcontainer. The init function for the LXC driver calls
the function from libcontainer to avoid code duplication.

Docker-DCO-1.1-Signed-off-by: Jérôme Petazzoni <jerome@docker.com> (github: jpetazzo)
2014-05-01 15:26:58 -07:00
Michael Crosby
8cd88f75fa Remove command factory and NsInit interface from libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 17:55:15 -07:00
Michael Crosby
2db754f3ee Export more functions from libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 17:18:07 -07:00
Michael Crosby
f98f4455b9 Export SetupUser
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 15:27:59 -07:00
Michael Crosby
32e9beb86e Remove logger from nsinit struct
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 15:24:18 -07:00
Guillaume J. Charmes
b6344f992e Merge pull request #5448 from crosbymichael/selinux-defaults
Add selinux label support for processes and mount
2014-04-30 14:14:39 -07:00
Michael Crosby
8e22ca2eed Merge pull request #5464 from tianon/close-leftover-fds 2014-04-30 12:27:52 -07:00
Tianon Gravi
c1dad4d063 Close extraneous file descriptors in containers
Without this patch, containers inherit the open file descriptors of the daemon, so my "exec 42>&2" allows us to "echo >&42 some nasty error with some bad advice" directly into the daemon log. :)

Also, "hack/dind" was already doing this due to issues caused by the inheritance, so I'm removing that hack too since this patch obsoletes it by generalizing it for all containers.

Docker-DCO-1.1-Signed-off-by: Andrew Page <admwiggin@gmail.com> (github: tianon)
2014-04-29 16:45:28 -06:00
Michael Crosby
48d893cc6b Initial work on selinux patch
This has every container using the docker daemon's pid for the processes
label so it does not work correctly.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-29 03:40:05 -07:00
Tianon Gravi
c54bc4ca04 Remove "root" and "" special cases in libcontainer
These are unnecessary since the user package handles these cases properly already (as evidenced by the LXC backend not having these special cases).

I also updated the errors returned to match the other libcontainer error messages in this same file.

Also, switching from Setresuid to Setuid directly isn't a problem, because the "setuid" system call will automatically do that if our own effective UID is root currently: (from `man 2 setuid`)

    setuid() sets the effective user ID of the calling process.  If the
    effective UID of the caller is root, the real UID and saved set-user-
    ID are also set.

Docker-DCO-1.1-Signed-off-by: Andrew Page <admwiggin@gmail.com> (github: tianon)
2014-04-28 16:46:03 -06:00
Michael Crosby
2ecea22c8c Update init for new apparmor import path
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
e40bde54a5 Move capabilities into security pkg
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
824ee83816 Move rest of console functions to pkg
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
a77846506b Refactor mounts into pkg to make changes easier
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
3d546f20db Add restrictions to proc in libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:19 -07:00
Michael Crosby
b5434b5d7f Move apparmor into security sub dir
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:19 -07:00
Michael Crosby
ca3224687b Move apparmor to top level pkg
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-13 23:33:25 +00:00
Victor Vieux
534990bda9 Merge pull request #4953 from rhatdan/selinux
These two patches should fix problems we see with running docker in the wild.
2014-04-02 16:36:41 -07:00
Dan Walsh
f71121b1fa In certain cases, setting the process label will not happen.
When the code attempts to set the ProcessLabel, it checks if SELinux Is
enabled.  We have seen a case with some of our patches where the code
is fooled by the container to think that SELinux is not enabled.  Calling
label.Init before setting up the rest of the container, tells the library that
SELinux is enabled and everything works fine.

Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
2014-04-01 13:30:10 -04:00
Michael Crosby
6a7bbbdddc Don't send prctl to be consistent with other drivers
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-01 07:12:50 +00:00
Michael Crosby
42be9fb9d2 Merge branch 'master' into pluginflag
Conflicts:
	pkg/cgroups/cgroups.go
	pkg/libcontainer/nsinit/exec.go
	pkg/libcontainer/nsinit/init.go
	pkg/libcontainer/nsinit/mount.go
	runconfig/hostconfig.go
	runconfig/parse.go
	runtime/execdriver/driver.go
	runtime/execdriver/lxc/lxc_template.go
	runtime/execdriver/lxc/lxc_template_unit_test.go
	runtime/execdriver/native/default_template.go
	runtime/execdriver/native/driver.go

Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-27 08:00:18 +00:00
Dan Walsh
2b01982d1f This patch adds SELinux labeling support.
docker will run the process(es) within the container with an SELinux label and will label
all of  the content within the container with mount label.  Any temporary file systems
created within the container need to be mounted with the same mount label.

The user can override the process label by specifying

-Z With a string of space separated options.

-Z "user=unconfined_u role=unconfined_r type=unconfined_t level=s0"

Would cause the process label to run with unconfined_u:unconfined_r:unconfined_t:s0"

By default the processes will run execute within the container as svirt_lxc_net_t.
All of the content in the container as svirt_sandbox_file_t.

The process mcs level is based of the PID of the docker process that is creating the container.

If you run the container in --priv mode, the labeling will be disabled.

Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
2014-03-26 15:30:40 -04:00
Johan Euphrosine
76d3c99530 libcontainer/nsinit/init: move mount namespace after network
Docker-DCO-1.1-Signed-off-by: Johan Euphrosine <proppy@google.com> (github: proppy)
2014-03-18 16:18:04 -07:00
Timothy Hobbs
1cad0a1d6c Fix issue #4681 - No loopback interface within container when networking is disabled.
Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

Remove loopback code from veth strategy

Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

Looback strategy: Get rid of uneeded code in Create
Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

Use append when building network strategy list

Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

Swap loopback and veth strategies in Networks list

Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

Revert "Swap loopback and veth strategies in Networks list"

This reverts commit 3b8b2c8454171d79bed5e9a80165172617e92fc7.

Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)

When initializing networks, only return from the loop if there is an error

Docker-DCO-1.1-Signed-off-by: Timothy Hobbs <timothyhobbs@seznam.cz> (github: https://github.com/timthelion)
2014-03-17 22:01:24 +01:00
Guillaume J. Charmes
dce2f20bec Merge pull request #4645 from crosbymichael/add-logger
Add logger to libcontainer
2014-03-17 11:30:14 -07:00
Michael Crosby
a518a10209 Send sigterm to child instead of sigkill
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-14 15:42:05 -07:00
Michael Crosby
9183242a5d Add stderr log ouput if in debug
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-14 09:55:05 -07:00
Michael Crosby
f7eec3dd13 Add initial logging to libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-14 09:55:05 -07:00
Alexander Larsson
eebaba3215 Move all bind-mounts in the container inside the namespace
This moves the bind mounts like /.dockerinit, /etc/hostname, volumes,
etc into the container namespace, by setting them up using lxc.

This is useful to avoid littering the global namespace with a lot of
mounts that are internal to each container and are not generally
needed on the outside. In particular, it seems that having a lot of
mounts is problematic wrt scaling to a lot of containers on systems
where the root filesystem is mounted --rshared.

Note that the "private" option is only supported by the native driver, as
lxc doesn't support setting this. This is not a huge problem, but it does
mean that some mounts are unnecessarily shared inside the container if you're
using the lxc driver.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-03-13 20:01:29 +01:00
Michael Crosby
da605a43d6 Add env var to toggle pivot root or ms_move
Use the  DOCKER_RAMDISK env var to tell the native driver not to use
a pivot root when setting up the rootfs of a container.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-06 19:30:52 -08:00
Michael Crosby
fd8470acba Ensure that native containers die with the parent
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-06 16:30:56 -08:00
Michael Crosby
c0d5c529bb Remove the ghosts and kill everything
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-06 15:30:26 -08:00
Guillaume J. Charmes
0ecd2aa284 Use CGO for apparmor profile switch
Docker-DCO-1.1-Signed-off-by: Guillaume J. Charmes <guillaume.charmes@docker.com> (github: creack)
2014-03-06 11:10:58 -08:00
Michael Crosby
0eb4ea2f79 Some cleanup around logs
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-05 13:50:49 -08:00
Guillaume J. Charmes
73233223de Add AppArmor support to native driver + change pipe/dup logic
Docker-DCO-1.1-Signed-off-by: Guillaume J. Charmes <guillaume.charmes@docker.com> (github: creack)
2014-03-05 13:08:24 -08:00
Michael Crosby
7dc071dca5 Factor out finalize namespace
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-03-03 12:15:47 -08:00
Michael Crosby
85696fdb67 Allow child process to live if daemon dies
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-02-27 09:33:36 -08:00
Michael Crosby
4f6cdc6f08 Make network a slice to support multiple types
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-02-26 14:20:41 -08:00
Michael Crosby
6daf56799f Refactor and improve libcontainer and driver
Remove logging for now because it is complicating things
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-02-24 21:11:52 -08:00