Commit graph

78 commits

Author SHA1 Message Date
Samuel Ortiz
be5084387c server: Serialize container/pod creation with updates
Interleaving asynchronous updates with pod or container creations can
lead to unrecoverable races and corruptions of the pod or container hash
tables. This is fixed by serializing update against pod or container
creation operations, while pod and container creation operations can
run in parallel.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-04-04 18:43:21 +02:00
Samuel Ortiz
d1006fdfbc server: Add new sandboxes to the sandbox hash table first
We want new sandboxes to be added to the sandbox hash table before
adding their ID to the pod Index registrar, in order to avoid potential
Update() races.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-04-04 17:22:34 +02:00
Mrunal Patel
4ccc5bbe7c Set the container hostnames same as pod hostname
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2017-03-29 16:11:57 -07:00
Mrunal Patel
d69ad9b5a3 Fix lint issues
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-27 10:21:30 -07:00
Samuel Ortiz
72129ee3fb sandbox: Track and store the pod resolv.conf path
When we get a pod with DNS settings, we need to build
a resolv.conf file and mount it in all pod containers.
In order to do that, we have to track the built resolv.conf
file and store/load it.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-03-24 15:28:14 +01:00
Daniel J Walsh
19620f3d1e Switch to using opencontainers/selinux
We have moved selinux support out of opencontainers/runc into its
own package.  This patch moves to using the new selinux go bindings.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-03-23 15:53:09 -04:00
Daniel J Walsh
ff950a8e37 Set SELinux mount label for pod sandbox
The pause container is creating an AVC since the /dev/null device
is not labeled correctly.  Looks like we are only setting the label of
the process not the label of the content inside of the container.
This change will label content in the pause container correctly and
eliminate the AVC.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-03-16 14:09:38 -04:00
Mrunal Patel
8c0ff7d904 Run conmon under cgroups (systemd)
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-06 15:08:46 -08:00
Pengfei Ni
3195f45904 Merge pull request #367 from sameo/topic/host-privileged-runtime
Support alternate runtime for host privileged operations
2017-03-05 07:53:20 +08:00
Mrunal Patel
38f497a701 Fix cgroup parent
We were using a variable before it was set.

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-03 16:38:46 -08:00
Samuel Ortiz
2ec696be41 server: Set sandbox and container privileged flags
The sandbox privileged flag is set to true only if either the
pod configuration privileged flag is set to true or when any
of the pod namespaces are the host ones.

A container inherit its privileged flag from its sandbox, and
will be run by the privileged runtime only if it's set to true.
In other words, the privileged runtime (when defined) will be
when one of the below conditions is true:

- The sandbox will be asked to run at least one privileged container.
- The sandbox requires access to either the host IPC or networking
  namespaces.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-03-03 19:06:04 +01:00
Andrew Pilloud
44e7e88ff3 Run without seccomp support
Signed-off-by: Andrew Pilloud <andrewpilloud@igneoussystems.com>
2017-02-21 16:47:03 -08:00
Michał Żyłowski
5c81217e09 Applying k8s.io v3 API for ocic and ocid
Signed-off-by: Michał Żyłowski <michal.zylowski@intel.com>
2017-02-06 13:05:10 +01:00
Nalin Dahyabhai
c0333b102b Integrate containers/storage
Use containers/storage to store images, pod sandboxes, and containers.
A pod sandbox's infrastructure container has the same ID as the pod to
which it belongs, and all containers also keep track of their pod's ID.

The container configuration that we build using the data in a
CreateContainerRequest is stored in the container's ContainerDirectory
and ContainerRunDirectory.

We catch SIGTERM and SIGINT, and when we receive either, we gracefully
exit the grpc loop.  If we also think that there aren't any container
filesystems in use, we attempt to do a clean shutdown of the storage
driver.

The test harness now waits for ocid to exit before attempting to delete
the storage root directory.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-01-18 10:23:30 -05:00
Jacek J. Łakis
b034072d6a sandbox_run: Do not run net plugin in host namespace
Signed-off-by: Jacek J. Łakis <jacek.lakis@intel.com>
2017-01-16 16:53:29 +01:00
Mrunal Patel
6df58df215 Add support for systemd cgroups
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-19 16:31:29 -08:00
Harry Zhang
02dfe877e4 Add container to pod qos cgroup
Signed-off-by: Harry Zhang <harryz@hyper.sh>
2016-12-15 14:42:59 +08:00
Samuel Ortiz
a9724c2c9c
sandbox: Fix gocyclo complexity
With the networking namespace code added, we were reaching a
gocyclo complexitiy of 52. By moving the container creation and
starting code path out, we're back to reasonable levels.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-12-12 19:48:23 +01:00
Samuel Ortiz
482eb460d6
sandbox: Setup networking namespace before sandbox creation
In order for hypervisor based container runtimes to be able to
fully prepare their pod virtual machines networking interfaces,
this patch sets the pod networking namespace before creating the
sandbox container.

Once the sandbox networking namespace is prepared, the runtime
can scan the networking namespace interfaces and build the pod VM
matching interfaces (typically TAP interfaces) at pod sandbox
creation time. Not doing so means those runtimes would have to
rely on all hypervisors to support networking interfaces hotplug.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-12-12 19:48:23 +01:00
Samuel Ortiz
4cab8ed06a
sandbox: Use persistent networking namespace
Because they need to prepare the hypervisor networking interfaces
and have them match the ones created in the pod networking
namespace (typically to bridge TAP and veth interfaces), hypervisor
based container runtimes need the sandbox pod networking namespace
to be set up before it's created. They can then prepare and start
the hypervisor interfaces when creating the pod virtual machine.

In order to do so, we need to create per pod persitent networking
namespaces that we pass to the CNI plugin. This patch leverages
the CNI ns package to create such namespaces under /var/run/netns,
and assign them to all pod containers.
The persitent namespace is removed when either the pod is stopped
or removed.

Since the StopPodSandbox() API can be called multiple times from
kubelet, we track the pod networking namespace state (closed or
not) so that we don't get a containernetworking/ns package error
when calling its Close() routine multiple times as well.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-12-12 19:48:23 +01:00
Antonio Murdaca
430297dd81
store annotations and image for a container
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-12-12 11:12:03 +01:00
Mrunal Patel
a0177ced09 Remove unnecessary check for mount label for /dev/shm
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-09 09:37:47 -08:00
Mrunal Patel
be29524ba4 Add support for pod /dev/shm that is shared by the pod ctrs
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-08 15:32:17 -08:00
Samuel Ortiz
60123a77ce server: Export more container metadata for VM containers
VM base container runtimes (e.g. Clear Containers) will run each pod
in a VM and will create containers within that pod VM. Unfortunately
those runtimes will get called by ocid with the same commands
(create and start) for both the pause containers and subsequent
containers to be added to the pod namespace. Unless they work around
that by e.g. infering that a container which rootfs is under
"/pause" would represent a pod, they have no way to decide if they
need to create/start a VM or if they need to add a container to an
already running VM pod.

This patch tries to formalize this difference through pod
annotations. When starting a container or a sandbox, we now add 2
annotations for the container type (Infrastructure or not) and the
sandbox name. This will allow VM based container runtimes to handle
2 things:

- Decide if they need to create a pod VM or not.
- Keep track of which pod ID runs in a given VM, so that they
  know to which sandbox they have to add containers.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-11-29 10:24:33 +01:00
Mrunal Patel
b6f1b027eb Merge pull request #213 from runcom/bump-runtime-tools
*: bump opencontainers/runtime-tools
2016-11-24 08:29:43 -08:00
Antonio Murdaca
70481bc5af
*: bump opencontainers/runtime-tools
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-11-24 12:26:18 +01:00
HaoZhang
d1e1b7c183 pass sysctls down to oci runtime
Signed-off-by: HaoZhang <crazykev@zju.edu.cn>
2016-11-24 16:29:37 +08:00
Antonio Murdaca
ebe2ea0dba
server: split sandboxes actions
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-11-22 23:23:01 +01:00