We add 2 ocid options for choosing the CNI configuration and plugin
binaries directories: --cni-config-dir and --cni-plugin-dir.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Add an intermediate API layer that uses containers/storage, and a
containers/image that has been patched to use it, to manage images and
containers, storing the data that we need to know about containers and
pods in the metadata fields provided by containers/storage.
While ocid manages pods and containers as different types of items, with
disjoint sets of IDs and names, it remains true that every pod includes
at least one container. When a container's only purpose is to serve as
a home for namespaces that are shared with the other containers in the
pod, it is referred to as the pod's infrastructure container.
At the storage level, a pod is stored as its set of containers. We keep
track of both pod IDs and container IDs in the metadata field of
Container objects that the storage library manages for us. Containers
which bear the same pod ID are members of the pod which has that ID.
Other information about the pod, which ocid needs to remember in order
to answer requests for information about the pod, is also kept in the
metadata field of its member containers.
The container's runtime configuration should be stored in the
container's ContainerDirectory, and used as a template. Each time the
container is about to be started, its layer should be mounted, that
configuration template should be read, the template's rootfs location
should be replaced with the mountpoint for the container's layer, and
the result should be saved to the container's ContainerRunDirectory,
for use as the configuration for the container.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Any binary that will be managing storage needs to initialize the reexec
package in order to be able to apply or read image layers.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Update the versions of containers/storage and containers/image, and add
new dependencies that they pull in.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add the necessary build tags and configuration so that integration tests
can properly build against device mapper and btrfs libraries.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
ns.Close() will not remove and unmount the networking namespace
if it's not currently marked as mounted.
When we restore a sandbox, we generate the sandbox netns from
ns.GetNS() which does not mark the sandbox as mounted.
There currently is a PR open to fix that in the ns package:
https://github.com/containernetworking/cni/pull/342
but meanwhile this patch fixes a netns leak when restoring a pod.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to workaround a bug introduced with runc commit bc84f833,
we create a symbolic link to our permanent networking namespace so
that runC realizes that this is not the host namespace.
Although this bug is now fixed upstream (See commit f33de5ab4), this
patch works with pre rc3 runC versions.
We may want to revert that patch once runC 1.0.0 is released.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
With the networking namespace code added, we were reaching a
gocyclo complexitiy of 52. By moving the container creation and
starting code path out, we're back to reasonable levels.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order for hypervisor based container runtimes to be able to
fully prepare their pod virtual machines networking interfaces,
this patch sets the pod networking namespace before creating the
sandbox container.
Once the sandbox networking namespace is prepared, the runtime
can scan the networking namespace interfaces and build the pod VM
matching interfaces (typically TAP interfaces) at pod sandbox
creation time. Not doing so means those runtimes would have to
rely on all hypervisors to support networking interfaces hotplug.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Because they need to prepare the hypervisor networking interfaces
and have them match the ones created in the pod networking
namespace (typically to bridge TAP and veth interfaces), hypervisor
based container runtimes need the sandbox pod networking namespace
to be set up before it's created. They can then prepare and start
the hypervisor interfaces when creating the pod virtual machine.
In order to do so, we need to create per pod persitent networking
namespaces that we pass to the CNI plugin. This patch leverages
the CNI ns package to create such namespaces under /var/run/netns,
and assign them to all pod containers.
The persitent namespace is removed when either the pod is stopped
or removed.
Since the StopPodSandbox() API can be called multiple times from
kubelet, we track the pod networking namespace state (closed or
not) so that we don't get a containernetworking/ns package error
when calling its Close() routine multiple times as well.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>