We build paths using g_build_filename and g_strdup_printf() instead
which means we don't have any arbitrary pathname lenght issue, and
the code becomes cleaner.
We also convert asprintf to g_strdup_printf so that we can use
the glib OOM checker instead of open coding it everywhere.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
This moves the timeout handling from the go code to conmon, whic
removes some of the complexity from criod, and additionally it will
makes it possible to do the double-fork in the exec case too.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
We need to do this, because otherwise we will continue and exit the
pid before systemd has a chance to look at it.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
This patch also hides the profile under the debug flag as there's
runtime cost to enable the profiler.
This removes the old way of profiling (CPU) as that's not really
needed.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
This patch isn't adding a test for /etc/hosts as that requires host
network and we don't want to play with host's /etc/hosts when running
make localintegration on our laptops. That may change in the future
moving to some sort of in-container testing.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Currently, when creating containers we never call Wait on the
conmon exec.Command, which means that the child hangs around
forever as a zombie after it dies.
However, instead of doing this waitpid() in the parent we instead
do a double-fork in conmon, to daemonize it. That makes a lot of
sense, as conmon really is not tied to the launcher, but needs
to outlive it if e.g. the cri-o daemon restarts.
However, this makes even more obvious a race condition which we
already have. When crio-d puts the conmon pid in a cgroup there
is a race where conmon could already have spawned a child, and
it would then not be part of the cgroup. In order to fix this
we add another synchronization pipe to conmon, which we block
on before we create any children. The parent then makes sure the
pid is in the cgroup before letting it continue.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
If we get a kubelet annotation about the sandbox trust level, we use it
to toggle our sandbox trust flag.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Container runtimes provide different levels of isolation, from kernel
namespaces to hardware virtualization. When starting a specific
container, one may want to decide which level of isolation to use
depending on how much we trust the container workload. Fully verified
and signed containers may not need the hardware isolation layer but e.g.
CI jobs pulling packages from many untrusted sources should probably not
run only on a kernel namespace isolation layer.
Here we allow CRI-O users to define a container runtime for trusted
containers and another one for untrusted containers, and also to define
a general, default trust level. This anticipates future kubelet
implementations that would be able to tag containers as trusted or
untrusted. When missing a kubelet hint, containers are trusted by
default.
A container becomes untrusted if we get a hint in that direction from
kubelet or if the default trust level is set to "untrusted" and the
container is not privileged. In both cases CRI-O will try to use the
untrusted container runtime. For any other cases, it will switch to the
trusted one.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>