Commit graph

86 commits

Author SHA1 Message Date
Alexander Larsson
af4fbcd942 conmon: Don't leave zombies and fix cgroup race
Currently, when creating containers we never call Wait on the
conmon exec.Command, which means that the child hangs around
forever as a zombie after it dies.

However, instead of doing this waitpid() in the parent we instead
do a double-fork in conmon, to daemonize it. That makes a lot of
sense, as conmon really is not tied to the launcher, but needs
to outlive it if e.g. the cri-o daemon restarts.

However, this makes even more obvious a race condition which we
already have. When crio-d puts the conmon pid in a cgroup there
is a race where conmon could already have spawned a child, and
it would then not be part of the cgroup. In order to fix this
we add another synchronization pipe to conmon, which we block
on before we create any children. The parent then makes sure the
pid is in the cgroup before letting it continue.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2017-06-15 14:20:40 +02:00
Mrunal Patel
7b9032bac7 Merge pull request #579 from alexlarsson/non-terminal-attach
Implement non-terminal attach
2017-06-14 21:45:44 -07:00
Alexander Larsson
7bb957bf75 Implement non-terminal attach
We use a SOCK_SEQPACKET socket for the attach unix domain socket, which
means the kernel will ensure that the reading side only ever get the
data from one write operation. We use this for frameing, where the
first byte is the pipe that the next bytes are for. We have to make sure
that all reads from the socket are using at least the same size of buffer
as the write side, because otherwise the extra data in the message
will be dropped.

This also adds a stdin pipe for the container, similar to the ones we
use for stdout/err, because we need a way for an attached client
to write to stdin, even if not using a tty.

This fixes https://github.com/kubernetes-incubator/cri-o/issues/569

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2017-06-14 22:59:50 +02:00
Mrunal Patel
62c9caeb83 oci: Add debugs for container create failures
This makes it easier to debug container creation failures
by looking at cri-o logs.

Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2017-06-14 07:33:07 -07:00
Antonio Murdaca
0b2f6b5354
adjust status on container start failure
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-06-12 12:48:50 +02:00
Mrunal Patel
065f12490c conmon: Add unix domain socket for attach
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2017-06-06 07:36:52 -07:00
Antonio Murdaca
88fb9094d0
oci: do not error out on runtime state failure
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-06-01 17:37:17 +02:00
Antonio Murdaca
b4f1cee2a2
server: store and use image's stop signal to stop containers
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-27 10:21:04 +02:00
Antonio Murdaca
b4251aebd8
execsync: rewrite to fix a bug in conmon
conmon has many flags that are parsed when it's executed, one of them
is "-c". During PR #510 where we vendor latest kube master code,
upstream has changed a test to call a "ctr execsync" with a command of
"sh -c commmand ...".
Turns out:

a) conmon has a "-c" flag which refers to the container name/id
b) the exec command has a "-c" flags but it's for "sh"

That leads to conmon parsing the second "-c" flags from the exec
command causing an error. The executed command looks like:

conmon -c [..other flags..] CONTAINERID -e sh -c echo hello world

This patch rewrites the exec sync code to not pass down to conmon the
exec command via command line. Rather, we're now creating an OCI runtime
process spec in a temp file, pass _the path_ down to conmon, and have
runc exec the command using "runc exec --process
/path/to/process-spec.json CONTAINERID". This is far better in which we
don't need to bother anymore about conflicts with flags in conmon.

Added and fixed some tests also.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-25 22:36:33 +02:00
Mrunal Patel
ea9a90abce Set Container Status Reason when OOM Killed
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-05-25 11:30:58 -07:00
Antonio Murdaca
4a8debe6c5
oci: do not serialize empty fields on disk
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:51 +02:00
Antonio Murdaca
6622feb480
server: still update status on container not found in runc
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:51 +02:00
Antonio Murdaca
358dac96d4
server: ignore runc not exist errors
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:50 +02:00
Antonio Murdaca
a41ca975c1
server: restore containers state from disk on startup
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:50 +02:00
Antonio Murdaca
da0b8a6157
server: store containers state on disk
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:50 +02:00
Antonio Murdaca
2ddc062bbe
oci: ignore non existing containers on delete
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 21:19:45 +02:00
Antonio Murdaca
fbc5e49a60
oci: save container's finished time
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 18:49:55 +02:00
Antonio Murdaca
790c6d891a
server: store creation in containers
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 18:49:54 +02:00
Antonio Murdaca
1f4a4742cb
oci: add container directory to Container struct
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-18 18:49:54 +02:00
Antonio Murdaca
1ca660e3b7 Merge pull request #512 from runcom/stop-timeout
server: honor container stop timeout from CRI
2017-05-16 10:06:47 +02:00
Antonio Murdaca
b3683ab184
server: honor container stop timeout from CRI
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-05-15 22:56:31 +02:00
Mrunal Patel
0a0533cdfc Capture errors from runtime create failures
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-05-15 13:35:18 -07:00
Dan Walsh
4493b6f176 Rename ocid to crio.
The ocid project was renamed to CRI-O, months ago, it is time that we moved
all of the code to the new name.  We want to elminate the name ocid from use.
Move fully to crio.

Also cric is being renamed to crioctl for the time being.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2017-05-12 09:56:06 -04:00
f1fd06bfc1
oci: more grep'able interface name
`git grep -wi store` is not nearly useful enough. Taking steps for
readability.

Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2017-04-19 16:12:59 -04:00
Aleksa Sarai
87faf98447
oci: make ExecSync handle split std{out,err}
Now that conmon splits std{out,err} for !terminal containers, ExecSync
can parse that output to return the correct std{out,err} split to the
kubelet. Invalid log lines are ignored but complained about.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-04-12 21:59:25 +10:00
Aleksa Sarai
8a928d06e7
oci: make ExecSync with ExitCode != 0 act properly
Previously we returned an internal error result when a program had a
non-zero exit code, which was incorrect. Fix this as well as change the
tests to actually check the "ExitCode" response from ExecSync (rather
than expecting ocic-ctr to return an internal error).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-04-11 20:32:18 +10:00
Suraj Deshmukh
da89d28473 Print received container pid as int
Earlier the received container pid was printed as unicode
character, this is fixed to print integer.

Fixes #431
2017-04-06 22:14:29 +05:30
Aleksa Sarai
c290c0d9c3
conmon: implement logging to logPath
This adds a very simple implementation of logging within conmon, where
every buffer read from the masterfd of the container is also written to
the log file (with errors during writing to the log file ignored).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-04-05 02:45:57 +10:00
Mrunal Patel
d69ad9b5a3 Fix lint issues
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-27 10:21:30 -07:00
Mrunal Patel
be47583041 Increase the timeout value for create container
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-23 10:06:52 -07:00
Daniel J Walsh
d679da0645 If the container exit file is missing default exit code to -1
If I create a sandbox pod and then restart the ocid service, the
pod ends up in a stopped state without an exit file.  Whether this is
a bug in ocid or not we should handle this case where a container exits
so that we can clean up the container.

This change just defaults to exit code to -1 if the container is not
running and does not have an exit file.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-03-21 08:00:04 -04:00
YaoZengzeng
3b7d815af1 add timeout when wait to get container pid from conmon
Signed-off-by: Yao Zengzeng <yaozengzeng@zju.edu.cn>
2017-03-20 21:49:30 +08:00
Mrunal Patel
8c0ff7d904 Run conmon under cgroups (systemd)
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-03-06 15:08:46 -08:00
Samuel Ortiz
2ec696be41 server: Set sandbox and container privileged flags
The sandbox privileged flag is set to true only if either the
pod configuration privileged flag is set to true or when any
of the pod namespaces are the host ones.

A container inherit its privileged flag from its sandbox, and
will be run by the privileged runtime only if it's set to true.
In other words, the privileged runtime (when defined) will be
when one of the below conditions is true:

- The sandbox will be asked to run at least one privileged container.
- The sandbox requires access to either the host IPC or networking
  namespaces.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-03-03 19:06:04 +01:00
Samuel Ortiz
eab6b00ea6 oci: Support for the host privileged runtime path
We add a privileged flag to the container and sandbox structures
and can now select the appropriate runtime path for any container
operations depending on that flag.

Here again, the default runtime will be used for non privileged
containers and for privileged ones in case there are no privileged
runtime defined.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-03-03 17:22:09 +01:00
Mrunal Patel
8e5b17cf13 Switch to github.com/golang/dep for vendoring
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
2017-01-31 16:45:59 -08:00
Nalin Dahyabhai
c0333b102b Integrate containers/storage
Use containers/storage to store images, pod sandboxes, and containers.
A pod sandbox's infrastructure container has the same ID as the pod to
which it belongs, and all containers also keep track of their pod's ID.

The container configuration that we build using the data in a
CreateContainerRequest is stored in the container's ContainerDirectory
and ContainerRunDirectory.

We catch SIGTERM and SIGINT, and when we receive either, we gracefully
exit the grpc loop.  If we also think that there aren't any container
filesystems in use, we attempt to do a clean shutdown of the storage
driver.

The test harness now waits for ocid to exit before attempting to delete
the storage root directory.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-01-18 10:23:30 -05:00
Samuel Ortiz
4c7583b467
oci: Do not call the container runtime from ExecSync
Some OCI container runtimes (in particular the hypervisor
based ones) will typically create a shim process between
the hypervisor and the runtime caller, in order to not
rely on the hypervisor process for e.g. forwarding the
output streams or getting a command exit code.

When executing a command inside a running container those
runtimes will create that shim process and terminate.
Therefore calling and monitoring them directly from
ExecSync() will fail. Instead we need to have a subreaper
calling the runtime and monitoring the shim process.
This change uses conmon as the subreaper from ExecSync(),
monitors the shim process and read the exec'ed command
exit code from the synchronization pipe.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-01-14 02:02:43 +01:00
Samuel Ortiz
9a4a1092fe
conmon: Return the exit status code
waitpid fills its second argument with a value that
contains the process exit code in the 8 least significant
bits. Instead of returning the complete value and then
convert it from ocid, return the exit status directly
by using WEXITSTATUS from conmon.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-01-14 02:00:45 +01:00
Xianglin Gao
ab4a408b66 fix typo to make go report more happy
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
2017-01-04 14:24:11 +08:00
Aleksa Sarai
da975261e7
oci: fix runc kill usage
In later versions of runC, `runc kill` *requires* the signal parameter
to know what signal needs to be sent.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
2016-12-31 17:01:19 +11:00
Mrunal Patel
6df58df215 Add support for systemd cgroups
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-19 16:31:29 -08:00
Mrunal Patel
5eab56e002 Pass cgroup manager to oci runtime manager
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-19 15:05:32 -08:00
Mrunal Patel
2800b83f2f Add operation lock for containers
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2016-12-14 10:54:42 -08:00
Antonio Murdaca
d2f6a4c0e2
server: remove reaper, let runc take care of reaping
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-12-14 12:15:20 +01:00
Samuel Ortiz
4cab8ed06a
sandbox: Use persistent networking namespace
Because they need to prepare the hypervisor networking interfaces
and have them match the ones created in the pod networking
namespace (typically to bridge TAP and veth interfaces), hypervisor
based container runtimes need the sandbox pod networking namespace
to be set up before it's created. They can then prepare and start
the hypervisor interfaces when creating the pod virtual machine.

In order to do so, we need to create per pod persitent networking
namespaces that we pass to the CNI plugin. This patch leverages
the CNI ns package to create such namespaces under /var/run/netns,
and assign them to all pod containers.
The persitent namespace is removed when either the pod is stopped
or removed.

Since the StopPodSandbox() API can be called multiple times from
kubelet, we track the pod networking namespace state (closed or
not) so that we don't get a containernetworking/ns package error
when calling its Close() routine multiple times as well.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-12-12 19:48:23 +01:00
Antonio Murdaca
430297dd81
store annotations and image for a container
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-12-12 11:12:03 +01:00
9ce0a55c35
oci: pass through error output from runc
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
2016-12-09 15:53:56 -05:00
Antonio Murdaca
cbe2a68ce5
execsync: return proper error description
The gprc execsync client call doesn't populate `ExecSyncResponse` on
error at all. You just get an error.
This patch modifies the code to include command's streams, exit code
and error direcly into the error. `ocic` will then print useful
infomation in the cli, otherwise it won't.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-11-24 12:11:04 +01:00
Antonio Murdaca
5c94544fb8 Merge pull request #203 from mrunalp/exec_sync
Exec sync
2016-11-21 23:22:20 +01:00