cri-o

Author	SHA1	Message	Date
Alexander Larsson	81cb788004	conmon: Clean up execsync This moves the timeout handling from the go code to conmon, whic removes some of the complexity from criod, and additionally it will makes it possible to do the double-fork in the exec case too. Signed-off-by: Alexander Larsson <alexl@redhat.com>	2017-06-21 21:03:17 +02:00
Mrunal Patel	88037b143b	Merge pull request #583 from alexlarsson/conmon-reap-zombies conmon: Don't leave zombies and fix cgroup race	2017-06-20 07:53:52 -07:00
Alexander Larsson	72686c78b4	fixup! conmon: Don't leave zombies and fix cgroup race Signed-off-by: Alexander Larsson <alexl@redhat.com>	2017-06-20 12:18:07 +02:00
Mrunal Patel	2b8e3a0d0f	Merge pull request #602 from runcom/busy-loop oci: remove busy loop	2017-06-15 10:29:33 -07:00
Alexander Larsson	af4fbcd942	conmon: Don't leave zombies and fix cgroup race Currently, when creating containers we never call Wait on the conmon exec.Command, which means that the child hangs around forever as a zombie after it dies. However, instead of doing this waitpid() in the parent we instead do a double-fork in conmon, to daemonize it. That makes a lot of sense, as conmon really is not tied to the launcher, but needs to outlive it if e.g. the cri-o daemon restarts. However, this makes even more obvious a race condition which we already have. When crio-d puts the conmon pid in a cgroup there is a race where conmon could already have spawned a child, and it would then not be part of the cgroup. In order to fix this we add another synchronization pipe to conmon, which we block on before we create any children. The parent then makes sure the pid is in the cgroup before letting it continue. Signed-off-by: Alexander Larsson <alexl@redhat.com>	2017-06-15 14:20:40 +02:00
Antonio Murdaca	9e6359b6f7	oci: remove busy loop Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-06-15 12:22:32 +02:00
Samuel Ortiz	0e51bbb778	oci: Support mixing trusted and untrusted workloads Container runtimes provide different levels of isolation, from kernel namespaces to hardware virtualization. When starting a specific container, one may want to decide which level of isolation to use depending on how much we trust the container workload. Fully verified and signed containers may not need the hardware isolation layer but e.g. CI jobs pulling packages from many untrusted sources should probably not run only on a kernel namespace isolation layer. Here we allow CRI-O users to define a container runtime for trusted containers and another one for untrusted containers, and also to define a general, default trust level. This anticipates future kubelet implementations that would be able to tag containers as trusted or untrusted. When missing a kubelet hint, containers are trusted by default. A container becomes untrusted if we get a hint in that direction from kubelet or if the default trust level is set to "untrusted" and the container is not privileged. In both cases CRI-O will try to use the untrusted container runtime. For any other cases, it will switch to the trusted one. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-06-15 10:04:36 +02:00
Mrunal Patel	7b9032bac7	Merge pull request #579 from alexlarsson/non-terminal-attach Implement non-terminal attach	2017-06-14 21:45:44 -07:00
Alexander Larsson	7bb957bf75	Implement non-terminal attach We use a SOCK_SEQPACKET socket for the attach unix domain socket, which means the kernel will ensure that the reading side only ever get the data from one write operation. We use this for frameing, where the first byte is the pipe that the next bytes are for. We have to make sure that all reads from the socket are using at least the same size of buffer as the write side, because otherwise the extra data in the message will be dropped. This also adds a stdin pipe for the container, similar to the ones we use for stdout/err, because we need a way for an attached client to write to stdin, even if not using a tty. This fixes https://github.com/kubernetes-incubator/cri-o/issues/569 Signed-off-by: Alexander Larsson <alexl@redhat.com>	2017-06-14 22:59:50 +02:00
Mrunal Patel	62c9caeb83	oci: Add debugs for container create failures This makes it easier to debug container creation failures by looking at cri-o logs. Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2017-06-14 07:33:07 -07:00
Antonio Murdaca	0b2f6b5354	adjust status on container start failure Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-06-12 12:48:50 +02:00
Mrunal Patel	065f12490c	conmon: Add unix domain socket for attach Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2017-06-06 07:36:52 -07:00
Antonio Murdaca	88fb9094d0	oci: do not error out on runtime state failure Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-06-01 17:37:17 +02:00
Antonio Murdaca	b4f1cee2a2	server: store and use image's stop signal to stop containers Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-27 10:21:04 +02:00
Antonio Murdaca	b4251aebd8	execsync: rewrite to fix a bug in conmon conmon has many flags that are parsed when it's executed, one of them is "-c". During PR #510 where we vendor latest kube master code, upstream has changed a test to call a "ctr execsync" with a command of "sh -c commmand ...". Turns out: a) conmon has a "-c" flag which refers to the container name/id b) the exec command has a "-c" flags but it's for "sh" That leads to conmon parsing the second "-c" flags from the exec command causing an error. The executed command looks like: conmon -c [..other flags..] CONTAINERID -e sh -c echo hello world This patch rewrites the exec sync code to not pass down to conmon the exec command via command line. Rather, we're now creating an OCI runtime process spec in a temp file, pass _the path_ down to conmon, and have runc exec the command using "runc exec --process /path/to/process-spec.json CONTAINERID". This is far better in which we don't need to bother anymore about conflicts with flags in conmon. Added and fixed some tests also. Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-25 22:36:33 +02:00
Mrunal Patel	ea9a90abce	Set Container Status Reason when OOM Killed Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-05-25 11:30:58 -07:00
Antonio Murdaca	6622feb480	server: still update status on container not found in runc Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-18 21:19:51 +02:00
Antonio Murdaca	358dac96d4	server: ignore runc not exist errors Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-18 21:19:50 +02:00
Antonio Murdaca	2ddc062bbe	oci: ignore non existing containers on delete Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-18 21:19:45 +02:00
Antonio Murdaca	fbc5e49a60	oci: save container's finished time Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-18 18:49:55 +02:00
Antonio Murdaca	1ca660e3b7	Merge pull request #512 from runcom/stop-timeout server: honor container stop timeout from CRI	2017-05-16 10:06:47 +02:00
Antonio Murdaca	b3683ab184	server: honor container stop timeout from CRI Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-05-15 22:56:31 +02:00
Mrunal Patel	0a0533cdfc	Capture errors from runtime create failures Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-05-15 13:35:18 -07:00
Dan Walsh	4493b6f176	Rename ocid to crio. The ocid project was renamed to CRI-O, months ago, it is time that we moved all of the code to the new name. We want to elminate the name ocid from use. Move fully to crio. Also cric is being renamed to crioctl for the time being. Signed-off-by: Dan Walsh <dwalsh@redhat.com>	2017-05-12 09:56:06 -04:00
Vincent Batts	f1fd06bfc1	oci: more grep'able interface name `git grep -wi store` is not nearly useful enough. Taking steps for readability. Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2017-04-19 16:12:59 -04:00
Aleksa Sarai	87faf98447	oci: make ExecSync handle split std{out,err} Now that conmon splits std{out,err} for !terminal containers, ExecSync can parse that output to return the correct std{out,err} split to the kubelet. Invalid log lines are ignored but complained about. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-04-12 21:59:25 +10:00
Aleksa Sarai	8a928d06e7	oci: make ExecSync with ExitCode != 0 act properly Previously we returned an internal error result when a program had a non-zero exit code, which was incorrect. Fix this as well as change the tests to actually check the "ExitCode" response from ExecSync (rather than expecting ocic-ctr to return an internal error). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-04-11 20:32:18 +10:00
Suraj Deshmukh	da89d28473	Print received container pid as int Earlier the received container pid was printed as unicode character, this is fixed to print integer. Fixes #431	2017-04-06 22:14:29 +05:30
Aleksa Sarai	c290c0d9c3	conmon: implement logging to logPath This adds a very simple implementation of logging within conmon, where every buffer read from the masterfd of the container is also written to the log file (with errors during writing to the log file ignored). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2017-04-05 02:45:57 +10:00
Mrunal Patel	d69ad9b5a3	Fix lint issues Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-03-27 10:21:30 -07:00
Mrunal Patel	be47583041	Increase the timeout value for create container Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-03-23 10:06:52 -07:00
Daniel J Walsh	d679da0645	If the container exit file is missing default exit code to -1 If I create a sandbox pod and then restart the ocid service, the pod ends up in a stopped state without an exit file. Whether this is a bug in ocid or not we should handle this case where a container exits so that we can clean up the container. This change just defaults to exit code to -1 if the container is not running and does not have an exit file. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2017-03-21 08:00:04 -04:00
YaoZengzeng	3b7d815af1	add timeout when wait to get container pid from conmon Signed-off-by: Yao Zengzeng <yaozengzeng@zju.edu.cn>	2017-03-20 21:49:30 +08:00
Mrunal Patel	8c0ff7d904	Run conmon under cgroups (systemd) Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-03-06 15:08:46 -08:00
Samuel Ortiz	2ec696be41	server: Set sandbox and container privileged flags The sandbox privileged flag is set to true only if either the pod configuration privileged flag is set to true or when any of the pod namespaces are the host ones. A container inherit its privileged flag from its sandbox, and will be run by the privileged runtime only if it's set to true. In other words, the privileged runtime (when defined) will be when one of the below conditions is true: - The sandbox will be asked to run at least one privileged container. - The sandbox requires access to either the host IPC or networking namespaces. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-03-03 19:06:04 +01:00
Samuel Ortiz	eab6b00ea6	oci: Support for the host privileged runtime path We add a privileged flag to the container and sandbox structures and can now select the appropriate runtime path for any container operations depending on that flag. Here again, the default runtime will be used for non privileged containers and for privileged ones in case there are no privileged runtime defined. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-03-03 17:22:09 +01:00
Mrunal Patel	8e5b17cf13	Switch to github.com/golang/dep for vendoring Signed-off-by: Mrunal Patel <mrunalp@gmail.com>	2017-01-31 16:45:59 -08:00
Nalin Dahyabhai	c0333b102b	Integrate containers/storage Use containers/storage to store images, pod sandboxes, and containers. A pod sandbox's infrastructure container has the same ID as the pod to which it belongs, and all containers also keep track of their pod's ID. The container configuration that we build using the data in a CreateContainerRequest is stored in the container's ContainerDirectory and ContainerRunDirectory. We catch SIGTERM and SIGINT, and when we receive either, we gracefully exit the grpc loop. If we also think that there aren't any container filesystems in use, we attempt to do a clean shutdown of the storage driver. The test harness now waits for ocid to exit before attempting to delete the storage root directory. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>	2017-01-18 10:23:30 -05:00
Samuel Ortiz	4c7583b467	oci: Do not call the container runtime from ExecSync Some OCI container runtimes (in particular the hypervisor based ones) will typically create a shim process between the hypervisor and the runtime caller, in order to not rely on the hypervisor process for e.g. forwarding the output streams or getting a command exit code. When executing a command inside a running container those runtimes will create that shim process and terminate. Therefore calling and monitoring them directly from ExecSync() will fail. Instead we need to have a subreaper calling the runtime and monitoring the shim process. This change uses conmon as the subreaper from ExecSync(), monitors the shim process and read the exec'ed command exit code from the synchronization pipe. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-01-14 02:02:43 +01:00
Samuel Ortiz	9a4a1092fe	conmon: Return the exit status code waitpid fills its second argument with a value that contains the process exit code in the 8 least significant bits. Instead of returning the complete value and then convert it from ocid, return the exit status directly by using WEXITSTATUS from conmon. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2017-01-14 02:00:45 +01:00
Aleksa Sarai	da975261e7	oci: fix runc kill usage In later versions of runC, `runc kill` requires the signal parameter to know what signal needs to be sent. Signed-off-by: Aleksa Sarai <asarai@suse.com>	2016-12-31 17:01:19 +11:00
Mrunal Patel	6df58df215	Add support for systemd cgroups Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2016-12-19 16:31:29 -08:00
Mrunal Patel	5eab56e002	Pass cgroup manager to oci runtime manager Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2016-12-19 15:05:32 -08:00
Mrunal Patel	2800b83f2f	Add operation lock for containers Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2016-12-14 10:54:42 -08:00
Antonio Murdaca	d2f6a4c0e2	server: remove reaper, let runc take care of reaping Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2016-12-14 12:15:20 +01:00
Samuel Ortiz	4cab8ed06a	sandbox: Use persistent networking namespace Because they need to prepare the hypervisor networking interfaces and have them match the ones created in the pod networking namespace (typically to bridge TAP and veth interfaces), hypervisor based container runtimes need the sandbox pod networking namespace to be set up before it's created. They can then prepare and start the hypervisor interfaces when creating the pod virtual machine. In order to do so, we need to create per pod persitent networking namespaces that we pass to the CNI plugin. This patch leverages the CNI ns package to create such namespaces under /var/run/netns, and assign them to all pod containers. The persitent namespace is removed when either the pod is stopped or removed. Since the StopPodSandbox() API can be called multiple times from kubelet, we track the pod networking namespace state (closed or not) so that we don't get a containernetworking/ns package error when calling its Close() routine multiple times as well. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2016-12-12 19:48:23 +01:00
Antonio Murdaca	430297dd81	store annotations and image for a container Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2016-12-12 11:12:03 +01:00
Vincent Batts	9ce0a55c35	oci: pass through error output from runc Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2016-12-09 15:53:56 -05:00
Antonio Murdaca	cbe2a68ce5	execsync: return proper error description The gprc execsync client call doesn't populate `ExecSyncResponse` on error at all. You just get an error. This patch modifies the code to include command's streams, exit code and error direcly into the error. `ocic` will then print useful infomation in the cli, otherwise it won't. Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2016-11-24 12:11:04 +01:00
Antonio Murdaca	5c94544fb8	Merge pull request #203 from mrunalp/exec_sync Exec sync	2016-11-21 23:22:20 +01:00

1 2

85 commits