Revert "fs/exec: allow to unshare a time namespace on vfork+exec"

This reverts commit 133e2d3e81.

Alexey pointed out a few undesirable side effects of the reverted change.
First, it doesn't take into account that CLONE_VFORK can be used with
CLONE_THREAD. Second, a child process doesn't enter a target time name-space,
if its parent dies before the child calls exec. It happens because the parent
clears vfork_done.

Eric W. Biederman suggests installing a time namespace as a task gets a new mm.
It includes all new processes cloned without CLONE_VM and all tasks that call
exec(). This is an user API change, but we think there aren't users that depend
on the old behavior.

It is too late to make such changes in this release, so let's roll back
this patch and introduce the right one in the next release.

Cc: Alexey Izbyshev <izbyshev@ispras.ru>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220913102551.1121611-3-avagin@google.com
This commit is contained in:
Andrei Vagin 2022-09-13 03:25:51 -07:00 committed by Kees Cook
parent 2b1e8921fc
commit 33a2d6bc34
3 changed files with 2 additions and 13 deletions

View File

@ -65,7 +65,6 @@
#include <linux/io_uring.h>
#include <linux/syscall_user_dispatch.h>
#include <linux/coredump.h>
#include <linux/time_namespace.h>
#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@ -979,12 +978,10 @@ static int exec_mmap(struct mm_struct *mm)
{
struct task_struct *tsk;
struct mm_struct *old_mm, *active_mm;
bool vfork;
int ret;
/* Notify parent that we're no longer interested in the old VM */
tsk = current;
vfork = !!tsk->vfork_done;
old_mm = current->mm;
exec_mm_release(tsk, old_mm);
if (old_mm)
@ -1029,10 +1026,6 @@ static int exec_mmap(struct mm_struct *mm)
tsk->mm->vmacache_seqnum = 0;
vmacache_flush(tsk);
task_unlock(tsk);
if (vfork)
timens_on_fork(tsk->nsproxy, tsk);
if (old_mm) {
mmap_read_unlock(old_mm);
BUG_ON(active_mm != old_mm);

View File

@ -2046,11 +2046,8 @@ static __latent_entropy struct task_struct *copy_process(
/*
* If the new process will be in a different time namespace
* do not allow it to share VM or a thread group with the forking task.
*
* On vfork, the child process enters the target time namespace only
* after exec.
*/
if ((clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM) {
if (clone_flags & (CLONE_THREAD | CLONE_VM)) {
if (nsp->time_ns != nsp->time_ns_for_children)
return ERR_PTR(-EINVAL);
}

View File

@ -179,8 +179,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
if (IS_ERR(new_ns))
return PTR_ERR(new_ns);
if ((flags & CLONE_VM) == 0)
timens_on_fork(new_ns, tsk);
timens_on_fork(new_ns, tsk);
tsk->nsproxy = new_ns;
return 0;