Received: by 10.192.165.148 with SMTP id m20csp1071467imm; Thu, 10 May 2018 05:17:15 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpyYu9koBa9pcnWupc1f1dWXUON2kZd+PgwAF+0rACdfPh391E0xrO6xJDtBxzkjfrpXWSk X-Received: by 2002:a62:e50d:: with SMTP id n13-v6mr1170454pff.125.1525954635888; Thu, 10 May 2018 05:17:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525954635; cv=none; d=google.com; s=arc-20160816; b=b4s9I76X9plHXYDAvM8hysJ04ws0JM3yrRoDVo0dAL+MfxIDyY29a7AC3bDEZhtWg1 +Z6QCh2j/W7rIo4BuJhOvNUueVvrEendKUaA7+VMYZ18rLRknwD1W/NEe/px6Ak9OWnP scVDL6m3OSrRDib79gPa+afkjUnbzdvpMe3vZPdiaTHIJgM24Ys9Fgn8LXrK3XbhEOk/ hqxAKHTyMBENqDa32NhcitnQlIPAR0vOJh3HKV/19FhT3r3Tcq8HGJ+8dcGLjO7lTGpi tqVZjEQeGrkkb9eadWMjXz3b0GFT66fTs7dEh0xyEljZ/99U5gYkgnaSMZ0VyyJWhiG8 kxuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=DAOI4d0Q9TYRGexQ68VJmD7gz8CjUkOkIo/olV5wgNk=; b=kSKqxZMI8wlXy9QrQy8LnMjijdyNXMxbyxMS/EZMmyKNmwpqggtkK/KVeTqAZYSm1K dBJf2p3r7h2hRZS6spcgQxDDcb80bH5JkXn58l4dWxKoLR9H4p6/2tHeG6/oovcxkBpg b7qNEHcmR0IyKbdV5PpaXU3scug82S4FHuzZqMqkouLlqrdbH3Q9btbjVXL3hVQ3DZbY 8bLW40bTbpiPldXwoIR3sXkEdJZu+NMtMS+uG1tXhpFclzLe9UPtWasgsLyGVTdYNuUE k0tDdhsE/3a6mPf9RlF78OBY1Z3yY4hUhhPFg0ihvuFo84ssnVidZ0aw6dGY6rdJjfIJ S+AA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h1-v6si567449pgn.430.2018.05.10.05.17.00; Thu, 10 May 2018 05:17:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757287AbeEJMPi (ORCPT + 99 others); Thu, 10 May 2018 08:15:38 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48058 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757069AbeEJMPh (ORCPT ); Thu, 10 May 2018 08:15:37 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AC7967C6B1; Thu, 10 May 2018 12:15:36 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.34.27.30]) by smtp.corp.redhat.com (Postfix) with SMTP id 226432026DEF; Thu, 10 May 2018 12:15:32 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Thu, 10 May 2018 14:15:36 +0200 (CEST) Date: Thu, 10 May 2018 14:15:32 +0200 From: Oleg Nesterov To: "Eric W. Biederman" Cc: Johannes Weiner , Michal Hocko , Kirill Tkhai , akpm@linux-foundation.org, peterz@infradead.org, viro@zeniv.linux.org.uk, mingo@kernel.org, paulmck@linux.vnet.ibm.com, keescook@chromium.org, riel@redhat.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com, hoeun.ryu@gmail.com, pasha.tatashin@oracle.com, gs051095@gmail.com, dhowells@redhat.com, rppt@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, Balbir Singh , Tejun Heo Subject: Re: [RFC][PATCH] cgroup: Don't mess with tasks in exec Message-ID: <20180510121531.GA29222@redhat.com> References: <87r2mrh4is.fsf@xmission.com> <20180504145435.GA26573@redhat.com> <87y3gzfmjt.fsf@xmission.com> <20180504162209.GB26573@redhat.com> <871serfk77.fsf@xmission.com> <20180507143358.GA3071@redhat.com> <87vabyvnw0.fsf@xmission.com> <20180509144016.GA25742@redhat.com> <87vabwp5p6.fsf@xmission.com> <871sekp378.fsf_-_@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <871sekp378.fsf_-_@xmission.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 10 May 2018 12:15:36 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Thu, 10 May 2018 12:15:36 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'oleg@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/09, Eric W. Biederman wrote: > > Semantically exec is supposed to be atomic with no user space visible > intermediate points. Migrating tasks during exec may change that and > lead to all manner of difficult to analyze and maintin corner cases. Apart from race with copy_strings() we discuss in another thread? > So avoid the problems by simply blocking cgroup migration over the > entirety of exec. This patch, even if it was correct, will bring much more problems. If nothing else exec() is very slow. If it races with migration which needs this sem for writing the new readers will be blocked. This means that clone(), exit(), or another exec() will block too. Now. if some IO path does kthread_stop() we have a deadlock. Or request_module() in search_binary_handler(). Deadlock. Plus this adds the nice security problem, a PTRACE_O_TRACEEXEC'ed task will sleep in TASK_TRACED with cgroup_threadgroup_rwsem. Oleg. > Reported-by: Oleg Nesterov > Signed-off-by: "Eric W. Biederman" > --- > > Unless this leads to some kind of deadlock > fs/exec.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 32461a1543fc..54bb01cfc635 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1101,7 +1099,6 @@ static int de_thread(struct task_struct *tsk) > struct task_struct *leader = tsk->group_leader; > > for (;;) { > - cgroup_threadgroup_change_begin(tsk); > write_lock_irq(&tasklist_lock); > /* > * Do this under tasklist_lock to ensure that > @@ -1112,7 +1109,6 @@ static int de_thread(struct task_struct *tsk) > break; > __set_current_state(TASK_KILLABLE); > write_unlock_irq(&tasklist_lock); > - cgroup_threadgroup_change_end(tsk); > schedule(); > if (unlikely(__fatal_signal_pending(tsk))) > goto killed; > @@ -1750,6 +1746,7 @@ static int do_execveat_common(int fd, struct filename *filename, > if (retval) > goto out_free; > > + cgroup_threadgroup_change_begin(current); > check_unsafe_exec(bprm); > current->in_execve = 1; > > @@ -1822,6 +1819,7 @@ static int do_execveat_common(int fd, struct filename *filename, > /* execve succeeded */ > current->fs->in_exec = 0; > current->in_execve = 0; > + cgroup_threadgroup_change_end(current); > membarrier_execve(current); > acct_update_integrals(current); > task_numa_free(current); > @@ -1841,6 +1839,7 @@ static int do_execveat_common(int fd, struct filename *filename, > out_unmark: > current->fs->in_exec = 0; > current->in_execve = 0; > + cgroup_threadgroup_change_end(current); > > out_free: > free_bprm(bprm); > -- > 2.14.1 >