Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6848735imu; Mon, 3 Dec 2018 03:58:09 -0800 (PST) X-Google-Smtp-Source: AFSGD/XQF8q2qfcgCSBqvjgof6SsnxSJhwEW2JGJWil/vnSX4yrewYlJNpLteCy80KyPpvYVW8oW X-Received: by 2002:a17:902:d70b:: with SMTP id w11mr15950629ply.294.1543838289259; Mon, 03 Dec 2018 03:58:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543838289; cv=none; d=google.com; s=arc-20160816; b=H2gTOXubNQ2URov+oqH2XqAQC4FH5GyHrKJbtDT73PlpKRjjcTVPkCy81CSLBei7lr Roe+EaS/iPy5mSfmq4pxfPuV+yR41cChmbDzDkpzwQkqZpF99dwbXOL81zsTRNx82nJv 56WK0cP1p8AGEU0Plaatczk9g+qMGcpS6kXAt9ZMHYSl2RD8X9wGoo6Sg3QTs5V9/yS0 qslICFLU0c8YD3O6JdvPXNNpyiiSrlZPAjqyVtpQUGFWQ97+z9AlUMru6lmTehXOzkZD lHAxqSYtmx3bd1Wxpb41iX4S3VGNCAu1WFLbdwXWoIzlruF+s6ioE/2gm4wlwRuKWtpJ Ii/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ZrbiIOvqdGfTSHqXFZ5idRZNBed0RtzCA3CMHw0fqvc=; b=nvFreOr/lhzprtP1gUkhiRuUjVPfMcLHLo/792ChVdINIHmYsnNUvRzio3QMXbmCH3 ms7nhTTE/rDi1nRIr6CBzCkqGbcD3ViGmcY57QTynlqMMEKAI9oypuCJadsj6PRym8S/ xFJB+jwyWFPXw6PJeGvRebsE3jgYYzPYdWASDtT3Qi/tRwi+yB/O5JGtNvSWWMvOLUiR Go0hr5Te6ESGbxpVszRrBNJ/I2t39K8qRtnJkKwdE8PI5s6SauvQL+t+ts7Etp7gJQIh R/zecdN4vN3v2fGqTtEdsbdFDiHiaaCaHzCMy0O1PZdHGpP3WZdsUBddF9bmW06NFp7a w9zQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 14si12875283pgo.511.2018.12.03.03.57.54; Mon, 03 Dec 2018 03:58:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726401AbeLCL4t (ORCPT + 99 others); Mon, 3 Dec 2018 06:56:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60254 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725975AbeLCL4t (ORCPT ); Mon, 3 Dec 2018 06:56:49 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E2D6B83F4C; Mon, 3 Dec 2018 11:56:04 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.43.17.12]) by smtp.corp.redhat.com (Postfix) with SMTP id 54CE2272C6; Mon, 3 Dec 2018 11:56:02 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Mon, 3 Dec 2018 12:56:04 +0100 (CET) Date: Mon, 3 Dec 2018 12:56:01 +0100 From: Oleg Nesterov To: Ingo Molnar Cc: Linus Torvalds , Linux List Kernel Mailing , "Rafael J. Wysocki" , Chanho Min , Thomas Gleixner , Peter Zijlstra , Pavel Machek , Michal Hocko Subject: Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4) Message-ID: <20181203115601.GA31795@redhat.com> References: <20181203074700.GA21240@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181203074700.GA21240@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 03 Dec 2018 11:56:05 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/03, Ingo Molnar wrote: > > So there's a new regression in v4.20-rc4, my desktop produces this > lockdep splat: > > [ 1772.588771] WARNING: pkexec/4633 still has locks held! > [ 1772.588773] 4.20.0-rc4-custom-00213-g93a49841322b #1 Not tainted > [ 1772.588775] ------------------------------------ > [ 1772.588776] 1 lock held by pkexec/4633: > [ 1772.588778] #0: 00000000ed85fbf8 (&sig->cred_guard_mutex){+.+.}, at: prepare_bprm_creds+0x2a/0x70 > [ 1772.588786] stack backtrace: > [ 1772.588789] CPU: 7 PID: 4633 Comm: pkexec Not tainted 4.20.0-rc4-custom-00213-g93a49841322b #1 > [ 1772.588792] Call Trace: > [ 1772.588800] dump_stack+0x85/0xcb > [ 1772.588803] flush_old_exec+0x116/0x890 > [ 1772.588807] ? load_elf_phdrs+0x72/0xb0 > [ 1772.588809] load_elf_binary+0x291/0x1620 > [ 1772.588815] ? sched_clock+0x5/0x10 > [ 1772.588817] ? search_binary_handler+0x6d/0x240 > [ 1772.588820] search_binary_handler+0x80/0x240 > [ 1772.588823] load_script+0x201/0x220 > [ 1772.588825] search_binary_handler+0x80/0x240 > [ 1772.588828] __do_execve_file.isra.32+0x7d2/0xa60 > [ 1772.588832] ? strncpy_from_user+0x40/0x180 > [ 1772.588835] __x64_sys_execve+0x34/0x40 > [ 1772.588838] do_syscall_64+0x60/0x1c0 My fault. Michal didn't like this patch, but I managed to confuse him ;) I even mentioned that freezable_schedule() is obviously not right with ->cred_guard_mutex held, but I somehow I thought that this patch "doesn't make things worse". > I reviewed the ->cred_guard_mutex code, and the mutex is held across all > of exec() - and we always did this. Yes, and this was always wrong. For example, this test-case hangs: #include #include #include #include void *thread(void *arg) { ptrace(PTRACE_TRACEME, 0,0,0); return NULL; } int main(void) { int pid = fork(); if (!pid) { pthread_t pt; pthread_create(&pt, NULL, thread, NULL); pthread_join(pt, NULL); execlp("echo", "echo", "passed", NULL); } sleep(1); // or anything else which needs ->cred_guard_mutex, // say open(/proc/$pid/mem) ptrace(PTRACE_ATTACH, pid, 0,0); kill(pid, SIGCONT); return 0; } we really need to narrow the (huge) scope of ->cred_guard_mutex in exec paths. my attempt to fix this was nacked, and nobody suggested a better solution so far. > This reverts commit c22397888f1eed98cd59f0a88f2a5f6925f80e15. Thanks. > --- > fs/exec.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index acc3a5536384..fc281b738a98 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -62,7 +62,6 @@ > #include > #include > #include > -#include > > #include > #include > @@ -1084,7 +1083,7 @@ static int de_thread(struct task_struct *tsk) > while (sig->notify_count) { > __set_current_state(TASK_KILLABLE); > spin_unlock_irq(lock); > - freezable_schedule(); > + schedule(); > if (unlikely(__fatal_signal_pending(tsk))) > goto killed; > spin_lock_irq(lock); > @@ -1112,7 +1111,7 @@ static int de_thread(struct task_struct *tsk) > __set_current_state(TASK_KILLABLE); > write_unlock_irq(&tasklist_lock); > cgroup_threadgroup_change_end(tsk); > - freezable_schedule(); > + schedule(); > if (unlikely(__fatal_signal_pending(tsk))) > goto killed; > }