Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp4228441ybh; Tue, 17 Mar 2020 14:54:50 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvkj9qHSyvlbqnKFSnAVhwgTh2ocpiVftHp/Uh3z6okGSypdDGBKzkWPsHVfvR33d7pdnzC X-Received: by 2002:aca:646:: with SMTP id 67mr813658oig.4.1584482090632; Tue, 17 Mar 2020 14:54:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1584482090; cv=pass; d=google.com; s=arc-20160816; b=OInSSffHTDcOFVaCn09kJx5L9dSRMpne4sQbgoGERODyeP8XkCDoNMeaeCGvPIUe0i 5LuPy0LCL1lTyuJelb2v9wiXrBrRpknk7PubIKoKmIkfFLELiRzrg0g8+WnCCjf0r1+l RKmKqE34wbqbKK3JYOQUdTq5JXLItpydHtpFjm06pViny+YV1S5B/6co9MVIGDgXmQSD R1ueXPygLFc1ezAyDrGrq+ODm7I5Q3r+gx3nMJ+jyNBwiTt/sFZreJow9DZuv38Tz+kX iewlr4tIHb8e6lpQafnfYGpcR/mIwM5uPYahnDTCQazH//dCPhKlndE6huNL8aA2GYJ2 k3mA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:in-reply-to:user-agent:date:message-id:from :references:cc:to:subject; bh=MfMUk+TYbhQyo069Y4tQCMdCoA1e5S3Vfl146EIzwEQ=; b=hhhAsoUoK8G56yU8uYNqahZ9qZyvBJ0VSYEwigfUcw4gNQTggx7I0Bxe4rto5AsA4k aEQZznU1lQcVlZHQJMqyPsJPo/SJ5EnsqDIUNf4u0x4gyvBpMHa8eq+EVy0nPCmGi03z LRvB4GgJRXyuqKY/QFmWNSGdpHT3IlhUOoVfX2ksNf1l4IigxV5GBVZgqNh3K+ZtdXW+ grddBqCqXf9BWOSUwKD9EMyqtCaiUHfsBOaadMDk+uxOHj7h4BBsYBivLns92n2/oPvk iiDDGDdDE1CWA3kEgOwHMyhHqL6tL+/PYM+0hTI0mxLynru94E0lN/vGb/clnB89qYH8 lYyg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=hotmail.de dkim=pass dkdomain=hotmail.de dmarc=pass fromdomain=hotmail.de); spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i19si1122158otp.177.2020.03.17.14.54.32; Tue, 17 Mar 2020 14:54:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=hotmail.de dkim=pass dkdomain=hotmail.de dmarc=pass fromdomain=hotmail.de); spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726916AbgCQVyC (ORCPT + 99 others); Tue, 17 Mar 2020 17:54:02 -0400 Received: from mail-oln040092067070.outbound.protection.outlook.com ([40.92.67.70]:1997 "EHLO EUR02-AM5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726476AbgCQVyC (ORCPT ); Tue, 17 Mar 2020 17:54:02 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iw8Ndk/3sWB8zc1KPd/+HbEoqJWMw0XjLBQAREKEKXwpsyVKmtya6ycwjNUUx1nTFYs6aIFyRdA+e1lXh8dqxBZJaSL2HcZCYOmorhv4Ppu+EtM1ODh3M8h0axACBr1xok6Yuw0sIhLv/YqJukIjKeKlh6dNWaREMI8hGKN1dzT100WWwBX4P8FdvO/up2padfDWzMczo6wHbyiYzxm3r+scoJ09qiKh2ptiD3YSzQ0pWlel9XieJoQHGGEfvz0nYiuKRiMM1uPAwaVopnyPVJzaVnVgAA19wtYrXgOSBBSKtHW7dLHpusKEdFti3tRA97UEknWRrJCFVzrYBBhz8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MfMUk+TYbhQyo069Y4tQCMdCoA1e5S3Vfl146EIzwEQ=; b=IrqQ5wiQRpRlgDpTApu/Ko+UnsQOIOuyrZf1kbA2wp+Oi2OLf8dezEsZH2Fatl8zpTUDrOG7jDEdCeJn3ao3zq6tVssmE00w+nQKfLbhlqm6waxNKAaftY5SNzX9YEsw2P4eNutQF9T/84o91bMyrEL9H4pkZX/mJBDo7MHdWnIcOTu/kECFGBlDpxhVmXTpBVJyRRY5uGLxaCRW58Mk1iHJ3n6EXQGjie68DDvDNbufytCuGgyvVpBZCJKz1MYfmizVvwBHxpS6yHDVFD10N3yNmQWyGD7epOXl0VqL8REL/YLFcsr8xyK/xxKSFgYioYi1Qo6JLCPu58nVU0/2bw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=hotmail.de; dmarc=pass action=none header.from=hotmail.de; dkim=pass header.d=hotmail.de; arc=none Received: from VE1EUR02FT042.eop-EUR02.prod.protection.outlook.com (2a01:111:e400:7e1e::3c) by VE1EUR02HT124.eop-EUR02.prod.protection.outlook.com (2a01:111:e400:7e1e::486) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2814.13; Tue, 17 Mar 2020 21:53:55 +0000 Received: from AM6PR03MB5170.eurprd03.prod.outlook.com (10.152.12.52) by VE1EUR02FT042.mail.protection.outlook.com (10.152.13.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2814.13 via Frontend Transport; Tue, 17 Mar 2020 21:53:55 +0000 X-IncomingTopHeaderMarker: OriginalChecksum:079933D0050953D00B1595D7E3EC684E4BEF8D14F99B4452C7BD7C512A7578A2;UpperCasedChecksum:5E4B4C9D9934F285B603CFBB35EAAC433D00F82E1FC5487E6098010C116CAE70;SizeAsReceived:10391;Count:50 Received: from AM6PR03MB5170.eurprd03.prod.outlook.com ([fe80::1956:d274:cab3:b4dd]) by AM6PR03MB5170.eurprd03.prod.outlook.com ([fe80::1956:d274:cab3:b4dd%6]) with mapi id 15.20.2814.021; Tue, 17 Mar 2020 21:53:55 +0000 Subject: Re: [PATCH v3 5/5] exec: Add a exec_update_mutex to replace cred_guard_mutex To: Kirill Tkhai , "Eric W. Biederman" Cc: Christian Brauner , Kees Cook , Jann Horn , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , Thomas Gleixner , Oleg Nesterov , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra (Intel)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , James Morris , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , Aleksa Sarai , "Dmitry V. Levin" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "stable@vger.kernel.org" , "linux-api@vger.kernel.org" References: <87a74xi4kz.fsf@x220.int.ebiederm.org> <87r1y8dqqz.fsf@x220.int.ebiederm.org> <87tv32cxmf.fsf_-_@x220.int.ebiederm.org> <87v9ne5y4y.fsf_-_@x220.int.ebiederm.org> <87zhcq4jdj.fsf_-_@x220.int.ebiederm.org> <87d09hn4kt.fsf@x220.int.ebiederm.org> <87lfo5lju6.fsf@x220.int.ebiederm.org> <6002ac56-025a-d50f-e89d-1bf42a072323@virtuozzo.com> <532ce6a3-f0df-e3e4-6966-473c608246e1@virtuozzo.com> From: Bernd Edlinger Message-ID: Date: Tue, 17 Mar 2020 22:53:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 In-Reply-To: <532ce6a3-f0df-e3e4-6966-473c608246e1@virtuozzo.com> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR2P281CA0033.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:14::20) To AM6PR03MB5170.eurprd03.prod.outlook.com (2603:10a6:20b:ca::23) X-Microsoft-Original-Message-ID: MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.1.101] (92.77.140.102) by FR2P281CA0033.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:14::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2814.19 via Frontend Transport; Tue, 17 Mar 2020 21:53:53 +0000 X-Microsoft-Original-Message-ID: X-TMN: [8/u1as5RDCi+LfL4ERhF3MFusUC8BGOx] X-MS-PublicTrafficType: Email X-IncomingHeaderCount: 50 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-Correlation-Id: 1f106d70-f042-4d65-8777-08d7cabdb042 X-MS-TrafficTypeDiagnostic: VE1EUR02HT124: X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: HdG5v9lvssNH7faA1LgXrFWy8Zrb2UEBeqQMlMx4gin9j2EeN/XfeXY8s2ZFdOKbKe+2EDn9T9j9CQ6g3MZKW9+OTA3Ezzx1RJa4IJH2pYB1m78D2ofvq2jNboKxH0t2LUl4m0KYi68n+S91WPYXy0Jg/Zu0Ant+FDrwMhqCQTjIbbMNu795hxS4IWTt/7sOLNBPZHgg4XUzWp6UDSkrZ2NKsAVux3gXOvoZNvlMxlQ= X-MS-Exchange-AntiSpam-MessageData: 8k8bx8BQpTlW5h6QHUM9jc0hB/GeSPzqQ1bLCi2KmjJIYAnHEGlTcr0Pt8BHv6pZb5Lr134A8Lds0clpZS3F/ziWXtlvcjdB+RS7bPlMqP6n32wfDq517BssnWnC3uwsfsi0Vas3Zu/g5PB85Nuvmg== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1f106d70-f042-4d65-8777-08d7cabdb042 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Mar 2020 21:53:55.5961 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-FromEntityHeader: Internet X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1EUR02HT124 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/17/20 9:56 AM, Kirill Tkhai wrote: > On 14.03.2020 12:11, Bernd Edlinger wrote: >> The cred_guard_mutex is problematic. The cred_guard_mutex is held >> over the userspace accesses as the arguments from userspace are read. >> The cred_guard_mutex is held of PTRACE_EVENT_EXIT as the the other >> threads are killed. The cred_guard_mutex is held over >> "put_user(0, tsk->clear_child_tid)" in exit_mm(). >> >> Any of those can result in deadlock, as the cred_guard_mutex is held >> over a possible indefinite userspace waits for userspace. >> >> Add exec_update_mutex that is only held over exec updating process >> with the new contents of exec, so that code that needs not to be >> confused by exec changing the mm and the cred in ways that can not >> happen during ordinary execution of a process. >> >> The plan is to switch the users of cred_guard_mutex to >> exec_udpate_mutex one by one. This lets us move forward while still >> being careful and not introducing any regressions. >> >> Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/ >> Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/ >> Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/ >> Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/ >> Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/ >> Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.") >> Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2") >> Signed-off-by: "Eric W. Biederman" >> Signed-off-by: Bernd Edlinger >> --- >> fs/exec.c | 17 ++++++++++++++--- >> include/linux/binfmts.h | 8 +++++++- >> include/linux/sched/signal.h | 9 ++++++++- >> init/init_task.c | 1 + >> kernel/fork.c | 1 + >> 5 files changed, 31 insertions(+), 5 deletions(-) >> >> v3: this update fixes lock-order and adds an explicit data member in linux_binprm >> >> diff --git a/fs/exec.c b/fs/exec.c >> index d820a72..11974a1 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1014,12 +1014,17 @@ static int exec_mmap(struct mm_struct *mm) >> { >> struct task_struct *tsk; >> struct mm_struct *old_mm, *active_mm; >> + int ret; >> >> /* Notify parent that we're no longer interested in the old VM */ >> tsk = current; >> old_mm = current->mm; >> exec_mm_release(tsk, old_mm); >> >> + ret = mutex_lock_killable(&tsk->signal->exec_update_mutex); >> + if (ret) >> + return ret; >> + >> if (old_mm) { >> sync_mm_rss(old_mm); >> /* >> @@ -1031,9 +1036,11 @@ static int exec_mmap(struct mm_struct *mm) >> down_read(&old_mm->mmap_sem); >> if (unlikely(old_mm->core_state)) { >> up_read(&old_mm->mmap_sem); >> + mutex_unlock(&tsk->signal->exec_update_mutex); >> return -EINTR; >> } >> } >> + >> task_lock(tsk); >> active_mm = tsk->active_mm; >> membarrier_exec_mmap(mm); >> @@ -1288,11 +1295,12 @@ int flush_old_exec(struct linux_binprm * bprm) >> goto out; >> >> /* >> - * After clearing bprm->mm (to mark that current is using the >> - * prepared mm now), we have nothing left of the original >> + * After setting bprm->called_exec_mmap (to mark that current is >> + * using the prepared mm now), we have nothing left of the original >> * process. If anything from here on returns an error, the check >> * in search_binary_handler() will SEGV current. >> */ >> + bprm->called_exec_mmap = 1; > > The two below is non-breaking pair: > > exec_mmap(bprm->mm); > bprm->called_exec_mmap = 1; > > Why not move this into exec_mmap(), so nobody definitely inserts something > between them? > Hmm, could be done, but then I would probably need a different name than "called_exec_mmap". How about adding a nice function comment to exec_mmap that calls out the changed behaviour that the exec_update_mutex is taken unless the function fails? Bernd. >> bprm->mm = NULL; >> >> #ifdef CONFIG_POSIX_TIMERS >> @@ -1438,6 +1446,8 @@ static void free_bprm(struct linux_binprm *bprm) >> { >> free_arg_pages(bprm); >> if (bprm->cred) { >> + if (bprm->called_exec_mmap) >> + mutex_unlock(¤t->signal->exec_update_mutex); >> mutex_unlock(¤t->signal->cred_guard_mutex); >> abort_creds(bprm->cred); >> } >> @@ -1487,6 +1497,7 @@ void install_exec_creds(struct linux_binprm *bprm) >> * credentials; any time after this it may be unlocked. >> */ >> security_bprm_committed_creds(bprm); >> + mutex_unlock(¤t->signal->exec_update_mutex); >> mutex_unlock(¤t->signal->cred_guard_mutex); >> } >> EXPORT_SYMBOL(install_exec_creds); >> @@ -1678,7 +1689,7 @@ int search_binary_handler(struct linux_binprm *bprm) >> >> read_lock(&binfmt_lock); >> put_binfmt(fmt); >> - if (retval < 0 && !bprm->mm) { >> + if (retval < 0 && bprm->called_exec_mmap) { >> /* we got to flush_old_exec() and failed after it */ >> read_unlock(&binfmt_lock); >> force_sigsegv(SIGSEGV); >> diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h >> index b40fc63..a345d9f 100644 >> --- a/include/linux/binfmts.h >> +++ b/include/linux/binfmts.h >> @@ -44,7 +44,13 @@ struct linux_binprm { >> * exec has happened. Used to sanitize execution environment >> * and to set AT_SECURE auxv for glibc. >> */ >> - secureexec:1; >> + secureexec:1, >> + /* >> + * Set by flush_old_exec, when exec_mmap has been called. >> + * This is past the point of no return, when the >> + * exec_update_mutex has been taken. >> + */ >> + called_exec_mmap:1; >> #ifdef __alpha__ >> unsigned int taso:1; >> #endif >> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h >> index 8805025..a29df79 100644 >> --- a/include/linux/sched/signal.h >> +++ b/include/linux/sched/signal.h >> @@ -224,7 +224,14 @@ struct signal_struct { >> >> struct mutex cred_guard_mutex; /* guard against foreign influences on >> * credential calculations >> - * (notably. ptrace) */ >> + * (notably. ptrace) >> + * Deprecated do not use in new code. >> + * Use exec_update_mutex instead. >> + */ >> + struct mutex exec_update_mutex; /* Held while task_struct is being >> + * updated during exec, and may have >> + * inconsistent permissions. >> + */ >> } __randomize_layout; >> >> /* >> diff --git a/init/init_task.c b/init/init_task.c >> index 9e5cbe5..bd403ed 100644 >> --- a/init/init_task.c >> +++ b/init/init_task.c >> @@ -26,6 +26,7 @@ >> .multiprocess = HLIST_HEAD_INIT, >> .rlim = INIT_RLIMITS, >> .cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex), >> + .exec_update_mutex = __MUTEX_INITIALIZER(init_signals.exec_update_mutex), >> #ifdef CONFIG_POSIX_TIMERS >> .posix_timers = LIST_HEAD_INIT(init_signals.posix_timers), >> .cputimer = { >> diff --git a/kernel/fork.c b/kernel/fork.c >> index 8642530..036b692 100644 >> --- a/kernel/fork.c >> +++ b/kernel/fork.c >> @@ -1594,6 +1594,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) >> sig->oom_score_adj_min = current->signal->oom_score_adj_min; >> >> mutex_init(&sig->cred_guard_mutex); >> + mutex_init(&sig->exec_update_mutex); >> >> return 0; >> } >> >