Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4682084ybf; Wed, 4 Mar 2020 08:36:40 -0800 (PST) X-Google-Smtp-Source: ADFU+vuQvI1+B9bOVq8CxGRVuXnEq6F02SNRkafhL5bZvuNitSPF+ro23Fm+4XjIbe4OsNujVv3i X-Received: by 2002:aca:43c1:: with SMTP id q184mr2262326oia.116.1583339799556; Wed, 04 Mar 2020 08:36:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583339799; cv=none; d=google.com; s=arc-20160816; b=G/0v7eqc5+tlmMuJzl+1sygYGZLxgcX3OBHPMTeTvThsNjDegP3PTMii1XWLtVV1xX +6Rsw393U/N+9GTBsfeQvC85sFJn6wwbQUH5jfnRKdFJyKEJd01BKpiAgF2vYv1iURec ZNlTu5U4CV9uSP1TEx5yDFaqHk7Tn9OdMe/EvVP95r9b6KtDiQz7NvODzpXCdW2FCD8t LwQsRaGRUb/iEhjUXFBePcRtk2ivbyHCIfybT8/s+/+Kae57Zv1xHhmkoUIj4OJFa1AY otZDgkBgTuK2jyrikOXRGylyYB8MZ7RwCNAlqDOLNcG5WSkgDsMLYDrEpCyfkpHLQQO6 AVMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from; bh=EG3DWU1D1tjcKRcv1pDuis9gLupoPa1nC7cVcpoABNY=; b=uKwMJVMHELzGL27MzfNm086IDka2zWsP3y+9+S9Mjg0Ot6aKpKcYA6bg1QH9YBt7Ie wYQnwMOnf10u20NWDKYMnoiNxAbntZ8R5907VfniQunT4Q2JWihJzo03ZsYQ5WaxuF6G 2x1tGyIAsJ1bcAvQp2Q4oznPasg6nZEU6ZWzwOf+SEcDM128NySfGDMKOfB1/LkzBo/X WdSfb7S7enxKGA9COTzL26TrqZBl+IsRZyKaRaBrj4rrO7787mz29ZegGkUeQZLxsXJX dngRutPFTTEuB6sSV9cK5wh3Nxodq11UMyg8JYBBD3od9OapsC+5+amFsb7zeemzqUWK GqNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p3si1566055oih.186.2020.03.04.08.36.27; Wed, 04 Mar 2020 08:36:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388107AbgCDQfz (ORCPT + 99 others); Wed, 4 Mar 2020 11:35:55 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:42746 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726263AbgCDQfz (ORCPT ); Wed, 4 Mar 2020 11:35:55 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j9Wzo-000429-0p; Wed, 04 Mar 2020 09:35:44 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j9Wzh-0000J3-B9; Wed, 04 Mar 2020 09:35:43 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Bernd Edlinger Cc: Christian Brauner , Kees Cook , Jann Horn , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , Thomas Gleixner , Oleg Nesterov , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra \(Intel\)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , James Morris , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , Aleksa Sarai , "Dmitry V. Levin" , "linux-doc\@vger.kernel.org" , "linux-kernel\@vger.kernel.org" , "linux-fsdevel\@vger.kernel.org" , "linux-mm\@kvack.org" , "stable\@vger.kernel.org" , "linux-api\@vger.kernel.org" References: <87a74zmfc9.fsf@x220.int.ebiederm.org> <87k142lpfz.fsf@x220.int.ebiederm.org> <875zfmloir.fsf@x220.int.ebiederm.org> <87v9nmjulm.fsf@x220.int.ebiederm.org> <202003021531.C77EF10@keescook> <20200303085802.eqn6jbhwxtmz4j2x@wittgenstein> <87v9nlii0b.fsf@x220.int.ebiederm.org> <87a74xi4kz.fsf@x220.int.ebiederm.org> Date: Wed, 04 Mar 2020 10:33:24 -0600 In-Reply-To: (Bernd Edlinger's message of "Wed, 4 Mar 2020 14:37:46 +0000") Message-ID: <87r1y8dqqz.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1j9Wzh-0000J3-B9;;;mid=<87r1y8dqqz.fsf@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+PbZqkG+MqZcU53qiOMlhhlYuGrYdl+NM= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa01.xmission.com X-Spam-Level: X-Spam-Status: No, score=-0.2 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,NO_DNS_FOR_FROM,T_TM2_M_HEADER_IN_MSG autolearn=disabled version=3.4.2 X-Spam-Virus: No X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4995] * 0.0 NO_DNS_FOR_FROM DNS: Envelope sender has no MX or A DNS records * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa01 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Bernd Edlinger X-Spam-Relay-Country: X-Spam-Timing: total 6151 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 2.4 (0.0%), b_tie_ro: 1.66 (0.0%), parse: 1.54 (0.0%), extract_message_metadata: 14 (0.2%), get_uri_detail_list: 3.6 (0.1%), tests_pri_-1000: 5 (0.1%), tests_pri_-950: 1.03 (0.0%), tests_pri_-900: 0.85 (0.0%), tests_pri_-90: 36 (0.6%), check_bayes: 34 (0.6%), b_tokenize: 12 (0.2%), b_tok_get_all: 12 (0.2%), b_comp_prob: 3.0 (0.0%), b_tok_touch_all: 5.0 (0.1%), b_finish: 0.69 (0.0%), tests_pri_0: 6077 (98.8%), check_dkim_signature: 0.44 (0.0%), check_dkim_adsp: 5385 (87.5%), poll_dns_idle: 5380 (87.5%), tests_pri_10: 2.4 (0.0%), tests_pri_500: 8 (0.1%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCHv5] exec: Fix a deadlock in ptrace X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Bernd Edlinger writes: > On 3/3/20 9:08 PM, Eric W. Biederman wrote: >> Bernd Edlinger writes: >> >>> On 3/3/20 4:18 PM, Eric W. Biederman wrote: >>>> Bernd Edlinger writes: >>>>> diff --git a/tools/testing/selftests/ptrace/vmaccess.c b/tools/testing/selftests/ptrace/vmaccess.c >>>>> new file mode 100644 >>>>> index 0000000..6d8a048 >>>>> --- /dev/null >>>>> +++ b/tools/testing/selftests/ptrace/vmaccess.c >>>>> @@ -0,0 +1,66 @@ >>>>> +// SPDX-License-Identifier: GPL-2.0+ >>>>> +/* >>>>> + * Copyright (c) 2020 Bernd Edlinger >>>>> + * All rights reserved. >>>>> + * >>>>> + * Check whether /proc/$pid/mem can be accessed without causing deadlocks >>>>> + * when de_thread is blocked with ->cred_guard_mutex held. >>>>> + */ >>>>> + >>>>> +#include "../kselftest_harness.h" >>>>> +#include >>>>> +#include >>>>> +#include >>>>> +#include >>>>> +#include >>>>> +#include >>>>> + >>>>> +static void *thread(void *arg) >>>>> +{ >>>>> + ptrace(PTRACE_TRACEME, 0, 0L, 0L); >>>>> + return NULL; >>>>> +} >>>>> + >>>>> +TEST(vmaccess) >>>>> +{ >>>>> + int f, pid = fork(); >>>>> + char mm[64]; >>>>> + >>>>> + if (!pid) { >>>>> + pthread_t pt; >>>>> + >>>>> + pthread_create(&pt, NULL, thread, NULL); >>>>> + pthread_join(pt, NULL); >>>>> + execlp("true", "true", NULL); >>>>> + } >>>>> + >>>>> + sleep(1); >>>>> + sprintf(mm, "/proc/%d/mem", pid); >>>>> + f = open(mm, O_RDONLY); >>>>> + ASSERT_LE(0, f); >>>>> + close(f); >>>>> + f = kill(pid, SIGCONT); >>>>> + ASSERT_EQ(0, f); >>>>> +} >>>>> + >>>>> +TEST(attach) >>>>> +{ >>>>> + int f, pid = fork(); >>>>> + >>>>> + if (!pid) { >>>>> + pthread_t pt; >>>>> + >>>>> + pthread_create(&pt, NULL, thread, NULL); >>>>> + pthread_join(pt, NULL); >>>>> + execlp("true", "true", NULL); >>>>> + } >>>>> + >>>>> + sleep(1); >>>>> + f = ptrace(PTRACE_ATTACH, pid, 0L, 0L); >>>> >>>> To be meaningful this code needs to learn to loop when >>>> ptrace returns -EAGAIN. >>>> >>>> Because that is pretty much what any self respecting user space >>>> process will do. >>>> >>>> At which point I am not certain we can say that the behavior has >>>> sufficiently improved not to be a deadlock. >>>> >>> >>> In this special dead-duck test it won't work, but it would >>> still be lots more transparent what is going on, since previously >>> you had two zombie process, and no way to even output debug >>> messages, which also all self respecting user space processes >>> should do. >> >> Agreed it is more transparent. So if you are going to deadlock >> it is better. >> >> My previous proposal (which I admit is more work to implement) would >> actually allow succeeding in this case and so it would not be subject to >> a dead lock (even via -EGAIN) at this point. >> >>> So yes, I can at least give a good example and re-try it several >>> times together with wait4 which a tracer is expected to do. >> >> Thank you, >> >> Eric >> > > Okay, I think it can be done with minimal API changes, > but it needs two mutexes, one that guards the execve, > and one that guards only the credentials. > > If no traced sibling thread exists, the mutexes are used this way: > lock(exec_guard_mutex) > cred_locked_in_execve = true; > de_thread() > lock(cred_guard_mutex) > unlock(cred_guard_mutex) > cred_locked_in_execve = false; > unlock(exec_guard_mutex) > > so effectively no API change at all. > > If a traced sibling thread exists, the mutexes are used differently: > lock(exec_guard_mutex) > cred_locked_in_execve = true; > unlock(exec_guard_mutex) > de_thread() > lock(cred_guard_mutex) > unlock(cred_guard_mutex) > lock(exec_guard_mutex) > cred_locked_in_execve = false; > unlock(exec_guard_mutex) > > Only the case changes that would deadlock anyway. Let me propose a slight alternative that I think sets us up for long term success. Leave cred_guard_mutex as is, but declare it undesirable. The cred_guard_mutex as designed really is something we should get rid of. As it it can sleep over several different userspace accesses. The copying of the exec arguments is technically as prone to deadlock as the ptrace case. Add a new mutex with a better name perhaps "exec_change_mutex" that is used to guard the changes that exec makes to a process. Then we gradually shift all the cred_guard_mutex users over to the new mutex. AKA one patch per user of cred_guard_mutex. At each patch that shifts things over we will have the opportunity to review the code to see that there no funny dependencies that were missed. I will sign up for working on the no_new_privs and ptrace_attach cases as I think I can make those happen. Especially no_new_privs. Getting the easier cases will resolve your issues and put things on a better footing. Eric