Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3184240ybf; Tue, 3 Mar 2020 00:59:29 -0800 (PST) X-Google-Smtp-Source: ADFU+vvArXqC3ebOBt8jF3pACxHqIVFaKLuRSvALjZOTO87OBbdK+Wm1+MsJ+jU5XNru7dByMHCR X-Received: by 2002:a9d:2028:: with SMTP id n37mr2675828ota.127.1583225969661; Tue, 03 Mar 2020 00:59:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583225969; cv=none; d=google.com; s=arc-20160816; b=sqLkeQk0B7Ot/6N4VOmSpOX0EFKCBOQ/0Z/o71fKLY9PI8pCH/ejkDAdQENx2pXK3O WbA5ctRtnEe4RF+wfp0uKbYVptzYk2ohU0d7QDsa7fdjBnq+twYv/8nPnj6sWbWzfdlg pUKQdK6UvGu/hSWCuXGQpovTNpU8gowT5+aQuhzeRHBBQH2Z6Pmg1wRWOjEA9ajbdNVy 4/84q+BPxlOyPyzrFyEhybrbxmoosgYa7ywelSr9EiBQDrwjNpImThPgsUtDLyRDSgqE RgU9XqS6AFwVyA7Vgj/+Aj5C5XkWoqhTSf1TO2NjXxBHsJTBbfVU+xCmvstPTmommDRQ PgIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Bg9gObTEzfipzaQuLjByCN8DSYbFvkeiHx6mce/HBP0=; b=Iisyc+uwRHsyLI5lePKFGtKHwuhPT6zYr5XWGqoqmHb9qhWLKEguVOmbFe3MnxPdTt Hk3TbQfIPJKIuIRhLU8LZVc3/8AEGXNV4KJAC58RnHal+SxdsZ3UX6iJN5uAguwQEYkN JqTQf5Cn1gLLlopeVwXPGrq8l49qOmL5k3NVOUryx2RSaH2Iswmsc827LPd2LOX8ZGE1 k1QbLlZLPbjV6qRn5rpAf/z498vMmigvdZg9LNJPPEk70Stnvo9VtQE2fbXn6hrZDlAG bzVSUnZJ6F81X2ocu71kGEK2QR5b3Z2nJCIaJSHiIki0OtZQjP2JJNXuho/HmcGcJVbv N4Ow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 66si6992223ott.232.2020.03.03.00.59.17; Tue, 03 Mar 2020 00:59:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727847AbgCCI6b (ORCPT + 99 others); Tue, 3 Mar 2020 03:58:31 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:34855 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725818AbgCCI6b (ORCPT ); Tue, 3 Mar 2020 03:58:31 -0500 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1j93NM-0006xr-6q; Tue, 03 Mar 2020 08:58:04 +0000 Date: Tue, 3 Mar 2020 09:58:02 +0100 From: Christian Brauner To: Kees Cook Cc: Bernd Edlinger , "Eric W. Biederman" , Jann Horn , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , Thomas Gleixner , Oleg Nesterov , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra (Intel)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , James Morris , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , Aleksa Sarai , "Dmitry V. Levin" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "stable@vger.kernel.org" Subject: Re: [PATCHv4] exec: Fix a deadlock in ptrace Message-ID: <20200303085802.eqn6jbhwxtmz4j2x@wittgenstein> References: <87a74zmfc9.fsf@x220.int.ebiederm.org> <87k142lpfz.fsf@x220.int.ebiederm.org> <875zfmloir.fsf@x220.int.ebiederm.org> <87v9nmjulm.fsf@x220.int.ebiederm.org> <202003021531.C77EF10@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <202003021531.C77EF10@keescook> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 02, 2020 at 06:26:47PM -0800, Kees Cook wrote: > On Mon, Mar 02, 2020 at 10:18:07PM +0000, Bernd Edlinger wrote: > > This fixes a deadlock in the tracer when tracing a multi-threaded > > application that calls execve while more than one thread are running. > > > > I observed that when running strace on the gcc test suite, it always > > blocks after a while, when expect calls execve, because other threads > > have to be terminated. They send ptrace events, but the strace is no > > longer able to respond, since it is blocked in vm_access. > > > > The deadlock is always happening when strace needs to access the > > tracees process mmap, while another thread in the tracee starts to > > execve a child process, but that cannot continue until the > > PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received: > > > > strace D 0 30614 30584 0x00000000 > > Call Trace: > > __schedule+0x3ce/0x6e0 > > schedule+0x5c/0xd0 > > schedule_preempt_disabled+0x15/0x20 > > __mutex_lock.isra.13+0x1ec/0x520 > > __mutex_lock_killable_slowpath+0x13/0x20 > > mutex_lock_killable+0x28/0x30 > > mm_access+0x27/0xa0 > > process_vm_rw_core.isra.3+0xff/0x550 > > process_vm_rw+0xdd/0xf0 > > __x64_sys_process_vm_readv+0x31/0x40 > > do_syscall_64+0x64/0x220 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > expect D 0 31933 30876 0x80004003 > > Call Trace: > > __schedule+0x3ce/0x6e0 > > schedule+0x5c/0xd0 > > flush_old_exec+0xc4/0x770 > > load_elf_binary+0x35a/0x16c0 > > search_binary_handler+0x97/0x1d0 > > __do_execve_file.isra.40+0x5d4/0x8a0 > > __x64_sys_execve+0x49/0x60 > > do_syscall_64+0x64/0x220 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > The proposed solution is to take the cred_guard_mutex only > > in a critical section at the beginning, and at the end of the > > execve function, and let PTRACE_ATTACH fail with EAGAIN while > > execve is not complete, but other functions like vm_access are > > allowed to complete normally. > > Sorry to be bummer, but I don't think this will work. A few more things > during the exec process depend on cred_guard_mutex being held. > > If I'm reading this patch correctly, this changes the lifetime of the > cred_guard_mutex lock to be: > - during prepare_bprm_creds() > - from flush_old_exec() through install_exec_creds() > Before, cred_guard_mutex was held from prepare_bprm_creds() through > install_exec_creds(). > > That means, for example, that check_unsafe_exec()'s documented invariant > is violated: > /* > * determine how safe it is to execute the proposed program > * - the caller must hold ->cred_guard_mutex to protect against > * PTRACE_ATTACH or seccomp thread-sync > */ > static void check_unsafe_exec(struct linux_binprm *bprm) ... > which is looking at no_new_privs as well as other details, and making > decisions about the bprm state from the current state. > > I think it also means that the potentially multiple invocations > of bprm_fill_uid() (via prepare_binprm() via binfmt_script.c and > binfmt_misc.c) would be changing bprm->cred details (uid, gid) without > a lock (another place where current's no_new_privs is evaluated). > > Related, it also means that cred_guard_mutex is unheld for every > invocation of search_binary_handler() (which can loop via the previously > mentioned binfmt_script.c and binfmt_misc.c), if any of them have hidden > dependencies on cred_guard_mutex. (Thought I only see bprm_fill_uid() > currently.) So one issue I see with having to reacquire the cred_guard_mutex might be that this would allow tasks holding the cred_guard_mutex to block a killed exec'ing task from exiting, right?