Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1379125ybf; Sun, 1 Mar 2020 07:59:33 -0800 (PST) X-Google-Smtp-Source: ADFU+vufiL20CCh+zN/2UkEhY+24KL0cP4C1BSq3vnKI9mfR9t3G/F8beGPurrKRMum3lCxsndMU X-Received: by 2002:aca:fc11:: with SMTP id a17mr2743253oii.123.1583078372825; Sun, 01 Mar 2020 07:59:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583078372; cv=none; d=google.com; s=arc-20160816; b=Tjig3OrX73q2JZOar5FxklsxvPz/tHUVp4/BVn/27IVGpL6aPzIjKnjyqWOdgrc+8X CiE3k8eQdWRVEtqCu/+VC0jZaCBymIQXcd5b/rjoOz5t+4FaY7j1BXnLYwqh5FkRsms4 /576k5ZFAVI19rBqVFcUgwHXQTdbA00vLWi6mFbgYTdH32KWtI3uvypVDvtNTnLdrpoY wTqh1tP9CKaly3gHwXV6xRg+XoiUxCXM6N26wV/5u3a4u5winsL5mSfIpA4Qjdzn4ynC mXd3gV54y6X37Ji9f4gSt3IFx5UnHqfBIXIR6mXWTEg2vu6MfRKNtWJWhUaO2+7jEbvv 48Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=xQONXsK/+vbGpt9brqn/2WwZZto3/HN+LJO6NutQmwo=; b=Kz8x9t55C4wzsY8JaYake9okDxj5V+KnSHk53OdIFaq7zssk1/QT5HpATuCOhOUJku pzMuG/e8qn2ue7sWxLSsZw+4i/v/rE8vRfEJTrR9Snh8WQR6/Bxsq72uDHU12COiKFNb OCOWQXPXfEBTU0G4v3Ya+1w66OY1NsZRH4S5JJqkkAhUhQ8mbP/JU9TVScZKc2L++VWx LrfxAxjqHLweSHE8wBULZMIKVQ+LMQEnZc7WG5sWfX1E7JuDJphJC58CNuAxzNSyGDND 6ygrXJSGsz0fMKJAL85+TD6JqdfLViVdEbMplAdJexTB7+mV3NcHDVqSjWd4hiRdzJdX 6U2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d8si5103281oti.306.2020.03.01.07.59.20; Sun, 01 Mar 2020 07:59:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726627AbgCAP7Q (ORCPT + 99 others); Sun, 1 Mar 2020 10:59:16 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:55079 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725945AbgCAP7P (ORCPT ); Sun, 1 Mar 2020 10:59:15 -0500 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1j8Qz8-0006pM-Le; Sun, 01 Mar 2020 15:58:30 +0000 Date: Sun, 1 Mar 2020 16:58:29 +0100 From: Christian Brauner To: Aleksa Sarai , Bernd Edlinger , Oleg Nesterov Cc: Bernd Edlinger , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , "Eric W. Biederman" , Thomas Gleixner , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra (Intel)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , Jann Horn , James Morris , Kees Cook , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , "Dmitry V. Levin" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" Subject: Re: [PATCH] exec: Fix a deadlock in ptrace Message-ID: <20200301155829.iiupfihl6z4jkylh@wittgenstein> References: <20200301151333.bsjfdjcjddsza2vn@yavin> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200301151333.bsjfdjcjddsza2vn@yavin> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 02, 2020 at 02:13:33AM +1100, Aleksa Sarai wrote: > On 2020-03-01, Bernd Edlinger wrote: > > This fixes a deadlock in the tracer when tracing a multi-threaded > > application that calls execve while more than one thread are running. > > > > I observed that when running strace on the gcc test suite, it always > > blocks after a while, when expect calls execve, because other threads > > have to be terminated. They send ptrace events, but the strace is no > > longer able to respond, since it is blocked in vm_access. > > > > The deadlock is always happening when strace needs to access the > > tracees process mmap, while another thread in the tracee starts to > > execve a child process, but that cannot continue until the > > PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received: > > > > strace D 0 30614 30584 0x00000000 > > Call Trace: > > __schedule+0x3ce/0x6e0 > > schedule+0x5c/0xd0 > > schedule_preempt_disabled+0x15/0x20 > > __mutex_lock.isra.13+0x1ec/0x520 > > __mutex_lock_killable_slowpath+0x13/0x20 > > mutex_lock_killable+0x28/0x30 > > mm_access+0x27/0xa0 > > process_vm_rw_core.isra.3+0xff/0x550 > > process_vm_rw+0xdd/0xf0 > > __x64_sys_process_vm_readv+0x31/0x40 > > do_syscall_64+0x64/0x220 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > expect D 0 31933 30876 0x80004003 > > Call Trace: > > __schedule+0x3ce/0x6e0 > > schedule+0x5c/0xd0 > > flush_old_exec+0xc4/0x770 > > load_elf_binary+0x35a/0x16c0 > > search_binary_handler+0x97/0x1d0 > > __do_execve_file.isra.40+0x5d4/0x8a0 > > __x64_sys_execve+0x49/0x60 > > do_syscall_64+0x64/0x220 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > The proposed solution is to have a second mutex that is > > used in mm_access, so it is allowed to continue while the > > dying threads are not yet terminated. > > > > I also took the opportunity to improve the documentation > > of prepare_creds, which is obviously out of sync. > > > > Signed-off-by: Bernd Edlinger > > I can't comment on the validity of the patch, but I also found and > reported this issue in 2016[1] and the discussion quickly veered into > the problem being more complicated (and uglier) than it seems at first > glance. > > You should probably also Cc stable, given this has been a long-standing > issue and your patch doesn't look (too) invasive. > > [1]: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/ Yeah, I remember you mentioning this a while back. Bernd, we really want a reproducer for this sent alongside with this patch added to: tools/testing/selftests/ptrace/ Having a test for this bug irrespective of whether or not we go with this as fix seems really worth it. Oleg seems to have suggested that a potential alternative fix is to wait in de_thread() until all other threads in the thread-group have passed exit_notiy(). Right now we only kill them but don't wait. Currently de_thread() only waits for the thread-group leader to pass exit_notify() whenever a non-thread-group leader thread execs (because the exec'ing thread becomes the new thread-group leader with the same pid as the former thread-group leader). Christian