Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1092114ybh; Tue, 10 Mar 2020 14:31:48 -0700 (PDT) X-Google-Smtp-Source: ADFU+vv371MnnXdIr04DnXYeEvzIIHWOtyZREulUwhafsLoCqWOf9N/Q+m+yC10U1fUX69qWHaKI X-Received: by 2002:aca:474e:: with SMTP id u75mr2605104oia.52.1583875908148; Tue, 10 Mar 2020 14:31:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583875908; cv=none; d=google.com; s=arc-20160816; b=JApVnK60KCml9gB9ZAN65bmHq1rvesNW9TPVyS8uCUqvgCUfiwX0w+HjUiUE0OHKzV Xf2ABoYp6JFju9dVDb2qAQDy3SjZ8jLKeqAegtmhqGxb1+vA5x0b836HKVsRM3CD8hQj RHl623y8WFJhM4XfcJ64o4jay2cuWLZ/HMbXSHLz8Oiykw1b2xAyb7mJMhCsmwf5Gzpl B+51dN9lGoNDr8TON1bhoYK1p0H7bhKquJmjHMhMuXb9cy/fBSc0iMojepW78s9Aykgq uC9HujzXz7Ne95DPSY9+CzNhAy3eex7f5NRdy5DUcZ7EhJPD7l+SqbuT9id9A+W12ybY HvLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=rxQd9KNQEM6LaNn3WgSscMleRhwruqDmCh2tYCAJzq0=; b=kugmMSVCwKvn3irtj0yI2bd2dU2ncERZNt/FY3skCGOpyhoFslPVWSG+KcZmAh6e5Y sjOOUQyoqkN7pO3rQF/Fj/xucFNCvEu73pLR2yP9J9Z7IX12qsyq0/4KWSfeSqCYQsYr 2A6zPEvuGeG1Dn0L7unoJ03dU/A5IX9zIy//jizjosY0O5hhAQzM2CLGWBzUk9HkfaY/ PC3AWamfyFmB46IYZ39IyQuqSXz98c0MQXk15Od9PIYbe6iQtTOlO+gjwsMWbw0tbOkC Qg8rBwtuiG9znPaylGCjC+wQXbEbEsCY0pTSbrGg0h5LOEYP0p2ziUM60nVaNPJ+Pa8u wt2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v22si8595482otq.109.2020.03.10.14.31.35; Tue, 10 Mar 2020 14:31:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727582AbgCJVad (ORCPT + 99 others); Tue, 10 Mar 2020 17:30:33 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:58351 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726265AbgCJVad (ORCPT ); Tue, 10 Mar 2020 17:30:33 -0400 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jBmRs-0000sm-Qb; Tue, 10 Mar 2020 21:30:00 +0000 Date: Tue, 10 Mar 2020 22:29:57 +0100 From: Christian Brauner To: "Eric W. Biederman" Cc: Jann Horn , Bernd Edlinger , Kees Cook , Jonathan Corbet , Alexander Viro , Andrew Morton , Alexey Dobriyan , Thomas Gleixner , Oleg Nesterov , Frederic Weisbecker , Andrei Vagin , Ingo Molnar , "Peter Zijlstra (Intel)" , Yuyang Du , David Hildenbrand , Sebastian Andrzej Siewior , Anshuman Khandual , David Howells , James Morris , Greg Kroah-Hartman , Shakeel Butt , Jason Gunthorpe , Christian Kellner , Andrea Arcangeli , Aleksa Sarai , "Dmitry V. Levin" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "stable@vger.kernel.org" , "linux-api@vger.kernel.org" , Arnd Bergmann , Sargun Dhillon Subject: Re: [PATCH] pidfd: Stop taking cred_guard_mutex Message-ID: <20200310212957.aatd4yzjwsyudi2g@wittgenstein> References: <877dztz415.fsf@x220.int.ebiederm.org> <20200309201729.yk5sd26v4bz4gtou@wittgenstein> <87k13txnig.fsf@x220.int.ebiederm.org> <20200310085540.pztaty2mj62xt2nm@wittgenstein> <87wo7svy96.fsf_-_@x220.int.ebiederm.org> <87k13sui1p.fsf@x220.int.ebiederm.org> <87lfo8rkqo.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87lfo8rkqo.fsf@x220.int.ebiederm.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2020 at 03:57:35PM -0500, Eric W. Biederman wrote: > Jann Horn writes: > > > On Tue, Mar 10, 2020 at 9:00 PM Jann Horn wrote: > >> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman wrote: > >> > Jann Horn writes: > >> > > On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman wrote: > >> > >> During exec some file descriptors are closed and the files struct is > >> > >> unshared. But all of that can happen at other times and it has the > >> > >> same protections during exec as at ordinary times. So stop taking the > >> > >> cred_guard_mutex as it is useless. > >> > >> > >> > >> Furthermore he cred_guard_mutex is a bad idea because it is deadlock > >> > >> prone, as it is held in serveral while waiting possibly indefinitely > >> > >> for userspace to do something. > > [...] > >> > > If you make this change, then if this races with execution of a setuid > >> > > program that afterwards e.g. opens a unix domain socket, an attacker > >> > > will be able to steal that socket and inject messages into > >> > > communication with things like DBus. procfs currently has the same > >> > > race, and that still needs to be fixed, but at least procfs doesn't > >> > > let you open things like sockets because they don't have a working > >> > > ->open handler, and it enforces the normal permission check for > >> > > opening files. > >> > > >> > It isn't only exec that can change credentials. Do we need a lock for > >> > changing credentials? > > [...] > >> > If we need a lock around credential change let's design and build that. > >> > Having a mismatch between what a lock is designed to do, and what > >> > people use it for can only result in other bugs as people get confused. > >> > >> Hmm... what benefits do we get from making it a separate lock? I guess > >> it would allow us to make it a per-task lock instead of a > >> signal_struct-wide one? That might be helpful... > > > > But actually, isn't the core purpose of the cred_guard_mutex to guard > > against concurrent credential changes anyway? That's what almost > > everyone uses it for, and it's in the name... > > Having been through all of the users nope. > > Maybe someone tried to repurpose for that. I haven't traced through > when it went the it was renamed from cred_exec_mutex to > cred_guard_mutex. > > The original purpose was to make make exec and ptrace deadlock. But it > was seen as being there to allow safely calculating the new credentials > before the point of now return. Because if a process is ptraced or not > affects the new credential calculations. Unfortunately offering that > guarantee fundamentally leads to deadlock. > > So ptrace_attach and seccomp use the cred_guard_mutex to guarantee > a deadlock. > > The common use is to take cred_guard_mutex to guard the window when > credentials and process details are out of sync in exec. But there > is at least do_io_accounting that seems to have the same justification > for holding __pidfd_fget. > > With effort I suspect we can replace exec_change_mutex with task_lock. > When we are guaranteed to be single threaded placing exec_change_mutex > in signal_struct doesn't really help us (except maybe in some races?). > > The deep problem is no one really understands cred_guard_mutex so it is > a mess. Code with poorly defined semantics is always wrong somewhere This is a good point. When discussing patches sensitive to credential changes cred_guard_mutex was always introduced as having the purpose to guard against concurrent credential changes. And I'm pretty sure that that's how most people have been using it for quite a long time. I mean, it's at least the case for seccomp and proc and probably quite a few more. So the problem seems to me that it has clear _intended_ semantics that runs into issues in all sorts of cases. So if cred_guard_mutex is not that then we seem to need to provide something that serves it's intended purpose.