Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1185266pxb; Fri, 27 Aug 2021 03:20:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3q9WIzGdpL1Z1nuBPu/22k3dn0e15+6R0yHQTqR9BXXtSezVwqh9IpH1lynGpEOocvSVO X-Received: by 2002:a5d:9253:: with SMTP id e19mr7059358iol.35.1630059614278; Fri, 27 Aug 2021 03:20:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630059614; cv=none; d=google.com; s=arc-20160816; b=hGBPz5uP8qAZQj7IWklVM1eWsk22TxhpkdGnmYfu6PRw2OK1P80KKyKQngsMZ/sd6b jIG0Da1ZoKT00QQO2iCsNvuEwoN8eHwM4U0k/rWOsKkolHdvg3ozcrdFTaema6ZpvRVr h3S33g0Pi+QpphCDGcXN32hMvvJ+8+E2SIo4n+MPJJD33VuOREpcBDjz3z7vDtpSHpc+ y18jyLCTIyyMcjVsCFSgAEUcZIFzE5+m2/BOs6R+b5rM5dHQGcEZkBvD+S6JRsb77wzb BucIs3CKInh0goDFk6vu6KjTS7wRGBh2MMRmb2MHMH801ypez8kirCS2uJdaV867NQmQ sROQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=BiYVpGfwenP+Ma5bvKVy0WK5cxC98kRYyW/cRuyLKG4=; b=kA0yQlM7Ph9DXPaf4HCCzF/clffiF1mxfcUjtmMe85nrBfH7DLbd1tr7/ZFShcTYIw MnpB3VqW7NzI6Uo2ASb4F144QBNZt4J3pdxdGqkzmnjh44wP6PMF6SegoW+w+OzVGdbY ozYaZmhG8uR7vyiWEuHXkRtKRLviTOOcG7/M7DCFaFIiaubXE5FS/aO2YoEPGfSyHMsI lmoW/hHHDjtXprsFjmtVqXEEA/+I3Ed6WlJW+Nw2dzXdoSNBZmZYdPYkngeXCe5DT0Rv ArLgGdHedabkkvnF5nNm5ihTXlpYVky7h+39nRtWvVFmhlJ2Lg3Jf+M4RIAuwUbzF1IZ TDaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c203si6574609iof.54.2021.08.27.03.20.02; Fri, 27 Aug 2021 03:20:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244831AbhH0KUA (ORCPT + 99 others); Fri, 27 Aug 2021 06:20:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:43168 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244708AbhH0KUA (ORCPT ); Fri, 27 Aug 2021 06:20:00 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A6CAF60560; Fri, 27 Aug 2021 10:18:56 +0000 (UTC) Date: Fri, 27 Aug 2021 12:18:52 +0200 From: Christian Brauner To: David Hildenbrand Cc: Andy Lutomirski , Linus Torvalds , "Eric W. Biederman" , David Laight , Linux Kernel Mailing List , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Al Viro , Alexey Dobriyan , Steven Rostedt , "Peter Zijlstra (Intel)" , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Michel Lespinasse , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Collin Fijalkovich , Michal Hocko , Miklos Szeredi , Chengguang Xu , Christian =?utf-8?B?S8O2bmln?= , "linux-unionfs@vger.kernel.org" , Linux API , the arch/x86 maintainers , linux-fsdevel@vger.kernel.org, Linux-MM , Florian Weimer , Michael Kerrisk Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE Message-ID: <20210827101852.7vbb2pqqyixqzd3b@wittgenstein> References: <87lf56bllc.fsf@disp2133> <87eeay8pqx.fsf@disp2133> <5b0d7c1e73ca43ef9ce6665fec6c4d7e@AcuMS.aculab.com> <87h7ft2j68.fsf@disp2133> <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 26, 2021 at 11:47:07PM +0200, David Hildenbrand wrote: > On 26.08.21 19:48, Andy Lutomirski wrote: > > On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote: > > > On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski wrote: > > > > > > > > I’ll bite. How about we attack this in the opposite direction: remove the deny write mechanism entirely. > > > > > > I think that would be ok, except I can see somebody relying on it. > > > > > > It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time. > > > > Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY. Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it. > > > > Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write. So, in a multithreaded program, one thread does: > > > > fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC); > > write(fd, some stuff); > > > > <--- problem is here > > > > close(fd); > > execve("some exefile"); > > > > Another thread does: > > > > fork(); > > execve("something else"); > > > > In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails. Whoops. See, for example: > > > > https://github.com/golang/go/issues/22315 > > > > I propose we get rid of deny_write_access() completely to solve this. > > > > Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons. > > > > (OFD locks seem like they might have the same problem. Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?) > > > > It's not like this issue is new (^2017) or relevant in practice. So no need > to hurry IMHO. One step at a time: it might make perfect sense to remove > ETXTBSY, but we have to be careful to not break other user space that > actually cares about the current behavior in practice. I agree. As I at least tried to show, removing write-protection can make some exploits easier. I'm all for trying to remove this if it simplifies things but for sure this shouldn't be part of this patchset and we should be careful about it. The removal of a (misguided or only partially functioning) protection mechanism doesn't introduce but removes a failure point. And I don't think removal and addition of a failure point usually have the same consequences. Introducing a new failure point will often mean userspace quickly detects regressions. Such regressions are pretty common due to security fixes we introduce. Recent examples include [1]. Right after this was merged the regression was reported. But when allowing behavior that used to fail like ETXTBSY it can be difficult for userspace to detect such regressions. The reason for that is quite often that userspace applications don't tend to do something that they know upfront will fail. Attackers however might. [1]: bfb819ea20ce ("proc: Check /proc/$pid/attr/ writes against file opener") Christian