Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1550977iog; Sat, 18 Jun 2022 12:43:56 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sK57H/DJioI9e6qv4mmEfedUTLHbSDWCT9k8X0MJiWp+iofp/DbuAbNabmoVb2LR0xy+qZ X-Received: by 2002:aa7:dd46:0:b0:435:2a52:3395 with SMTP id o6-20020aa7dd46000000b004352a523395mr19926078edw.252.1655581436719; Sat, 18 Jun 2022 12:43:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655581436; cv=none; d=google.com; s=arc-20160816; b=jd2pp+fhogvnizmAEA7nX0BrxS3qFG24XCYpsUMuqUlrhG5zLTzEX3U21FTBXFdt0I Syho/CJxoo1mPYU7t0VBw73L/coWnFZowat/9un++MLYg8yuQY4II48dRmmrWDAJXQmn RvePrPReIJn+Sv3t2ZKPhW765fhZV49ONhh6x0D9aCLyrRUof6ct2hIqR0pnqK8E9g8S suM9NRWaDrEydh1D0eORs1wAuG4LQdj8RRPiH+/9WHjNN+XXqwCT7E2DkrYoIJQdEU9c +EJnA48j2xB+7crFKqxBXqZZB//TMuVbdEfp6DT6d93iVzDQsq6kPCNiop1ODPpbKgyR VScg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=a0bL49uOiCfKwT3pkrVPnT1ka4/B28/6uyQ2X04Z2uE=; b=gudDZiPpbBpw+oI+cMxnVZ78n9IQ8uYIDoMzo5im3bH0N3rgMzVbQWyR+bPsVWrTXA 38RYfm3vuBvUcqGf1csi23BLipd4bvKIdbE3lMfeGeXq+eoPqRXRZxn4Cwwo/4wnyyN4 fuqlBKD0WMN0Q0Jc/skVEpzplPEPtfQh+S+hMS6+YGdxYJdWAbJZ0wOvlWSli3ObciaL EcLVM/Cqx7RZUovsMEY8xZ6oPjqv3mklOEENynQYzUMi9IWbf3bVzvnJJQ0KX9YzZzTI dhXc7Fjcu9RH+hXQO2ncWVaR0Tw5/Oqj4hfMCIWg9IMc58QwpxAEhwaNkodC8+PJjhvg iXlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=v7sWJzFO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h19-20020a05640250d300b004355c5b88basi7066191edb.626.2022.06.18.12.43.30; Sat, 18 Jun 2022 12:43:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=v7sWJzFO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233188AbiFRTlO (ORCPT + 99 others); Sat, 18 Jun 2022 15:41:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229680AbiFRTlM (ORCPT ); Sat, 18 Jun 2022 15:41:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBCBC13D1C; Sat, 18 Jun 2022 12:41:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=a0bL49uOiCfKwT3pkrVPnT1ka4/B28/6uyQ2X04Z2uE=; b=v7sWJzFOGdSYdZIqrUgltulDvB 4OYLCiBKpTU/gxVWoe7+5BZDCThsZIpeVflFB2hKb9xef5S9oR1IzVv/TIPNZ8352DZ7L4y2H9bjc Jmm3lucQ5r46W+jh0YPPdpufOhL0oc+6u/z1/HnxocLAcrLExQk/tV+r89h7h2rJBW7HkWMzNVF0q +WIbMin/lIE9rFhhyDjKdL9OzkPDDLx9So6dRWroeqd+o2BLMeRQMd+jBz8OKtZA/L9fXvpViA0ij 3FOXifV7VGCmCrmPC9jPkGMAJgjGydG1JnS8N8FfoYeqV3urXj5BnPigDS9O7tvXEKOSK4e67Qkxt 5/OlBqAw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1o2eIt-003s30-RQ; Sat, 18 Jun 2022 19:40:19 +0000 Date: Sat, 18 Jun 2022 20:40:19 +0100 From: Matthew Wilcox To: Ralph Corderoy Cc: Nate Karstens , Alexander Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , Richard Henderson , Ivan Kokshaysky , Matt Turner , "James E.J. Bottomley" , Helge Deller , "David S. Miller" , Jakub Kicinski , Eric Dumazet , David Laight , linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Changli Gao Subject: Re: [PATCH v2] Implement close-on-fork Message-ID: References: <20200515152321.9280-1-nate.karstens@garmin.com> <20220618114111.61EC71F981@orac.inputplus.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20220618114111.61EC71F981@orac.inputplus.co.uk> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 18, 2022 at 12:41:11PM +0100, Ralph Corderoy wrote: > Hi Nate, > > > One manifestation of this is a race conditions in system(), which > > (depending on the implementation) is non-atomic in that it first calls > > a fork() and then an exec(). > > The need for O_CLOFORK might be made more clear by looking at a > long-standing Go issue, i.e. unrelated to system(3), which was started > in 2017 by Russ Cox when he summed up the current race-condition > behaviour of trying to execve(2) a newly created file: > https://github.com/golang/go/issues/22315. I raised it on linux-kernel > in 2017, https://marc.info/?l=linux-kernel&m=150834137201488, and linked > to a proposed patch from 2011, ‘[PATCH] fs: add FD_CLOFORK and > O_CLOFORK’ by Changli Gao. As I said, long-standing. The problem is that people advocating for O_CLOFORK understand its value, but not its cost. Other google employees have a system which has literally millions of file descriptors in a single process. Having to maintain this extra state per-fd is a cost they don't want to pay (and have been quite vocal about earlier in this thread). Fundamentally, fork()+exec() is a terrible model. Mind you, so is spawn(). I haven't seen a good model yet.