Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp3471437ybz; Mon, 20 Apr 2020 03:28:59 -0700 (PDT) X-Google-Smtp-Source: APiQypL2UJntSxslb86CjMWdIUSyzHcq5ozKd178/U08lnz5Zcq44RxzhPNQpwKQ0uJ+xCVpBnzg X-Received: by 2002:a17:906:1ccb:: with SMTP id i11mr15135916ejh.101.1587378539559; Mon, 20 Apr 2020 03:28:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587378539; cv=none; d=google.com; s=arc-20160816; b=HtvNs6igjDlr/S1RsFobtDHJFTWS+UI7hezMp/vLf6jsXubY869Adh27fb2eEqcO4e 7Op8wDXEi03pSIASvLXLcpbtTSn0o3F8mH1pbe6eHmL0sNeagAfrrAa0zD5rA5MTB28f PVDhXUZzpJbt/CwpitL0Gx6xVLbS7tswSSTMdBYMuf/gCRNbAlzbsl1qW/s74yjmvm8C 3lyr0N9pBUdXtJpr0NZTXiJomQ62rrJBwtWyLRaSUiTm9t/fIbFZgJRxSR4WZoaUkkkX Mdy58+L3tazKVnTnk3Fu4ItlCTdlBcQGuQVBU5c9UmhFO7opDKNv3fXdMajVdpxHQ27+ DbSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=yypQDahjbXqJhDbVQSiASHihfGpSC0JMzsgU8Cz/AiY=; b=IalFeOvw6GH02oRbb4hoywJTFPp8ng6Lnf+q6ug6PqnkZP+gQLWzlmjNOLCPoGNSHy MP3mhSG40DEXj79eYeogquouwxQbq01hHAzEi3UTdT0wJt2U9uBJgWIpoUdlD3XDZF4m 7yblsFuryTFxK6xx3axm52f48dsatg3JYRgrTKqBZpIM71FVXtA+B4cdsD+VFxJ9VKIS QmOkSiTSUdDVikjMwOL0TBMr7Opk6+TEMMJY6DhgQfQoTHcIUOiuozP/O9Xc5WgEXmwl HwdELYR9BBp8cVD6kH5zdsA9lanMae+hirltN/o1hbG2zvHEgFDGs8+36GhFnZcqyD4+ JUsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HdCWUfWu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f11si225296ejb.492.2020.04.20.03.28.36; Mon, 20 Apr 2020 03:28:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HdCWUfWu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726173AbgDTKZz (ORCPT + 99 others); Mon, 20 Apr 2020 06:25:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1725773AbgDTKZy (ORCPT ); Mon, 20 Apr 2020 06:25:54 -0400 Received: from mail-pg1-x542.google.com (mail-pg1-x542.google.com [IPv6:2607:f8b0:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0238DC061A0C; Mon, 20 Apr 2020 03:25:53 -0700 (PDT) Received: by mail-pg1-x542.google.com with SMTP id p8so4830736pgi.5; Mon, 20 Apr 2020 03:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=yypQDahjbXqJhDbVQSiASHihfGpSC0JMzsgU8Cz/AiY=; b=HdCWUfWuE1u+XaXD9Ms85gt4izU8DyOrmmvb1s/Jb/Vh5rKqf9BQq9h5pCmTjXXb8E i1UeedwF++2a3WAWjZe5irqGqtuCmn1sqTn985LVCgLRPFMufdeksavjuiSojGUipIdZ KuEf+TG/qxTGRCRUtJ2GKY3NIj3P93vGRM0HLck6KdcKx4DGwm+0M9vy2nIlfVF7NNsx sye5nxxQEOtR0dyZ/uTdz2bHo6IPo03UlAOiO7CTUc83i4baqLwa3JKWU4scp5Bb9LZB pg1dl1F8j76CX5u1r2aZ9/L9ZVP5bNKuWBRNtyK6n4Bidq3Wq9nle2G3W7KDxUqkhV/I iVOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=yypQDahjbXqJhDbVQSiASHihfGpSC0JMzsgU8Cz/AiY=; b=J/a94mo1ZdoSSAE6gC0TkDuX8Lc1jt9IUMOaIuVwAsPZXZuqGu7FVYDPDS8ZBBxkwZ F6zOV1wP0UbxHHP2mJ+OngHu5ojTwVz6C8au44jeYuKNppv/MuhT98lGkeCN5GgAkXal yjdPoYW8TC0hSKe5Y24RdL0ll68NuA+iEy/r+COiZs3SgZ3H1b1u+Dxxwd18oHpPJ4QM AdkZz5msh3/QbVLYHiBynaU/apzmxvV2tNXbgrbfaSmCpH71xA/q1uW2pepHcotwWDBZ 7tSANFv7cuesU2EUWRpW5qIojQVjIeNgxqqrc7UYvvgLl7GTVQeYHXKrLcuZmqybYCNk uC/Q== X-Gm-Message-State: AGi0PuZ7fNOJM0hFzBy4LKDO+qBUH2uxXXWbsuWLzsoAAF4IZxbeyHnT aKu7kLtCltjPkDyX7J/3kM0= X-Received: by 2002:a62:2a8c:: with SMTP id q134mr16642242pfq.35.1587378352530; Mon, 20 Apr 2020 03:25:52 -0700 (PDT) Received: from [192.168.86.235] (c-73-241-150-58.hsd1.ca.comcast.net. [73.241.150.58]) by smtp.gmail.com with ESMTPSA id h14sm651899pjc.46.2020.04.20.03.25.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 20 Apr 2020 03:25:51 -0700 (PDT) Subject: Re: [PATCH 1/4] fs: Implement close-on-fork To: Nate Karstens , Alexander Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , Richard Henderson , Ivan Kokshaysky , Matt Turner , "James E.J. Bottomley" , Helge Deller , "David S. Miller" , Jakub Kicinski , linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Changli Gao References: <20200420071548.62112-1-nate.karstens@garmin.com> <20200420071548.62112-2-nate.karstens@garmin.com> From: Eric Dumazet Message-ID: <36dce9b4-a0bf-0015-f6bc-1006938545b1@gmail.com> Date: Mon, 20 Apr 2020 03:25:49 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <20200420071548.62112-2-nate.karstens@garmin.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/20/20 12:15 AM, Nate Karstens wrote: > The close-on-fork flag causes the file descriptor to be closed > atomically in the child process before the child process returns > from fork(). Implement this feature and provide a method to > get/set the close-on-fork flag using fcntl(2). > > This functionality was approved by the Austin Common Standards > Revision Group for inclusion in the next revision of the POSIX > standard (see issue 1318 in the Austin Group Defect Tracker). Oh well... yet another feature slowing down a critical path. > > Co-developed-by: Changli Gao > Signed-off-by: Changli Gao > Signed-off-by: Nate Karstens > --- > fs/fcntl.c | 2 ++ > fs/file.c | 50 +++++++++++++++++++++++++- > include/linux/fdtable.h | 7 ++++ > include/linux/file.h | 2 ++ > include/uapi/asm-generic/fcntl.h | 5 +-- > tools/include/uapi/asm-generic/fcntl.h | 5 +-- > 6 files changed, 66 insertions(+), 5 deletions(-) > > diff --git a/fs/fcntl.c b/fs/fcntl.c > index 2e4c0fa2074b..23964abf4a1a 100644 > --- a/fs/fcntl.c > +++ b/fs/fcntl.c > @@ -335,10 +335,12 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, > break; > case F_GETFD: > err = get_close_on_exec(fd) ? FD_CLOEXEC : 0; > + err |= get_close_on_fork(fd) ? FD_CLOFORK : 0; > break; > case F_SETFD: > err = 0; > set_close_on_exec(fd, arg & FD_CLOEXEC); > + set_close_on_fork(fd, arg & FD_CLOFORK); > break; > case F_GETFL: > err = filp->f_flags; > diff --git a/fs/file.c b/fs/file.c > index c8a4e4c86e55..de7260ba718d 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -57,6 +57,8 @@ static void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt, > memset((char *)nfdt->open_fds + cpy, 0, set); > memcpy(nfdt->close_on_exec, ofdt->close_on_exec, cpy); > memset((char *)nfdt->close_on_exec + cpy, 0, set); > + memcpy(nfdt->close_on_fork, ofdt->close_on_fork, cpy); > + memset((char *)nfdt->close_on_fork + cpy, 0, set); > I suggest we group the two bits of a file (close_on_exec, close_on_fork) together, so that we do not have to dirty two separate cache lines. Otherwise we will add yet another cache line miss at every file opening/closing for processes with big file tables. Ie having a _single_ bitmap array, even bit for close_on_exec, odd bit for close_on_fork static inline void __set_close_on_exec(unsigned int fd, struct fdtable *fdt) { __set_bit(fd * 2, fdt->close_on_fork_exec); } static inline void __set_close_on_fork(unsigned int fd, struct fdtable *fdt) { __set_bit(fd * 2 + 1, fdt->close_on_fork_exec); } Also the F_GETFD/F_SETFD implementation must use a single function call, to not acquire the spinlock twice.