Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1903012pxu; Sun, 13 Dec 2020 07:06:22 -0800 (PST) X-Google-Smtp-Source: ABdhPJyUxEp924JJB5EdXOFa3zGQy9CMYY5S3KcJsmXVxZ/6PY8tJCk1lkVKP0pTryGYmqCgvO7m X-Received: by 2002:a17:906:e58:: with SMTP id q24mr19057706eji.438.1607871981803; Sun, 13 Dec 2020 07:06:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607871981; cv=none; d=google.com; s=arc-20160816; b=ClaM/qXARpcwf+mOhB0Kmy+4mcoIw1bAAP3Jv1d0mMyRbIMTVkSwYDi9mBMSArwpdS oHI+KB4Fnudfr199SGLR2cKM/6amWyvxq9sH3mzgxv8Rh0CtQiwlQz64aFYRZmmPqXl4 XLjHv3JYBGp0MRyz8+wQTLUX5MIdDQ7U08OnpzxUdon3vwOnAdNnOQ4Zu4fGRtoUdZSA zAwJpJomTAtiZOcThgsvIkoZyOldAM5qJq0EpUYpgpk7oFIaV3killGbciRhh5Me6qMg kyeDnrR1LasNs4sZuqL5TSP4jApOgQ/uwqb9KPkyDcIWqO/RGJ0rHBe8ZVHTSOV5E8Eg Z4rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=cW6Te7tJCB66gCiPGXXLSmNPZ8nGBXFVTmfC/JTIluI=; b=Q0/GW/OX7Uj/W8BOqJJB74JAufFJeImirBMc7Ug1PQQX8oBBgZ4lghlhuleFwwPYOi oi1VGsFBYPU4UfaJUvQKzXYo//qabefqeOdNycgoiC/XeyFbZp7DiJ/F4SUWw7Tv7JS8 ooLp/PzcBEbYzfCscxcnl+zRTkjEuUFUq0mQ9lANuCy7AnWBzhnuVQSNP/s/JzFslm/Q x/FuwPrnDIgPIuUIdeir9PO6A9AKNDgPEAJvXtzZRQUavJHAW3PiI7fTfXeHRnLj+Trw BLs90aI3FuvJsjRZxGMr7BVu7l4BchAX3clrU6Y0zFl/LA2OyFlD76q/qYYiPiGIZsI5 4isw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u2si7861701ejo.343.2020.12.13.07.05.58; Sun, 13 Dec 2020 07:06:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727617AbgLLMPD (ORCPT + 99 others); Sat, 12 Dec 2020 07:15:03 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:55261 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727289AbgLLMPD (ORCPT ); Sat, 12 Dec 2020 07:15:03 -0500 Received: from ip5f5af0a0.dynamic.kabel-deutschland.de ([95.90.240.160] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ko3n2-0005la-6W; Sat, 12 Dec 2020 12:14:20 +0000 Date: Sat, 12 Dec 2020 13:14:19 +0100 From: Christian Brauner To: "Alejandro Colomar (man-pages)" Cc: Stephen Kitt , linux-man@vger.kernel.org, Michael Kerrisk , linux-kernel@vger.kernel.org Subject: Re: [patch] close_range.2: new page documenting close_range(2) Message-ID: <20201212121419.odpgbaigrjhpkjnm@wittgenstein> References: <20201208215133.30575-1-steve@sk2.org> <20201209095817.7ksihhftmnd3c3hi@wittgenstein> <5f69d42d-c36d-b98a-3d00-7a5e7f489a07@gmail.com> <20201209105618.okw5lgcdikg5bvae@wittgenstein> <0ea38a7a-1c64-086e-3d64-38686f5b7856@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0ea38a7a-1c64-086e-3d64-38686f5b7856@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 10, 2020 at 03:36:42PM +0100, Alejandro Colomar (man-pages) wrote: > Hi Christian, Hi Alex, > > Thanks for confirming that behavior. Seems reasonable. > > I was wondering... > If this call is equivalent to unshare(2)+{close(2) in a loop}, > shouldn't it fail for the same reasons those syscalls can fail? > > What about the following errors?: > > From unshare(2): > > EPERM The calling process did not have the required privi‐ > leges for this operation. unshare(CLONE_FILES) doesn't require any privileges. Only flags relevant to kernel/nsproxy.c:unshare_nsproxy_namespaces() require privileges, i.e. CLONE_NEWNS CLONE_NEWUTS CLONE_NEWIPC CLONE_NEWNET CLONE_NEWPID CLONE_NEWCGROUP CLONE_NEWTIME so the permissions are the same. > > From close(2): > EBADF fd isn't a valid open file descriptor. > > OK, this one can't happen with the current code. > Let's say there are fds 1 to 10, and you call 'close_range(20,30,0)'. > It's a no-op (although it will still unshare if the flag is set). > But souldn't it fail with EBADF? CLOSE_RANGE_UNSHARE should always give you a private file descriptor table independent of whether or not any file descriptors need to be closed. That's also how we documented the flag: /* Unshare the file descriptor table before closing file descriptors. */ #define CLOSE_RANGE_UNSHARE (1U << 1) A caller calling unshare(CLONE_FILES) and then an emulated close_range() or the proper close_range() syscall wants to make sure that all unwanted file descriptors are closed (if any) and that no new file descriptors can be injected afterwards. If you skip the unshare(CLONE_FILES) because there are no fds to be closed you open up a race window. It would also be annoying for userspace if they _may_ have received a private file descriptor table but only if any fds needed to be closed. If people really were extremely keen about skipping the unshare when no fd needs to be closed then this could become a new flag. But I really don't think that's necessary and also doesn't make a lot of sense, imho. > > EINTR The close() call was interrupted by a signal; see sig‐ > nal(7). > > EIO An I/O error occurred. > > ENOSPC, EDQUOT > On NFS, these errors are not normally reported against > the first write which exceeds the available storage > space, but instead against a subsequent write(2), > fsync(2), or close(). None of these will be seen by userspace because close_range() currently ignores all errors after it has begun closing files. Christian