Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2376928pxu; Mon, 14 Dec 2020 00:14:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJzFXiF/ZWXySMnpHqZOnf2wZPAlwHgJPX0ZAiXk9Iivq4y6+H+BV84DyR+L2FHmrnHgImt4 X-Received: by 2002:a17:906:c24c:: with SMTP id bl12mr3842442ejb.248.1607933642329; Mon, 14 Dec 2020 00:14:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607933642; cv=none; d=google.com; s=arc-20160816; b=QdLbRRtmMl1hQWej5UKUnixeYFEmtPc69W/96O+QP/xFQtbsjAE1gWiWjvtNXxXeZe 4zmUud7HmZhC+XPPIJTmK7sleBfVw0490xmW8z+XE9cpf0Pn2iPcRD1SHejgEpiIKrYh eFjDMb8g+JDvBClwUO2Hxm7KQIRhc70eIw0XtkUhmeM7BItF8m8OLV0LZUPelp4k39z7 CNaawOtucYSGvzlGmamAl3cJGIvfehdaAnREix7zNBX1jT3qJnT0OV1U5PWOMovosqyW bh07FtbTp5eVuskYqQcP6viHK25D12C/tr540GCLRResFzPqN6hCamwN8hIXN0hTn/wH 7vEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=pGBq0gS9Tvo+AD3qRB5qrYqHg9rdP4UpBRTiGVfHgMQ=; b=GDkiZeW0IVXfp1pVYuRHhXLRGDb2liHZZ+5VbxvyvBTVggpX4U/bqSul9q73cH87Jv OSYxeCpdXnsmmyQgDSA9IMKPNZsMSMvJr25dbA/kRC6amCvkrzWC9bxMDBbGJlskHbfR TLT9MF488W3Dtr/5PwwnKp/vdzekiRe7HfQI+WKVLRCOWLDWfAASo26OGxPowi3iI84o Srog/Awmgp/TfFB98vF+fhiDOKVR/LNi9Zr966Rc+qHbU5T9zbRvdb7Ed5JhOcLeUWIe 7LeZCUhmcymneZ6pNIIKg4/AFdmxK1sJDJO4Hu5ykkzfNkZ2IpAIE7O2S3VUwT+6kPjX yEEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=n3nX4RXU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c9si6413805edr.4.2020.12.14.00.13.36; Mon, 14 Dec 2020 00:14:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=n3nX4RXU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439634AbgLLR7N (ORCPT + 99 others); Sat, 12 Dec 2020 12:59:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436681AbgLLR7N (ORCPT ); Sat, 12 Dec 2020 12:59:13 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 464F1C0613CF; Sat, 12 Dec 2020 09:58:33 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id q75so11517709wme.2; Sat, 12 Dec 2020 09:58:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=pGBq0gS9Tvo+AD3qRB5qrYqHg9rdP4UpBRTiGVfHgMQ=; b=n3nX4RXUt0Y1qFxq5Y7ajmnaB01WSbjce38Le5nGxY1ex0QajFzqkBJj+wq8xwHzR9 kApk2MDuYvdaP34XRvNZkkAMMnaypaDk2GbHTLPP1AQqLuQW0UyKyXQUPUZZNDfIf1J7 hby7+sWa34J0X/kbqO9SDfkfWpAtpRUV+CmaqvEHYZFEzT+NNbrXHxzNrID3zTEWmRzJ 8orC9HaKfqw8XK/+SHIq+or0E3UQAU40LT2Lnoo9uvAI/QXPqyPAfrUcEJPaVdS+9wV+ 07Q+QqjTEXHDOd3txtZK4jDqmv8udgzulj0dsyG6yucK1MkPGWIDrTgEqNTJit2A1qSK voOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=pGBq0gS9Tvo+AD3qRB5qrYqHg9rdP4UpBRTiGVfHgMQ=; b=BwpI/Ky084gt7rQNLrScDAOUiEGYflHeqnWNfwkN/7fidqilCNVuX8Kz7bMOX4LlMg e9RbKd6nx2r0K6RMSm8X167eNgRknyN1JTarhtmD8TnmHuVkiNbo2oxdX5hE7ifsa6VT olrN4CdytISwNlnDtGhmqzCqLfpaH2klHfvDCa4ixJqN/0CbwVpZ0GLx8SmrlvBjXybD 9vbxtQqAiIP6vPfRnBQSYHUrEU3g7xc0GPjf0hxO9UlXfBSxN0b3goWl8bMgJQ1dEzGd u0YyS50SmtINBSH7BVGiE6j1oP/lmmljGj/4Hye5oArH/fr9xCfu9oSDDF3quAJ2w8Vm nU7g== X-Gm-Message-State: AOAM533fUDGJDQebLlgm7KQgRGhvzesUCVoL6kNvP82pje7brAdjsGGB 4DoJjDvfVVqc4sSeyCPaYgYiii77UUVZTA== X-Received: by 2002:a1c:a912:: with SMTP id s18mr19088413wme.26.1607795911156; Sat, 12 Dec 2020 09:58:31 -0800 (PST) Received: from [192.168.0.160] ([170.253.49.0]) by smtp.gmail.com with ESMTPSA id z8sm20894522wmg.17.2020.12.12.09.58.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 12 Dec 2020 09:58:30 -0800 (PST) Subject: Re: [patch] close_range.2: new page documenting close_range(2) To: Christian Brauner Cc: Stephen Kitt , linux-man@vger.kernel.org, Michael Kerrisk , linux-kernel@vger.kernel.org References: <20201208215133.30575-1-steve@sk2.org> <20201209095817.7ksihhftmnd3c3hi@wittgenstein> <5f69d42d-c36d-b98a-3d00-7a5e7f489a07@gmail.com> <20201209105618.okw5lgcdikg5bvae@wittgenstein> <0ea38a7a-1c64-086e-3d64-38686f5b7856@gmail.com> <20201212121419.odpgbaigrjhpkjnm@wittgenstein> From: "Alejandro Colomar (man-pages)" Message-ID: <47a388ca-bcd8-d917-0a0a-cdbd185d6998@gmail.com> Date: Sat, 12 Dec 2020 18:58:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20201212121419.odpgbaigrjhpkjnm@wittgenstein> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Christian, Makes sense to me. Thanks, Alex On 12/12/20 1:14 PM, Christian Brauner wrote: > On Thu, Dec 10, 2020 at 03:36:42PM +0100, Alejandro Colomar (man-pages) wrote: >> Hi Christian, > > Hi Alex, > >> >> Thanks for confirming that behavior. Seems reasonable. >> >> I was wondering... >> If this call is equivalent to unshare(2)+{close(2) in a loop}, >> shouldn't it fail for the same reasons those syscalls can fail? >> >> What about the following errors?: >> >> From unshare(2): >> >> EPERM The calling process did not have the required privi‐ >> leges for this operation. > > unshare(CLONE_FILES) doesn't require any privileges. Only flags relevant > to kernel/nsproxy.c:unshare_nsproxy_namespaces() require privileges, > i.e. > CLONE_NEWNS > CLONE_NEWUTS > CLONE_NEWIPC > CLONE_NEWNET > CLONE_NEWPID > CLONE_NEWCGROUP > CLONE_NEWTIME > so the permissions are the same. > >> >> From close(2): >> EBADF fd isn't a valid open file descriptor. >> >> OK, this one can't happen with the current code. >> Let's say there are fds 1 to 10, and you call 'close_range(20,30,0)'. >> It's a no-op (although it will still unshare if the flag is set). >> But souldn't it fail with EBADF? > > CLOSE_RANGE_UNSHARE should always give you a private file descriptor > table independent of whether or not any file descriptors need to be > closed. That's also how we documented the flag: > > /* Unshare the file descriptor table before closing file descriptors. */ > #define CLOSE_RANGE_UNSHARE (1U << 1) > > A caller calling unshare(CLONE_FILES) and then an emulated close_range() > or the proper close_range() syscall wants to make sure that all unwanted > file descriptors are closed (if any) and that no new file descriptors > can be injected afterwards. If you skip the unshare(CLONE_FILES) because > there are no fds to be closed you open up a race window. It would also > be annoying for userspace if they _may_ have received a private file > descriptor table but only if any fds needed to be closed. > > If people really were extremely keen about skipping the unshare when no > fd needs to be closed then this could become a new flag. But I really > don't think that's necessary and also doesn't make a lot of sense, imho. > >> >> EINTR The close() call was interrupted by a signal; see sig‐ >> nal(7). >> >> EIO An I/O error occurred. >> >> ENOSPC, EDQUOT >> On NFS, these errors are not normally reported against >> the first write which exceeds the available storage >> space, but instead against a subsequent write(2), >> fsync(2), or close(). > > None of these will be seen by userspace because close_range() currently > ignores all errors after it has begun closing files. > > Christian >