Received: by 10.192.165.148 with SMTP id m20csp1825396imm; Thu, 3 May 2018 06:06:21 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqaF44+cLwDhy+sXkfPwrBMRTzvLxYpW0kt41izrPDfj5b9uxKQQdrvQwfZbN+DDW2uBhBe X-Received: by 2002:a17:902:bc84:: with SMTP id bb4-v6mr12842399plb.84.1525352781390; Thu, 03 May 2018 06:06:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525352781; cv=none; d=google.com; s=arc-20160816; b=T7manqWmM7O6uhNu3V+IAQUhApPsF2BQa0z8IdhkeCdzhdyvCKmSQcj2UoJ3vJda0j 0McBAkbSa1cKbbxjwEXirwGxodCzDhPWw+2cxGG4zpXc+ZvStGDrrm/WKAjjHGfvCyTI tpqDShDqQPcOmBEHkR1vZrVHcbci4e+tCIiOQStkq0xIqHxxlBGYjli1xG6+ekY0eadA rctzYbJoLf8J4lAvtJXJ/FDw9FWbk+H/78Jl3XKIVPHpcJuTS8A1e2F8wVDPEPN4gD5b R7OtFZ3WDUc0a5Bkfj9sm33dDXd2gqQp9NZnaYI3N9YMQlKg+tXRfpgT6whIP0oRUIyR KdBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=P7JViVTZh5PV3TzLW7lkn7O9jlJQX9nOlq6cteYGAP4=; b=Vp16wWrxqyXO7/PsX0Cv4n5H/ORpyQ2e783z2gUFBNCF/o5i7qJjzSSQgrcrJs1Ov/ Lqpxe4N/PisIppl1LEximjB9rtALFGQClGN1xT3PzqwkdwXptaawxQ8wLWlgDRiVZNc/ ohh2HP94eU/HfgulWkiqJWyDyxt+sRAhGRo63Fe+74GT0eTvu5KXi+ERDQF0duoK2Lul A9txHflNznbtO1gmNTEJxWJAh63B+rUFrGCMM6p/PdhbWF8PuNgh1moKlB9w4FmMMaK3 fEWZqTRA0L5At67Lpf2eBLkEGWRz2rj3c/7RNdvSwSyXGmPW3Y4ng/IdGdPjkjq+cniz YZsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e1-v6si13448703pld.69.2018.05.03.06.06.06; Thu, 03 May 2018 06:06:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751121AbeECNEo (ORCPT + 99 others); Thu, 3 May 2018 09:04:44 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:42559 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750781AbeECNEm (ORCPT ); Thu, 3 May 2018 09:04:42 -0400 Received: from mail-ua0-f197.google.com ([209.85.217.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1fEDub-0006zS-7B for linux-kernel@vger.kernel.org; Thu, 03 May 2018 13:04:41 +0000 Received: by mail-ua0-f197.google.com with SMTP id 70so15393079uav.18 for ; Thu, 03 May 2018 06:04:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=P7JViVTZh5PV3TzLW7lkn7O9jlJQX9nOlq6cteYGAP4=; b=isInN/JYyFY2sL1ZwhTE4bimoZcNnU5dwnItsLmpKQjFxzcqlVO8W9/iKWDyWqEgZD ulH3SU1v5uxhw2vALPjM1AsqMIayCzbHA63OskfywmeBMJUoNvCpJ0PLw2by7k7BgWLS UsxkZVj0jyvVCYBl7Z06WXNpnadWwMIBqW8uXsfmYheSS2xFcb153dLCIx1ZhdncXvRP 6CZOlXjewtJr6sezfTlwGrV8fHaSqZXoPWDuq6Zr/dqoOGZjUQhQjICzvbnrQ8svNwri y/1wnxKN2l/D3GJ4yaCB9CNFWTnv8zM3EnbCwpN1NP32WPsfb7S0h+GZEJ9W8eB1pDQc aP+Q== X-Gm-Message-State: ALQs6tB6+on9X4msM7X/4oZQAfr6yuGk8BRCeNe+2OmFkTBU3aYkQD0+ frFAJ2Km7oB684z+U9IGdEouTd8BRCTv0e2I5iW2hK46M837JkKXNXvk0iPx7wKPL+f9HGcDbbm oCF8SHTxgg+QkATjvKO/q7yPsM9jIY1YaBnu2EhU8deSkc6YFcsBRdB0OIg== X-Received: by 2002:a1f:f95:: with SMTP id 143-v6mr21633337vkp.93.1525352680050; Thu, 03 May 2018 06:04:40 -0700 (PDT) X-Received: by 2002:a1f:f95:: with SMTP id 143-v6mr21633158vkp.93.1525352677459; Thu, 03 May 2018 06:04:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.171.1 with HTTP; Thu, 3 May 2018 06:04:36 -0700 (PDT) In-Reply-To: <20180502160939.GU30522@ZenIV.linux.org.uk> References: <20180502154239.14013-1-christian.brauner@ubuntu.com> <20180502160939.GU30522@ZenIV.linux.org.uk> From: Christian Brauner Date: Thu, 3 May 2018 15:04:36 +0200 Message-ID: Subject: Re: [PATCH 0/6 resend] statfs: handle mount propagation To: Al Viro Cc: Linus Torvalds , linux-fsdevel@vger.kernel.org, Linux Kernel Mailing List , hch@infradead.org, tglx@linutronix.de, kstewart@linuxfoundation.org, Greg KH , pombredanne@nexb.com, Linux API Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 02, 2018 at 05:09:39PM +0100, Al Viro wrote: > On Wed, May 02, 2018 at 05:42:33PM +0200, Christian Brauner wrote: > > Hey, > > > > This is the second resend of this patchset. I'm not sure whether it has > > simply been overlooked but the number of people get_maintainer.pl was > > rather small and seemed a little random so I added Linus and Christoph, > > two people I know that do look at VFS stuff at least from time to time, > > although they weren't listed by get_maintainer.pl. I hope that's ok. > > > > This little series > > - unifies the definition of constants in statfs.h and fs.h > > *Note, that Andreas has expressed doubts whether this unification is > > useful. Please see https://lkml.org/lkml/2018/4/13/571 . I still think > > it is but I'm happy to drop these two patches if others agree.* > > - extends statfs to handle mount propagation. This will let userspace > > easily query a given mountpoint for MS_UNBINDABLE, MS_SHARED, > > MS_PRIVATE and MS_SLAVE without always having to do costly parsing of > > /proc//mountinfo. > > To this end the flags: > > - ST_UNBINDABLE > > - ST_SHARED > > - ST_PRIVATE > > - ST_SLAVE > > are added. They have the same value as their MS_* counterparts. > > How about some rationale for that in the first place? statfs() looks like > a bad match for that - not to mention anything else, there's no way to > get anything beyond "it is a peer of something", not even "do these two > get propagation between them". What would be using that, what would the > userland side of users look like, etc... Ok, sorry if I wasn't detailed enough. From a userspace perspective we often run into the case where we simply want to know whether a given mountpoint is MS_SHARED or is MS_SLAVE. If it is we remount it as MS_PRIVATE to prevent any propagation from happening. We don't care about the peer relationship or how the propagation is exactly setup. We only want to prevent any propagation from happening. The above case is what I see most often. A more specific use-case is to differentiate between MS_SLAVE and MS_SHARED mountpoints. Mountpoints that are MS_SLAVE are kept intact and mountpoints that are MS_SHARED are made MS_PRIVATE. For both cases the only way to do this right now is by parsing /proc//mountinfo. Yes, it is doable but still it is somewhat costly and annoying as e.g. those mount propagation fields are optional. Another problem are scenarios where propagation matters but /proc is not mounted. Here are three concrete use-cases I run into frequently: - (*buzzword alarm*) container runtimes: They do usually clone(CLONE_NEWNS) and inherit the mount table but want to remount specific mountpoints to prevent propagation. Again the specifics of propagation don't matter in this case usually. - Sharing shared mountpoints: Sometimes it is desirable to share shared mountpoints between namespaces. Say, the first process sets up the shared mountpoint on the host and all the other ones want to know "Did someone already make this rshared or do i need to set this up?". It would be very helpful if you could just check a mount by using the statvfs() syscall. - I want to know whether a mountpoint is unbindable to e.g. skip it if it is or remount it. About how this would work what I simply expect and tested with this patch was e.g. this: int ret; char *s = "/some/path"; struct statvfs sb; ret = statvfs(s, &sb); if (ret < 0) return false; if (sb.f_flag & ST_SHARED) { ret = mount("", s, NULL, MS_SLAVE | MS_REC, NULL); if (ret < 0) return -1; } > > And in any case linux-api should've been Cc'd. I'm not saying that this Ah, thanks! Sorry I missed this. > (or something similar) would be an inherently bad idea, but the question > "why this way?" deserves a bit more than "parsing is costly"... I hope some of the above helped to clarify this. Thanks! Christian