Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3231581ybf; Tue, 3 Mar 2020 02:03:11 -0800 (PST) X-Google-Smtp-Source: ADFU+vulb82sfhnRhpP0IRfVVg5h0zXC9CfCjjA4QN8AJFOEWwFC2YQJzw6k6hoyaH66Wj3JNMOV X-Received: by 2002:aca:b70a:: with SMTP id h10mr2034852oif.20.1583229791487; Tue, 03 Mar 2020 02:03:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583229791; cv=none; d=google.com; s=arc-20160816; b=pklRRtvMi89Js7Nf6G+b3njbe6bcek/p+htwr8P095yYxy01ajBsoyhVpLDr7zK1yk AmF7xpLETHbzSNLI/9ni43g2F8uNGOTHgxjgaK9puKWFUsMfzmjkA1VedIbY31n4p54p rNXeCR+O7nKpw9/C1fvm91WYofo8eqQGrGN38h80/bz82Zf2fCrmaie+z3/NZZfOy13D BBwBNadIDuQ2E7Upcr6wlm9eZrOdpaU5+/KWiPbjKb22/ODSlGnUDV8gHwhhhGUvmr0P PswMe2yfFnkjtEG5giq7k51JzGELJTZT2yvkxqi5R6JT7G+5kiMPk5VJHP4NsnRirGTE TIwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=9C8AV5jrY6XC9oV83v1/KZRQoCWWHyL7Vzh45Y1evBI=; b=KFm5F3ti1e7dUAhnz3zbv8XLyVF7z0P2OiNe0zhr6CXfOR/9puhzlSKPG4qN7LYo+A zVETXu1tjGDkRZGsbQ0gKjKRgKuJ+ZSuIuOFHqjFvE7B7PUja/xVViiouW/xJOjwwK2q IhNS9lkRma6RaSej9tLHjESgPgFsdowuCPt5pPN9P5zpHx2peBzEd0DyWRthfl0uPy7J Uqr2y5RVOLJbn1BGnmQmCED4SCZvYMMJL22hQ8esI46XgLfGFrvpScv7AMXKff82i/BL 0uSwGCM06m5mSJz95hczhTdtET62H+vyc2qZLv9DFgVmQ4i4t1XYXLbbdQNbAnsrcvnh 1ynw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v24si7830534otn.125.2020.03.03.02.02.59; Tue, 03 Mar 2020 02:03:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728323AbgCCKA6 (ORCPT + 99 others); Tue, 3 Mar 2020 05:00:58 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:36546 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728005AbgCCKA6 (ORCPT ); Tue, 3 Mar 2020 05:00:58 -0500 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1j94M2-0002Iv-H5; Tue, 03 Mar 2020 10:00:46 +0000 Date: Tue, 3 Mar 2020 11:00:45 +0100 From: Christian Brauner To: Miklos Szeredi Cc: David Howells , Ian Kent , James Bottomley , Steven Whitehouse , Miklos Szeredi , viro , Christian Brauner , Jann Horn , "Darrick J. Wong" , Linux API , linux-fsdevel , lkml , Greg Kroah-Hartman Subject: Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17] Message-ID: <20200303100045.zqntjjjv6npvs5zl@wittgenstein> References: <1582644535.3361.8.camel@HansenPartnership.com> <20200228155244.k4h4hz3dqhl7q7ks@wittgenstein> <107666.1582907766@warthog.procyon.org.uk> <0403cda7345e34c800eec8e2870a1917a8c07e5c.camel@themaw.net> <1509948.1583226773@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 03, 2020 at 10:26:21AM +0100, Miklos Szeredi wrote: > On Tue, Mar 3, 2020 at 10:13 AM David Howells wrote: > > > > Miklos Szeredi wrote: > > > > > I'm doing a patch. Let's see how it fares in the face of all these > > > preconceptions. > > > > Don't forget the efficiency criterion. One reason for going with fsinfo(2) is > > that scanning /proc/mounts when there are a lot of mounts in the system is > > slow (not to mention the global lock that is held during the read). > > > > Now, going with sysfs files on top of procfs links might avoid the global > > lock, and you can avoid rereading the options string if you export a change > > notification, but you're going to end up injecting a whole lot of pathwalk > > latency into the system. > > Completely irrelevant. Cached lookup is so much optimized, that you > won't be able to see any of it. > > No, I don't think this is going to be a performance issue at all, but > if anything we could introduce a syscall > > ssize_t readfile(int dfd, const char *path, char *buf, size_t > bufsize, int flags); > > that is basically the equivalent of open + read + close, or even a > vectored variant that reads multiple files. But that's off topic > again, since I don't think there's going to be any performance issue > even with plain I/O syscalls. > > > > > On top of that, it isn't going to help with the case that I'm working towards > > implementing where a container manager can monitor for mounts taking place > > inside the container and supervise them. What I'm proposing is that during > > the action phase (eg. FSCONFIG_CMD_CREATE), fsconfig() would hand an fd > > referring to the context under construction to the manager, which would then > > be able to call fsinfo() to query it and fsconfig() to adjust it, reject it or > > permit it. Something like: > > > > fd = receive_context_to_supervise(); > > struct fsinfo_params params = { > > .flags = FSINFO_FLAGS_QUERY_FSCONTEXT, > > .request = FSINFO_ATTR_SB_OPTIONS, > > }; > > fsinfo(fd, NULL, ¶ms, sizeof(params), buffer, sizeof(buffer)); > > supervise_parameters(buffer); > > fsconfig(fd, FSCONFIG_SET_FLAG, "hard", NULL, 0); > > fsconfig(fd, FSCONFIG_SET_STRING, "vers", "4.2", 0); > > fsconfig(fd, FSCONFIG_CMD_SUPERVISE_CREATE, NULL, NULL, 0); > > struct fsinfo_params params = { > > .flags = FSINFO_FLAGS_QUERY_FSCONTEXT, > > .request = FSINFO_ATTR_SB_NOTIFICATIONS, > > }; > > struct fsinfo_sb_notifications sbnotify; > > fsinfo(fd, NULL, ¶ms, sizeof(params), &sbnotify, sizeof(sbnotify)); > > watch_super(fd, "", AT_EMPTY_PATH, watch_fd, 0x03); > > fsconfig(fd, FSCONFIG_CMD_SUPERVISE_PERMIT, NULL, NULL, 0); > > close(fd); > > > > However, the supervised mount may be happening in a completely different set > > of namespaces, in which case the supervisor presumably wouldn't be able to see > > the links in procfs and the relevant portions of sysfs. > > It would be a "jump" link to the otherwise invisible directory. More magic links to beam you around sounds like a bad idea. We had a bunch of CVEs around them in containers and they were one of the major reasons behind us pushing for openat2(). That's why it has a RESOLVE_NO_MAGICLINKS flag. Christian