Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3939584rdb; Thu, 14 Sep 2023 07:19:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEyuPZ8/XFmVcvxK53weFPRjn/TVT+W8iirWcVW8b20xGHOplFlBdMEjpAzWg8h18XZ4R21 X-Received: by 2002:a17:902:e882:b0:1c4:fae:bf28 with SMTP id w2-20020a170902e88200b001c40faebf28mr1853293plg.32.1694701169976; Thu, 14 Sep 2023 07:19:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694701169; cv=none; d=google.com; s=arc-20160816; b=D1W4J0Z6Et7FI8aeYrgXrdFj7Jp3BZw+h9HXIxwrGfRB40Zq9sIis+1DU9aJwHPmUS FSBccW+BLWpbpp7T9gzaYrbWF3EBNMkl+Mu6aSBqm7u/+FZJTe89OvAENKDQQNyvVALF AiLQS+lYQFXOpWxCpmjvFaZLVeiUtWmYYAqEgp/ow/0aMMJstBBN5g/7P2n3kY5zGBAu 8lnZJRTsW8WkamQgzGiMSZZ7scFaI7nXP3GKCyO62qH33VHCsgi7VYDc9SPz8A/J353R A3NGlMR4Qa3e1VO3zDaHrx7lsrPfzmdQpiVKQwr4IMkYqUWmG0gcTdzxmjGeX/9eK8i1 Astg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=TPuDwD4ZrkyAZ/NBOLS8VaQLWjoMvPUnZLd4ZQdk17o=; fh=8G8DDOWeXTrynpwGxRU6+UU65RCHDEs7NE1+CluhaKI=; b=x9q4vmVEHkfv1Tu09cYQjC0mYNZ6B6oj8ItmEi0T/ap1ZbzsowN4R3GrzulvOhMftR Cav4YHxacJnYqw3P2/Bl+rMYAfXUeqnq9CbI64/zp+0B6jb8sIrGaadGof3Wa1w10DEc Ny3Rrd+6LTEMaY8qSI8NNVoHQIJ2awc9d9mDGmwwGf3Y8P6pd0jwlvIJvUpOXR84ikeL vud+LgFnJJ/Tbf/HwuqvjCLY3Bm0NBGaPZE6k6R/CSMAxWIpoO0mdCpekytHm2b7+lLq MXLhTaecWyC29eVvdtmvyCs96S+iv/gbmo6+zkF1D0u+RiWxv331qqcDV1QUTMj4BBaj FEVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@szeredi.hu header.s=google header.b=QLPVbta0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=szeredi.hu Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id c2-20020a170903234200b001bbcddd6e82si1923832plh.91.2023.09.14.07.19.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 07:19:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@szeredi.hu header.s=google header.b=QLPVbta0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=szeredi.hu Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id C1345813EA81; Thu, 14 Sep 2023 03:14:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238082AbjINKOR (ORCPT + 99 others); Thu, 14 Sep 2023 06:14:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237639AbjINKOO (ORCPT ); Thu, 14 Sep 2023 06:14:14 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABFAC1BEB for ; Thu, 14 Sep 2023 03:14:09 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-9adb9fa7200so106510766b.0 for ; Thu, 14 Sep 2023 03:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; t=1694686448; x=1695291248; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=TPuDwD4ZrkyAZ/NBOLS8VaQLWjoMvPUnZLd4ZQdk17o=; b=QLPVbta0mIgsVHxTB342Kfkc6P5mkwpvc8LRZIviHrsloFuQwMwZ0sM6jGWy9nCBOT uTUyoG8ex9Jy1lIJaB8zz4u0JxQthyFIW0HmvV7XCC5pkgb/0nUKcjHwgc4G15a6EL2C eVC25wejuY9fv1OJMSV8kXuBjt6XPECvgOaMA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694686448; x=1695291248; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TPuDwD4ZrkyAZ/NBOLS8VaQLWjoMvPUnZLd4ZQdk17o=; b=bNSy13yPgxF0PyYMEUTt4DJOtKYO5ZiKvnCUKKft+oz5kjEoJHpHfijPvIJ7emA+KR jBVY0z70qjb1PeAxFqOQGpZddP5eXlb78JkQYQ6xpUieRyuGl9Eu2OEJKIECIOANAUnj sgjS/vNmXQDL3HpmW6Jdt3VBbsh8NwXTqbSf+Cqcn14Uj46cq7U618CDDGgHvs41FXfq Va20D74WrYN28NmAUfV/i7Y77oGUA38s+hZdVdhGRt3soXwdClKIcll/Z/plySdSvco7 h0klV/hcUljC+2qYioPaVj2XWukjaI/GZ/9Tjimo90QAryYlSowf8eJZLejco1ehkAi5 mj1g== X-Gm-Message-State: AOJu0Yw3PXZqjFhURILhhrHXJtPYJKueJ2IWNg4c4k4n0aS2DQKvrW/r SRryf1cG8ae/6wg+XvSh22klelSQUt4sQppm1EHIgg== X-Received: by 2002:a17:906:51c5:b0:9ad:a46c:66a2 with SMTP id v5-20020a17090651c500b009ada46c66a2mr2033508ejk.11.1694686448127; Thu, 14 Sep 2023 03:14:08 -0700 (PDT) MIME-Version: 1.0 References: <20230913152238.905247-1-mszeredi@redhat.com> <20230913152238.905247-3-mszeredi@redhat.com> <20230914-salzig-manifest-f6c3adb1b7b4@brauner> In-Reply-To: <20230914-salzig-manifest-f6c3adb1b7b4@brauner> From: Miklos Szeredi Date: Thu, 14 Sep 2023 12:13:54 +0200 Message-ID: Subject: Re: [RFC PATCH 2/3] add statmnt(2) syscall To: Christian Brauner Cc: Miklos Szeredi , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-man@vger.kernel.org, linux-security-module@vger.kernel.org, Karel Zak , Ian Kent , David Howells , Al Viro , Christian Brauner , Amir Goldstein Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 14 Sep 2023 03:14:18 -0700 (PDT) On Thu, 14 Sept 2023 at 11:28, Christian Brauner wrote: > > On Wed, Sep 13, 2023 at 05:22:35PM +0200, Miklos Szeredi wrote: > > Add a way to query attributes of a single mount instead of having to parse > > the complete /proc/$PID/mountinfo, which might be huge. > > > > Lookup the mount by the old (32bit) or new (64bit) mount ID. If a mount > > needs to be queried based on path, then statx(2) can be used to first query > > the mount ID belonging to the path. > > > > Design is based on a suggestion by Linus: > > > > "So I'd suggest something that is very much like "statfsat()", which gets > > a buffer and a length, and returns an extended "struct statfs" *AND* > > just a string description at the end." > > So what we agreed to at LSFMM was that we split filesystem option > retrieval into a separate system call and just have a very focused > statx() for mounts with just binary and non-variable sized information. > We even gave David a hard time about this. :) I would really love if we > could stick to that. > > Linus, I realize this was your suggestion a long time ago but I would > really like us to avoid structs with variable sized fields at the end of > a struct. That's just so painful for userspace and universally disliked. > If you care I can even find the LSFMM video where we have users of that > api requesting that we please don't do this. So it'd be great if you > wouldn't insist on it. I completely missed that. What I'm thinking is making it even simpler for userspace: struct statmnt { ... char *mnt_root; char *mountpoint; char *fs_type; u32 num_opts; char *opts; }; I'd still just keep options nul delimited. Is there a good reason not to return pointers (pointing to within the supplied buffer obviously) to userspace? > > This will also allow us to turn statmnt() into an extensible argument > system call versioned by size just like we do any new system calls with > struct arguments (e.g., mount_setattr(), clone3(), openat2() and so on). > Which is how we should do things like that. The mask mechanism also allow versioning of the struct. > > Other than that I really think this is on track for what we ultimately > want. > > > +struct stmt_str { > > + __u32 off; > > + __u32 len; > > +}; > > + > > +struct statmnt { > > + __u64 mask; /* What results were written [uncond] */ > > + __u32 sb_dev_major; /* Device ID */ > > + __u32 sb_dev_minor; > > + __u64 sb_magic; /* ..._SUPER_MAGIC */ > > + __u32 sb_flags; /* MS_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */ > > + __u32 __spare1; > > + __u64 mnt_id; /* Unique ID of mount */ > > + __u64 mnt_parent_id; /* Unique ID of parent (for root == mnt_id) */ > > + __u32 mnt_id_old; /* Reused IDs used in proc/.../mountinfo */ > > + __u32 mnt_parent_id_old; > > + __u64 mnt_attr; /* MOUNT_ATTR_... */ > > + __u64 mnt_propagation; /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */ > > + __u64 mnt_peer_group; /* ID of shared peer group */ > > + __u64 mnt_master; /* Mount receives propagation from this ID */ > > + __u64 propagate_from; /* Propagation from in current namespace */ > > + __u64 __spare[20]; > > + struct stmt_str mnt_root; /* Root of mount relative to root of fs */ > > + struct stmt_str mountpoint; /* Mountpoint relative to root of process */ > > + struct stmt_str fs_type; /* Filesystem type[.subtype] */ > > I think if we want to do this here we should add: > > __u64 fs_type > __u64 fs_subtype > > fs_type can just be our filesystem magic number and we introduce magic It's already there: sb_magic. However it's not a 1:1 mapping (ext* only has one magic). > numbers for sub types as well. So we don't need to use strings here. Ugh. Thanks, Miklos