Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp318480pxb; Wed, 23 Mar 2022 19:18:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxnnWwAc4qwwyvxg7o8z/PhbSy4C5Z8jyar3+Vyzpiz7+nGrdIgdf4L1wN6aYj9rGFqaoVR X-Received: by 2002:a62:30c4:0:b0:4f7:2b29:984 with SMTP id w187-20020a6230c4000000b004f72b290984mr2754525pfw.11.1648088337298; Wed, 23 Mar 2022 19:18:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648088337; cv=none; d=google.com; s=arc-20160816; b=ExpX32VG92Ow8n3/URnlVXABe0HjUOtZOIF928FgD97mUgA3kZmcLVHmrNw9BBkHXT 5GlCvnUJK8ANIYUoLZ06hhwCarmjWUi8Kit4MEzCWk9ijydskmB+0WjqbIYqa1IX5pYU Qk9unwd7/KibZdgSEUZ01DTfmuf3NhM6Y+UGkrUCgpAuYE1OwGkDkY8ePkHD8z8IXejy jOSuY4uCS9gqgvt1uDASdWOaScR85LdSpCMsONrWYFUxzYpf9ey0t/KNECaF9BOPamg6 wzIRfHybcMj2R2RXnipy//dNyQU30EPZHmOxlDrk8kKcEWFTOW+MQ3wRM2nJh1FMq/T/ B1vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=HHJ8gyeFgngGdpZ8FhzlnfsmDMwmIiHrlIJwTYr2r/w=; b=fRC7JH7h2gm6bUAc4+VS2dBHATA27DKYwtMEXw+AJdUKUaPLfL7shlewKrdFopdmQQ mWvCSlMo5wRR6bgqnJYtJ7K1tKWkBDXmpI1GJePgZ0wX1grH5zCCJGsg2BnXTVNkSLTp Iinngb0eNFvoWBaGFceTYlFv7sRuo/IKfTrJyF/Xp8EiIYPeXiDQO5pBfPsVXYHLYwDN pcefr9rioRx/J7MRzq7DhYlSk3Bn6Cd7PgGlTyAyDwHN/aV0cmz4EBDGI4De3+PtdzUw e5SgtpSchdfGDs2s9OAkKjlMBz6rTs/7+10NfH80+muxongYq0Rrb9p/pzVply3Ac2ef +U3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 142-20020a621694000000b004fa3a8e003esi13332171pfw.245.2022.03.23.19.18.42; Wed, 23 Mar 2022 19:18:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241417AbiCWXAY (ORCPT + 99 others); Wed, 23 Mar 2022 19:00:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234809AbiCWXAW (ORCPT ); Wed, 23 Mar 2022 19:00:22 -0400 Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ED1FC90256; Wed, 23 Mar 2022 15:58:49 -0700 (PDT) Received: from dread.disaster.area (pa49-186-150-27.pa.vic.optusnet.com.au [49.186.150.27]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 72EA3533E65; Thu, 24 Mar 2022 09:58:45 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1nX9wB-0095L5-JH; Thu, 24 Mar 2022 09:58:43 +1100 Date: Thu, 24 Mar 2022 09:58:43 +1100 From: Dave Chinner To: Miklos Szeredi Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-man@vger.kernel.org, linux-security-module@vger.kernel.org, Karel Zak , Ian Kent , David Howells , Linus Torvalds , Al Viro , Christian Brauner , Amir Goldstein , James Bottomley Subject: Re: [RFC PATCH] getvalues(2) prototype Message-ID: <20220323225843.GI1609613@dread.disaster.area> References: <20220322192712.709170-1-mszeredi@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220322192712.709170-1-mszeredi@redhat.com> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=e9dl9Yl/ c=1 sm=1 tr=0 ts=623ba628 a=sPqof0Mm7fxWrhYUF33ZaQ==:117 a=sPqof0Mm7fxWrhYUF33ZaQ==:17 a=kj9zAlcOel0A:10 a=o8Y5sQTvuykA:10 a=GumQ9EM2AAAA:8 a=7-415B0cAAAA:8 a=63ntfpr2ZaCgDPseocQA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 22, 2022 at 08:27:12PM +0100, Miklos Szeredi wrote: > Add a new userspace API that allows getting multiple short values in a > single syscall. > > This would be useful for the following reasons: > > - Calling open/read/close for many small files is inefficient. E.g. on my > desktop invoking lsof(1) results in ~60k open + read + close calls under > /proc and 90% of those are 128 bytes or less. How does doing the open/read/close in a single syscall make this any more efficient? All it saves is the overhead of a couple of syscalls, it doesn't reduce any of the setup or teardown overhead needed to read the data itself.... > - Interfaces for getting various attributes and statistics are fragmented. > For files we have basic stat, statx, extended attributes, file attributes > (for which there are two overlapping ioctl interfaces). For mounts and > superblocks we have stat*fs as well as /proc/$PID/{mountinfo,mountstats}. > The latter also has the problem on not allowing queries on a specific > mount. https://xkcd.com/927/ > - Some attributes are cheap to generate, some are expensive. Allowing > userspace to select which ones it needs should allow optimizing queries. > > - Adding an ascii namespace should allow easy extension and self > description. > > - The values can be text or binary, whichever is fits best. > > The interface definition is: > > struct name_val { > const char *name; /* in */ > struct iovec value_in; /* in */ > struct iovec value_out; /* out */ > uint32_t error; /* out */ > uint32_t reserved; > }; Ahhh, XFS_IOC_ATTRMULTI_BY_HANDLE reborn. This is how xfsdump gets and sets attributes efficiently when dumping and restoring files - it's an interface that allows batches of xattr operations to be run on a file in a single syscall. I've said in the past when discussing things like statx() that maybe everything should be addressable via the xattr namespace and set/queried via xattr names regardless of how the filesystem stores the data. The VFS/filesystem simply translates the name to the storage location of the information. It might be held in xattrs, but it could just be a flag bit in an inode field. Then we just get named xattrs in batches from an open fd. > int getvalues(int dfd, const char *path, struct name_val *vec, size_t num, > unsigned int flags); > > @dfd and @path are used to lookup object $ORIGIN. @vec contains @num > name/value descriptors. @flags contains lookup flags for @path. > > The syscall returns the number of values filled or an error. > > A single name/value descriptor has the following fields: > > @name describes the object whose value is to be returned. E.g. > > mnt - list of mount parameters > mnt:mountpoint - the mountpoint of the mount of $ORIGIN > mntns - list of mount ID's reachable from the current root > mntns:21:parentid - parent ID of the mount with ID of 21 > xattr:security.selinux - the security.selinux extended attribute > data:foo/bar - the data contained in file $ORIGIN/foo/bar How are these different from just declaring new xattr namespaces for these things. e.g. open any file and list the xattrs in the xattr:mount.mnt namespace to get the list of mount parameters for that mount. Why do we need a new "xattr in everything but name" interface when we could just extend the one we've already got and formalise a new, cleaner version of xattr batch APIs that have been around for 20-odd years already? Cheers, Dave. -- Dave Chinner david@fromorbit.com