Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1023271iog; Thu, 30 Jun 2022 15:20:46 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vgEsK3i+XJPVPYbmM1BCXtOEImzDAHj1OZ8ovek2gJOfkWuN0WmzbX4+RGj6wLUjX0wxob X-Received: by 2002:a17:902:860c:b0:16a:20a0:f6e4 with SMTP id f12-20020a170902860c00b0016a20a0f6e4mr17470997plo.164.1656627645835; Thu, 30 Jun 2022 15:20:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656627645; cv=none; d=google.com; s=arc-20160816; b=emPrzkmJeO8D+qub5pekuzr8cHrRkaBnqZSNxdiIlHVJiJUamGgdYOiOGwNQzra5Vb 615fvuycrg/8ZLvloT9GWHyKdHo+ivbms+XswXjEp0VR8NN/zsMumlxuYBoJx9Bgj/BD TwwNCVALUD3Nw0DUDtQKi/P9z6AQWYbeAGaE05t4+L4A4bGBjb44wm4dRIbiDXvjR6Cm 5e/VyG6ksZHSgmJtsoffbSnzy3PjjuX5xewdXT5qk7WYV/dcpfs3m3RYLJT7too3mAtG W7ThkT1Ppz4HZDzMLpj45oHGeulEAblyyJcLGP/hkIIkC/AQCfeMCtLoOZGfo1+Y4Gj5 BxOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=TK13O2fc2Qey/nv0w8Yb46V++JBjODuVf0Eb/jbY6Gc=; b=LVc7j5hp1zHU8KEUGQRi1lMleiSd3irhuOeZWYFbaMshUvuKsa2y5eEyC0ETEt77Js rYDp8qj3odSmrx3bE6fMiPQ74yR994t6NrGcnmelTMAeKpADaLg46fh6WmiSTeeOiJ+4 UCtUJD+AyGv2oyXqAQ87Zt0ifkhv5WyOHuMo/f+4AzrwXbrhSdnpOWTrnq8QMGO+YQK9 SoSBUwep3MMVtl5risY4zfqQ/USEcHef955GbNSr/OVPFiIJXmiIE7xPyou8TT/QJFho lxGbuAPzEj/Vh2cMT3rPd+v4Dltr81uSDbZAWPjYhOb7W9MtFv2dLGQVjLs1HI3/hdJz Upnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=H3MitQaR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g13-20020a056a0023cd00b0052523851474si6231058pfc.232.2022.06.30.15.20.33; Thu, 30 Jun 2022 15:20:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=H3MitQaR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235686AbiF3Va3 (ORCPT + 99 others); Thu, 30 Jun 2022 17:30:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236095AbiF3Va2 (ORCPT ); Thu, 30 Jun 2022 17:30:28 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E962B51B3A for ; Thu, 30 Jun 2022 14:30:26 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id o4so429507wrh.3 for ; Thu, 30 Jun 2022 14:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=TK13O2fc2Qey/nv0w8Yb46V++JBjODuVf0Eb/jbY6Gc=; b=H3MitQaRwHJtdIHW7Q0yV8AMtLIJ5fN2TAYTsv8ANTQdrxHy1APK5cKBkHMrH66THT tM2gjSeVpOVxY+zTrqNGHXsHf1u3Uk6g5l6nlUhuuRnKOft2rpFffn9ceeL8/E+UC4Ws 8hF3stIJ3SQefsOUmMOuM2rzkm+pxUlRB5E6VM0Vpkq3roiIB4foy/0ueECQqIv1pvKd IuEXSkf5AkU+H5FM5qTVp2egIr/K2VwXMsSXHkq26joE6ICs7u1LXBw4IlsBez7roZlc qDA8FHIJWouhvuF6XU8Rr4URUmIOVx7N7MO9jyPBVDpEEstOvNpZsGaBoNycLzZMbIvr 2OvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=TK13O2fc2Qey/nv0w8Yb46V++JBjODuVf0Eb/jbY6Gc=; b=M7TQOwEjoVDxwf+jnHVktr8DAhkjEzpc7lRCmrE60eIYgfgum9G6XywFg7EYI3+Qlk yhFhvJeAby1Ko1SCrWuo8vch2tLnEhh8DbKWhk8tJD78AlwKl27VpouzMdxXcYuG7Bn4 7wtgOPt3bt/qWpsqKo1J35KkRANRTuEfZe0ctWDju/vWPVsV5mhgHG8r7rJQGlnrMvR4 jqJahsYRsq7OwkstGxrwZDPpOK03GSRDHw2t1TQjVZqBMOLU5pBEZNztwNhFYmajV92B P/ocL2zZylk0yA5wx4i2Me7pe+V+YaAh5rFsIeWd3EEtkHW/wFvjDWiku34Ho++mEcIC aB2g== X-Gm-Message-State: AJIora/wtazrxqm9I3vKzULUlPhYU4S57v/KDLDWDMy8hPv5/5IwG9WU g2MX3gkpKRJkc489fyzzxadMjCojjRa+srXWx6XlVQ== X-Received: by 2002:a5d:4304:0:b0:21b:9b2c:be34 with SMTP id h4-20020a5d4304000000b0021b9b2cbe34mr11026223wrq.577.1656624625307; Thu, 30 Jun 2022 14:30:25 -0700 (PDT) MIME-Version: 1.0 References: <20220623220613.3014268-1-kaleshsingh@google.com> <20220623220613.3014268-2-kaleshsingh@google.com> In-Reply-To: From: Kalesh Singh Date: Thu, 30 Jun 2022 14:30:14 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] procfs: Add 'size' to /proc//fdinfo/ To: Brian Foster Cc: =?UTF-8?Q?Christian_K=C3=B6nig?= , =?UTF-8?Q?Christian_K=C3=B6nig?= , Alexander Viro , Christoph Hellwig , Stephen Brennan , David.Laight@aculab.com, Ioannis Ilkos , "T.J. Mercier" , Suren Baghdasaryan , "Cc: Android Kernel" , Jonathan Corbet , Sumit Semwal , Andrew Morton , Johannes Weiner , Christoph Anton Mitterer , Paul Gortmaker , Mike Rapoport , Randy Dunlap , LKML , linux-fsdevel , "open list:DOCUMENTATION" , Linux Media Mailing List , DRI mailing list , "moderated list:DMA BUFFER SHARING FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 30, 2022 at 5:03 AM Brian Foster wrote: > > On Thu, Jun 30, 2022 at 07:48:46AM -0400, Brian Foster wrote: > > On Wed, Jun 29, 2022 at 01:43:11PM -0700, Kalesh Singh wrote: > > > On Wed, Jun 29, 2022 at 5:23 AM Brian Foster wro= te: > > > > > > > > On Tue, Jun 28, 2022 at 03:38:02PM -0700, Kalesh Singh wrote: > > > > > On Tue, Jun 28, 2022 at 4:54 AM Brian Foster = wrote: > > > > > > > > > > > > On Thu, Jun 23, 2022 at 03:06:06PM -0700, Kalesh Singh wrote: > > > > > > > To be able to account the amount of memory a process is keepi= ng pinned > > > > > > > by open file descriptors add a 'size' field to fdinfo output. > > > > > > > > > > > > > > dmabufs fds already expose a 'size' field for this reason, re= move this > > > > > > > and make it a common field for all fds. This allows tracking = of > > > > > > > other types of memory (e.g. memfd and ashmem in Android). > > > > > > > > > > > > > > Signed-off-by: Kalesh Singh > > > > > > > Reviewed-by: Christian K=C3=B6nig > > > > > > > --- > > > > > > > > > > > > > > Changes in v2: > > > > > > > - Add Christian's Reviewed-by > > > > > > > > > > > > > > Changes from rfc: > > > > > > > - Split adding 'size' and 'path' into a separate patches, p= er Christian > > > > > > > - Split fdinfo seq_printf into separate lines, per Christia= n > > > > > > > - Fix indentation (use tabs) in documentaion, per Randy > > > > > > > > > > > > > > Documentation/filesystems/proc.rst | 12 ++++++++++-- > > > > > > > drivers/dma-buf/dma-buf.c | 1 - > > > > > > > fs/proc/fd.c | 9 +++++---- > > > > > > > 3 files changed, 15 insertions(+), 7 deletions(-) > > > > > > > > > > > ... > > > > > > > > > > > > Also not sure if it matters that much for your use case, but so= mething > > > > > > worth noting at least with shmem is that one can do something l= ike: > > > > > > > > > > > > # cat /proc/meminfo | grep Shmem: > > > > > > Shmem: 764 kB > > > > > > # xfs_io -fc "falloc -k 0 10m" ./file > > > > > > # ls -alh file > > > > > > -rw-------. 1 root root 0 Jun 28 07:22 file > > > > > > # stat file > > > > > > File: file > > > > > > Size: 0 Blocks: 20480 IO Block: 4096 reg= ular empty file > > > > > > # cat /proc/meminfo | grep Shmem: > > > > > > Shmem: 11004 kB > > > > > > > > > > > > ... where the resulting memory usage isn't reflected in i_size = (but is > > > > > > is in i_blocks/bytes). > > > > > > > > > > I tried a similar experiment a few times, but I don't see the sam= e > > > > > results. In my case, there is not any change in shmem. IIUC the > > > > > fallocate is allocating the disk space not shared memory. > > > > > > > > > > > > > Sorry, it was implied in my previous test was that I was running ag= ainst > > > > tmpfs. So regardless of fs, the fallocate keep_size semantics shown= in > > > > both cases is as expected: the underlying blocks are allocated and = the > > > > inode size is unchanged. > > > > > > > > What wasn't totally clear to me when I read this patch was 1. wheth= er > > > > tmpfs refers to Shmem and 2. whether tmpfs allowed this sort of > > > > operation. The test above seems to confirm both, however, right? E.= g., a > > > > more detailed example: > > > > > > > > # mount | grep /tmp > > > > tmpfs on /tmp type tmpfs (rw,nosuid,nodev,seclabel,nr_inodes=3D1048= 576,inode64) > > > > # cat /proc/meminfo | grep Shmem: > > > > Shmem: 5300 kB > > > > # xfs_io -fc "falloc -k 0 1g" /tmp/file > > > > # stat /tmp/file > > > > File: /tmp/file > > > > Size: 0 Blocks: 2097152 IO Block: 4096 regular= empty file > > > > Device: 22h/34d Inode: 45 Links: 1 > > > > Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ = root) > > > > Context: unconfined_u:object_r:user_tmp_t:s0 > > > > Access: 2022-06-29 08:04:01.301307154 -0400 > > > > Modify: 2022-06-29 08:04:01.301307154 -0400 > > > > Change: 2022-06-29 08:04:01.451312834 -0400 > > > > Birth: 2022-06-29 08:04:01.301307154 -0400 > > > > # cat /proc/meminfo | grep Shmem: > > > > Shmem: 1053876 kB > > > > # rm -f /tmp/file > > > > # cat /proc/meminfo | grep Shmem: > > > > Shmem: 5300 kB > > > > > > > > So clearly this impacts Shmem.. was your test run against tmpfs or = some > > > > other (disk based) fs? > > > > > > Hi Brian, > > > > > > Thanks for clarifying. My issue was tmpfs not mounted at /tmp in my s= ystem: > > > > > > =3D=3D> meminfo.start <=3D=3D > > > Shmem: 572 kB > > > =3D=3D> meminfo.stop <=3D=3D > > > Shmem: 51688 kB > > > > > > > Ok, makes sense. > > > > > > > > > > FWIW, I don't have any objection to exposing inode size if it's com= monly > > > > useful information. My feedback was more just an fyi that i_size do= esn't > > > > necessarily reflect underlying space consumption (whether it's memo= ry or > > > > disk space) in more generic cases, because it sounds like that is r= eally > > > > what you're after here. The opposite example to the above would be > > > > something like an 'xfs_io -fc "truncate 1t" /tmp/file', which shows= a > > > > 1TB inode size with zero additional shmem usage. > > > > > > From these cases, it seems the more generic way to do this is by > > > calculating the actual size consumed using the blocks. (i_blocks * > > > 512). So in the latter example 'xfs_io -fc "truncate 1t" /tmp/file' > > > the size consumed would be zero. Let me know if it sounds ok to you > > > and I can repost the updated version. > > > > > > > That sounds a bit more useful to me if you're interested in space usage= , > > or at least I don't have a better idea for you. ;) > > > > One thing to note is that I'm not sure whether all fs' use i_blocks > > reliably. E.g., XFS populates stat->blocks via a separate block counter > > in the XFS specific inode structure (see xfs_vn_getattr()). A bunch of > > other fs' seem to touch it so perhaps that is just an outlier. You coul= d > > consider fixing that up, perhaps make a ->getattr() call to avoid it, o= r > > just use the field directly if it's useful enough as is and there are n= o > > other objections. Something to think about anyways.. > > Hi Brian, Thanks for pointing it out. Let me take a look into the xfs case. > > Oh, I wonder if you're looking for similar "file rss" information this > series wants to collect/expose..? > > https://lore.kernel.org/linux-fsdevel/20220624080444.7619-1-christian.koe= nig@amd.com/#r Christian's series seems to have some overlap with what we want to achieve here. IIUC it exposes the information on the per process granularity. Perhaps if that approach is agreed on, I think we can use the file_rss() f_op to expose the per file size in the fdinfo for the cases where the i_blocks are unreliable. Thanks, Kalesh > > Brian > > > Brian > > > > > Thanks, > > > Kalesh > > > > > > > > > > > Brian > > > > > > > > > cat /proc/meminfo > meminfo.start > > > > > xfs_io -fc "falloc -k 0 50m" ./xfs_file > > > > > cat /proc/meminfo > meminfo.stop > > > > > tail -n +1 meminfo.st* | grep -i '=3D=3D\|Shmem:' > > > > > > > > > > =3D=3D> meminfo.start <=3D=3D > > > > > Shmem: 484 kB > > > > > =3D=3D> meminfo.stop <=3D=3D > > > > > Shmem: 484 kB > > > > > > > > > > ls -lh xfs_file > > > > > -rw------- 1 root root 0 Jun 28 15:12 xfs_file > > > > > > > > > > stat xfs_file > > > > > File: xfs_file > > > > > Size: 0 Blocks: 102400 IO Block: 4096 regul= ar empty file > > > > > > > > > > Thanks, > > > > > Kalesh > > > > > > > > > > > > > > > > > Brian > > > > > > > > > > > > > > > > > > > > /* show_fd_locks() never deferences files so a stale va= lue is safe */ > > > > > > > show_fd_locks(m, file, files); > > > > > > > -- > > > > > > > 2.37.0.rc0.161.g10f37bed90-goog > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > To unsubscribe from this group and stop receiving emails from it, s= end an email to kernel-team+unsubscribe@android.com. > > > > > > > >