Received: by 2002:ac2:464d:0:0:0:0:0 with SMTP id s13csp3245035lfo; Sun, 22 May 2022 23:43:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw6j2CEZmo9uuo5dRtOP6W3qzhRKpHO+tbJxeMT3TwSBOSYXBtWGPUF93LD7RcsiUObaB+V X-Received: by 2002:a17:902:7049:b0:162:962:5b04 with SMTP id h9-20020a170902704900b0016209625b04mr9479626plt.167.1653288203229; Sun, 22 May 2022 23:43:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653288203; cv=none; d=google.com; s=arc-20160816; b=Y79PLug9lY0iXX/bJ5p5noVfRWPTmxUco3+sCIYgOEEE2TNbcGJoo4pXDEJD715REV BFysQJlZFlddxm7/WrgvWL8fnypgk/4sQ94nNoqyLEA8GScBk0yxFDnzRpiDPNurcXpy a7dHMMc/rT5Eng8qnlrgpx1QghTUuIKrxEwPbyPuuxzEhGc37iyJgZ1Um69o7h6jPDIx DH0b6orcPt/XHwTwy323Nw5xPSq1H5Zu3KZr+Q52K/PqTahe8qvRddrXfv8l0EgRjVQH JPIHcoPN0dGltCm1PAKIep6np+StupBoMx8fUv9Y9pd4U7SaYXBonJyzvGlIRy2c171P 54IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ipqOEuFHu2Nd21bFt01/0+/z/eZlM2I6EgR6QTamA4c=; b=hzotLa6GuUqxVyxrUc45LO4lU+R/jPlh5/P4dW6wizUKPV84BcJ8r4rdzwog1kxALg HD/e4tzpuZ+473mnXA5a2z5R0rGHTQzEwnyvkbswKZO2ZYqO99xSt+ar1yxs6pas4ZKw iLa2KOyHEOO3NWG+A1S9J6o7r4999WsJJFAE6GAHnqH9hBWklUfq61hGR7ZjCOtwgYTG M4pmpHDNDKrgk8EgYo7TfTs1wsWssB1aZCMeFEA7jZrlqU/jc1SOsrCwxamSptrRo/uJ +5PwahX5o3Enniupv6xd3aEkXvMCspg5Pd2KCmIycnzWLsBWbvXJeDaJVing0SnZYm+X YDWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Vi8bKKQt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id o5-20020a637305000000b003c653a37b46si8980213pgc.299.2022.05.22.23.43.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 May 2022 23:43:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Vi8bKKQt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A0EBB50054; Sun, 22 May 2022 23:15:54 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236367AbiETQNK (ORCPT + 99 others); Fri, 20 May 2022 12:13:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349224AbiETQNH (ORCPT ); Fri, 20 May 2022 12:13:07 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DFAB1666A1 for ; Fri, 20 May 2022 09:13:05 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id r23so12160170wrr.2 for ; Fri, 20 May 2022 09:13:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ipqOEuFHu2Nd21bFt01/0+/z/eZlM2I6EgR6QTamA4c=; b=Vi8bKKQtMJOCH6PWFXZrgIIwYCB1hRYAZN5J1IvH8FdLKxkln8YsdvvKPM736hEneU t2+ggDTtFOQdzEmWNu8TxsS1dlmRzZ3mQX6h2YkC5Vd4Uf3mhAp9+hBuOKNJZpo8CQZo umdy+46JvMpUIR5Io/nT1dTTZZ1uXG/SSQZ45xNVdZgP2Xi1dpDedndVCwQb5L5eXNaH gPNHNRpnzahyzEfFg3EIWwf8hFsjbQ6rzJqjE2YcZKTKsMiWwAl2sSiq5UuJpvhq4Uqz cSq1yEQIs9lLAZ9dFyOtH1oEBi7skSV2FcgLlF2MSOMvMgztU+TJ5EDyxxK14N0eLfYq whtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ipqOEuFHu2Nd21bFt01/0+/z/eZlM2I6EgR6QTamA4c=; b=PVPjlbnl9o35xPB09nhPsJ8JyHg/KCsb0ramb29kCkS6KUUVpvzFzAHrcVwSLj3mFW JEnvZjV8x+qNb2B42OM0HddknbSMNIF6exLn4p52HRFdlzKrdpiiGkSZdts+JL9pt65J +EiDnVte44RBQpMN0opJkGdtEih1bXYq8CfBvv/eqyWHM1efxzLW9WvwnMNjcMB2BjVA bNeohvZ/bVXbNXlwhSRm6nqHi/xq4ZBCByER71VSQiPef5UM3e30sipcJm+j+WWuG09M CVSw2zSrG+Ql9V6K7AplE65BqaMsW7F+H5bdGQ0Bu/Irw//RhdD8cUdjirhFhph6rTha TB6w== X-Gm-Message-State: AOAM532zizybb7iIyUhVmUZTVarYPb7tSFaW4L7Tf0a1WUA6mynBZIbf kXNSV9Z5ZAk01XpjXZ4FcLUOEqv4VsbP5CCvGrp5Xw== X-Received: by 2002:a5d:5846:0:b0:20c:7407:5fa1 with SMTP id i6-20020a5d5846000000b0020c74075fa1mr8871905wrf.116.1653063183383; Fri, 20 May 2022 09:13:03 -0700 (PDT) MIME-Version: 1.0 References: <20220519214021.3572840-1-kaleshsingh@google.com> <4e35dc30-1157-50b3-e3b6-954481a0524d@amd.com> In-Reply-To: <4e35dc30-1157-50b3-e3b6-954481a0524d@amd.com> From: Kalesh Singh Date: Fri, 20 May 2022 09:12:51 -0700 Message-ID: Subject: Re: [RFC PATCH] procfs: Add file path and size to /proc//fdinfo To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: Ioannis Ilkos , "T.J. Mercier" , Suren Baghdasaryan , "Cc: Android Kernel" , Jonathan Corbet , Sumit Semwal , Andrew Morton , Christoph Anton Mitterer , Kees Cook , Mike Rapoport , Colin Cross , Randy Dunlap , LKML , linux-fsdevel , "open list:DOCUMENTATION" , Linux Media Mailing List , DRI mailing list , "moderated list:DMA BUFFER SHARING FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 19, 2022 at 11:29 PM Christian K=C3=B6nig wrote: > > Am 19.05.22 um 23:40 schrieb Kalesh Singh: > > Processes can pin shared memory by keeping a handle to it through a > > file descriptor; for instance dmabufs, memfd, and ashsmem (in Android). > > > > In the case of a memory leak, to identify the process pinning the > > memory, userspace needs to: > > - Iterate the /proc//fd/* for each process > > - Do a readlink on each entry to identify the type of memory from > > the file path. > > - stat() each entry to get the size of the memory. > > > > The file permissions on /proc//fd/* only allows for the owner > > or root to perform the operations above; and so is not suitable for > > capturing the system-wide state in a production environment. > > > > This issue was addressed for dmabufs by making /proc/*/fdinfo/* > > accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1] > > To allow the same kind of tracking for other types of shared memory, > > add the following fields to /proc//fdinfo/: > > > > path - This allows identifying the type of memory based on common > > prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..." > > > > This was not an issued when dmabuf tracking was introduced > > because the exp_name field of dmabuf fdinfo could be used > > to distinguish dmabuf fds from other types. > > > > size - To track the amount of memory that is being pinned. > > > > dmabufs expose size as an additional field in fdinfo. Remove > > this and make it a common field for all fds. > > > > Access to /proc//fdinfo is governed by PTRACE_MODE_READ_FSCREDS > > -- the same as for /proc//maps which also exposes the path and > > size for mapped memory regions. > > > > This allows for a system process with PTRACE_MODE_READ_FSCREDS to > > account the pinned per-process memory via fdinfo. > > I think this should be split into two patches, one adding the size and > one adding the path. > > Adding the size is completely unproblematic, but the path might raise > some eyebrows. Hi Christian, Thanks for reviewing. "path" is exposed under the same ptrace capability as in /proc/pid/maps. If we want to be more cautious, then perhaps only adding "path" for the applicable anon inodes (dmabuf, memfd, ...)? But prefer to keep it generic if no one sees an issue with that. > > > > > [1] https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2= Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%= 2F&data=3D05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08= da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%= 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWw= iLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3Dkf%2B2es12hV3z5zjOFhx3EyxI1XEMe= HexqTLNpNoDhAY%3D&reserved=3D0 > > > > Signed-off-by: Kalesh Singh > > --- > > Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++-- > > drivers/dma-buf/dma-buf.c | 1 - > > fs/proc/fd.c | 9 +++++++-- > > 3 files changed, 27 insertions(+), 5 deletions(-) > > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesys= tems/proc.rst > > index 061744c436d9..ad66d78aca51 100644 > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -1922,13 +1922,16 @@ if precise results are needed. > > 3.8 /proc//fdinfo/ - Information about opened file > > --------------------------------------------------------------- > > This file provides information associated with an opened file. The re= gular > > -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'. > > +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'si= ze', > > +and 'path'. > > + > > The 'pos' represents the current offset of the opened file in decimal > > form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask= the > > file has been created with [see open(2) for details] and 'mnt_id' rep= resents > > mount ID of the file system containing the opened file [see 3.5 > > /proc//mountinfo for details]. 'ino' represents the inode number= of > > -the file. > > +the file, 'size' represents the size of the file in bytes, and 'path' > > +represents the file path. > > > > A typical output is:: > > > > @@ -1936,6 +1939,8 @@ A typical output is:: > > flags: 0100002 > > mnt_id: 19 > > ino: 63107 > > + size: 0 > > + path: /dev/null > > > > All locks associated with a file descriptor are shown in its fdinfo t= oo:: > > > > @@ -1953,6 +1958,8 @@ Eventfd files > > flags: 04002 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:[eventfd] > > eventfd-count: 5a > > > > where 'eventfd-count' is hex value of a counter. > > @@ -1966,6 +1973,8 @@ Signalfd files > > flags: 04002 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:[signalfd] > > sigmask: 0000000000000200 > > > > where 'sigmask' is hex value of the signal mask associated > > @@ -1980,6 +1989,8 @@ Epoll files > > flags: 02 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:[eventpoll] > > tfd: 5 events: 1d data: ffffffffffffffff pos:0 ino:6= 1af sdev:7 > > > > where 'tfd' is a target file descriptor number in decimal form, > > @@ -1998,6 +2009,8 @@ For inotify files the format is the following:: > > flags: 02000000 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:inotify > > inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fha= ndle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d > > > > where 'wd' is a watch descriptor in decimal form, i.e. a target file > > @@ -2021,6 +2034,8 @@ For fanotify files the format is:: > > flags: 02 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:[fanotify] > > fanotify flags:10 event-flags:0 > > fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003 > > fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:4000= 0000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4 > > @@ -2046,6 +2061,8 @@ Timerfd files > > flags: 02 > > mnt_id: 9 > > ino: 63107 > > + size: 0 > > + path: anon_inode:[timerfd] > > clockid: 0 > > ticks: 0 > > settime flags: 01 > > @@ -2070,6 +2087,7 @@ DMA Buffer files > > mnt_id: 9 > > ino: 63107 > > size: 32768 > > + path: /dmabuf: > > count: 2 > > exp_name: system-heap > > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > > index b1e25ae98302..d61183ff3c30 100644 > > --- a/drivers/dma-buf/dma-buf.c > > +++ b/drivers/dma-buf/dma-buf.c > > @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m,= struct file *file) > > { > > struct dma_buf *dmabuf =3D file->private_data; > > > > - seq_printf(m, "size:\t%zu\n", dmabuf->size); > > /* Don't count the temporary reference taken inside procfs seq_sh= ow */ > > seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1); > > seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name); > > diff --git a/fs/proc/fd.c b/fs/proc/fd.c > > index 913bef0d2a36..a8a968bc58f0 100644 > > --- a/fs/proc/fd.c > > +++ b/fs/proc/fd.c > > @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v) > > if (ret) > > return ret; > > > > - seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n"= , > > + seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\ns= ize:\t%zu\n", > > (long long)file->f_pos, f_flags, > > real_mount(file->f_path.mnt)->mnt_id, > > - file_inode(file)->i_ino); > > + file_inode(file)->i_ino, > > + file_inode(file)->i_size); > > We might consider splitting this into multiple seq_printf calls, one for > each printed attribute. > > It becomes a bit unreadable and the minimal additional overhead > shouldn't matter that much. Agreed. WIll update in the next version. Thanks, Kalesh > > Regards, > Christian. > > > + > > + seq_puts(m, "path:\t"); > > + seq_file_path(m, file, "\n"); > > + seq_putc(m, '\n'); > > > > /* show_fd_locks() never deferences files so a stale value is saf= e */ > > show_fd_locks(m, file, files); > > > > base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5 >