Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp594186pxb; Wed, 27 Jan 2021 16:09:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJx3zr0kOOcVZOZWffUndOtY8xAgvrYASa33Sxm62DLOScV0UNrTShDRkwHR3P0AEdEFqUOE X-Received: by 2002:a17:906:eb46:: with SMTP id mc6mr8821318ejb.184.1611792575288; Wed, 27 Jan 2021 16:09:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611792575; cv=none; d=google.com; s=arc-20160816; b=bhyy/eFaItzkdsK2ZVVTpa1d9eP2AX58Y8IwGw3MGUnq5T922STI/NolQXyq+ia9Ae mcVzWNyiv1NcqtXOpmIP8pt4T6L690BK4+JV63B4O01B76BYWPSb/LFKLEYqrQTTexCf +SABMtR1/y2YR2WgPpwV2FZi1/6gx6G8Mfpa9Lq1qOuAYbeGVeLxZnJs5RnYtvRNKrEU 9mmcmIHJ8/TTXwW5Nm27wCwD6LgdHsFmhQsYHEkgLqnYUjroIS76C7IHKN7sfKcAFvmx EoZBx/XFmpP7w5cN4XLkyUEY2fypzfwSgF12Jy5fCevYyQD62NK39TQKgmuo5HOV0xkx RBEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=eZkRAFReyCaCfsUebKjqhppx54eZqZXUA8gRquKcsPU=; b=lwQoaWs+eXbvwk/YYII048qfE2Rjb75DrpH0k4UXYJyDpqFvvWtxGQA+jevf2Dl7pp xyj/9txJvh7TbpxYZmudob+GaqEjJKR8ewJMIrjqPpgW1MxuPdclo03lWMCUoxd3y9kA CWC9Pm3y76i0XyuRT0G178wrpYWoEB8Pez++wq6yUljxvn23rgKl17U9YGumTQeTAol5 qofAk4tC5DDbU2cCgjTOWJTAFQnxsBO6NUguya0AII4Kdr9lNF++uSf27lb3sKtWWDJM RpU61pMtU3I8pEoBjVCYU8ZS17RWTF96lm+q5cWyTuJX94NiZsWt78AM2+oU3X014oDg bbzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=dimGUWBu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zl6si1433465ejb.747.2021.01.27.16.09.10; Wed, 27 Jan 2021 16:09:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=dimGUWBu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234869AbhA0RV4 (ORCPT + 99 others); Wed, 27 Jan 2021 12:21:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235601AbhA0RRP (ORCPT ); Wed, 27 Jan 2021 12:17:15 -0500 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63CDDC06178A for ; Wed, 27 Jan 2021 09:16:35 -0800 (PST) Received: by mail-pf1-x42d.google.com with SMTP id m6so1618039pfk.1 for ; Wed, 27 Jan 2021 09:16:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=eZkRAFReyCaCfsUebKjqhppx54eZqZXUA8gRquKcsPU=; b=dimGUWBu88dX8OdBPtqV1K4RlZBNMOe3qLaj2DFeriW0zXGvfLXQdvpDQW90G+pq8w p6xdN0AtXLBzz2oPtDzXqxZIr8psZaQU+lB+nDni6Zl/WIpcblkhLxycbxPAvV8te05i KsVIp+ePdyQmOMOK1htw20UOkkjeFnjUvviVoIEY2ofKY9lhBjfLrheC8MM/6mQmKGDo lgZrbx94cvm3t6pqu1kCexYTDa1cFG519/QRDEJtzXaZ7GklAYFE3S3JdLgmQXJO7Olf 5tbc9e9YBFuaJlg9N990aaa1pAzjhautowBMkQc84YuILgD85YeIKeYhHvbjucdynQoo 2yeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=eZkRAFReyCaCfsUebKjqhppx54eZqZXUA8gRquKcsPU=; b=G7djzSWBOJQkGv5P1zvOiQllApXVTQG68DluVbtDT8WWLO8+cAeHw7FcsWpjxyqRYS 6YYzWRSfSQTXwBSLqcfb3CsNPYMvwYm4wwStIByUpB735+84yNf3FICFvAa4xocJ55d7 0z4A3ZoL7nrP3lNasRWKVF7GaRNMPOn0RDdgOUG74saXGgTFyLSxjbr1hn1289TAhWSX nRdj9qXYSadKt6fqC6SCyv4EjEZ5ifFd5UVHWjBG/nkqZKlRegG5V/OZT3lS/M2NGtpq 5Otm86CullkkumL0SumKWUgcvFju/TSJu1kmf1lcHSSZfcdaa4YvMrsWyHMpMtVkpp7K R6lw== X-Gm-Message-State: AOAM532xT+aLbTSux72qg4Cnm89/JiADHCEeFqFm8M+ebdKbs+zZd3h3 h6UjaWkn9+jkicEdkCVx9SbgynGCiWX/+mhQ2QlEDA== X-Received: by 2002:a63:724a:: with SMTP id c10mr10790208pgn.124.1611767794770; Wed, 27 Jan 2021 09:16:34 -0800 (PST) MIME-Version: 1.0 References: <20210126225138.1823266-1-kaleshsingh@google.com> In-Reply-To: From: Kalesh Singh Date: Wed, 27 Jan 2021 12:16:23 -0500 Message-ID: Subject: Re: [PATCH] procfs/dmabuf: Add /proc//task//dmabuf_fds To: Jann Horn Cc: Suren Baghdasaryan , Minchan Kim , Greg Kroah-Hartman , Hridya Valsaraju , kernel-team , Alexey Dobriyan , Jonathan Corbet , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Mauro Carvalho Chehab , Andrew Morton , Michal Hocko , Kees Cook , Alexey Gladkov , Szabolcs Nagy , "Eric W. Biederman" , Daniel Jordan , Michel Lespinasse , Bernd Edlinger , Andrei Vagin , Yafang Shao , Hui Su , kernel list , linux-fsdevel , "open list:DOCUMENTATION" , Linux Media Mailing List , DRI mailing list , linaro-mm-sig@lists.linaro.org, Jeffrey Vander Stoep , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 27, 2021 at 5:47 AM Jann Horn wrote: > > +jeffv from Android > > On Tue, Jan 26, 2021 at 11:51 PM Kalesh Singh wr= ote: > > In order to measure how much memory a process actually consumes, it is > > necessary to include the DMA buffer sizes for that process in the memor= y > > accounting. Since the handle to DMA buffers are raw FDs, it is importan= t > > to be able to identify which processes have FD references to a DMA buff= er. > > Or you could try to let the DMA buffer take a reference on the > mm_struct and account its size into the mm_struct? That would probably > be nicer to work with than having to poke around in procfs separately > for DMA buffers. > > > Currently, DMA buffer FDs can be accounted using /proc//fd/* and > > /proc//fdinfo -- both of which are only root readable, as follows: > > That's not quite right. They can both also be accessed by the user > owning the process. Also, fdinfo is a standard interface for > inspecting process state that doesn't permit reading process memory or > manipulating process state - so I think it would be fine to permit > access to fdinfo under a PTRACE_MODE_READ_FSCRED check, just like the > interface you're suggesting. Hi everyone. Thank you for the feedback. I understand there is a deeper problem of accounting shared memory in the kernel, that=E2=80=99s not only specific to the DMA buffers. In this ca= se DMA buffers, I think Jann=E2=80=99s proposal is the cleanest way to attribu= te the shared buffers to processes. I can respin a patch modifying fdinfo as suggested, if this is not an issue from a security perspective. Thanks, Kalesh > > > > 1. Do a readlink on each FD. > > 2. If the target path begins with "/dmabuf", then the FD is a dmabuf = FD. > > 3. stat the file to get the dmabuf inode number. > > 4. Read/ proc//fdinfo/, to get the DMA buffer size. > > > > Android captures per-process system memory state when certain low memor= y > > events (e.g a foreground app kill) occur, to identify potential memory > > hoggers. To include a process=E2=80=99s dmabuf usage as part of its mem= ory state, > > the data collection needs to be fast enough to reflect the memory state= at > > the time of such events. > > > > Since reading /proc//fd/ and /proc//fdinfo/ requires root > > privileges, this approach is not suitable for production builds. > > It should be easy to add enough information to /proc//fdinfo/ so > that you don't need to look at /proc//fd/ anymore. > > > Granting > > root privileges even to a system process increases the attack surface a= nd > > is highly undesirable. Additionally this is slow as it requires many > > context switches for searching and getting the dma-buf info. > > What do you mean by "context switches"? Task switches or kernel/user > transitions (e.g. via syscall)? > > > With the addition of per-buffer dmabuf stats in sysfs [1], the DMA buff= er > > details can be queried using their unique inode numbers. > > > > This patch proposes adding a /proc//task//dmabuf_fds interfac= e. > > > > /proc//task//dmabuf_fds contains a list of inode numbers for > > every DMA buffer FD that the task has. Entries with the same inode > > number can appear more than once, indicating the total FD references > > for the associated DMA buffer. > > > > If a thread shares the same files as the group leader then its > > dmabuf_fds file will be empty, as these dmabufs are reported by the > > group leader. > > > > The interface requires PTRACE_MODE_READ_FSCRED (same as /proc//map= s) > > and allows the efficient accounting of per-process DMA buffer usage wit= hout > > requiring root privileges. (See data below) > > I'm not convinced that introducing a new procfs file for this is the > right way to go. And the idea of having to poke into multiple > different files in procfs and in sysfs just to be able to compute a > proper memory usage score for a process seems weird to me. "How much > memory is this process using" seems like the kind of question the > kernel ought to be able to answer (and the kernel needs to be able to > answer somewhat accurately so that its own OOM killer can do its job > properly)?