Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp535694pxb; Wed, 27 Jan 2021 14:13:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJwUfEFiSuQ8lEGJbyx8F1fP/HIuiEHcmqoIzby+lVKisQzWXivQmBgzXMEiPWS5h0GOZqoG X-Received: by 2002:a17:907:16a2:: with SMTP id hc34mr8584541ejc.9.1611785598677; Wed, 27 Jan 2021 14:13:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611785598; cv=none; d=google.com; s=arc-20160816; b=q/1bao7YmikPTUlHDwn3b/QJXTiNlkZIzRsEySP1Z5IxmCWmSR2dbcesUS+iRIJJcB 6yLtLrRwtQ7JTBbp38onk/UOBJGEeT/P9EiEh9GiAGEvnZ6XU6RcvPQiYnqoo5jEtdI7 GfKUlHJCIjNZ5UWW9bFHd1muh85bK01OEegMwVRGN6E+aXAQ38QFqaZ0VPDbjmRKvEVA vzDx2x5PBYoDKVKvfaOmQK8fhVN75jF+BCJ+w+xRmvkcL/TB+rQr7X3eXHvbF7uNhSof W/FpABSE+pMRPB24LSjtk474+/uFs94FrrJJwdzB/NwLdJuyfV+1N+mZ2VfAmgLOexUj 1eWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=nt9pdiLty5VAzHiVzQ9Z0VE2Bkzi2Zy4BkFRJGQTy+s=; b=WFIZStU6SxVFDm5fU1AaMqv2p7p1ive/jScHhpBRYL/BMRE4Y+7zxM75fPGnjZFSaq 7/+KlRjkSl5EBpIfdcu5/djVmTs0Cnt11YZWwViFvOx6MB88+nc/Sd5e+Wq77vF2zb31 CFm0OQc5aVVb5PN6a7M8t4ogxgnoO/7r6D8Gbz1W58d5R3Gk4HceClmGWLlBlDVD0OBA XiWBhY22gGJ6PFRD8bUbfgffpr7kXqPe7Uhqa661jWHTS5q0/x5SB0uUawPHWkjN2bZN 6j3O/R/jYCN8JVIkrPxzmlr6bX9OZRXS8MYvqDEzIv2rMZFGO+Qj0E70FRCq9zjoF9MT cwdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RkI69ax2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a67si1781549edf.198.2021.01.27.14.12.52; Wed, 27 Jan 2021 14:13:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RkI69ax2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235847AbhA0KwU (ORCPT + 99 others); Wed, 27 Jan 2021 05:52:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236226AbhA0Ksi (ORCPT ); Wed, 27 Jan 2021 05:48:38 -0500 Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 392B8C061788 for ; Wed, 27 Jan 2021 02:47:58 -0800 (PST) Received: by mail-lj1-x22f.google.com with SMTP id s18so1515743ljg.7 for ; Wed, 27 Jan 2021 02:47:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=nt9pdiLty5VAzHiVzQ9Z0VE2Bkzi2Zy4BkFRJGQTy+s=; b=RkI69ax2YOOuKLnURhraOk6rkuVkZAdIrcLC8ROzb78X3XKcsHv7ZkY1MlD4SJgGMv mvSeWdPu6cvDFRvCbp1iTBbLA1Zl8F7eXpM479ayN+NeCBRjUZ9u4HUy3EzFV1sYTJcy PYB/fNUbJlr64E477+47TxBf7b4ryCO7tAewLY6czi6NhEMjHxiI15j/4hIaqSUmrqvt hnkovTZkaCRWB504JvBAbJB02VSyoiNAOfs8HPotmXVpd8A4rbeS6mMZhro1+rQDyfcn ljPWQoMwPkLz6Wc1OU0ppRL02EqrNXZCGQXG1ygbkLlN+Jif1Tk4RsoVvqn8fhhDYg0j bsvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=nt9pdiLty5VAzHiVzQ9Z0VE2Bkzi2Zy4BkFRJGQTy+s=; b=cz9skJQBQGigiGF836+mnHWAHBqKckd185tl7qNUhoKrCcowzIgWbe3CjPgZk9ew2M tq4LZfEcfDMCQD0WK34Nr58CEZODaPFeIROeF6dmxkZhT8n3rgY/e6Ohdsb9oRqUmpDH C9nq+a0eoHsfePNjXs9n1kwfV1tSzuU/7s0wYFpeDry15TVmtYMaXYIpBx2fYBs844gO Vv7u8M4y6vg5qz/3BGNluXWzYwCOospbzRqcT3eczQm/p6F+xKtrn4fJYUkL0A+6yHC/ 7AIUpzHeVpOkDlluNHYMW9m5R3ehZCX76QDqXFPBv+60vLWTPeBF3NZ+HdHq3UMKcvs7 028w== X-Gm-Message-State: AOAM533sJJmork4k7NhgGJ/GPwT+BVAapPVmcbE3PCZXT61EyQk8SX8Q NotP2wxAm8HAJLWMErEBp7bbe11TM38tdxNz/C/pSg== X-Received: by 2002:a2e:908e:: with SMTP id l14mr5465324ljg.226.1611744476438; Wed, 27 Jan 2021 02:47:56 -0800 (PST) MIME-Version: 1.0 References: <20210126225138.1823266-1-kaleshsingh@google.com> In-Reply-To: <20210126225138.1823266-1-kaleshsingh@google.com> From: Jann Horn Date: Wed, 27 Jan 2021 11:47:29 +0100 Message-ID: Subject: Re: [PATCH] procfs/dmabuf: Add /proc//task//dmabuf_fds To: Kalesh Singh Cc: Suren Baghdasaryan , Minchan Kim , Greg Kroah-Hartman , Hridya Valsaraju , kernel-team , Alexey Dobriyan , Jonathan Corbet , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Mauro Carvalho Chehab , Andrew Morton , Michal Hocko , Kees Cook , Alexey Gladkov , Szabolcs Nagy , "Eric W. Biederman" , Daniel Jordan , Michel Lespinasse , Bernd Edlinger , Andrei Vagin , Yafang Shao , Hui Su , kernel list , linux-fsdevel , "open list:DOCUMENTATION" , Linux Media Mailing List , dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, Jeffrey Vander Stoep , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +jeffv from Android On Tue, Jan 26, 2021 at 11:51 PM Kalesh Singh wrot= e: > In order to measure how much memory a process actually consumes, it is > necessary to include the DMA buffer sizes for that process in the memory > accounting. Since the handle to DMA buffers are raw FDs, it is important > to be able to identify which processes have FD references to a DMA buffer= . Or you could try to let the DMA buffer take a reference on the mm_struct and account its size into the mm_struct? That would probably be nicer to work with than having to poke around in procfs separately for DMA buffers. > Currently, DMA buffer FDs can be accounted using /proc//fd/* and > /proc//fdinfo -- both of which are only root readable, as follows: That's not quite right. They can both also be accessed by the user owning the process. Also, fdinfo is a standard interface for inspecting process state that doesn't permit reading process memory or manipulating process state - so I think it would be fine to permit access to fdinfo under a PTRACE_MODE_READ_FSCRED check, just like the interface you're suggesting. > 1. Do a readlink on each FD. > 2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD= . > 3. stat the file to get the dmabuf inode number. > 4. Read/ proc//fdinfo/, to get the DMA buffer size. > > Android captures per-process system memory state when certain low memory > events (e.g a foreground app kill) occur, to identify potential memory > hoggers. To include a process=E2=80=99s dmabuf usage as part of its memor= y state, > the data collection needs to be fast enough to reflect the memory state a= t > the time of such events. > > Since reading /proc//fd/ and /proc//fdinfo/ requires root > privileges, this approach is not suitable for production builds. It should be easy to add enough information to /proc//fdinfo/ so that you don't need to look at /proc//fd/ anymore. > Granting > root privileges even to a system process increases the attack surface and > is highly undesirable. Additionally this is slow as it requires many > context switches for searching and getting the dma-buf info. What do you mean by "context switches"? Task switches or kernel/user transitions (e.g. via syscall)? > With the addition of per-buffer dmabuf stats in sysfs [1], the DMA buffer > details can be queried using their unique inode numbers. > > This patch proposes adding a /proc//task//dmabuf_fds interface. > > /proc//task//dmabuf_fds contains a list of inode numbers for > every DMA buffer FD that the task has. Entries with the same inode > number can appear more than once, indicating the total FD references > for the associated DMA buffer. > > If a thread shares the same files as the group leader then its > dmabuf_fds file will be empty, as these dmabufs are reported by the > group leader. > > The interface requires PTRACE_MODE_READ_FSCRED (same as /proc//maps) > and allows the efficient accounting of per-process DMA buffer usage witho= ut > requiring root privileges. (See data below) I'm not convinced that introducing a new procfs file for this is the right way to go. And the idea of having to poke into multiple different files in procfs and in sysfs just to be able to compute a proper memory usage score for a process seems weird to me. "How much memory is this process using" seems like the kind of question the kernel ought to be able to answer (and the kernel needs to be able to answer somewhat accurately so that its own OOM killer can do its job properly)?