Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp10264461rwl; Wed, 11 Jan 2023 17:23:00 -0800 (PST) X-Google-Smtp-Source: AMrXdXupxxA0XB2f5LceygtEMA64K8/vmxqMWoPxJKpLjAhXDWjxyLQaZXhRNfCnafstlGDpA7Mk X-Received: by 2002:a17:907:a708:b0:82e:a59a:5c3e with SMTP id vw8-20020a170907a70800b0082ea59a5c3emr82264341ejc.10.1673486580575; Wed, 11 Jan 2023 17:23:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673486580; cv=none; d=google.com; s=arc-20160816; b=S1oz2CO51aRyeLHe8SulS05YCYJD+4wooX6pRzcPrP8qYNU6o6nCSzzOikHmv/6Rm0 gnM29ZI103R65wacWprLHfmyjlOqCEiV9wfLnwU0p9ZLjtzaz3pbTirTOae9SJYn6BPL XdfiPaNRJChW6lDTph8fzpsRyzzFcwnOkLLmeRbzV37Qv2kVolDHu+BIaUapQZx8YmQ/ 0nomj29dEud8wrYdtFyf9Q+7PFMDJKz3bNWbDfia2p8/lQZJ79MKR4sFuxOLuUgHDhS2 UC7POVKrL7ny374CAI8dvigFE33gmDa3t8FkZVJDOURG01Yzj3qPg+Oay/vT0HArXqT6 29FQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=hDrqeCy2isEAcEZutQc18fIgv9oWhVupwebw91lgye4=; b=qLpCD98wEPjoWug8pTofg9monSDeTnsiUvA3tNuNqH9LfWlI00tQTyjhcPO6/u7EYc wwZ9oLGIPx8v30gMR5zm7vkkVgHtKSgQeT7+8hrrfA+cSy9oH6+/UJ3AXMNGt6cjbS2f TmkAqLiJatQlzfvkwRkmWMNjkN1kzNaBLigioh0tRTFkyPh8vL4q6x2F/clWT4g+Xfn5 zteC4gImkgWPIXv4/UUuPoCkpmFb8EkgFWaBb+C23ZNsRy5Gejl/izyXoEBvWoXkdOmC kbzYvliriwhmKOA3Zg888NA58NryIRGSQioOYXTuhdKrwykpk/ySLsPrgWaU2pZJGB+i BJQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=qxxyHu+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xa14-20020a170906fd8e00b00787b9cd2fe3si15337659ejb.844.2023.01.11.17.22.48; Wed, 11 Jan 2023 17:23:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=qxxyHu+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235778AbjALAtx (ORCPT + 50 others); Wed, 11 Jan 2023 19:49:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbjALAtu (ORCPT ); Wed, 11 Jan 2023 19:49:50 -0500 Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD2013C0D0 for ; Wed, 11 Jan 2023 16:49:48 -0800 (PST) Received: by mail-yb1-xb34.google.com with SMTP id p188so16977672yba.5 for ; Wed, 11 Jan 2023 16:49:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hDrqeCy2isEAcEZutQc18fIgv9oWhVupwebw91lgye4=; b=qxxyHu+WDybsKdsOve3HdK7LpgzBN/3GZRqgrxvs9Q5W9Kzwd7Dzc9/jYzxkNuDrul kQkTCY47Z2auo0ah8YGi8N0fYkfJOF7YxzNDRGT2TVVKTxh3xTGn5tExmfv9+PHBUfjc yw/X/nMCu8FyYUb99VuG8LjTaud4O30po5zBH3LKaw5V6vBQRwOmH59u3sd47u/bsD8t wqFwxSQZsBtZNU5U8hrEp3462X5hXd5zeVSkG1FDkuO9G+kbmGLsdETBG2DaHv5igzc1 rEShl5bXFHof6n5bFLnvY/qrSb9Wg1ns/Lu0eWyU8NcGpi2ojYMe7+hIT1ATndOFXF0A cgSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hDrqeCy2isEAcEZutQc18fIgv9oWhVupwebw91lgye4=; b=tfnRmXX7JGZ8wMXc7xWo/4ost8mu0h55vjVfsDMlcrnH385Dn1w+7dJFee46dtj/Ks NCHWk95eyoRQaFNk/idqaRyXrGO+ZvDIjGPs2C2t0UAruGKX6bGWnGdzuOrSS0tVIQIz PDb+XDw++njD4jfs9vUjhRB20D02Ertb6UdtmEMrr9hQH02SpvOrMnzVGhDSywWN6etd Jniu5IZvmxe513igNIntp7a7Uis+I9c8SRqx406POjdhlngJQwb2anf9xm/CUOlj9a73 oT9mTIIrBkYKwLjUhXjfUBo7PT6G/EgD3vWgUH7n2ITj+ThNLjf1SdOhy3YbMaNspU4t 8JSA== X-Gm-Message-State: AFqh2kpEgfBOGek0LJamnovYYJPyQGE9ovqrRGQapMFn8PYeO8yAHYSU yhoIe2LPojdngPIo8e0iMMhgGBDL9096FGz3fGUikw== X-Received: by 2002:a25:dd84:0:b0:756:35b9:e2de with SMTP id u126-20020a25dd84000000b0075635b9e2demr8569422ybg.117.1673484587884; Wed, 11 Jan 2023 16:49:47 -0800 (PST) MIME-Version: 1.0 References: <20230109213809.418135-1-tjmercier@google.com> In-Reply-To: From: "T.J. Mercier" Date: Wed, 11 Jan 2023 16:49:36 -0800 Message-ID: Subject: Re: [PATCH 0/4] Track exported dma-buffers with memcg To: Shakeel Butt , "T.J. Mercier" , Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Greg Kroah-Hartman , =?UTF-8?B?QXJ2ZSBIasO4bm5ldsOlZw==?= , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris , android-mm@google.com, jstultz@google.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, selinux@vger.kernel.org Cc: daniel.vetter@ffwll.ch Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 11, 2023 at 2:56 PM Daniel Vetter wrote: > > On Mon, Jan 09, 2023 at 04:18:12PM -0800, Shakeel Butt wrote: > > Hi T.J., > > > > On Mon, Jan 9, 2023 at 1:38 PM T.J. Mercier wrote: > > > > > > Based on discussions at LPC, this series adds a memory.stat counter for > > > exported dmabufs. This counter allows us to continue tracking > > > system-wide total exported buffer sizes which there is no longer any > > > way to get without DMABUF_SYSFS_STATS, and adds a new capability to > > > track per-cgroup exported buffer sizes. The total (root counter) is > > > helpful for accounting in-kernel dmabuf use (by comparing with the sum > > > of child nodes or with the sum of sizes of mapped buffers or FD > > > references in procfs) in addition to helping identify driver memory > > > leaks when in-kernel use continually increases over time. With > > > per-application cgroups, the per-cgroup counter allows us to quickly > > > see how much dma-buf memory an application has caused to be allocated. > > > This avoids the need to read through all of procfs which can be a > > > lengthy process, and causes the charge to "stick" to the allocating > > > process/cgroup as long as the buffer is alive, regardless of how the > > > buffer is shared (unless the charge is transferred). > > > > > > The first patch adds the counter to memcg. The next two patches allow > > > the charge for a buffer to be transferred across cgroups which is > > > necessary because of the way most dmabufs are allocated from a central > > > process on Android. The fourth patch adds a SELinux hook to binder in > > > order to control who is allowed to transfer buffer charges. > > > > > > [1] https://lore.kernel.org/all/20220617085702.4298-1-christian.koenig@amd.com/ > > > > > > > I am a bit confused by the term "charge" used in this patch series. > > From the patches, it seems like only a memcg stat is added and nothing > > is charged to the memcg. > > > > This leads me to the question: Why add this stat in memcg if the > > underlying memory is not charged to the memcg and if we don't really > > want to limit the usage? > > > > I see two ways forward: > > > > 1. Instead of memcg, use bpf-rstat [1] infra to implement the > > per-cgroup stat for dmabuf. (You may need an additional hook for the > > stat transfer). > > > > 2. Charge the actual memory to the memcg. Since the size of dmabuf is > > immutable across its lifetime, you will not need to do accounting at > > page level and instead use something similar to the network memory > > accounting interface/mechanism (or even more simple). However you > > would need to handle the reclaim, OOM and charge context and failure > > cases. However if you are not looking to limit the usage of dmabuf > > then this option is an overkill. > > I think eventually, at least for other "account gpu stuff in cgroups" use > case we do want to actually charge the memory. > Yes, I've been looking at this today. > The problem is a bit that with gpu allocations reclaim is essentially "we > pass the error to userspace and they get to sort the mess out". There are > some exceptions (some gpu drivers to have shrinkers) would we need to make > sure these shrinkers are tied into the cgroup stuff before we could enable > charging for them? > I'm also not sure that we can depend on the dmabuf being backed at export time 100% of the time? (They are for dmabuf heaps.) If not, that'd make calling the existing memcg folio based functions a bit difficult. > Also note that at least from the gpu driver side this is all a huge > endeavour, so if we can split up the steps as much as possible (and get > something interim useable that doesn't break stuff ofc), that is > practically need to make headway here. TJ has been trying out various > approaches for quite some time now already :-/ > -Daniel > > > Please let me know if I misunderstood something. > > > > [1] https://lore.kernel.org/all/20220824233117.1312810-1-haoluo@google.com/ > > > > thanks, > > Shakeel > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch