Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp1295335rwb; Wed, 26 Jul 2023 10:13:37 -0700 (PDT) X-Google-Smtp-Source: APBJJlFbWsvW64v4Wm7b5ZrFjIRPYIvFvCzJmr0eY6/ZbjPStFeiwe714aMQb6V2IcFKvfYOncJY X-Received: by 2002:a17:902:bb8c:b0:1b9:e8e4:620e with SMTP id m12-20020a170902bb8c00b001b9e8e4620emr1877620pls.47.1690391617246; Wed, 26 Jul 2023 10:13:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690391617; cv=none; d=google.com; s=arc-20160816; b=p5TD4JBz9vHZ2KsCJf4n7bZsFOmRpmPS+KRbMmTyaeqB0aSDkZU+goxhLMBqbql0kw 7sQZS+crHdQFNaYG9+osEd3wbktWPCKck6E/t7RN/1ExJPb3kEh+/r/qFYQvaRkSG8c/ U6v8Ff4r9rfrE17UHqBPmcjzeuBtLML28FEFaXk9AdjGDMPRr21GU1WQI4KeTDcpHAbB Ve/k9saUrRLvfnNtYLQdFpFGaOD/o7FDsK8ITuQhQZNUj46p3XXEZlQO8E740c6bBYP9 6g1QlKzIK4pSuu52ajtvcwOi5CJnWN3VGYmUXUfzDJ8USfpQsavjhpQzACRTCLxukxoY ZyjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=sV/itwTH2wBGTBWPXDcdY9RW5SeTusxXJJVhS7J/y30=; fh=KOhHSVU547YoErcjXjLhufKKkyS+RmJk5RMy6uqxrUw=; b=vu+mxI6EwwBrbi4Iq3ti5kprn4bf8VY7IoiQj7t74yHSlVzTlDFr1mAGKB1UxfBn6r MPp+3TDczTx84oq/uZ2gOCgGME4PodMTmkFuN8GcsUwf6n2U1osc6HBOD8u2QZMmYjiC YyvxgaMKAy6Xv1m3mX7wO+N5Rj8sCkiYXvIwYXoXysrgsFs/mcx5rQ6sUhtS4++qTEw+ HKh0zvZodvZiej0eHUs331j0zftblZIk18Fv0W9v8ql35SYM+aMuLJNQpo1szq2YLLxI uAkFAryKkSVKSEt3isPGi2MprgAX3rE63yEPQxfH4yjhc9QDO4f1XvsfcwZ6NWr1hb2N FWug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QYmyWShx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o8-20020a170902778800b001bb12009a14si12509031pll.338.2023.07.26.10.13.23; Wed, 26 Jul 2023 10:13:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QYmyWShx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231612AbjGZQo4 (ORCPT + 99 others); Wed, 26 Jul 2023 12:44:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231664AbjGZQot (ORCPT ); Wed, 26 Jul 2023 12:44:49 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9CB926AC; Wed, 26 Jul 2023 09:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690389876; x=1721925876; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=4yYCYMS1X5+8+dKZLjO0tNUiV3DOrWeoyubjzbdu4rY=; b=QYmyWShxHwr52uPIUZ8l/V+HHrXshhcKDHfB3v9LBiQejEX1WHPMYWwy w4IqQKSxXslelUQYy+kGwnyIU6WC09fCfMcNiAPqigVS8adMk2ooflvgi 9g7tG/45We64uirdScr2TAAXpSsmHA7FL68/6XurwtGasQ+ppPnUJfIRb XlOMkw0P3k+kYPUka3zkIwXcyrSF278CmihtjgzMH/AiTqtXrWtk6zDN/ i/Dvh9S+Ud2job5DMFqVGyOis56wnnKNaPhW95EbsnVObsQdVY/gM11e/ l6Y36eTVvwNOj8wae/laaXytgf17/4G66aIiPfzumHi0DrutwsNj6mYZ0 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="370747893" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="370747893" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2023 09:44:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10783"; a="900512351" X-IronPort-AV: E=Sophos;i="6.01,232,1684825200"; d="scan'208";a="900512351" Received: from spsmyth-mobl1.ger.corp.intel.com (HELO [10.213.229.32]) ([10.213.229.32]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2023 09:44:30 -0700 Message-ID: <8959f665-4353-3630-a6c7-5dca60959faa@linux.intel.com> Date: Wed, 26 Jul 2023 17:44:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH 16/17] cgroup/drm: Expose memory stats Content-Language: en-US To: Tejun Heo Cc: Intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Johannes Weiner , Zefan Li , Dave Airlie , Daniel Vetter , Rob Clark , =?UTF-8?Q?St=c3=a9phane_Marchesin?= , "T . J . Mercier" , Kenny.Ho@amd.com, =?UTF-8?Q?Christian_K=c3=b6nig?= , Brian Welty , Tvrtko Ursulin , Maarten Lankhorst , Eero Tamminen References: <20230712114605.519432-1-tvrtko.ursulin@linux.intel.com> <20230712114605.519432-17-tvrtko.ursulin@linux.intel.com> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,HK_RANDOM_ENVFROM,HK_RANDOM_FROM, NICE_REPLY_A,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 21/07/2023 23:21, Tejun Heo wrote: > On Wed, Jul 12, 2023 at 12:46:04PM +0100, Tvrtko Ursulin wrote: >> $ cat drm.memory.stat >> card0 region=system total=12898304 shared=0 active=0 resident=12111872 purgeable=167936 >> card0 region=stolen-system total=0 shared=0 active=0 resident=0 purgeable=0 >> >> Data is generated on demand for simplicty of implementation ie. no running >> totals are kept or accounted during migrations and such. Various >> optimisations such as cheaper collection of data are possible but >> deliberately left out for now. >> >> Overall, the feature is deemed to be useful to container orchestration >> software (and manual management). >> >> Limits, either soft or hard, are not envisaged to be implemented on top of >> this approach due on demand nature of collecting the stats. > > So, yeah, if you want to add memory controls, we better think through how > the fd ownership migration should work. It would be quite easy to make the implicit migration fail - just the matter of failing the first ioctl, which is what triggers the migration, after the file descriptor access from a new owner. But I don't think I can really add that in the RFC given I have no hard controls or anything like that. With GPU usage throttling it doesn't really apply, at least I don't think it does, since even when migrated to a lower budget group it would just get immediately de-prioritized. I don't think hard GPU time limits are feasible in general, and while soft might be, again I don't see that any limiting would necessarily have to run immediately on implicit migration. Second part of the story are hypothetical/future memory controls. I think first thing to say is that implicit migration is important, but it is not really established to use the file descriptor from two places or to migrate more than once. It is simply fresh fd which gets sent to clients from Xorg, which is one of the legacy ways of doing things. So we probably can just ignore that given no significant amount of memory ownership would be getting migrated. And for drm.memory.stat I think what I have is good enough - both private and shared data get accounted, for any clients that have handles to particular buffers. Maarten was working on memory controls so maybe he would have more thoughts on memory ownership and implicit migration. But I don't think there is anything incompatible with that and drm.memory.stats as proposed here, given how the categories reported are the established ones from the DRM fdinfo spec, and it is fact of the matter that we can have multiple memory regions per driver. The main thing that would change between this RFC and future memory controls in the area of drm.memory.stat is the implementation - it would have to get changed under the hood from "collect on query" to "account at allocation/free/etc". But that is just implementation details. Regards, Tvrtko