Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp8575727rwr; Thu, 11 May 2023 03:22:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ61+a8TK3FcpYIw3pFqUsvU6JbGd/BPTV38KZydn6MJHV0grd0orPr2jBfqYDenI0cWd9Bz X-Received: by 2002:a17:903:1c6:b0:1ac:946e:468e with SMTP id e6-20020a17090301c600b001ac946e468emr11583528plh.57.1683800529005; Thu, 11 May 2023 03:22:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683800528; cv=none; d=google.com; s=arc-20160816; b=Ao8ykMc9sxi/OcdmEP6Jjpn4BJoeUEZIibfP0RR0apPNxBLVVLZ/XsS5rtnpsklNiF b3uVHFpsiHSgfZWFD7mhtrOXGiNRuedP2nzNuW3R/hYluy7Gr/u/maRaoFL3I+jvpOn+ e9SILD2LBNi5dqNC5hUmzbJnv7vHa6F2pnTRuPpn2fECFWAylOK5XlDdVEfyYP5Q9IPL PVqVNY/lWKgSGNMSrDADswKlo9GBaWK1olWz2BEBgnCSP7msm87hpkgUMSxAPW+WXeTP AlfG4JDmJ8O69r5dLc28sOwO6f8raipVHeuO94Lv4ouFLOfrevbc3XfoMnrEeUCG2+HX 7t1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=gwWfVUfIUw9HHBnZYo7+cmTJtP6zPzrvmgqWtB21PNo=; b=XKJxK37ZVej+BfV1EjMXt4uhmbsuJApqfxs7bEINUFQ8KoDlGb3SRvqG/50Oa0n57h 5G+l7ZrgFlYubTMOSOfOh5pFBzxhN0RnlkSfQOI50kglXdKiKTQIyvvyZsrsJiaknfXU z2b0sfjtP6mRpXgr4otfb8fuMZlJIUu4GPV+aGaDuxX+upqkFyWX0FrRO6j6zEBKva4C Q+lDK1NIkvHo4iHReg3LVGS0YYLXUSySDsSL20smB2aqe7BOwVjIibQ7amHhleHeFCin h5+YSUNXKjEjEYH/IvmG5Pgfv7toBEv+4ipB1JpGm5lHIxah2BFH/i7o+iDyR4uwtd86 HPGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mX9uxVKL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h8-20020a170902680800b001ab06ae9edfsi5824159plk.551.2023.05.11.03.21.55; Thu, 11 May 2023 03:22:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mX9uxVKL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236394AbjEKKDQ (ORCPT + 99 others); Thu, 11 May 2023 06:03:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229888AbjEKKDP (ORCPT ); Thu, 11 May 2023 06:03:15 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB6FE7EDD; Thu, 11 May 2023 03:03:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683799393; x=1715335393; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=QlT9To/OXd8EOJgtiTI8cBaN4OaCchdOcuG26O4bS00=; b=mX9uxVKLxVUhs2atxKLLhdH8TgisX0f5SSdBdO4OriBsm4CfWQS5spTN l0MjKHeXEXgFTYg9ZbtBoM1V2md5fGcIyzDeYrijhXWiK1fXaVS7Qrvc4 crxZUhhkgybLD2kMMITZ0klB4/XcrYRheMs/xcxD53uDddV+tF081WEEL wZU3KIfh97ctbYv+v0XqzcPbLW5Wgifuh61yrzG2pJfV5LswoisSWnHzj kVTP6BniIXdT35yX21EWoCSQHe4OFuiqBPOhX6UrbPDnXWHfLkhhNof6Q 3/NwtKdULdsqJ1H9akydPvKn1FLwbaLESEfZVYy7OsCt9tCqcaEMGtRXb w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="353549906" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="353549906" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 03:03:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="843886962" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="843886962" Received: from acharach-mobl.ger.corp.intel.com (HELO [10.251.219.38]) ([10.251.219.38]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 03:03:06 -0700 Message-ID: <56668e29-f697-bd9b-2c13-182e8456dbce@linux.intel.com> Date: Thu, 11 May 2023 12:03:04 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.11.0 Subject: Re: [RFC PATCH 0/4] Add support for DRM cgroup memory accounting. Content-Language: en-US To: Tejun Heo Cc: dri-devel@lists.freedesktop.org, cgroups@vger.kernel.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Zefan Li , Johannes Weiner , David Airlie , Daniel Vetter , amd-gfx@lists.freedesktop.org, Maxime Ripard , Thomas Zimmermann , Tvrtko Ursulin References: <20230503083500.645848-1-maarten.lankhorst@linux.intel.com> <4d6fbce3-a676-f648-7a09-6f6dcc4bdb46@linux.intel.com> From: Maarten Lankhorst In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hey, On 2023-05-10 20:46, Tejun Heo wrote: > Hello, > > On Wed, May 10, 2023 at 04:59:01PM +0200, Maarten Lankhorst wrote: >> The misc controller is not granular enough. A single computer may have any number of >> graphics cards, some of them with multiple regions of vram inside a single card. > Extending the misc controller to support dynamic keys shouldn't be that > difficult. > > ... >> In the next version, I will move all the code for handling the resource limit to >> TTM's eviction layer, because otherwise it cannot handle the resource limit correctly. >> >> The effect of moving the code to TTM, is that it will make the code even more generic >> for drivers that have vram and use TTM. When using TTM, you only have to describe your >> VRAM, update some fields in the TTM manager and (un)register your device with the >> cgroup handler on (un)load. It's quite trivial to add vram accounting to amdgpu and >> nouveau. [2] >> >> If you want to add a knob for scheduling weight for a process, it makes sense to >> also add resource usage as a knob, otherwise the effect of that knob is very >> limited. So even for Tvrtko's original proposed usecase, it would make sense. > It does make sense but unlike Tvrtko's scheduling weights what's being > proposed doesn't seem to encapsulate GPU memory resource in a generic enough > manner at least to my untrained eyes. ie. w/ drm.weight, I don't need any > specific knoweldge of how a specific GPU operates to say "this guy should > get 2x processing power over that guy". This more or less holds for other > major resources including CPU, memory and IO. What you're proposing seems a > lot more tied to hardware details and users would have to know a lot more > about how memory is configured on that particular GPU. There's not much need of knowing the specifics of a card, but there might be a need of knowing the workload to determine what allocation limits to set. I've left region to be implementation specific, but it would make sense to standardise it. TTM, the layer used by drivers that support VRAM, have the following regions: * sysmem - All system memory allocated; includes evicted VRAM. * mapped - All physical system memory that is mapped to the GPU, when unbound moves to sysmem. When evicting VRAM to sysmem, it's temporarily mapped here. * vramN - All VRAM regions of the device. * driver specific regions - probably doesn't make sense to put in cgroup at all, this includes stolen from the PoC. That leaves the question, what regions would make sense for a cgroup? Since vramN can be moved to mapped and sysmem (VRAM eviction, suspend/resume, driver_madvise), it becomes a subject of debate if we should include the other regions, since things become complicated fast. For the first iteration, I focus on a single category, vramN. Even when not knowing anything about a GPU, it will be easy to partition its memory like that. If you can assign a weight for the scheduler, then you can also partition it's vram by parsing /drm.capacity for total amount, and then splitting it across cgroups. > Now, if this is inherent to how all, or at least most, GPUs operate, sure, > but otherwise let's start small in terms of interface and not take up space > which should be for something universal. If this turns out to be the way, > expanding to take up the generic interface space isn't difficult. > > I don't know GPU space so please educate me where I'm wrong. Most GPU's have dedicated vram that works roughly in the same way, some integrated chips like i915 or arm use shared memory from the host system only. I would say amd, nvidia and intel's chips with dedicated memory work roughly in the same way for vram. I hope this explains it a little bit more, ~Maarten