Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1458597rwl; Wed, 12 Apr 2023 13:12:58 -0700 (PDT) X-Google-Smtp-Source: AKy350ZjVoKfeESWzzb1JQdY3uH+bEXyAVqK7017MCOva12YTCd2ZJb50jxUkV46fFJMJ9RGBo4m X-Received: by 2002:a17:906:2797:b0:94e:3eb:dfc3 with SMTP id j23-20020a170906279700b0094e03ebdfc3mr3820769ejc.36.1681330377668; Wed, 12 Apr 2023 13:12:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681330377; cv=none; d=google.com; s=arc-20160816; b=AHnAyZh3C2EqNXhW4dHu+u03pgnFhY3anELvWMy1mWURkgoxGkDDfpBXItMbflXKpF kO1TklnXTUJ9VtxaArm2vV9GKwmxEjOpToVr4D/3xoOg4J/Yj6NNOLpseAjmAEp4DQF9 1xrZygTYEzB7UoxaMzVyFxvqeC1dQH3rfJFmNUErcUczVtxNLQwb4RhiPl6T7uH4Yncp 9oKyidm2ZTtdrU7h5QygkNjNtirmLhIPhmHaEYfbLGB8CxB4Co3Pey1S6G+N29+dgJOp tAXOWnkKPGZrCuMERdg8jrAvLdkQhBsBvKmIjEey2Pc7SfXV89vED6gOLr/446LZRRI6 0O1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=WJBfqgHDVPfKU8NYqpJEjIgLGTuDGEpHWnbrpLt7FQE=; b=FmkRaVtbUvoWG8zC4wYy0ZTUfMUTXtLVZylUFPYorZV+M0f4isGoGu7QTbcHSpJTD5 TuH0ZtJ8ShpEIidHehMOjr6BNeF3IxQo6bt3zf0AaZMiyh1oOFP+tAYXbjwGdqfj4NQS /PcFEpQTE7ngrCaQdpxamYWTu1CPKWkPWHY5a58ysxN7x94bTiAkeWSoZT6ibcoXR0Y4 JpeW72/DxeD/v9OPPZ8tEO1TzwVRE26fCbBlC/M9dHXy+FRld66+ZKT1SKSWD4BSE+Bn u1ltURdZr6TqYfwDtKiTV3aV/ujO81DrtrPo3dkCUhyS18Gojet6a9xi+OKgK+DQCx6E 4PgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=P6HsoZ0N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dk9-20020a170906f0c900b0094aa60e7539si5684666ejb.18.2023.04.12.13.12.32; Wed, 12 Apr 2023 13:12:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=P6HsoZ0N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230071AbjDLUKD (ORCPT + 99 others); Wed, 12 Apr 2023 16:10:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230010AbjDLUKB (ORCPT ); Wed, 12 Apr 2023 16:10:01 -0400 Received: from mail-oa1-x30.google.com (mail-oa1-x30.google.com [IPv6:2001:4860:4864:20::30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D53FD61B3; Wed, 12 Apr 2023 13:09:59 -0700 (PDT) Received: by mail-oa1-x30.google.com with SMTP id 586e51a60fabf-1842eb46746so15079166fac.4; Wed, 12 Apr 2023 13:09:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681330199; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WJBfqgHDVPfKU8NYqpJEjIgLGTuDGEpHWnbrpLt7FQE=; b=P6HsoZ0NO/iam4SNYEbrXw4BqROT0RRXrgJ56x/VXdrUO32o3wsGnWJlcvQ5kffrR2 Zd4+0RlZ1ieNb7R97RYmYIsqKoZbH2TjIW6m5fVv2lF+z9I9iv0fq0vxsvZZilWGYile VKOiAnijri3ZrRanaWrQL+DL9+dkL1rVE57qLQNITZ/1s9333Seu0eNYkoEnUG5CZPD+ IwjBiUNNbxHp3HobiCjdKVtVWkMHjHJA0DDkECfia5oRkwIAjaqqWNdzNhbhFA+XNYO2 yku1PJNKqKZBovbVqFUe7oiv/GppRXV9JkT17ByV30ZEy6j1b2/mPxlfPVLiKV2qPKWE X6Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681330199; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WJBfqgHDVPfKU8NYqpJEjIgLGTuDGEpHWnbrpLt7FQE=; b=dEgD4oVj/IzKaEjutAr4X0LZ1l5b/SMZYvH+VjdRscO2weXfqWjbUagXumWG5eBnrP d8RVmmyVnaX6Dja11YaHs8vGVhpuT+tGTG7bmxcXmq5VUkQsqn4bcm1lzvIW8hro2kg4 P3Oqp0OEGPX8U8+Bpxli9tnoCCsFH0PiuzR7Jwwor5ibbMTd4Ocn94hQ3dZ6H0C1cimC 4Jlhiy89VjtXi5HY22MX7hc/EX//RsUwszlyq6T6jfQjJ46N8GmanfmVswVXgB9swpCG bZyKIzGoyUP89B00zEqK/0+safinlxp1tqxrQ8jaSTguqlxmuWNJdVi5aGT3S4JlVdZo LUpw== X-Gm-Message-State: AAQBX9dLuDeIcUVf50LwME0Th2kZ3otaLAZpb0KfgliwmBNJJkz2GloR ey9tI1QskEz81pqSE68bu6VDKj4Zt+w2JT5r450= X-Received: by 2002:a05:6870:fbaa:b0:17f:2918:2f46 with SMTP id kv42-20020a056870fbaa00b0017f29182f46mr64463oab.5.1681330198644; Wed, 12 Apr 2023 13:09:58 -0700 (PDT) MIME-Version: 1.0 References: <20230410210608.1873968-1-robdclark@gmail.com> In-Reply-To: From: Rob Clark Date: Wed, 12 Apr 2023 13:09:47 -0700 Message-ID: Subject: Re: [Freedreno] [PATCH v2 0/2] drm: fdinfo memory stats To: Rodrigo Vivi Cc: Dmitry Baryshkov , dri-devel@lists.freedesktop.org, Rob Clark , Tvrtko Ursulin , "open list:DOCUMENTATION" , linux-arm-msm@vger.kernel.org, Emil Velikov , Christopher Healy , open list , Sean Paul , Boris Brezillon , freedreno@lists.freedesktop.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 12, 2023 at 5:47=E2=80=AFAM Rodrigo Vivi wrote: > > On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote: > > On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote: > > > On 11/04/2023 21:28, Rob Clark wrote: > > > > On Tue, Apr 11, 2023 at 10:36=E2=80=AFAM Dmitry Baryshkov > > > > wrote: > > > > > > > > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark wro= te: > > > > > > > > > > > > On Tue, Apr 11, 2023 at 9:53=E2=80=AFAM Daniel Vetter wrote: > > > > > > > > > > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote: > > > > > > > > On Mon, Apr 10, 2023 at 2:06=E2=80=AFPM Rob Clark wrote: > > > > > > > > > > > > > > > > > > From: Rob Clark > > > > > > > > > > > > > > > > > > Similar motivation to other similar recent attempt[1]. B= ut with an > > > > > > > > > attempt to have some shared code for this. As well as do= cumentation. > > > > > > > > > > > > > > > > > > It is probably a bit UMA-centric, I guess devices with VR= AM might want > > > > > > > > > some placement stats as well. But this seems like a reas= onable start. > > > > > > > > > > > > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/s= eries/116236/ > > > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop= /pull/204 > > > > > > > > > > > > > > > > On a related topic, I'm wondering if it would make sense to= report > > > > > > > > some more global things (temp, freq, etc) via fdinfo? Some= of this, > > > > > > > > tools like nvtop could get by trawling sysfs or other drive= r specific > > > > > > > > ways. But maybe it makes sense to have these sort of thing= s reported > > > > > > > > in a standardized way (even though they aren't really per-d= rm_file) > > > > > > > > > > > > > > I think that's a bit much layering violation, we'd essentiall= y have to > > > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a busines= s I want to > > > > > > > be in :-) > > > > > > > > > > > > I guess this is true for temp (where there are thermal zones wi= th > > > > > > potentially multiple temp sensors.. but I'm still digging my wa= y thru > > > > > > the thermal_cooling_device stuff) > > > > > > > > > > It is slightly ugly. All thermal zones and cooling devices are vi= rtual > > > > > devices (so, even no connection to the particular tsens device). = One > > > > > can either enumerate them by checking > > > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through > > > > > /sys/class/hwmon. For cooling devices again the only enumeration = is > > > > > through /sys/class/thermal/cooling_deviceN/type. > > > > > > > > > > Probably it should be possible to push cooling devices and therma= l > > > > > zones under corresponding providers. However I do not know if the= re is > > > > > a good way to correlate cooling device (ideally a part of GPU) to= the > > > > > thermal_zone (which in our case is provided by tsens / temp_alarm > > > > > rather than GPU itself). > > > > > > > > > > > > > > > > > But what about freq? I think, esp for cases where some "fw thi= ng" is > > > > > > controlling the freq we end up needing to use gpu counters to m= easure > > > > > > the freq. > > > > > > > > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices= are > > > > > registered under proper parent (IOW, GPU). So one can read > > > > > /sys/class/devfreq/3d00000.gpu/cur_freq or > > > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_fre= q. > > > > > > > > > > However because of the components usage, there is no link from > > > > > /sys/class/drm/card0 > > > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.di= splay-controller/drm/card0) > > > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit. > > > > > > > > > > Getting all these items together in a platform-independent way wo= uld > > > > > be definitely an important but complex topic. > > > > > > > > But I don't believe any of the pci gpu's use devfreq ;-) > > > > > > > > And also, you can't expect the CPU to actually know the freq when f= w > > > > is the one controlling freq. We can, currently, have a reasonable > > > > approximation from devfreq but that stops if IFPC is implemented. = And > > > > other GPUs have even less direct control. So freq is a thing that = I > > > > don't think we should try to get from "common frameworks" > > > > > > I think it might be useful to add another passive devfreq governor ty= pe for > > > external frequencies. This way we can use the same interface to expor= t > > > non-CPU-controlled frequencies. > > > > Yeah this sounds like a decent idea to me too. It might also solve the = fun > > of various pci devices having very non-standard freq controls in sysfs > > (looking at least at i915 here ...) > > I also like the idea of having some common infrastructure for the GPU fre= q. > > hwmon have a good infrastructure, but they are more focused on individual > monitoring devices and not very welcomed to embedded monitoring and contr= ol. > I still want to check the opportunity to see if at least some freq contro= l > could be aligned there. > > Another thing that complicates that is that there are multiple frequency > domains and controls with multipliers in Intel GPU that are not very > standard or easy to integrate. > > On a quick glace this devfreq seems neat because it aligns with the cpufr= eq > and governors. But again it would be hard to align with the multiple doma= ins > and controls. But it deserves a look. > > I will take a look to both fronts for Xe: hwmon and devfreq. Right now on > Xe we have a lot less controls than i915, but I can imagine soon there > will be requirements to make that to grow and I fear that we end up just > like i915. So I will take a look before that happens. So it looks like i915 (dgpu only) and nouveau already use hwmon.. so maybe this is a good way to expose temp. Maybe we can wire up some sort of helper for drivers which use thermal_cooling_device (which can be composed of multiple sensors) to give back an aggregate temp for hwmon to report? Freq could possibly be added to hwmon (ie. seems like a reasonable attribute to add). Devfreq might also be an option but on arm it isn't necessarily associated with the drm device, whereas we could associate the hwmon with the drm device to make it easier for userspace to find. BR, -R > > > > I guess it would minimally be a good idea if we could document this, or > > maybe have a reference implementation in nvtop or whatever the cool thi= ng > > is rn. > > -Daniel > > > > > > > > > > > > > BR, > > > > -R > > > > > > > > > > > > > > > > > What might be needed is better glue to go from the fd or fdin= fo to the > > > > > > > right hw device and then crawl around the hwmon in sysfs auto= matically. I > > > > > > > would not be surprised at all if we really suck on this, prob= ably more > > > > > > > likely on SoC than pci gpus where at least everything should = be under the > > > > > > > main pci sysfs device. > > > > > > > > > > > > yeah, I *think* userspace would have to look at /proc/device-tr= ee to > > > > > > find the cooling device(s) associated with the gpu.. at least I= don't > > > > > > see a straightforward way to figure it out just for sysfs > > > > > > > > > > > > BR, > > > > > > -R > > > > > > > > > > > > > -Daniel > > > > > > > > > > > > > > > > > > > > > > > BR, > > > > > > > > -R > > > > > > > > > > > > > > > > > > > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/ > > > > > > > > > > > > > > > > > > Rob Clark (2): > > > > > > > > > drm: Add fdinfo memory stats > > > > > > > > > drm/msm: Add memory stats to fdinfo > > > > > > > > > > > > > > > > > > Documentation/gpu/drm-usage-stats.rst | 21 +++++++ > > > > > > > > > drivers/gpu/drm/drm_file.c | 79 ++++++++++++= +++++++++++++++ > > > > > > > > > drivers/gpu/drm/msm/msm_drv.c | 25 ++++++++- > > > > > > > > > drivers/gpu/drm/msm/msm_gpu.c | 2 - > > > > > > > > > include/drm/drm_file.h | 10 ++++ > > > > > > > > > 5 files changed, 134 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.39.2 > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Daniel Vetter > > > > > > > Software Engineer, Intel Corporation > > > > > > > http://blog.ffwll.ch > > > > > > > > > > > > > > > > > > > > -- > > > > > With best wishes > > > > > Dmitry > > > > > > -- > > > With best wishes > > > Dmitry > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch