Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1580317ybe; Fri, 13 Sep 2019 20:18:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqyw1AtnHkLT2NTfe8d/PzN98Zadi8kaIxQBM3z5LXDJ8yMmFHGOnAbak7airmaoF0jxT819 X-Received: by 2002:a17:906:f0d5:: with SMTP id dk21mr41216153ejb.118.1568431116831; Fri, 13 Sep 2019 20:18:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568431116; cv=none; d=google.com; s=arc-20160816; b=M//bv01EDws6f8ezVbD1I7Tfs/JxZ18otLIVWW5WQdPnKtPh2TCsUaUYaIS20yOHfW 9EfUEcwQhT76U2CXILpk6VtvfLsBuDWG27VoF4BtQhB/fVjba/3iPGw/oBRQVHF5pg4t WlE6j9E0Se4ftiq42OLsoR2ub28/975Iy3dLTLCQE3y7JnflwFaRjfrg7jrqkldplgPG CiuWAmesjc1ufQb9p3j1HQ/BetWzF1/goeZIaJGUJr3Kp9eUGyYvujCaxwJLw5JLFceS U88DmB7cI36rjQeviwEPD55Dyc5iLa+EBnDBIhiMctqft5qpsijt7Hgi0zU5RuIzrOKp /Vjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=KQRems8MEuuHeaOdjeD7ebYe+l4KzDOeqs26yxRpDcY=; b=OgNat1+OmuZwKW6HEFS/rW/FRPups3GCQShV7WRInAYxAfxU4e9rQXPnsxM96DO4c6 ixKe6wfj5sp8oGWPfLdCs8KyDWt8WliQDEvRH/l2qs8ZglvJlMCU08/P1aDkK3zo4zso vbvQHk90euWXgUf7q3EzhSBH7FNUrq+TVezGMPOemxPmhLtsKAYMka+SN+wsOmhd6MQH +YV1TD0ar9yu6V2Sw+3jc1EAwlIcp+dqqhcU6/ft+cX1ap2jgX7dl7Xo1ydJifGUbhOs dscenbEjEVZPp4AZN2IySMRYUIKEVh5g2Oq+1F/35AJdpDuq86phPAa7ppo+vmPDr1Rq aoOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y22si15543097ejr.99.2019.09.13.20.17.44; Fri, 13 Sep 2019 20:18:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729872AbfIMRnx (ORCPT + 99 others); Fri, 13 Sep 2019 13:43:53 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:58494 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729340AbfIMRnw (ORCPT ); Fri, 13 Sep 2019 13:43:52 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: alyssa) with ESMTPSA id AB4A828DF6E Date: Fri, 13 Sep 2019 13:43:43 -0400 From: Alyssa Rosenzweig To: Steven Price Cc: Daniel Vetter , David Airlie , Rob Herring , Tomeu Vizoso , Mark Brown , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] drm/panfrost: Simplify devfreq utilisation tracking Message-ID: <20190913174343.GB5387@kevin> References: <20190912112804.10104-1-steven.price@arm.com> <20190912112804.10104-3-steven.price@arm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="JYK4vJDZwFMowpUq" Content-Disposition: inline In-Reply-To: <20190912112804.10104-3-steven.price@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --JYK4vJDZwFMowpUq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Patch 1 is: Acked-by: Alyssa Rosenzweig Patch 2 is: Reviewed-by: Alyssa Rosenzweig Good stuff as always! On Thu, Sep 12, 2019 at 12:28:04PM +0100, Steven Price wrote: > Instead of tracking per-slot utilisation track a single value for the > entire GPU. Ultimately it doesn't matter if the GPU is busy with only > vertex or a combination of vertex and fragment processing - if it's busy > then it's busy and devfreq should be scaling appropriately. >=20 > This also makes way for being able to submit multiple jobs per slot > which requires more values than the original boolean per slot. >=20 > Signed-off-by: Steven Price > --- > drivers/gpu/drm/panfrost/panfrost_devfreq.c | 64 ++++++++------------- > drivers/gpu/drm/panfrost/panfrost_devfreq.h | 3 +- > drivers/gpu/drm/panfrost/panfrost_device.h | 12 ++-- > drivers/gpu/drm/panfrost/panfrost_job.c | 14 ++--- > 4 files changed, 38 insertions(+), 55 deletions(-) >=20 > diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/dr= m/panfrost/panfrost_devfreq.c > index 7ded282a5ca8..4c4e8a30a1ac 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c > +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c > @@ -13,7 +13,7 @@ > #include "panfrost_gpu.h" > #include "panfrost_regs.h" > =20 > -static void panfrost_devfreq_update_utilization(struct panfrost_device *= pfdev, int slot); > +static void panfrost_devfreq_update_utilization(struct panfrost_device *= pfdev); > =20 > static int panfrost_devfreq_target(struct device *dev, unsigned long *fr= eq, > u32 flags) > @@ -32,37 +32,23 @@ static int panfrost_devfreq_target(struct device *dev= , unsigned long *freq, > =20 > static void panfrost_devfreq_reset(struct panfrost_device *pfdev) > { > - ktime_t now =3D ktime_get(); > - int i; > - > - for (i =3D 0; i < NUM_JOB_SLOTS; i++) { > - pfdev->devfreq.slot[i].busy_time =3D 0; > - pfdev->devfreq.slot[i].idle_time =3D 0; > - pfdev->devfreq.slot[i].time_last_update =3D now; > - } > + pfdev->devfreq.busy_time =3D 0; > + pfdev->devfreq.idle_time =3D 0; > + pfdev->devfreq.time_last_update =3D ktime_get(); > } > =20 > static int panfrost_devfreq_get_dev_status(struct device *dev, > struct devfreq_dev_status *status) > { > struct panfrost_device *pfdev =3D dev_get_drvdata(dev); > - int i; > =20 > - for (i =3D 0; i < NUM_JOB_SLOTS; i++) { > - panfrost_devfreq_update_utilization(pfdev, i); > - } > + panfrost_devfreq_update_utilization(pfdev); > =20 > status->current_frequency =3D clk_get_rate(pfdev->clock); > - status->total_time =3D ktime_to_ns(ktime_add(pfdev->devfreq.slot[0].bus= y_time, > - pfdev->devfreq.slot[0].idle_time)); > - > - status->busy_time =3D 0; > - for (i =3D 0; i < NUM_JOB_SLOTS; i++) { > - status->busy_time +=3D ktime_to_ns(pfdev->devfreq.slot[i].busy_time); > - } > + status->total_time =3D ktime_to_ns(ktime_add(pfdev->devfreq.busy_time, > + pfdev->devfreq.idle_time)); > =20 > - /* We're scheduling only to one core atm, so don't divide for now */ > - /* status->busy_time /=3D NUM_JOB_SLOTS; */ > + status->busy_time =3D ktime_to_ns(pfdev->devfreq.busy_time); > =20 > panfrost_devfreq_reset(pfdev); > =20 > @@ -134,14 +120,10 @@ void panfrost_devfreq_fini(struct panfrost_device *= pfdev) > =20 > void panfrost_devfreq_resume(struct panfrost_device *pfdev) > { > - int i; > - > if (!pfdev->devfreq.devfreq) > return; > =20 > panfrost_devfreq_reset(pfdev); > - for (i =3D 0; i < NUM_JOB_SLOTS; i++) > - pfdev->devfreq.slot[i].busy =3D false; > =20 > devfreq_resume_device(pfdev->devfreq.devfreq); > } > @@ -154,9 +136,8 @@ void panfrost_devfreq_suspend(struct panfrost_device = *pfdev) > devfreq_suspend_device(pfdev->devfreq.devfreq); > } > =20 > -static void panfrost_devfreq_update_utilization(struct panfrost_device *= pfdev, int slot) > +static void panfrost_devfreq_update_utilization(struct panfrost_device *= pfdev) > { > - struct panfrost_devfreq_slot *devfreq_slot =3D &pfdev->devfreq.slot[slo= t]; > ktime_t now; > ktime_t last; > =20 > @@ -164,22 +145,27 @@ static void panfrost_devfreq_update_utilization(str= uct panfrost_device *pfdev, i > return; > =20 > now =3D ktime_get(); > - last =3D pfdev->devfreq.slot[slot].time_last_update; > + last =3D pfdev->devfreq.time_last_update; > =20 > - /* If we last recorded a transition to busy, we have been idle since */ > - if (devfreq_slot->busy) > - pfdev->devfreq.slot[slot].busy_time +=3D ktime_sub(now, last); > + if (atomic_read(&pfdev->devfreq.busy_count) > 0) > + pfdev->devfreq.busy_time +=3D ktime_sub(now, last); > else > - pfdev->devfreq.slot[slot].idle_time +=3D ktime_sub(now, last); > + pfdev->devfreq.idle_time +=3D ktime_sub(now, last); > + > + pfdev->devfreq.time_last_update =3D now; > +} > =20 > - pfdev->devfreq.slot[slot].time_last_update =3D now; > +void panfrost_devfreq_record_busy(struct panfrost_device *pfdev) > +{ > + panfrost_devfreq_update_utilization(pfdev); > + atomic_inc(&pfdev->devfreq.busy_count); > } > =20 > -/* The job scheduler is expected to call this at every transition busy <= -> idle */ > -void panfrost_devfreq_record_transition(struct panfrost_device *pfdev, i= nt slot) > +void panfrost_devfreq_record_idle(struct panfrost_device *pfdev) > { > - struct panfrost_devfreq_slot *devfreq_slot =3D &pfdev->devfreq.slot[slo= t]; > + int count; > =20 > - panfrost_devfreq_update_utilization(pfdev, slot); > - devfreq_slot->busy =3D !devfreq_slot->busy; > + panfrost_devfreq_update_utilization(pfdev); > + count =3D atomic_dec_if_positive(&pfdev->devfreq.busy_count); > + WARN_ON(count < 0); > } > diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h b/drivers/gpu/dr= m/panfrost/panfrost_devfreq.h > index e3bc63e82843..0611beffc8d0 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h > +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h > @@ -10,6 +10,7 @@ void panfrost_devfreq_fini(struct panfrost_device *pfde= v); > void panfrost_devfreq_resume(struct panfrost_device *pfdev); > void panfrost_devfreq_suspend(struct panfrost_device *pfdev); > =20 > -void panfrost_devfreq_record_transition(struct panfrost_device *pfdev, i= nt slot); > +void panfrost_devfreq_record_busy(struct panfrost_device *pfdev); > +void panfrost_devfreq_record_idle(struct panfrost_device *pfdev); > =20 > #endif /* __PANFROST_DEVFREQ_H__ */ > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm= /panfrost/panfrost_device.h > index 4c2b3c84baac..233957f88d77 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.h > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h > @@ -51,13 +51,6 @@ struct panfrost_features { > unsigned long hw_issues[64 / BITS_PER_LONG]; > }; > =20 > -struct panfrost_devfreq_slot { > - ktime_t busy_time; > - ktime_t idle_time; > - ktime_t time_last_update; > - bool busy; > -}; > - > struct panfrost_device { > struct device *dev; > struct drm_device *ddev; > @@ -95,7 +88,10 @@ struct panfrost_device { > struct { > struct devfreq *devfreq; > struct thermal_cooling_device *cooling; > - struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS]; > + ktime_t busy_time; > + ktime_t idle_time; > + ktime_t time_last_update; > + atomic_t busy_count; > } devfreq; > }; > =20 > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/pa= nfrost/panfrost_job.c > index 05c85f45a0de..47aeadb4f161 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > @@ -155,7 +155,7 @@ static void panfrost_job_hw_submit(struct panfrost_jo= b *job, int js) > =20 > cfg =3D panfrost_mmu_as_get(pfdev, &job->file_priv->mmu); > =20 > - panfrost_devfreq_record_transition(pfdev, js); > + panfrost_devfreq_record_busy(pfdev); > spin_lock_irqsave(&pfdev->hwaccess_lock, flags); > =20 > job_write(pfdev, JS_HEAD_NEXT_LO(js), jc_head & 0xFFFFFFFF); > @@ -396,7 +396,7 @@ static void panfrost_job_timedout(struct drm_sched_jo= b *sched_job) > =20 > /* panfrost_core_dump(pfdev); */ > =20 > - panfrost_devfreq_record_transition(pfdev, js); > + panfrost_devfreq_record_idle(pfdev); > panfrost_device_reset(pfdev); > =20 > for (i =3D 0; i < NUM_JOB_SLOTS; i++) > @@ -454,7 +454,7 @@ static irqreturn_t panfrost_job_irq_handler(int irq, = void *data) > =20 > pfdev->jobs[j] =3D NULL; > panfrost_mmu_as_put(pfdev, &job->file_priv->mmu); > - panfrost_devfreq_record_transition(pfdev, j); > + panfrost_devfreq_record_idle(pfdev); > dma_fence_signal(job->done_fence); > } > =20 > @@ -551,14 +551,14 @@ int panfrost_job_is_idle(struct panfrost_device *pf= dev) > struct panfrost_job_slot *js =3D pfdev->js; > int i; > =20 > + /* Check whether the hardware is idle */ > + if (atomic_read(&pfdev->devfreq.busy_count)) > + return false; > + > for (i =3D 0; i < NUM_JOB_SLOTS; i++) { > /* If there are any jobs in the HW queue, we're not idle */ > if (atomic_read(&js->queue[i].sched.hw_rq_count)) > return false; > - > - /* Check whether the hardware is idle */ > - if (pfdev->devfreq.slot[i].busy) > - return false; > } > =20 > return true; > --=20 > 2.20.1 >=20 --JYK4vJDZwFMowpUq Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEQ17gm7CvANAdqvY4/v5QWgr1WA0FAl171UAACgkQ/v5QWgr1 WA1NMg/+JqnCFFTfRWYDqBeEnJpQBFerABGn9oWjVjkQJ+uJQeuK1BdZ/f2Q7WWi lK9Py3X8DCAgUFlacJbqLemR+WRgfIMto+V1xTjorcQFry4EqC7k4MUfEsmZpfIE saKl6sUDd0VGLy1feX05co7gM54rHQV0X67z6P0xbcxymnOQwqJ1B2biTeH1QYGr JGcbrWrdkAemRpFquZuSe8FrjwVJ9z5DAJG1GqNVDaYm3mtKZHEpuhsfqKRHKTmM hMaokGdkaX+62x5zpxv/Y81MLToIpXxvp2O7ruyt8gk3XQx/5Sf+K5wnrJD3+Kqq 16kM1KLRbeSOV7TUyO/moflEzmKEAVw1g7jGRiXUV0RlXycHnxz/se/4HELrkIaP bwZvX9lZ8OYrUm6+bI5WZaJyj88nilK0HShiZukJ+94I+rWgY+LjlPdQ1oZTagMb LqkvXFTsshmyJ7s0AeRVWo9hwciG+Ew5H+dDcdvC2dZjowRp0AKSV2r5wfQRaXlP c1MMeRzKXg0qZREd75/Wh3U9A00hS3eczTEgrb6xj1oUzoPZrFsd7QOEUDoACmVe sCp9aI4k0tbMqwxgIsTd28GYablQQ/GKEThtWiytraHnXIEhwql30alx+hesHjc+ YjP+QeCWiAEl3qvj97525G9OHrET5Ddv0ryA3AauxHG6xF7WHLI= =qjqB -----END PGP SIGNATURE----- --JYK4vJDZwFMowpUq--