2022-06-15 15:32:25

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: [PATCH 1/6] drm/i915/gt: Ignore TLB invalidations on idle engines

From: Chris Wilson <[email protected]>

As an extension of the current skip TLB invalidations,
check if the device is powered down prior to any engine activity,

as, on such cases, all the TLBs were already invalidated, so an
explicit TLB invalidation is not needed.

This becomes more significant with GuC, as it can only do so when
the connection to the GuC is awake.

Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")

Signed-off-by: Chris Wilson <[email protected]>
Cc: Fei Yang <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: [email protected]
Acked-by: Thomas Hellström <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---

See [PATCH 0/6] at: https://lore.kernel.org/all/[email protected]/

drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 +++++----
drivers/gpu/drm/i915/gt/intel_gt.c | 26 +++++++++++++++++------
drivers/gpu/drm/i915/gt/intel_gt_pm.h | 3 +++
3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 97c820eee115..6835279943df 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -6,14 +6,15 @@

#include <drm/drm_cache.h>

+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+
#include "i915_drv.h"
#include "i915_gem_object.h"
#include "i915_scatterlist.h"
#include "i915_gem_lmem.h"
#include "i915_gem_mman.h"

-#include "gt/intel_gt.h"
-
void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
struct sg_table *pages,
unsigned int sg_page_sizes)
@@ -217,10 +218,11 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)

if (test_and_clear_bit(I915_BO_WAS_BOUND_BIT, &obj->flags)) {
struct drm_i915_private *i915 = to_i915(obj->base.dev);
+ struct intel_gt *gt = to_gt(i915);
intel_wakeref_t wakeref;

- with_intel_runtime_pm_if_active(&i915->runtime_pm, wakeref)
- intel_gt_invalidate_tlbs(to_gt(i915));
+ with_intel_gt_pm_if_awake(gt, wakeref)
+ intel_gt_invalidate_tlbs(gt);
}

return pages;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index f33290358c51..d5ed6a6ac67c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -11,6 +11,7 @@

#include "i915_drv.h"
#include "intel_context.h"
+#include "intel_engine_pm.h"
#include "intel_engine_regs.h"
#include "intel_gt.h"
#include "intel_gt_buffer_pool.h"
@@ -1216,6 +1217,7 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
struct intel_uncore *uncore = gt->uncore;
struct intel_engine_cs *engine;
+ intel_engine_mask_t awake, tmp;
enum intel_engine_id id;
const i915_reg_t *regs;
unsigned int num = 0;
@@ -1239,12 +1241,27 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)

GEM_TRACE("\n");

- assert_rpm_wakelock_held(&i915->runtime_pm);
-
mutex_lock(&gt->tlb_invalidate_lock);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);

+ awake = 0;
for_each_engine(engine, gt, id) {
+ struct reg_and_bit rb;
+
+ if (!intel_engine_pm_is_awake(engine))
+ continue;
+
+ rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+ if (!i915_mmio_reg_offset(rb.reg))
+ continue;
+
+ intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+ awake |= engine->mask;
+ }
+
+ for_each_engine_masked(engine, gt, awake, tmp) {
+ struct reg_and_bit rb;
+
/*
* HW architecture suggest typical invalidation time at 40us,
* with pessimistic cases up to 100us and a recommendation to
@@ -1252,13 +1269,8 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
*/
const unsigned int timeout_us = 100;
const unsigned int timeout_ms = 4;
- struct reg_and_bit rb;

rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
- if (!i915_mmio_reg_offset(rb.reg))
- continue;
-
- intel_uncore_write_fw(uncore, rb.reg, rb.bit);
if (__intel_wait_for_register_fw(uncore,
rb.reg, rb.bit, 0,
timeout_us, timeout_ms,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index bc898df7a48c..a334787a4939 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -55,6 +55,9 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt)
for (tmp = 1, intel_gt_pm_get(gt); tmp; \
intel_gt_pm_put(gt), tmp = 0)

+#define with_intel_gt_pm_if_awake(gt, wf) \
+ for (wf = intel_gt_pm_get_if_awake(gt); wf; intel_gt_pm_put_async(gt), wf = 0)
+
static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
{
return intel_wakeref_wait_for_idle(&gt->wakeref);
--
2.36.1


2022-06-16 07:31:23

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/i915/gt: Ignore TLB invalidations on idle engines


On 15/06/2022 16:27, Mauro Carvalho Chehab wrote:
> From: Chris Wilson <[email protected]>
>
> As an extension of the current skip TLB invalidations,
> check if the device is powered down prior to any engine activity,
>
> as, on such cases, all the TLBs were already invalidated, so an
> explicit TLB invalidation is not needed.
>
> This becomes more significant with GuC, as it can only do so when
> the connection to the GuC is awake.
>
> Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")

Hmmm is this a fix or "an extension" as the commit text mentions both
options?! GuC angle does not appear relevant for upstream yet so is cc:
stable really required is the question.

Regards,

Tvrtko

>
> Signed-off-by: Chris Wilson <[email protected]>
> Cc: Fei Yang <[email protected]>
> Cc: Andi Shyti <[email protected]>
> Cc: [email protected]
> Acked-by: Thomas Hellström <[email protected]>
> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
> ---
>
> See [PATCH 0/6] at: https://lore.kernel.org/all/[email protected]/
>
> drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 +++++----
> drivers/gpu/drm/i915/gt/intel_gt.c | 26 +++++++++++++++++------
> drivers/gpu/drm/i915/gt/intel_gt_pm.h | 3 +++
> 3 files changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index 97c820eee115..6835279943df 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -6,14 +6,15 @@
>
> #include <drm/drm_cache.h>
>
> +#include "gt/intel_gt.h"
> +#include "gt/intel_gt_pm.h"
> +
> #include "i915_drv.h"
> #include "i915_gem_object.h"
> #include "i915_scatterlist.h"
> #include "i915_gem_lmem.h"
> #include "i915_gem_mman.h"
>
> -#include "gt/intel_gt.h"
> -
> void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
> struct sg_table *pages,
> unsigned int sg_page_sizes)
> @@ -217,10 +218,11 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
>
> if (test_and_clear_bit(I915_BO_WAS_BOUND_BIT, &obj->flags)) {
> struct drm_i915_private *i915 = to_i915(obj->base.dev);
> + struct intel_gt *gt = to_gt(i915);
> intel_wakeref_t wakeref;
>
> - with_intel_runtime_pm_if_active(&i915->runtime_pm, wakeref)
> - intel_gt_invalidate_tlbs(to_gt(i915));
> + with_intel_gt_pm_if_awake(gt, wakeref)
> + intel_gt_invalidate_tlbs(gt);
> }
>
> return pages;
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index f33290358c51..d5ed6a6ac67c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -11,6 +11,7 @@
>
> #include "i915_drv.h"
> #include "intel_context.h"
> +#include "intel_engine_pm.h"
> #include "intel_engine_regs.h"
> #include "intel_gt.h"
> #include "intel_gt_buffer_pool.h"
> @@ -1216,6 +1217,7 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
> struct drm_i915_private *i915 = gt->i915;
> struct intel_uncore *uncore = gt->uncore;
> struct intel_engine_cs *engine;
> + intel_engine_mask_t awake, tmp;
> enum intel_engine_id id;
> const i915_reg_t *regs;
> unsigned int num = 0;
> @@ -1239,12 +1241,27 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
>
> GEM_TRACE("\n");
>
> - assert_rpm_wakelock_held(&i915->runtime_pm);
> -
> mutex_lock(&gt->tlb_invalidate_lock);
> intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
>
> + awake = 0;
> for_each_engine(engine, gt, id) {
> + struct reg_and_bit rb;
> +
> + if (!intel_engine_pm_is_awake(engine))
> + continue;
> +
> + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> + if (!i915_mmio_reg_offset(rb.reg))
> + continue;
> +
> + intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> + awake |= engine->mask;
> + }
> +
> + for_each_engine_masked(engine, gt, awake, tmp) {
> + struct reg_and_bit rb;
> +
> /*
> * HW architecture suggest typical invalidation time at 40us,
> * with pessimistic cases up to 100us and a recommendation to
> @@ -1252,13 +1269,8 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
> */
> const unsigned int timeout_us = 100;
> const unsigned int timeout_ms = 4;
> - struct reg_and_bit rb;
>
> rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> - if (!i915_mmio_reg_offset(rb.reg))
> - continue;
> -
> - intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> if (__intel_wait_for_register_fw(uncore,
> rb.reg, rb.bit, 0,
> timeout_us, timeout_ms,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> index bc898df7a48c..a334787a4939 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> @@ -55,6 +55,9 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt)
> for (tmp = 1, intel_gt_pm_get(gt); tmp; \
> intel_gt_pm_put(gt), tmp = 0)
>
> +#define with_intel_gt_pm_if_awake(gt, wf) \
> + for (wf = intel_gt_pm_get_if_awake(gt); wf; intel_gt_pm_put_async(gt), wf = 0)
> +
> static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
> {
> return intel_wakeref_wait_for_idle(&gt->wakeref);

2022-06-23 11:39:00

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH 1/6] drm/i915/gt: Ignore TLB invalidations on idle engines

Hi Mauro,

On Wed, Jun 15, 2022 at 04:27:35PM +0100, Mauro Carvalho Chehab wrote:
> From: Chris Wilson <[email protected]>
>
> As an extension of the current skip TLB invalidations,
> check if the device is powered down prior to any engine activity,
>
> as, on such cases, all the TLBs were already invalidated, so an
> explicit TLB invalidation is not needed.
>
> This becomes more significant with GuC, as it can only do so when
> the connection to the GuC is awake.
>
> Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
>
> Signed-off-by: Chris Wilson <[email protected]>
> Cc: Fei Yang <[email protected]>
> Cc: Andi Shyti <[email protected]>
> Cc: [email protected]
> Acked-by: Thomas Hellstr?m <[email protected]>
> Signed-off-by: Mauro Carvalho Chehab <[email protected]>

Reviewed-by: Andi Shyti <[email protected]>

Andi