Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp3457949imw; Mon, 18 Jul 2022 08:25:13 -0700 (PDT) X-Google-Smtp-Source: AGRyM1v36LqhOgVJSGcXyscot5JLXNxTrvqmbocfvQpnlM90O4yIAEOyghJLdT6iZ2W3PwrEi/Ei X-Received: by 2002:a17:907:16ab:b0:72c:7533:7262 with SMTP id hc43-20020a17090716ab00b0072c75337262mr24429497ejc.288.1658157913064; Mon, 18 Jul 2022 08:25:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658157913; cv=none; d=google.com; s=arc-20160816; b=weWIyaBDkfmcqXDTqKAHun2b0hgl98XoZeDhA93C9zuQ+L/ihfWp/ltQruwLdkmjil o+AfjvGkgYNwL/gqVen09lXwb8m0l8hFcbYZiJLRiwqcxn5jV3lwl00EX0Ek7md7THjk nap/hc37mgKxa3IL0xmPrwgVaiTVdsj6OlpEcJ6pHwmvuBQgVW/n4I+UgBUFIy3alYGw BkAUdhZzQxwrn9txnYogrl5/2dG82rKOwjgFA5heaBWZavWyJfcm2Ym8a6cbJLEY4Z0U XZwczXsvlZQe+GZW3XAfP1o3w0JFXZj7Rs68dv7Y1lbf79wnoOJ5Vy+EnMW99PbX+9Ob FelA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=20XCcW1JvQfYpeLuGuYbsAyJyV0n9kmQHJVGanhXW1M=; b=KkyfNJQCLtt2fGUxB3JsPwzGb6FTZt+DVTTtEtmma9neWyxmqKLh3pt0i45ZZ50dpo nhSBTYSH79utkx16Hmq0ayHOSv7eeDsNMuH0w3TpvSkx8sABIeCa1csfkzzfN/Gfl4cm 84weNskYJmwNc7QVv0LnSDRog4oVTtxiRQGbA8lrZ+sPAnsFnYGcPIUnoRHykJfN/vhd uYaRvaItI4Dp4lGLzKE2So6KYi22UBDpuyHLPOSKKzYy+z+nUxvgdxNqvglwKtLKt+UZ Y1pGSEussuwlD+Y/J2MpsbJAPkB626WnpafZIEnDR3fdnpIkSo5SA61b1Szv79b3AwRs d1Jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="fo/JkCDx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l28-20020a056402345c00b0043585bc1c54si16106216edc.567.2022.07.18.08.24.48; Mon, 18 Jul 2022 08:25:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="fo/JkCDx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232739AbiGRPCI (ORCPT + 99 others); Mon, 18 Jul 2022 11:02:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229639AbiGRPCH (ORCPT ); Mon, 18 Jul 2022 11:02:07 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5A6A24BF9; Mon, 18 Jul 2022 08:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658156526; x=1689692526; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=dOad1nR/yXKty+tVe0LpjOM4v857tHTn09mRRzWe0Wg=; b=fo/JkCDx4o8kbyPwAGehoeBYLuXhi4lJ7aEjWybWQkZFMrhcOS5H+UYl d4oak69ibTA6J8YVPdSqySPw0H5HCSoYI5bARR9iurL5nbSy0nyTCSnDw qnpIy77h+GXYkZj+M0I8dWO9ed9Ywxi0yicWwM5iA/sCXP6eA4aSGVVSI JPlXMt0WUlU90h60eVczR5fqTyE8QR7I0VP6YiwVKsiu1kyNFC7L28eDS LfSyTWkRYpUn38QHK+UhQTXiddABptxUniJPuov/P0x8Dj+vvyliI+H19 7ijyuZmNUYtKmbh7OAujGiQaXBQ3ffg6fJXomWCnaBEFlsmK0OtCKWjhg w==; X-IronPort-AV: E=McAfee;i="6400,9594,10412"; a="372549474" X-IronPort-AV: E=Sophos;i="5.92,281,1650956400"; d="scan'208";a="372549474" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2022 08:02:00 -0700 X-IronPort-AV: E=Sophos;i="5.92,281,1650956400"; d="scan'208";a="686769046" Received: from smyint-mobl1.amr.corp.intel.com (HELO [10.212.107.15]) ([10.212.107.15]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2022 08:01:57 -0700 Message-ID: Date: Mon, 18 Jul 2022 16:01:56 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [Intel-gfx] [PATCH v2 01/21] drm/i915/gt: Ignore TLB invalidations on idle engines Content-Language: en-US To: Mauro Carvalho Chehab Cc: Mauro Carvalho Chehab , David Airlie , dri-devel@lists.freedesktop.org, Chris Wilson , Matthew Auld , Dave Airlie , =?UTF-8?Q?Thomas_Hellstr=c3=b6m?= , Lucas De Marchi , intel-gfx@lists.freedesktop.org, Rodrigo Vivi , linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <76318fe1-37dc-8a1e-317e-76333995b8ca@linux.intel.com> <20220718165341.30ee6e31@maurocar-mobl2> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc In-Reply-To: <20220718165341.30ee6e31@maurocar-mobl2> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,HK_RANDOM_ENVFROM,HK_RANDOM_FROM, NICE_REPLY_A,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18/07/2022 15:53, Mauro Carvalho Chehab wrote: > On Mon, 18 Jul 2022 14:16:10 +0100 > Tvrtko Ursulin wrote: > >> On 14/07/2022 13:06, Mauro Carvalho Chehab wrote: >>> From: Chris Wilson >>> >>> Check if the device is powered down prior to any engine activity, >>> as, on such cases, all the TLBs were already invalidated, so an >>> explicit TLB invalidation is not needed, thus reducing the >>> performance regression impact due to it. >>> >>> This becomes more significant with GuC, as it can only do so when >>> the connection to the GuC is awake. >>> >>> Cc: stable@vger.kernel.org >>> Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") >> >> Patch itself looks fine but I don't think we closed on the issue of >> stable/fixes on this patch? > > No, because TLB cache invalidation takes time and causes time outs, which > in turn affects applications and produce Kernel warnings. > > There's even open bugs due to TLB timeouts, like this one: > > [424.370996] i915 0000:00:02.0: [drm] *ERROR* rcs0 TLB invalidation did not complete in 4ms! > > See: > https://gitlab.freedesktop.org/drm/intel/-/issues/6424 > > So, while this is a performance regression, it ends causing a > functional regression. This test is not even particularly stressful. Fair enough - thanks for the information. Acked-by: Tvrtko Ursulin Is skipping of the ggtt only bound flush the fix for this particular test? Regards, Tvrtko > > The first part of this series (patches 1-7) are meant to reduce the > risk of such timeouts by doing TLB invalidation in batch and only > when really needed (userspace-exposed TLBs for GTs that are powered-on > and non-edged). > > As they're fixing such regressions, it makes sense c/c stable and having > a fixes tag. > >> My position here is that, if the functional issue is only with GuC >> invalidations, then the tags shouldn't be there (and the huge CC list). >> >> Regards, >> >> Tvrtko >> >>> Signed-off-by: Chris Wilson >>> Cc: Fei Yang >>> Cc: Andi Shyti >>> Cc: Thomas Hellström >>> Signed-off-by: Mauro Carvalho Chehab >>> --- >>> >>> To avoid mailbombing on a large number of people, only mailing lists were C/C on the cover. >>> See [PATCH v2 00/21] at: https://lore.kernel.org/all/cover.1657800199.git.mchehab@kernel.org/ >>> >>> drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ++++++---- >>> drivers/gpu/drm/i915/gt/intel_gt.c | 17 ++++++++++------- >>> drivers/gpu/drm/i915/gt/intel_gt_pm.h | 3 +++ >>> 3 files changed, 19 insertions(+), 11 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >>> index 97c820eee115..6835279943df 100644 >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c >>> @@ -6,14 +6,15 @@ >>> >>> #include >>> >>> +#include "gt/intel_gt.h" >>> +#include "gt/intel_gt_pm.h" >>> + >>> #include "i915_drv.h" >>> #include "i915_gem_object.h" >>> #include "i915_scatterlist.h" >>> #include "i915_gem_lmem.h" >>> #include "i915_gem_mman.h" >>> >>> -#include "gt/intel_gt.h" >>> - >>> void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, >>> struct sg_table *pages, >>> unsigned int sg_page_sizes) >>> @@ -217,10 +218,11 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) >>> >>> if (test_and_clear_bit(I915_BO_WAS_BOUND_BIT, &obj->flags)) { >>> struct drm_i915_private *i915 = to_i915(obj->base.dev); >>> + struct intel_gt *gt = to_gt(i915); >>> intel_wakeref_t wakeref; >>> >>> - with_intel_runtime_pm_if_active(&i915->runtime_pm, wakeref) >>> - intel_gt_invalidate_tlbs(to_gt(i915)); >>> + with_intel_gt_pm_if_awake(gt, wakeref) >>> + intel_gt_invalidate_tlbs(gt); >>> } >>> >>> return pages; >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c >>> index 68c2b0d8f187..c4d43da84d8e 100644 >>> --- a/drivers/gpu/drm/i915/gt/intel_gt.c >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c >>> @@ -12,6 +12,7 @@ >>> >>> #include "i915_drv.h" >>> #include "intel_context.h" >>> +#include "intel_engine_pm.h" >>> #include "intel_engine_regs.h" >>> #include "intel_ggtt_gmch.h" >>> #include "intel_gt.h" >>> @@ -924,6 +925,7 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) >>> struct drm_i915_private *i915 = gt->i915; >>> struct intel_uncore *uncore = gt->uncore; >>> struct intel_engine_cs *engine; >>> + intel_engine_mask_t awake, tmp; >>> enum intel_engine_id id; >>> const i915_reg_t *regs; >>> unsigned int num = 0; >>> @@ -947,26 +949,31 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) >>> >>> GEM_TRACE("\n"); >>> >>> - assert_rpm_wakelock_held(&i915->runtime_pm); >>> - >>> mutex_lock(>->tlb_invalidate_lock); >>> intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); >>> >>> spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */ >>> >>> + awake = 0; >>> for_each_engine(engine, gt, id) { >>> struct reg_and_bit rb; >>> >>> + if (!intel_engine_pm_is_awake(engine)) >>> + continue; >>> + >>> rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); >>> if (!i915_mmio_reg_offset(rb.reg)) >>> continue; >>> >>> intel_uncore_write_fw(uncore, rb.reg, rb.bit); >>> + awake |= engine->mask; >>> } >>> >>> spin_unlock_irq(&uncore->lock); >>> >>> - for_each_engine(engine, gt, id) { >>> + for_each_engine_masked(engine, gt, awake, tmp) { >>> + struct reg_and_bit rb; >>> + >>> /* >>> * HW architecture suggest typical invalidation time at 40us, >>> * with pessimistic cases up to 100us and a recommendation to >>> @@ -974,12 +981,8 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt) >>> */ >>> const unsigned int timeout_us = 100; >>> const unsigned int timeout_ms = 4; >>> - struct reg_and_bit rb; >>> >>> rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); >>> - if (!i915_mmio_reg_offset(rb.reg)) >>> - continue; >>> - >>> if (__intel_wait_for_register_fw(uncore, >>> rb.reg, rb.bit, 0, >>> timeout_us, timeout_ms, >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h >>> index bc898df7a48c..a334787a4939 100644 >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h >>> @@ -55,6 +55,9 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt) >>> for (tmp = 1, intel_gt_pm_get(gt); tmp; \ >>> intel_gt_pm_put(gt), tmp = 0) >>> >>> +#define with_intel_gt_pm_if_awake(gt, wf) \ >>> + for (wf = intel_gt_pm_get_if_awake(gt); wf; intel_gt_pm_put_async(gt), wf = 0) >>> + >>> static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt) >>> { >>> return intel_wakeref_wait_for_idle(>->wakeref);