Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp3025869rwb; Mon, 15 Aug 2022 16:23:55 -0700 (PDT) X-Google-Smtp-Source: AA6agR4Cp21zoNcciSpQZCJqGTemxXUvCnHpIOfw3X4wVtHaDxBEcJJEr5Tc+BTSzepg3h/KpbX2 X-Received: by 2002:a17:902:b697:b0:172:65f9:d681 with SMTP id c23-20020a170902b69700b0017265f9d681mr10042648pls.137.1660605834851; Mon, 15 Aug 2022 16:23:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660605834; cv=none; d=google.com; s=arc-20160816; b=SIQDl1txpd4yWvuRtVZAVy0ruwlc6dvHZBEG47W3gXp9epbGUiguXAcbSSUw6ceeTj o7zGJ3c9B3fADC4pK3y0PIl5hfZuEi1hfstBn04i6ZqcoblCrXfgGTV0MyUX/PW5ouao vti4QQyf0qSUXIaB+N7/TM5k9hDHvf0u7FUAlKKPn2YFEnhYOPMeOch9mqjBeItifMSx TtMm5WyVRDsWPldBuq0xx83BUYSiaIbBauibS+RQI6qrl0BDZsXD/jfvCKkOQaobaoq5 ppsiofjGtG4rcj4FLPaC2w0WLcNXpaRCe+A+tBILe/ZdtOEnxLfBCMP2Ycb0tvpbpq+V 9W7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=li4gZJiiMpsyqvlaxYg18MhArdbt/yY3L7D7+cRuxYA=; b=cRflTTbwLgOKcuN5K3SQK6/cSKcP1nDk5eNpwyJgYSE3aWdn/PSgWJPqjBJBMGX3OA 1O4FoHijq1wjsTROW9EmicP90ejoYnMtxLr92cEG8V0xKzibOTL74sb34ocJ9nmF6t3r a4T29k8RFO8uj/KaWlPtu8Eok3OFcpeTx+Px7MKpljJuGiRq4eJpYPld0JwCAEeUoOW7 icUKw5NVSC4vE/Klc8sBTyBZYDsvIYjymMF58fzm57YcT+avtjUYDUg4YVEOMqsMtf3y Ual3tz7oPU+7FvXDqoVvWaKUCVSbCGUZlyit95CdC0LEil8ffiBBGmnLoJC2JxFqdexj kvqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=yJRxCO4y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v22-20020a056a00149600b0052e33f30ca2si12490205pfu.208.2022.08.15.16.23.43; Mon, 15 Aug 2022 16:23:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=yJRxCO4y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345145AbiHOVRk (ORCPT + 99 others); Mon, 15 Aug 2022 17:17:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347726AbiHOVHS (ORCPT ); Mon, 15 Aug 2022 17:07:18 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22DD737F88; Mon, 15 Aug 2022 12:15:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2DA85B81106; Mon, 15 Aug 2022 19:15:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B73DC433D6; Mon, 15 Aug 2022 19:15:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660590944; bh=z+5x1ygZ0wmS3S4zFmjdaGOeyp2L82efvKH3V/p6uRc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=yJRxCO4y6yllgDM8q8ilHmbc2VyfFFUsIenbVlHBBJropzMwB8fBbPhd2xbGpAUTy 9h0JJYc5LFJctkfbWGht/kschzf/gPStd79zAgktswLoC4+Z1Vm79vmJtR7HRgr4B5 0ivyoQ+aPsSdoS41ca41EMkpqz9UCM8ubaNZ0aUI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Douglas Anderson , Rob Clark , Rob Clark , Sasha Levin Subject: [PATCH 5.18 0421/1095] drm/msm: Avoid unclocked GMU register access in 6xx gpu_busy Date: Mon, 15 Aug 2022 19:57:00 +0200 Message-Id: <20220815180447.129052352@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815180429.240518113@linuxfoundation.org> References: <20220815180429.240518113@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Douglas Anderson [ Upstream commit 6694482a70e9536efbf2ac233cbf0c302d6e2dae ] >From testing on sc7180-trogdor devices, reading the GMU registers needs the GMU clocks to be enabled. Those clocks get turned on in a6xx_gmu_resume(). Confusingly enough, that function is called as a result of the runtime_pm of the GPU "struct device", not the GMU "struct device". Unfortunately the current a6xx_gpu_busy() grabs a reference to the GMU's "struct device". The fact that we were grabbing the wrong reference was easily seen to cause crashes that happen if we change the GPU's pm_runtime usage to not use autosuspend. It's also believed to cause some long tail GPU crashes even with autosuspend. We could look at changing it so that we do pm_runtime_get_if_in_use() on the GPU's "struct device", but then we run into a different problem. pm_runtime_get_if_in_use() will return 0 for the GPU's "struct device" the whole time when we're in the "autosuspend delay". That is, when we drop the last reference to the GPU but we're waiting a period before actually suspending then we'll think the GPU is off. One reason that's bad is that if the GPU didn't actually turn off then the cycle counter doesn't lose state and that throws off all of our calculations. Let's change the code to keep track of the suspend state of devfreq. msm_devfreq_suspend() is always called before we actually suspend the GPU and msm_devfreq_resume() after we resume it. This means we can use the suspended state to know if we're powered or not. NOTE: one might wonder when exactly our status function is called when devfreq is supposed to be disabled. The stack crawl I captured was: msm_devfreq_get_dev_status devfreq_simple_ondemand_func devfreq_update_target qos_notifier_call qos_max_notifier_call blocking_notifier_call_chain pm_qos_update_target freq_qos_apply apply_constraint __dev_pm_qos_update_request dev_pm_qos_update_request msm_devfreq_idle_work Fixes: eadf79286a4b ("drm/msm: Check for powered down HW in the devfreq callbacks") Signed-off-by: Douglas Anderson Reviewed-by: Rob Clark Patchwork: https://patchwork.freedesktop.org/patch/489124/ Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid Signed-off-by: Rob Clark Signed-off-by: Sasha Levin --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 8 ------ drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 13 ++++----- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 12 +++------ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 3 ++- drivers/gpu/drm/msm/msm_gpu.h | 11 +++++++- drivers/gpu/drm/msm/msm_gpu_devfreq.c | 39 +++++++++++++++++++++------ 6 files changed, 53 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 217615e0e850..213f75fc98e8 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1666,18 +1666,10 @@ static u64 a5xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate) { u64 busy_cycles; - /* Only read the gpu busy if the hardware is already active */ - if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0) { - *out_sample_rate = 1; - return 0; - } - busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO, REG_A5XX_RBBM_PERFCTR_RBBM_0_HI); *out_sample_rate = clk_get_rate(gpu->core_clk); - pm_runtime_put(&gpu->pdev->dev); - return busy_cycles; } diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 3e325e2a2b1b..1863908694dc 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -102,7 +102,8 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); } -void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) +void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp, + bool suspended) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); @@ -127,15 +128,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) /* * This can get called from devfreq while the hardware is idle. Don't - * bring up the power if it isn't already active + * bring up the power if it isn't already active. All we're doing here + * is updating the frequency so that when we come back online we're at + * the right rate. */ - if (pm_runtime_get_if_in_use(gmu->dev) == 0) + if (suspended) return; if (!gmu->legacy) { a6xx_hfi_set_freq(gmu, perf_index); dev_pm_opp_set_opp(&gpu->pdev->dev, opp); - pm_runtime_put(gmu->dev); return; } @@ -159,7 +161,6 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret); dev_pm_opp_set_opp(&gpu->pdev->dev, opp); - pm_runtime_put(gmu->dev); } unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) @@ -895,7 +896,7 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) return; gmu->freq = 0; /* so a6xx_gmu_set_freq() doesn't exit early */ - a6xx_gmu_set_freq(gpu, gpu_opp); + a6xx_gmu_set_freq(gpu, gpu_opp, false); dev_pm_opp_put(gpu_opp); } diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 40fb92becc78..a8a0d798d17f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1658,27 +1658,21 @@ static u64 a6xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate) /* 19.2MHz */ *out_sample_rate = 19200000; - /* Only read the gpu busy if the hardware is already active */ - if (pm_runtime_get_if_in_use(a6xx_gpu->gmu.dev) == 0) - return 0; - busy_cycles = gmu_read64(&a6xx_gpu->gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H); - - pm_runtime_put(a6xx_gpu->gmu.dev); - return busy_cycles; } -static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) +static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp, + bool suspended) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); mutex_lock(&a6xx_gpu->gmu.lock); - a6xx_gmu_set_freq(gpu, opp); + a6xx_gmu_set_freq(gpu, opp, suspended); mutex_unlock(&a6xx_gpu->gmu.lock); } diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 86e0a7c3fe6d..ab853f61db63 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -77,7 +77,8 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state); int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu); -void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp); +void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp, + bool suspended); unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu); void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state, diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 143c56f5185b..a4d3abc4f3e1 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -63,11 +63,14 @@ struct msm_gpu_funcs { /* for generation specific debugfs: */ void (*debugfs_init)(struct msm_gpu *gpu, struct drm_minor *minor); #endif + /* note: gpu_busy() can assume that we have been pm_resumed */ u64 (*gpu_busy)(struct msm_gpu *gpu, unsigned long *out_sample_rate); struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu); int (*gpu_state_put)(struct msm_gpu_state *state); unsigned long (*gpu_get_freq)(struct msm_gpu *gpu); - void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp); + /* note: gpu_set_freq() can assume that we have been pm_resumed */ + void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp, + bool suspended); struct msm_gem_address_space *(*create_address_space) (struct msm_gpu *gpu, struct platform_device *pdev); struct msm_gem_address_space *(*create_private_address_space) @@ -91,6 +94,9 @@ struct msm_gpu_devfreq { /** devfreq: devfreq instance */ struct devfreq *devfreq; + /** lock: lock for "suspended", "busy_cycles", and "time" */ + struct mutex lock; + /** * idle_constraint: * @@ -134,6 +140,9 @@ struct msm_gpu_devfreq { * elapsed */ struct msm_hrtimer_work boost_work; + + /** suspended: tracks if we're suspended */ + bool suspended; }; struct msm_gpu { diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c index c7dbaa4b1926..01248d340124 100644 --- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c +++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c @@ -20,6 +20,7 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) { struct msm_gpu *gpu = dev_to_gpu(dev); + struct msm_gpu_devfreq *df = &gpu->devfreq; struct dev_pm_opp *opp; /* @@ -32,10 +33,13 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq, trace_msm_gpu_freq_change(dev_pm_opp_get_freq(opp)); - if (gpu->funcs->gpu_set_freq) - gpu->funcs->gpu_set_freq(gpu, opp); - else + if (gpu->funcs->gpu_set_freq) { + mutex_lock(&df->lock); + gpu->funcs->gpu_set_freq(gpu, opp, df->suspended); + mutex_unlock(&df->lock); + } else { clk_set_rate(gpu->core_clk, *freq); + } dev_pm_opp_put(opp); @@ -58,15 +62,24 @@ static void get_raw_dev_status(struct msm_gpu *gpu, unsigned long sample_rate; ktime_t time; + mutex_lock(&df->lock); + status->current_frequency = get_freq(gpu); - busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate); time = ktime_get(); - - busy_time = busy_cycles - df->busy_cycles; status->total_time = ktime_us_delta(time, df->time); + df->time = time; + if (df->suspended) { + mutex_unlock(&df->lock); + status->busy_time = 0; + return; + } + + busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate); + busy_time = busy_cycles - df->busy_cycles; df->busy_cycles = busy_cycles; - df->time = time; + + mutex_unlock(&df->lock); busy_time *= USEC_PER_SEC; do_div(busy_time, sample_rate); @@ -175,6 +188,8 @@ void msm_devfreq_init(struct msm_gpu *gpu) if (!gpu->funcs->gpu_busy) return; + mutex_init(&df->lock); + dev_pm_qos_add_request(&gpu->pdev->dev, &df->idle_freq, DEV_PM_QOS_MAX_FREQUENCY, PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE); @@ -244,12 +259,16 @@ void msm_devfreq_cleanup(struct msm_gpu *gpu) void msm_devfreq_resume(struct msm_gpu *gpu) { struct msm_gpu_devfreq *df = &gpu->devfreq; + unsigned long sample_rate; if (!has_devfreq(gpu)) return; - df->busy_cycles = 0; + mutex_lock(&df->lock); + df->busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate); df->time = ktime_get(); + df->suspended = false; + mutex_unlock(&df->lock); devfreq_resume_device(df->devfreq); } @@ -261,6 +280,10 @@ void msm_devfreq_suspend(struct msm_gpu *gpu) if (!has_devfreq(gpu)) return; + mutex_lock(&df->lock); + df->suspended = true; + mutex_unlock(&df->lock); + devfreq_suspend_device(df->devfreq); cancel_idle_work(df); -- 2.35.1