Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp270649rdh; Thu, 23 Nov 2023 03:43:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IHGBUwuh9GpWpyRxH8c1lOMYZDfKbZ0VrkCvPts+fbZADrBq5dms8Qe0yoIx1oVAT09a4JA X-Received: by 2002:a05:6870:ab05:b0:1e9:c59b:a9ad with SMTP id gu5-20020a056870ab0500b001e9c59ba9admr6274062oab.52.1700739832735; Thu, 23 Nov 2023 03:43:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700739832; cv=none; d=google.com; s=arc-20160816; b=PBhFGbU279ZbQcqqYCm8L9Nvt3ATChqttvGSwH27DryY6HozYnpIY/4niZZsT/ZNGe Fi8ckB8MFchevC3AHQDC0ZCGw/RnkQAJ0zjWZrvYjWFoxhpqBcB8JhI9/NfKrpDGSke9 PS8dcd1ggHXzDveaLwiGjmXZQ39YKqvcWrPvRm8zu10MvRminTnOXFtoOkiz9XgnkzdU 6tGGyZuYJmxxWPcjEdstJT0zhPvmCbVoQPRMRaWzFh+FTwyXy8bQB4rsjvRnq8pJhXnW YyWhtEjApucyg8zZKImnmiFgCs1UvUp1rOS6KUHBLwokEqBh7teq3pw2QL6eIs15s5j5 pzwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=Mx+cGmsC0wNLN1ZqBShO/A3fq1HCZnmyoNYPONJmgZs=; fh=OPF45KW6QJ0Y/GxQIcpDGA12c0BiQ9lg3qy+hHVwrk0=; b=vVUavS8+6aEqa7s7azNQgL9ES8cgf9UkE7Wo7uiBzAhnYSJ6g3onMsHy92mwdp5O8G +OCMWCOcRUR3/5mO3mEPYZFd600hKdOiq8sKbodHYbEybqJnVQTJ1rmO3gFTnOamMqgF v/BKHYdB2xEJxDExBwrKlyVpBaouBN3rrxZgYQ1uaboAqIc6lw6I0i/kEllOU7cmFdIQ YGlYEYZlkKLTmEMLpIXkxTyidkANopHYW9vON0XgAQ5CGGty4CPdV/V05fpTkkZzNCB5 o/THQhJH01jPllq1GNrzUeFy0aPOHe5PQfePgDiihb3PLgCLKHDEh7h2dbDzhXCyh5X/ RblA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=AipKP99p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id be14-20020a656e4e000000b005bdbd683601si1249920pgb.57.2023.11.23.03.43.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Nov 2023 03:43:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=AipKP99p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 807E483271FE; Thu, 23 Nov 2023 03:43:49 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345097AbjKWLnb (ORCPT + 99 others); Thu, 23 Nov 2023 06:43:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345070AbjKWLna (ORCPT ); Thu, 23 Nov 2023 06:43:30 -0500 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCCF198 for ; Thu, 23 Nov 2023 03:43:35 -0800 (PST) Received: from [100.107.97.3] (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kholk11) by madras.collabora.co.uk (Postfix) with ESMTPSA id C56AA6607399; Thu, 23 Nov 2023 11:43:33 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1700739814; bh=U0wvK8MYhX15YyDgMNWd2SteQ9Ja2JnnJjiJ4hRV3bk=; h=Date:Subject:From:To:Cc:References:In-Reply-To:From; b=AipKP99pQsEMkeP/4tm/l4h3fKnhkt6Ac4x0k2sv+Ph2wx9kgN7jH/EiFonX1Gs5N 782B3G5ywoYoUHvFTS5WTeL7S+9WhRcqQqUyym1r4szIEbCPu7tWs3zIdxZVEzR4ZO 77o6Foa0+hfWXPM/FYWXQQ1jACJJxOVipYH/VKfhLQa18rzHIJ92frHN6cyfsplYND PuxemQSYTU1K9iMtITjyARdFl5Ut4T0oPlm96FZQKn3V/IanC03hD/uFYJp95gtugY 5VkWLp7fcuxvchp+UTXi7s87cmrV0PHWRMNUgssr9dfK8gV5C4SlqJnDvE851FAwu1 4Xoani12PvnrQ== Message-ID: Date: Thu, 23 Nov 2023 12:43:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] drm/panfrost: Ignore core_mask for poweroff and sync interrupts Content-Language: en-US From: AngeloGioacchino Del Regno To: Boris Brezillon Cc: steven.price@arm.com, robh@kernel.org, maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, krzysztof.kozlowski@linaro.org, kernel@collabora.com References: <20231123095320.41433-1-angelogioacchino.delregno@collabora.com> <20231123113530.46191ded@collabora.com> <1740797f-f3ae-4868-924a-08d6d731e506@collabora.com> In-Reply-To: <1740797f-f3ae-4868-924a-08d6d731e506@collabora.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 23 Nov 2023 03:43:49 -0800 (PST) Il 23/11/23 12:15, AngeloGioacchino Del Regno ha scritto: > Il 23/11/23 11:35, Boris Brezillon ha scritto: >> On Thu, 23 Nov 2023 10:53:20 +0100 >> AngeloGioacchino Del Regno >> wrote: >> >>> Some SoCs may be equipped with a GPU containing two core groups >>> and this is exactly the case of Samsung's Exynos 5422 featuring >>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost >>> is partial, as this driver currently supports using only one >>> core group and that's reflected on all parts of it, including >>> the power on (and power off, previously to this patch) function. >>> >>> The issue with this is that even though executing the soft reset >>> operation should power off all cores unconditionally, on at least >>> one platform we're seeing a crash that seems to be happening due >>> to an interrupt firing which may be because we are calling power >>> transition only on the first core group, leaving the second one >>> unchanged, or because ISR execution was pending before entering >>> the panfrost_gpu_power_off() function and executed after powering >>> off the GPU cores, or all of the above. >>> >>> Finally, solve this by changing the power off flow to >>>   1. Mask and clear all interrupts: we don't need nor want any, as >>>      we are polling PWRTRANS anyway; >>>   2. Call synchronize_irq() after that to make sure that any pending >>>      ISR is executed before powering off the GPU Shaders/Tilers/L2 >>>      hence avoiding unpowered registers R/W; and >>>   3. Ignore the core_mask and ask the GPU to poweroff both core groups >> >> Could we split that in two patches? 1+2 in one patch, and 3 in another. >> These are two orthogonal fixes IMO. >> > > My initial idea was exactly that, but I opted for one patch doing 'em all > because a "full fix" comprises all of 1+2+3: the third one without the > first two and vice-versa may not fully resolve the issue that was seen > on the HC1 board. > > So, while I agree that it'd be slightly more readable as a diff if those > were two different commits I do have reasons against splitting..... > >>> >>> Of course it was also necessary to add a `irq` variable to `struct >>> panfrost_device` as we need to get that in panfrost_gpu_power_off() >>> for calling synchronize_irq() on it. >>> >>> Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in >>> panfrost_gpu_power_off()") >>> [Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6] >>> Reported-by: Krzysztof Kozlowski >>> Signed-off-by: AngeloGioacchino Del Regno >>> --- >>>   drivers/gpu/drm/panfrost/panfrost_device.h |  1 + >>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++------- >>>   2 files changed, 19 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h >>> b/drivers/gpu/drm/panfrost/panfrost_device.h >>> index 0fc558db6bfd..b4feaa99e34f 100644 >>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h >>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h >>> @@ -94,6 +94,7 @@ struct panfrost_device { >>>       struct device *dev; >>>       struct drm_device *ddev; >>>       struct platform_device *pdev; >>> +    int irq; >> >> I know it's the only irq being stored at the panfrost_device level, but >> I think it's clearer if we explicitly prefix it with gpu_. >> > > Makes sense, agreed. > >>>       void __iomem *iomem; >>>       struct clk *clock; >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c >>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c >>> index 1cc55fb9c45b..30b395125155 100644 >>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c >>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c >>> @@ -425,11 +425,21 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev) >>>   void panfrost_gpu_power_off(struct panfrost_device *pfdev) >>>   { >>> -    u64 core_mask = panfrost_get_core_mask(pfdev); >>>       int ret; >>>       u32 val; >>> -    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & >>> core_mask); >>> +    /* We are polling PWRTRANS and we don't need nor want interrupts */ >> >> I kinda agree with that, but that's not exactly why we're >> masking+syncing IRQs here. If that was the only reason, the right fix >> would be to modify GPU_IRQ_MASK_ALL so it doesn't include the PWRTRANS >> bits. >> >> This fix should cover more than just the power transition IRQ use case: >> we want all IRQs to be disabled before the device is suspended. >> > > Eh I should reword that, effectively, because what I wrote as comments make > sense (as in, the logic of it completes) only if you read both of them "as one". > > I'll do that in the new suspend irq helper :-) > >>> +    gpu_write(pfdev, GPU_INT_MASK, 0); >>> +    gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL); >>> + >>> +    /* >>> +     * Make sure that we don't have pending ISRs, otherwise we'll be >>> +     * reading and/or writing registers while the GPU is powered off >>> +     */ >>> +    synchronize_irq(pfdev->irq); >> >> Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not >> sure making it part of panfrost_gpu_power_off() is a good idea. I'd >> rather have this panfrost_gpu_suspend_irq() helper called from >> panfrost_device_[runtime_]suspend(), along with >> panfrost_{mmu,job}_suspend_irq(). >> > > Okay I will move that to a helper, but I still want to clarify: >  - For JOB, we're checking if panfrost_job_is_idle() before trying >    to runtime_suspend() (hence before trying to power off cores), >    so implicitly no interrupt can fire I guess? Though there could >    still be a pending ISR there too.... mmh. Brain ticking :-) >  - For MMU, we're not checking anything, but I guess that if there >    is no job, the mmu can't be doing anything at all? >    ...but then you also gave me the doubt about that one as well. > > What I think that would be sensible to do is to get this commit as > a "clear" fix for the "Really power off" one, then have one or more > additional commit(s) without any fixes tag to improve the IRQ suspend > with the new mmu/job irq suspend helpers. > > Of course *this* commit would introduce the panfrost_gpu_suspend_irq() > helper directly, instead of moving the logic to a helper in a later one. > > Any reason against? :-) > >>> + >>> +    /* Now it's safe to request poweroff for Shaders, Tilers and L2 */ >> >> It was safe before too, it's just that we probably don't want to be > > In theory yes, in practice the Odroid HC1 board crashed :-P > > P.S.: Joking! I understand what you're saying :-) > >> interrupted, if all we do is ignore the interrupts we receive, hence >> the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the >> IRQs we care about instead. >> >>> +    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present); >>>       ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO, >>>                        val, !val, 1, 1000); >>>       if (ret) >>> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev) >>>       if (ret) >>>           dev_err(pfdev->dev, "tiler power transition timeout"); >>> -    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask); >>> +    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present); >> >> I really think we should have a helper doing the 'write(PWROFF_{LO,HI} + >> poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2], >> such that, once we got it right for one block, we have it working for >> all of them. And if there's a fix to apply, it automatically applies >> to all blocks without having to fix the same bug in each copy. >> > > ...technically yes, but practically this driver doesn't currently support any > GPU that even fills the _LO registers. > > I guess that we can do that later? > > That's just to (paranoidly) avoid any risk to introduce other regressions in > this merge window, since we're fixing one that shouldn't have happened in the > first place... > >> Note that these panthor_gpu_block_power_{on,off}() helpers also handle >> the case where power transitions are already in progress when you ask a >> new power transition, which I don't think is checked in >> panfrost_gpu_power_{off,on}(). >> > > I admit I didn't analyze the panthor driver - but here, the only power transitions > that may happen are either because of panfrost_gpu_power_on(), or because of > panfrost_gpu_power_off(), and both are polling and blocking... so from what I > understand, there's no possibility to have "another" power transition happening > while calling poweron, or poweroff. > > That would change if we start to selectively turn on and off a number of shaders > and/or a number of tilers (not all of them) depending on the workload, but we're > not doing that... > > ...yet? > > :-) > >>>       ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO, >>>                        val, !val, 0, 1000); >> >> Not introduced by the patch, but I noticed args passed on the second >> line are no longer aligned on the open parens. >> > > Yeah, fixing that for v2 :-) > Sorry for the double reply - I just noticed that you're seeing this misalignment because I had a *local* commit introducing that but, on linux-next, this is not present, so there's no misalignment to fix........ :-) Cheers, Angelo