Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp535528pxb; Wed, 8 Sep 2021 06:52:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwc2eqqc2kDY/yRTVR3Ui8cKNx0DzX9amQOGZOfg+dNPeZrXezjCqv5B7nyhiaivDvS4RqZ X-Received: by 2002:a92:c04f:: with SMTP id o15mr2952071ilf.310.1631109146907; Wed, 08 Sep 2021 06:52:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631109146; cv=none; d=google.com; s=arc-20160816; b=nY4nIkpQXJ3ElX1coAe9MPEa1dmSwkpJ7hxz9p3zyyHdOFJiNmEx3mXw/VXeo7/0cV llWvCv+5O6n4Ai8SLCbl743zH69PukO6OsBdugEpMfaTHexPKzIcv86K4Byu+tcbCSX1 TzViQICZX3U1zZn0Wt6tcJpQpJaIBsyXnrNdSbD4BpCaodGQWLc9Uc6iSfNgpgsydjs0 LnrP/lJN3Wr5YCcacyTTshlRENF+yb2ZOAjuCpvpHf1E1KAKdtBKBCtAE/siDNoAENFE 5mz+j70oaH6BOgQgzqQ3IhDya8nQa15soP9oxhfb512SaaLX94bYD8md7RyOyZwI52n8 +S6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=y0lh+sRHyJvQUVAPP40k5yKczJum04OO6ZyC4AudIkU=; b=CUfh9U8JSk4dxW+TTfEtKkhq+Bi7lwp0uzSEa0Xd+TMwUjxDD2o4q+ot51UX1rRGYX NBUGBs5DsKP0BZJTXa2xZk7iHXm9QYR5jv5H8wr9nU+AJ2lp83Sz1K/RYGDsWwIA2gvg F9W8ju0FIUZwTzpvPBVMiIekPMc7aDDnEri3Z7nMjlQ0rR/Gcn+mXwT3jmEzL7Laz/4O jpf7aGMewbhx8YVqgheT0wI/MT2eNnYPTSKjVDdqs0lXhUtbkSxo/BcIHAORYGWQ2ZMK bbJnCemEStdc0hogaUaOprRud32gYCr2jsHsZI75VOirARkDxe0UgpchMHzy3qRqR4Ps i02Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=oZdIoJNd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q35si1748352jac.78.2021.09.08.06.52.11; Wed, 08 Sep 2021 06:52:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=oZdIoJNd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232667AbhIHNup (ORCPT + 99 others); Wed, 8 Sep 2021 09:50:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230326AbhIHNun (ORCPT ); Wed, 8 Sep 2021 09:50:43 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E523C061757 for ; Wed, 8 Sep 2021 06:49:35 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id k5-20020a05600c1c8500b002f76c42214bso1677039wms.3 for ; Wed, 08 Sep 2021 06:49:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=y0lh+sRHyJvQUVAPP40k5yKczJum04OO6ZyC4AudIkU=; b=oZdIoJNdme6vvjVB/0Gc1a7V21qx2XE+NiQ7TRgrELm+J9yaO9LaPpFJsvpQq9klZ/ HqLtUaAcz/afrFGiy/d6OenFiDKWhNBr4WzUK8xj0l3U8eBSB09TxVjGjLv1jQEIVV88 38AIqd6sUEPv91QTXD8fOV3yV8/49Qak7CX7o+Rhpe1GNDNYGPsrJ3GuYah3GcCkCeyC uTngstmz4BsZWCCwUWZMj50LjGbKdnATC4bR3O2b3dEpTu3yWtydzGCUsgmrnSgsyzDb I1II2grn8UkgJ9c7osJMgRbDQjL3p+okgD7IkB1rW27pukzY+p6xPiBiRuzIdvM3mLqD E60Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=y0lh+sRHyJvQUVAPP40k5yKczJum04OO6ZyC4AudIkU=; b=sHEc3II+D2XHKw+yhcEaS8FrBRs0DKEpcIIoL3o6WsxHb6CStcsMQnXvaXKLdeaTkc C6jLsEqpTsAWKKvZM3SME+/PRnutI1zEEvHwLITmbuOSSn+v2ByP5iFWKACEu9r2k91w 9WnzscdYijLetWtEfUvirMHIXtVm+Rc8RSxWboROaSwMQ6fZe5rH2ON7d+lS1AArfLAO CmGpokVJojWsWJaEfLQM+14woHwQq0/uXzajPgEYUi4MH9G7oeXNXFo8y13TG8sZPh1J cT3ao5cuFMGdX/KIpFU8OV1PbrjyYi2pw97H4McNlyK4Pd5jPJqhLxDbxkxck5wC08te dtXA== X-Gm-Message-State: AOAM530BNdm2mM0+EUgCLHf1hfSr1wO85nxzws/N5StPCvGYSYlfkD8p KjNOxJMQ2XYqvjKOmMm5Eqo631xojsiP4Tbo X-Received: by 2002:a1c:7515:: with SMTP id o21mr3714062wmc.150.1631108973796; Wed, 08 Sep 2021 06:49:33 -0700 (PDT) Received: from [192.168.1.12] (host-92-17-50-173.as13285.net. [92.17.50.173]) by smtp.gmail.com with ESMTPSA id w1sm2193690wmc.19.2021.09.08.06.49.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Sep 2021 06:49:33 -0700 (PDT) Subject: Re: [PATCH] drm/msm: Disable frequency clamping on a630 To: Bjorn Andersson , Akhil P Oommen Cc: Rob Clark , dri-devel , freedreno , linux-arm-msm , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , Jordan Crouse , Jonathan Marek , Sai Prakash Ranjan , Sharat Masetty , open list , Stephen Boyd References: <8aa590be-6a9f-9343-e897-18e86ea48202@linaro.org> <6eefedb2-9e59-56d2-7703-2faf6cb0ca3a@codeaurora.org> <83ecbe74-caf0-6c42-e6f5-4887b3b534c6@linaro.org> <53d3e5b7-9dc0-a806-70e9-b9b5ff877462@codeaurora.org> From: Caleb Connolly Message-ID: <076cb487-7929-2af4-b2c0-385cd731ee48@linaro.org> Date: Wed, 8 Sep 2021 14:49:31 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/09/2021 03:21, Bjorn Andersson wrote: > On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote: > >> On 8/9/2021 9:48 PM, Caleb Connolly wrote: >>> >>> >>> On 09/08/2021 17:12, Rob Clark wrote: >>>> On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen >>>> wrote: > [..] >>>>> I am a bit confused. We don't define a power domain for gpu in dt, >>>>> correct? Then what exactly set_opp do here? Do you think this usleep is >>>>> what is helping here somehow to mask the issue? >>> The power domains (for cx and gx) are defined in the GMU DT, the OPPs in >>> the GPU DT. For the sake of simplicity I'll refer to the lowest >>> frequency (257000000) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as >>> the "min" state, and the highest frequency (710000000) and OPP level >>> (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in >>> sdm845.dtsi under the gpu node. >>> >>> The new devfreq behaviour unmasks what I think is a driver bug, it >>> inadvertently puts much more strain on the GPU regulators than they >>> usually get. With the new behaviour the GPU jumps from it's min state to >>> the max state and back again extremely rapidly under workloads as small >>> as refreshing UI. Where previously the GPU would rarely if ever go above >>> 342MHz when interacting with the device, it now jumps between min and >>> max many times per second. >>> >>> If my understanding is correct, the current implementation of the GMU >>> set freq is the following: >>>  - Get OPP for frequency to set >>>  - Push the frequency to the GMU - immediately updating the core clock >>>  - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds >>> up somewhere in power management code and causes the gx regulator level >>> to be updated >> >> Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We >> were using a different api earlier which got deprecated - >> dev_pm_opp_set_bw(). >> > > On the Lenovo Yoga C630 this is reproduced by starting alacritty and if > I'm lucky I managed to hit a few keys before it crashes, so I spent a > few hours looking into this as well... > > As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote. > The opp-level is just there for show and isn't used by anything, at > least not on 845. > > Further more, I'm missing something in my tree, so the interconnect > doesn't hit sync_state, and as such we're not actually scaling the > buses. So the problem is not that Linux doesn't turn on the buses in > time. > > So I suspect that the "AHB bus error" isn't saying that we turned off > the bus, but rather that the GPU becomes unstable or something of that > sort. > > > Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran > Aquarium for 20 minutes without a problem. I then switched the gpu > devfreq governor to "userspace" and ran the following: > > while true; do > echo 257000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq > echo 710000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq > done > > It took 19 iterations of this loop to crash the GPU. > > So the problem doesn't seem to be Rob's change, it's just that prior to > it the chance to hitting it is way lower. Question is still what it is > that we're triggering. Do the opp-levels in DTS represent how the hardware behaves? If so then it does just appear to be that whatever is responsible for scaling the GX rail voltage has no time limits and will attempt to switch the regulator between min/max voltage as often as we tell it to which is probably not something the hardware expected. > > Regards, > Bjorn > -- Kind Regards, Caleb (they/them)