Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp115478pxb; Tue, 7 Sep 2021 19:22:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzkAW1sHFqa3D89qdogQgv1Llx/afybd1pqs2G2J7YBML9gjBkvUwYSfLHb8NXCBvQP4AIX X-Received: by 2002:a17:906:4d53:: with SMTP id b19mr1496420ejv.180.1631067769240; Tue, 07 Sep 2021 19:22:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631067769; cv=none; d=google.com; s=arc-20160816; b=j+IJ6lVcKtV+TX1PJfMRn2Ak+Q7qzsUozzzy5W53cvfuzM6IVC02Ry/hlVNpubFTw1 OMsH8IEUjTnLyZB2a4s3PfKg6sFeJGUK3+Dn0K8HRwftffR2NJ5cVIs25aXgSDqaHuxC 4A3BzGYWU5RR+lso0UcVgLyrnFq0vJEMg7cVI7p+nH14cDvmRJJMR5ZeQYbofhEmdffO s3re7D1USNC8yKbd8qnCbPXLudW4Ih7z5BDmfh9FdbJL6usb9mN+T2Faet9QAjeHOhWY ZSoIe8uOfEgZHpvyKyY4vlhJsP7soYvc3AZBvmltUil5Y5bTyMFJxRYppI/uqGrqWJmW LZgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=d7N2sG5KH7lGAgEggC9zCBdx01DXAMxZT+wV9lrlBZo=; b=ZW0B+NRQpby0jowgm1jbu8mUZRrANwyzmg9Slrqijy1wvlM0CtwU1RZV9TwGPLj6xf xfTdX7YSGnNpFLiR9jaGu9j+z36mLfJC6LToE7VjRfvMsZFbTDa/vq9Cz95yQRqn2Y0V ZNvDTQiUo3Zf0gznE83cbDDlVm3Lh/Bs3QdvmoSnBkf+O4lC5KZ2O/gNl7mxouSp01X5 TJ54MiDvjEHQ3gRbwbSoKOvku+ZwIMERiAyJ3DQfNgnQlJlE6VX4L+pbvjskedZcCOKp HnJXhjt+R2ni9o76yJ8K0vtw8iqskCSSAN6X1UweK/8J/391LX2VpXeGt+3KNT7T6+o5 /twQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=d1E5oZrp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h13si856113edv.45.2021.09.07.19.22.22; Tue, 07 Sep 2021 19:22:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=d1E5oZrp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347137AbhIHCVe (ORCPT + 99 others); Tue, 7 Sep 2021 22:21:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235450AbhIHCVd (ORCPT ); Tue, 7 Sep 2021 22:21:33 -0400 Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B4C7C061575 for ; Tue, 7 Sep 2021 19:20:26 -0700 (PDT) Received: by mail-oo1-xc2f.google.com with SMTP id m11-20020a056820034b00b0028bb60b551fso280141ooe.5 for ; Tue, 07 Sep 2021 19:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=d7N2sG5KH7lGAgEggC9zCBdx01DXAMxZT+wV9lrlBZo=; b=d1E5oZrpcIsuWWUnDrUU47njxjPHb/BblH3zYvHGr+MCBfMhW4jULBFqddvrjoGij4 Esn56yeCeCihAXL3SPlDeop4KwnLgw2Zth1cO7tEg6EEGYPT84FDGoOsdf9wHBXly+xK n7z7rJ5rZm8UPwA0Tzz7Z4X84paxUv5bcSBgUKCRUWG6XD+nPkOQ4wUxx6Qg6551i1so 535pxB4UK+7PUCkjW1dAkhL30cKrtgRPpGi97tbJCwuBw2YfazhVrdRKb0Ebqdx93rPR tWy8F7nj55XsSYGCz6R6F1sFVet2HvE4zEBNZbceFbgcsVHvZ84nzucACI8hCFsj98sy G5kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=d7N2sG5KH7lGAgEggC9zCBdx01DXAMxZT+wV9lrlBZo=; b=CL4Y/lrwgKMYCzsztwl71gZ4BoDm+tkkJC+vuAy9Dpe31nWqFQ1X/ddX1kwONvGACb 0fuaWrsLawzV1jJjIwzoYTLtUhPnXAyhYgfcqM9Qs+fq5FnwoR6+LODUC0ZcmrmEij8y uQjLb4Day2zHQMQc6oJkKriwQbzXFEqZ8aHpX39gV3FByHgbIVXSd7yC6bCxV8gx7dsn 6pANhF609yuvkZTvvMO1viov9VZv45+pQAsO+aSY871zYdmwOrzEObLFnI1fEi+YuQJT BuPysQQVWjSJu5ux1hmZqrJ1VW5HyeA6NAyG/VOptuJLlsjPG8BC6O/eJPnr9SuEnrzX RAmQ== X-Gm-Message-State: AOAM5328Udrgoef6IG2A/G5meiA67IB19I/iQw/KIK5XkvvvPLZ5UXp3 NcjcDUcwqB6xqqffbYdEeQDzFA== X-Received: by 2002:a4a:de90:: with SMTP id v16mr1072428oou.42.1631067624003; Tue, 07 Sep 2021 19:20:24 -0700 (PDT) Received: from ripper (104-57-184-186.lightspeed.austtx.sbcglobal.net. [104.57.184.186]) by smtp.gmail.com with ESMTPSA id d10sm159139ooj.24.2021.09.07.19.20.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 19:20:23 -0700 (PDT) Date: Tue, 7 Sep 2021 19:21:22 -0700 From: Bjorn Andersson To: Akhil P Oommen Cc: Caleb Connolly , Rob Clark , dri-devel , freedreno , linux-arm-msm , Rob Clark , Sean Paul , David Airlie , Daniel Vetter , Jordan Crouse , Jonathan Marek , Sai Prakash Ranjan , Sharat Masetty , open list , Stephen Boyd Subject: Re: [PATCH] drm/msm: Disable frequency clamping on a630 Message-ID: References: <8aa590be-6a9f-9343-e897-18e86ea48202@linaro.org> <6eefedb2-9e59-56d2-7703-2faf6cb0ca3a@codeaurora.org> <83ecbe74-caf0-6c42-e6f5-4887b3b534c6@linaro.org> <53d3e5b7-9dc0-a806-70e9-b9b5ff877462@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <53d3e5b7-9dc0-a806-70e9-b9b5ff877462@codeaurora.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote: > On 8/9/2021 9:48 PM, Caleb Connolly wrote: > > > > > > On 09/08/2021 17:12, Rob Clark wrote: > > > On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen > > > wrote: [..] > > > > I am a bit confused. We don't define a power domain for gpu in dt, > > > > correct? Then what exactly set_opp do here? Do you think this usleep is > > > > what is helping here somehow to mask the issue? > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs in > > the GPU DT. For the sake of simplicity I'll refer to the lowest > > frequency (257000000) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as > > the "min" state, and the highest frequency (710000000) and OPP level > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in > > sdm845.dtsi under the gpu node. > > > > The new devfreq behaviour unmasks what I think is a driver bug, it > > inadvertently puts much more strain on the GPU regulators than they > > usually get. With the new behaviour the GPU jumps from it's min state to > > the max state and back again extremely rapidly under workloads as small > > as refreshing UI. Where previously the GPU would rarely if ever go above > > 342MHz when interacting with the device, it now jumps between min and > > max many times per second. > > > > If my understanding is correct, the current implementation of the GMU > > set freq is the following: > > ?- Get OPP for frequency to set > > ?- Push the frequency to the GMU - immediately updating the core clock > > ?- Call dev_pm_opp_set_opp() which triggers a notify chain, this winds > > up somewhere in power management code and causes the gx regulator level > > to be updated > > Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We > were using a different api earlier which got deprecated - > dev_pm_opp_set_bw(). > On the Lenovo Yoga C630 this is reproduced by starting alacritty and if I'm lucky I managed to hit a few keys before it crashes, so I spent a few hours looking into this as well... As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote. The opp-level is just there for show and isn't used by anything, at least not on 845. Further more, I'm missing something in my tree, so the interconnect doesn't hit sync_state, and as such we're not actually scaling the buses. So the problem is not that Linux doesn't turn on the buses in time. So I suspect that the "AHB bus error" isn't saying that we turned off the bus, but rather that the GPU becomes unstable or something of that sort. Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran Aquarium for 20 minutes without a problem. I then switched the gpu devfreq governor to "userspace" and ran the following: while true; do echo 257000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq echo 710000000 > /sys/class/devfreq/5000000.gpu/userspace/set_freq done It took 19 iterations of this loop to crash the GPU. So the problem doesn't seem to be Rob's change, it's just that prior to it the chance to hitting it is way lower. Question is still what it is that we're triggering. Regards, Bjorn