Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1496520rwl; Wed, 29 Mar 2023 19:35:11 -0700 (PDT) X-Google-Smtp-Source: AK7set9ujU0Q4gmdi4xyJ3rJq/VzM6wLwsFrzpQ7OkOZihUCGvp7MfqtKFwIq90AxKQkvyxybOBW X-Received: by 2002:a05:6a20:b2f:b0:d9:9e33:7218 with SMTP id x47-20020a056a200b2f00b000d99e337218mr17111721pzf.1.1680143710835; Wed, 29 Mar 2023 19:35:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680143710; cv=none; d=google.com; s=arc-20160816; b=BkzsfUTUgFHCkH6KMp31R1jBN5QfXbXne9icbHLF6DooBVp2q+plcgCiSW0Rii+Tl6 YYlAV24yGn9lOEnhW4VE/zqafjYc0I7ItAAzH5NIBMK/73rSM7B9MRDEiY80H2NiQcps Re0EBSR+ed6Kd/IBFLJMt1HxL6slpi9YCOOUJyIMEtOL8nIEaLStv/LAdK6zzQavx52H J+MEhNam6PGN98g+w+eQoInYyZp8DB6sowp1bLG2Img8woCcJN+b66GRlz3gXM/3wG2A mpkUR/s/vZKK084tvWw23G3zlshAKTeIh58B8dtz2zQf2AuqIHmF+2TBdRNxmAeUNLfE uwLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Sy5NzJg49ItAmdfPzDIPNFNHG8KPHe+y/70Sz4C8t/Q=; b=KYdN0xDuvaafxLq/xcW5HVWDfoQTvsFbF0X3/Rc4C7gBMSoAOBXkc1OkvOWRSu50PM 0dJqEcGP+zkInApivsx6pWYsKezsU6ia3mErZUfcUU8C9xQVGglwDkyaNxoAUzxDAcBP G7+PYEW7gzxBaJSiLWhT5NtRkJGutt+H5HDsE1Q8eWQ25BPQDtNqbyE3U9kKB0b7lVIJ GDJ/xRNtWfWDYVAwC72Uq5a2S+o8ynqdlVSl0StVNSzCCzLLxUY343PKxABBWzr+Ardf H0YbA6YaOTPhohmWXxw2SdWQeJZh4WOmNkb+INxW+PWwy1zS5/HeH4tMdakQetjn9ai7 o1fg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dj3pDmzX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y23-20020a63de57000000b0051339e73e3asi11104866pgi.314.2023.03.29.19.34.57; Wed, 29 Mar 2023 19:35:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dj3pDmzX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229548AbjC3CIY (ORCPT + 99 others); Wed, 29 Mar 2023 22:08:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229452AbjC3CIX (ORCPT ); Wed, 29 Mar 2023 22:08:23 -0400 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26AD7135 for ; Wed, 29 Mar 2023 19:08:22 -0700 (PDT) Received: by mail-ot1-x335.google.com with SMTP id x8-20020a9d3788000000b0069f922cd5ceso9287256otb.12 for ; Wed, 29 Mar 2023 19:08:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680142101; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Sy5NzJg49ItAmdfPzDIPNFNHG8KPHe+y/70Sz4C8t/Q=; b=dj3pDmzX9esXH3m186tk5BDhMUS8uTz/RMb68iBmqLeFXMlrZC8Gv3BI4YtClR8Mbj 7Bt7l37n/4trmlFwnRIXUoFQKm/hkLnRiko6+7cGnGeWoF9HrjoEC/QtbwnT5oHL2BjL 8iAkbDrCAU9nbHQ5egrseqIliBqu1Y/ZAOSzeIKjFKiuGSS5rIb7qe4SFPRw+cVEdEIh /5ieJwpMHhDBlPXtqGBmrEuOPCvEOQYgv5tvIPPfPyCKUtD+o56YaphvhMEVUFOGGl/T OkvBRVazGT5KXzoKD34qjOfcTqxAeDedp71q1z6UcTS0Dl3GiZpiky+KnhNYWG/F5kUK 1eTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680142101; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Sy5NzJg49ItAmdfPzDIPNFNHG8KPHe+y/70Sz4C8t/Q=; b=jl3koAIoqkpgxTq25gQ1RAzm2eJee4iiLMco7nn9zEMnssSWT2vVfy4yNBFWqS3s9J C6O4KBzBvsrU1iFmatpRgh8nxqX+YpSQA8X1TvIuwGZU2EcpXk5sQrIh1f4ccfvo9FiV DkDHwxiXOHCC+3syOQ2cjciX40VaCBce6bJ8H/d/hdm0dfJxacIdLIvI5MyKP9wzOvR8 LECmxFuKt48q5WzMhnsJZ1nbBFZ3MUtj8IRna8d3FpoQ5jZmhwHN/n3t7dGVkM9C9mzK 6ss7WT843qwmBA66eVCOaXWguTreuNzQuKVZSzUrYbebXxbHE28VmJFjotbwzX+aSTBE Ereg== X-Gm-Message-State: AO0yUKUUtVhl16oBbBBHY3OQtspiVxRPN9ze4Fb4xMBoTyk0SzBAmVNQ /4If1I6McLbwZo18CXCmsFxsQZGyW+PbQ3XSzDo= X-Received: by 2002:a05:6830:1bed:b0:69f:882:cdb2 with SMTP id k13-20020a0568301bed00b0069f0882cdb2mr7034850otb.3.1680142101284; Wed, 29 Mar 2023 19:08:21 -0700 (PDT) MIME-Version: 1.0 References: <20230329095933.1203559-1-kai.heng.feng@canonical.com> In-Reply-To: From: Alex Deucher Date: Wed, 29 Mar 2023 22:08:09 -0400 Message-ID: Subject: Re: [PATCH 1/2] drm/amdgpu: Reset GPU on S0ix when device supports BOCO To: Kai-Heng Feng Cc: alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, Jingyu Wang , Andrey Grodzovsky , Lijo Lazar , dri-devel@lists.freedesktop.org, =?UTF-8?Q?Michel_D=C3=A4nzer?= , YiPeng Chai , Mario Limonciello , Guchun Chen , "Rafael J. Wysocki" , amd-gfx@lists.freedesktop.org, Jiansong Chen , Kenneth Feng , Tim Huang , Bokun Zhang , Hans de Goede , Maxime Ripard , Evan Quan , Somalapuram Amaranath , linux-kernel@vger.kernel.org, Hawking Zhang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 29, 2023 at 8:49=E2=80=AFPM Kai-Heng Feng wrote: > > On Wed, Mar 29, 2023 at 9:21=E2=80=AFPM Alex Deucher wrote: > > > > On Wed, Mar 29, 2023 at 6:00=E2=80=AFAM Kai-Heng Feng > > wrote: > > > > > > When the power is lost due to ACPI power resources being turned off, = the > > > driver should reset the GPU so it can work anew. > > > > > > First, _PR3 support of the hierarchy needs to be found correctly. Sin= ce > > > the GPU on some discrete GFX cards is behind a PCIe switch, checking = the > > > _PR3 on downstream port alone is not enough, as the _PR3 can associat= e > > > to the root port above the PCIe switch. > > > > > > Once the _PR3 is found and BOCO support is correctly marked, use that > > > information to inform the GPU should be reset. This solves an issue t= hat > > > system freeze on a Intel ADL desktop that uses S0ix for sleep and D3c= old > > > is supported for the GFX slot. > > > > I don't think we need to reset the GPU. If the power is turned off, a > > reset shouldn't be necessary. The reset is only necessary when the > > power is not turned off to put the GPU into a known good state. It > > should be in that state already if the power is turn off. It sounds > > like the device is not actually getting powered off. > > I had the impression that the GPU gets reset because S3 turned the > power rail off. > > So the actual intention for GPU reset is because S3 doesn't guarantee > the power is being turned off? For S4, the reset in freeze is there because once the boot kernel transitions to the hibernated kernel, we need the reset to bring the GPU back to a known state. On dGPUs at least there are some engines that can only be initialized once and then require a reset to be initialized again. The one in suspend was originally there to deal with aborted suspends where we'd need to reset the GPU for the same reason as S4. However, it no longer really serves much purpose since it got moved to noirq and it could probably be dropped. Alex > > Kai-Heng > > > > > Alex > > > > > > > > Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") > > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 > > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458 > > > Signed-off-by: Kai-Heng Feng > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 3 +++ > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ++++++- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++++------- > > > 3 files changed, 14 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/d= rm/amd/amdgpu/amdgpu_acpi.c > > > index 60b1857f469e..407456ac0e84 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > @@ -987,6 +987,9 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_d= evice *adev) > > > if (amdgpu_sriov_vf(adev)) > > > return false; > > > > > > + if (amdgpu_device_supports_boco(adev_to_drm(adev))) > > > + return true; > > > + > > > #if IS_ENABLED(CONFIG_SUSPEND) > > > return pm_suspend_target_state !=3D PM_SUSPEND_TO_IDLE; > > > #else > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu= /drm/amd/amdgpu/amdgpu_device.c > > > index f5658359ff5c..d56b7a2bafa6 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > @@ -2181,7 +2181,12 @@ static int amdgpu_device_ip_early_init(struct = amdgpu_device *adev) > > > > > > if (!(adev->flags & AMD_IS_APU)) { > > > parent =3D pci_upstream_bridge(adev->pdev); > > > - adev->has_pr3 =3D parent ? pci_pr3_present(parent) : = false; > > > + do { > > > + if (pci_pr3_present(parent)) { > > > + adev->has_pr3 =3D true; > > > + break; > > > + } > > > + } while ((parent =3D pci_upstream_bridge(parent))); > > > } > > > > > > amdgpu_amdkfd_device_probe(adev); > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/dr= m/amd/amdgpu/amdgpu_drv.c > > > index ba5def374368..5d81fcac4b0a 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > @@ -2415,10 +2415,11 @@ static int amdgpu_pmops_suspend(struct device= *dev) > > > struct drm_device *drm_dev =3D dev_get_drvdata(dev); > > > struct amdgpu_device *adev =3D drm_to_adev(drm_dev); > > > > > > - if (amdgpu_acpi_is_s0ix_active(adev)) > > > - adev->in_s0ix =3D true; > > > - else if (amdgpu_acpi_is_s3_active(adev)) > > > + if (amdgpu_acpi_is_s3_active(adev) || > > > + amdgpu_device_supports_boco(drm_dev)) > > > adev->in_s3 =3D true; > > > + else if (amdgpu_acpi_is_s0ix_active(adev)) > > > + adev->in_s0ix =3D true; > > > if (!adev->in_s0ix && !adev->in_s3) > > > return 0; > > > return amdgpu_device_suspend(drm_dev, true); > > > @@ -2449,10 +2450,7 @@ static int amdgpu_pmops_resume(struct device *= dev) > > > adev->no_hw_access =3D true; > > > > > > r =3D amdgpu_device_resume(drm_dev, true); > > > - if (amdgpu_acpi_is_s0ix_active(adev)) > > > - adev->in_s0ix =3D false; > > > - else > > > - adev->in_s3 =3D false; > > > + adev->in_s0ix =3D adev->in_s3 =3D false; > > > return r; > > > } > > > > > > -- > > > 2.34.1 > > >