Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp748440pxb; Fri, 22 Apr 2022 10:18:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwmK2mU1c6beWiZ8yTEMgnDA+zG/cIs1cRC12fnpeKh6XnU6kgKfGWWoKhXmEAsTfWvY9Wa X-Received: by 2002:a05:6a00:330b:b0:50b:d988:1ebc with SMTP id cq11-20020a056a00330b00b0050bd9881ebcmr5804592pfb.71.1650647931032; Fri, 22 Apr 2022 10:18:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650647931; cv=none; d=google.com; s=arc-20160816; b=FdCL9ovRsQz2SrdTOQgpblqjvgqURKsob+sZIuGjEKv0TKzTr2xk97SMLvAH/WNxH7 33Mj0V1CRftP9cUBTQymx1XassieFHWKKPE9nNxmk6NLhSZY/aBqkzXrDuvIwj2cWTdY C3Hq0sCb1UqEqgt1FHj5KrvKm2uAdjJkQECeCobJkZDXm+JFlAeST20dJraDcneGWH6/ iEtTbolYWnhaO6aGBRttx7AodmvGOXFd6ykZQRkOiTxIO1H51SzPG9iR/Pb343hc5ZRO nL1vuyQ5KIypgWvmLR0pnSS6XHQKSuFRc0mifskK4HQ00rddWz9XNXXqB+It8vwxExdx raqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=N+OLORm1AfWUweTzws0Wr06Z1Q7k31RIlbcKaQcCkq8=; b=nEJi8yoETQ/wwsuCMfNoHxrgLRpLvAmWqSGd+yOvRw9odvFgipOClAE4Em5qyRpxOE O4I7Dx1r48Z/0zjxyLmkA3exJPL4jpO5CjbLkzLnMmGw+wb7RdpMzaMwoxS1FYOmRRnI lVhH7WKO9WHILPsd9i9WIdlgIdg9ZjAR71cBQ/IBW0lIyMZZew2XiCrAgbsoAWRPLSX+ C9D5FypMpI/0YvL1kI8aaXjvdCTd1LYqpg8BCX/0ZrZwv8cy1/JC3YE2yOdw/HElXrPr Ezp1r/MV44k7oKYXv5Toki+SuMo+eKBPHK40qadDN+kBQGZ9hMrf1JgakMiIRPe+bIOK jKRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id 5-20020a630505000000b00398604b5c6dsi9050938pgf.828.2022.04.22.10.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 10:18:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 46E20B861; Fri, 22 Apr 2022 10:14:59 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1384468AbiDUFiy (ORCPT + 99 others); Thu, 21 Apr 2022 01:38:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345361AbiDUFiw (ORCPT ); Thu, 21 Apr 2022 01:38:52 -0400 Received: from mx1.molgen.mpg.de (mx3.molgen.mpg.de [141.14.17.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1875612093 for ; Wed, 20 Apr 2022 22:36:02 -0700 (PDT) Received: from [192.168.0.2] (ip5f5ae8f0.dynamic.kabel-deutschland.de [95.90.232.240]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: pmenzel) by mx.molgen.mpg.de (Postfix) with ESMTPSA id 181F461CCD7D8; Thu, 21 Apr 2022 07:35:59 +0200 (CEST) Message-ID: <294555b4-2d1b-270f-6682-3a17e9df133c@molgen.mpg.de> Date: Thu, 21 Apr 2022 07:35:58 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems Content-Language: en-US To: Richard Gong Cc: Dave Airlie , Xinhui Pan , LKML , amd-gfx@lists.freedesktop.org, Alexander Deucher , dri-devel@lists.freedesktop.org, Daniel Vetter , Alex Deucher , =?UTF-8?Q?Christian_K=c3=b6nig?= , Mario Limonciello References: <20220412215000.897344-1-richard.gong@amd.com> <91e916e3-d793-b814-6cbf-abee0667f5f8@molgen.mpg.de> <94fd858d-1792-9c05-b5c6-1b028427687d@amd.com> <237da02b-0ed8-6b1c-3eaf-5574aab4f13f@amd.com> From: Paul Menzel In-Reply-To: <237da02b-0ed8-6b1c-3eaf-5574aab4f13f@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Richard, Am 21.04.22 um 03:12 schrieb Gong, Richard: > On 4/20/2022 3:29 PM, Paul Menzel wrote: >> Am 19.04.22 um 23:46 schrieb Gong, Richard: >> >>> On 4/14/2022 2:52 AM, Paul Menzel wrote: >>>> [Cc: -kernel test robot ] >> >> […] >> >>>> Am 13.04.22 um 15:00 schrieb Alex Deucher: >>>>> On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote: >>>> >>>>>> Thank you for sending out v4. >>>>>> >>>>>> Am 12.04.22 um 23:50 schrieb Richard Gong: >>>>>>> Active State Power Management (ASPM) feature is enabled since >>>>>>> kernel 5.14. >>>>>>> There are some AMD GFX cards (such as WX3200 and RX640) that >>>>>>> won't work >>>>>>> with ASPM-enabled Intel Alder Lake based systems. Using these GFX >>>>>>> cards as >>>>>>> video/display output, Intel Alder Lake based systems will hang >>>>>>> during >>>>>>> suspend/resume. >> >> [Your email program wraps lines in cited text for some reason, making >> the citation harder to read.] >> > Not sure why, I am using Mozila Thunderbird for email. I am not using MS > Outlook for upstream email. Strange. No idea if there were bugs in Mozilla Thunderbird 91.2.0, released over half year ago. The current version is 91.8.1. [1] >>>>>> I am still not clear, what “hang during suspend/resume” means. I >>>>>> guess >>>>>> suspending works fine? During resume (S3 or S0ix?), where does it >>>>>> hang? >>>>>> The system is functional, but there are only display problems? >>> System freeze after suspend/resume. >> >> But you see certain messages still? At what point does it freeze >> exactly? In the bug report you posted Linux messages. > > No, the system freeze then users have to recycle power to recover. Then I misread the issue? Did you capture the messages over serial log then? >>>>>>> The issue was initially reported on one system (Dell Precision >>>>>>> 3660 with >>>>>>> BIOS version 0.14.81), but was later confirmed to affect at least >>>>>>> 4 Alder >>>>>>> Lake based systems. >>>>>>> >>>>>>> Add extra check to disable ASPM on Intel Alder Lake based systems. >>>>>>> >>>>>>> Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") >>>>>>> Link: >>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885&data=05%7C01%7Crichard.gong%40amd.com%7Cce01de048c61456174ff08da230c750d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860833680922036%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vqhh3dTc%2FgBt7GrP9hKppWlrFy2F7DaivkNEuGekl0g%3D&reserved=0 >>>>>>> >> >> Thank you Microsoft Outlook for keeping us safe. :( > I am not using MS Outlook for the email exchanges. I guess, it’s not the client but the Microsoft email service (Exchange?) no idea adding these protection links. (Making it even harder for users to actually verify domain. No idea who comes up with these ideas, and customers actually accepting those.) >>>>>>> >>>>>>> Reported-by: kernel test robot >>>>>> >>>>>> This tag is a little confusing. Maybe clarify that it was for an >>>>>> issue >>>>>> in a previous patch iteration? >>> >>> I did describe in change-list version 3 below, which corrected the >>> build error with W=1 option. >>> >>> It is not good idea to add the description for that to the commit >>> message, this is why I add descriptions on change-list version 3. >> >> Do as you wish, but the current style is confusing, and readers of the >> commit are going to think, the kernel test robot reported the problem >> with AMD VI ASICs and Intel Alder Lake systems. >> >>>>>> >>>>>>> Signed-off-by: Richard Gong >>>>>>> --- >>>>>>> v4: s/CONFIG_X86_64/CONFIG_X86 >>>>>>>       enhanced check logic >>>>>>> v3: s/intel_core_asom_chk/aspm_support_quirk_check >>>>>>>       correct build error with W=1 option >>>>>>> v2: correct commit description >>>>>>>       move the check from chip family to problematic platform >>>>>>> --- >>>>>>>    drivers/gpu/drm/amd/amdgpu/vi.c | 17 ++++++++++++++++- >>>>>>>    1 file changed, 16 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/vi.c >>>>>>> index 039b90cdc3bc..b33e0a9bee65 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c >>>>>>> @@ -81,6 +81,10 @@ >>>>>>>    #include "mxgpu_vi.h" >>>>>>>    #include "amdgpu_dm.h" >>>>>>> >>>>>>> +#if IS_ENABLED(CONFIG_X86) >>>>>>> +#include >>>>>>> +#endif >>>>>>> + >>>>>>>    #define ixPCIE_LC_L1_PM_SUBSTATE    0x100100C6 >>>>>>>    #define >>>>>>> PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK 0x00000001L >>>>>>>    #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK >>>>>>> 0x00000002L >>>>>>> @@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct >>>>>>> amdgpu_device *adev) >>>>>>>                WREG32_PCIE(ixPCIE_LC_CNTL, data); >>>>>>>    } >>>>>>> >>>>>>> +static bool aspm_support_quirk_check(void) >>>>>>> +{ >>>>>>> +     if (IS_ENABLED(CONFIG_X86)) { >>>>>>> +             struct cpuinfo_x86 *c = &cpu_data(0); >>>>>>> + >>>>>>> +             return !(c->x86 == 6 && c->x86_model == >>>>>>> INTEL_FAM6_ALDERLAKE); >>>>>>> +     } >>>>>>> + >>>>>>> +     return true; >>>>>>> +} >>>>>>> + >>>>>>>    static void vi_program_aspm(struct amdgpu_device *adev) >>>>>>>    { >>>>>>>        u32 data, data1, orig; >>>>>>>        bool bL1SS = false; >>>>>>>        bool bClkReqSupport = true; >>>>>>> >>>>>>> -     if (!amdgpu_device_should_use_aspm(adev)) >>>>>>> +     if (!amdgpu_device_should_use_aspm(adev) || >>>>>>> !aspm_support_quirk_check()) >>>>>>>                return; >>>>>> >>>>>> Can users still forcefully enable ASPM with the parameter >>>>>> `amdgpu.aspm`? >>>>>> >>> As Mario mentioned in a separate reply, we can't forcefully enable >>> ASPM with the parameter 'amdgpu.aspm'. >> >> That would be a regression on systems where ASPM used to work. Hmm. I >> guess, you could say, there are no such systems. >> >>>>>>> >>>>>>>        if (adev->flags & AMD_IS_APU || >>>>>> >>>>>> If I remember correctly, there were also newer cards, where ASPM >>>>>> worked >>>>>> with Intel Alder Lake, right? Can only the problematic generations >>>>>> for >>>>>> WX3200 and RX640 be excluded from ASPM? >>>>> >>>>> This patch only disables it for the generatioaon that was problematic. >>>> >>>> Could that please be made clear in the commit message summary, and >>>> message? >>> >>> Are you ok with the commit messages below? >> >> Please change the commit message summary. Maybe: >> >> drm/amdgpu: VI: Disable ASPM on Intel Alder Lake based systems >> >>> Active State Power Management (ASPM) feature is enabled since kernel >>> 5.14. >>> >>> There are some AMD GFX cards (such as WX3200 and RX640) that won't work >>> with ASPM-enabled Intel Alder Lake based systems. Using these GFX >>> cards as >>> video/display output, Intel Alder Lake based systems will freeze after >>> suspend/resume. >> >> Something like: >> >> On Intel Alder Lake based systems using ASPM with AMD GFX Volcanic >> Islands (VI) cards, like WX3200 and RX640, graphics don’t initialize >> when resuming from S0ix(?). >> >> >>> The issue was initially reported on one system (Dell Precision 3660 with >>> BIOS version 0.14.81), but was later confirmed to affect at least 4 >>> Alder >>> Lake based systems. >> >> Which ones? > those are pre-production Alder Lake based OEM systems Just write that then: at least four pre-production Alder Lake based systems. >>> Add extra check to disable ASPM on Intel Alder Lake based systems with >>> problematic generation GFX cards. >> >> … with the problematic Volcanic Islands GFX cards. >> >>>> >>>> Loosely related, is there a public (or internal issue) to analyze >>>> how to get ASPM working for VI generation devices with Intel Alder >>>> Lake? >>> >>> As Alex mentioned, we need support from Intel. We don't have any >>> update on that. >> >> It’d be great to get that fixed properly. >> >> Last thing, please don’t hate me, does Linux log, that ASPM is disabled? Kind regards, Paul [1]: https://www.thunderbird.net/en-US/thunderbird/releases/