Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1949579rwb; Sun, 2 Oct 2022 11:14:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM72DbrkL1HEv3L1nWJK/+gfi2rK9YGc1XYfC2DEwPkDLiYV1SKmYV8lovTYWVSrCdSvYUvL X-Received: by 2002:a05:6402:3206:b0:458:c6ef:a5aa with SMTP id g6-20020a056402320600b00458c6efa5aamr5452202eda.127.1664734448760; Sun, 02 Oct 2022 11:14:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664734448; cv=none; d=google.com; s=arc-20160816; b=gRbX4jH6s0vt0jR9YQ7pm9B1Iq4RD8T9EuhzVoCOO0axXmZ+60wS+sPZ0XL3NXnivE iy33T9+rKdFpxe8zaCxDhZ53JhgZKCSJBY6a7vEFLooh2mumBwvcktugvMdtUO7dm7K+ z4wrHsDgrQohkTZxIqaFuFGxiM9rx1QBuV48RCbjVRtbETcaz3tfKsVhgXTQp0g5m/Nl fuMXtNau/QsHMmFxqltD2SNIWKxTMLAgxncqHlGAFjfyM02gHiAHshsNNl+gWpuaw8+U L96HbdAMAsCRwsT0EHTQldFN1FI317dpzGKh5eYYGZjKxMAvlPfYtyvrg5VmyDFUrtXa vQRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=uSivKqsMe+pcn53/RGWX+5273bK8TwjglQKO5+E8za8=; b=z5ytyOa5W/D8d359TYeveEc2000COJnXOgy/d/WUJsrfIbX6/+IrjoIAb846tYP+ID awKQQ+fDimSGAg9iyj1UJibJQqFCVbqTCvUxgw5rnj5C09pKmB44tFpyUrLMjLjGTcgQ 1K961facVWBxfHkaGptiIe7rNk0BNfXcjiWaRA7o4RJmUZa4jJM4sIXWX6qGv6nY+7g6 lRTrOAgqbspmADK0dJsf6HIHEKsZiC7VzI7Ax8p9BP9T1kD0kvA5J2hrKCvsXPPKj9Uy tsDE79RQgMX78nvfWvLVzkddYY3fomGxvV1nyHUq6tuXW41iDi1+4wKxhEvj5mO8vOEP EcWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ueqFlTo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g5-20020a1709065d0500b0077cecf8904esi6718726ejt.515.2022.10.02.11.13.41; Sun, 02 Oct 2022 11:14:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ueqFlTo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229996AbiJBR4y (ORCPT + 99 others); Sun, 2 Oct 2022 13:56:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229971AbiJBR4w (ORCPT ); Sun, 2 Oct 2022 13:56:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 344093A169; Sun, 2 Oct 2022 10:56:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C485C60EE8; Sun, 2 Oct 2022 17:56:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1281FC433C1; Sun, 2 Oct 2022 17:56:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1664733411; bh=qUgOK3t18tYW2UK/R1jl3lLWy7URggvB3fJyitiBEBs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ueqFlTo67MAaDKe+MZdg+MlYcg61prs06BSY3rtIb2kEQuBpAoPdJfV8iH5DaWtaK nx2zwUiuYAsFgOg+EcM9i0+Ap1wcixy1k+n2Md6vnXHIaerCvTQWzD+ke+dhfwvEGl uvgxXEfsPxm/oOjHqSrwqnjlXpd7NWdaRxW3a83DrUJccihA7tF6MuJsJ7Me05cyay NYj9GYQzReVfyBl+GoLXNCBSLdcNyQOGht2NcR2UNJ4I8EpKxGfLNpqcMbAKCNtjc3 MXsO9FXLqd/1ki6mfVhRWGOtkNRtszkqCL+NvVglx20djL2YAoQs2KAwHq/T7zalji X6bdW4QY/vZ1w== Received: by pali.im (Postfix) id 3F3AA225; Sun, 2 Oct 2022 19:56:48 +0200 (CEST) Date: Sun, 2 Oct 2022 19:56:48 +0200 From: Pali =?utf-8?B?Um9ow6Fy?= To: Nathan Rossi Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Nathan Rossi , Bjorn Helgaas Subject: Re: [PATCH] PCI/ASPM: Wait for data link active after retraining Message-ID: <20221002175648.jzxcvka46vylbs2d@pali> References: <20220602065544.2552771-1-nathan@nathanrossi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220602065544.2552771-1-nathan@nathanrossi.com> User-Agent: NeoMutt/20180716 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello! On Thursday 02 June 2022 06:55:44 Nathan Rossi wrote: > From: Nathan Rossi > > When retraining the link either the child or the parent device may have > the data link layer state machine of the respective devices move out of > the active state despite the physical link training being completed. > Depending on how long is takes for the devices to return to the active > state, the device may not be ready and any further reads/writes to the > device can fail. > > This issue is present with the pci-mvebu controller paired with a device > supporting ASPM but without advertising the Slot Clock, where during > boot the pcie_aspm_cap_init call would cause common clocks to be made > consistent and then retrain the link. However the data link layer would > not be active before any device initialization (e.g. ASPM capability > queries, BAR configuration) causing improper configuration of the device > without error. There is the known issue in marvell pcie controllers. They completely drop the link for PCIe GEN1 cards when Target Link Speed (Link Control2) in Root Port is configured to 5.0 GT/s or higher value and OS issues Retrain Link (Link Control). I think the proper way should be to workaround root of this issue by programming Target Link Speed in Link Control2 register to required value, instead of hacking couple of other places which are just implication of that issue... I can reproduce it for example with Qualcomm Atheros ath9k/ath10k wifi cards which have another issue that they go into "broken" state when in-band reset (e.g. pcie hot reset or pcie link down) is issues multiple times without longer delay. These two bugs (first in marvell pcie controller and second in wifi card) cause that setting kernel ASPM cause disappearing card from bus until cpu/board reset (or pcie warm reset; if board supports it at runtime without going to POR). I guess you are just observing result of this issue here. > To ensure the child device is accessible, after the link retraining use > pcie_wait_for_link to perform the associated state checks and any needed > delays. > > Signed-off-by: Nathan Rossi > --- > drivers/pci/pcie/aspm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index a96b7424c9..4b8a1810be 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -288,7 +288,8 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link) > reg16 &= ~PCI_EXP_LNKCTL_CCC; > pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); > > - if (pcie_retrain_link(link)) > + /* Retrain link and then wait for the link to become active */ > + if (pcie_retrain_link(link) && pcie_wait_for_link(parent, true)) > return; > > /* Training failed. Restore common clock configurations */ > --- > 2.36.1