Received: by 10.192.165.148 with SMTP id m20csp758768imm; Fri, 27 Apr 2018 07:06:06 -0700 (PDT) X-Google-Smtp-Source: AB8JxZr8D8T46NtOkLWZis6Yy/DhQXVWraDXGnJ4Fk3zyFE6bfPLCp/usrY0/2cjbIKONZpKWEqI X-Received: by 2002:a17:902:9004:: with SMTP id a4-v6mr1424490plp.143.1524837966099; Fri, 27 Apr 2018 07:06:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524837966; cv=none; d=google.com; s=arc-20160816; b=Cq6/csGBVDMRgu/HClAO/oJmeWfzupSMS2jVx8lYpte+IDz9ZwwepkAn1RcqEGQuzd e10+Ada+UzXbo6H0yBnE79QXEZO06YunxUUxhukA+HioVXK77b1qWrBGGoFn+I1GMMtq M8NSUogFKsY+lf6Dqbxs42kIbJXF9k7Sk6o7FZD0Y+2XyrfYxDTqVbZi4GiIE/7+DTgX EAErqqRKdNtqWGu4xI7OkR3z7NmzV1poPia49FfiRQKEolc81TZWxIPJWOX5zLf9g4lU U0WyBHCw/1WnQmcEDWOwJqkMT7mxD2qRGFJVxKz6cOWUL9jaGI9f00aYlPkrA5UziLq/ h4iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dmarc-filter :arc-authentication-results; bh=+mLZmTKX8Tn/J3c52ChCclAqElOaraHE6p+22Lc19d0=; b=fEmc9QfRUGORwxlcjb+AMKvED7MSTPQVcJaaB4GiNLPedTyFKgfDMOxqo+PMeApuD9 7UVgKDIYBzU+CRVzeAEAHS4floZAmcNyz9IvuklFkn8pbHPcwkBoiK0IdlbAewcNBAvU csbs6tIneiDJ62br9YkB5o8TNWtMSjYsg42teJ2SHv1w6IxPtY1fTvgSHHPINfHo0Eo5 jYoWN4muuSB0iFGzY6vdOaClYayAtDxTRZXO39govGsbSKjG6MC2dZ0GAsgEaZFPK8o4 GuKH+JBADKSuTQFy/A5Rgf+NzLaETuUGMSqGvPgH9utFHL1F5+NzzUBnia9JoAow+JdM 1BYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o186-v6si1275505pga.350.2018.04.27.07.05.51; Fri, 27 Apr 2018 07:06:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933556AbeD0ODj (ORCPT + 99 others); Fri, 27 Apr 2018 10:03:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:50104 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933518AbeD0ODe (ORCPT ); Fri, 27 Apr 2018 10:03:34 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A742621890; Fri, 27 Apr 2018 14:03:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A742621890 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linuxfoundation.org Authentication-Results: mail.kernel.org; spf=fail smtp.mailfrom=gregkh@linuxfoundation.org From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sinan Kaya , Bjorn Helgaas Subject: [PATCH 4.9 27/74] PCI: Wait up to 60 seconds for device to become ready after FLR Date: Fri, 27 Apr 2018 15:58:17 +0200 Message-Id: <20180427135711.058892150@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180427135709.899303463@linuxfoundation.org> References: <20180427135709.899303463@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Sinan Kaya commit 821cdad5c46cae94ce65b9a98614c70a6ff021f8 upstream. Sporadic reset issues have been observed with an Intel 750 NVMe drive while assigning the physical function to the guest machine. The sequence of events observed is as follows: - perform a Function Level Reset (FLR) - sleep up to 1000ms total - read ~0 from PCI_COMMAND (CRS completion for config read) - warn that the device didn't return from FLR - touch the device before it's ready - device drops config writes when we restore register settings (there's no mechanism for software to learn about CRS completions for writes) - incomplete register restore leaves device in inconsistent state - device probe fails because device is in inconsistent state After reset, an endpoint may respond to config requests with Configuration Request Retry Status (CRS) to indicate that it is not ready to accept new requests. See PCIe r3.1, sec 2.3.1 and 6.6.2. Increase the timeout value from 1 second to 60 seconds to cover the period where device responds with CRS and also report polling progress. Signed-off-by: Sinan Kaya [bhelgaas: include the mandatory 100ms in the delays we print] Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman --- drivers/pci/pci.c | 52 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-) --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3756,27 +3756,49 @@ int pci_wait_for_pending_transaction(str } EXPORT_SYMBOL(pci_wait_for_pending_transaction); -/* - * We should only need to wait 100ms after FLR, but some devices take longer. - * Wait for up to 1000ms for config space to return something other than -1. - * Intel IGD requires this when an LCD panel is attached. We read the 2nd - * dword because VFs don't implement the 1st dword. - */ static void pci_flr_wait(struct pci_dev *dev) { - int i = 0; + int delay = 1, timeout = 60000; u32 id; - do { - msleep(100); + /* + * Per PCIe r3.1, sec 6.6.2, a device must complete an FLR within + * 100ms, but may silently discard requests while the FLR is in + * progress. Wait 100ms before trying to access the device. + */ + msleep(100); + + /* + * After 100ms, the device should not silently discard config + * requests, but it may still indicate that it needs more time by + * responding to them with CRS completions. The Root Port will + * generally synthesize ~0 data to complete the read (except when + * CRS SV is enabled and the read was for the Vendor ID; in that + * case it synthesizes 0x0001 data). + * + * Wait for the device to return a non-CRS completion. Read the + * Command register instead of Vendor ID so we don't have to + * contend with the CRS SV value. + */ + pci_read_config_dword(dev, PCI_COMMAND, &id); + while (id == ~0) { + if (delay > timeout) { + dev_warn(&dev->dev, "not ready %dms after FLR; giving up\n", + 100 + delay - 1); + return; + } + + if (delay > 1000) + dev_info(&dev->dev, "not ready %dms after FLR; waiting\n", + 100 + delay - 1); + + msleep(delay); + delay *= 2; pci_read_config_dword(dev, PCI_COMMAND, &id); - } while (i++ < 10 && id == ~0); + } - if (id == ~0) - dev_warn(&dev->dev, "Failed to return from FLR\n"); - else if (i > 1) - dev_info(&dev->dev, "Required additional %dms to return from FLR\n", - (i - 1) * 100); + if (delay > 1000) + dev_info(&dev->dev, "ready %dms after FLR\n", 100 + delay - 1); } static int pcie_flr(struct pci_dev *dev, int probe)