Received: by 10.192.165.156 with SMTP id m28csp1601379imm; Tue, 17 Apr 2018 02:05:34 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+FcD6tHABf4CFGN29zD8zxDBZCRObPhVGQFIe2GBh3dsThJ1+v93V+wT/31Qy0KNL3kDHE X-Received: by 10.99.104.9 with SMTP id d9mr1128964pgc.304.1523955934675; Tue, 17 Apr 2018 02:05:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523955934; cv=none; d=google.com; s=arc-20160816; b=TZz2kKWJD6H1P3c6y4JFUpBYNfOIBP8u8QdD8hLOEAXbrnDPs78InRXgpFLBiuVkyL 2Lu0x+9SLftRXfbQGcziYTtU1Q/seqVpP4Kt9ikEfXjQjfS/3GuzXaAlMjw4hMusCrMS PuvdrosQNp6cw0TNtlejp0OH+ucfnYvsP3XxksE85hKGG+SiuURYU1bULcMmN9xGI1L6 cKbv+yjEAUJmYXr2lhBE10o8/rSDMYtJBUvcSNO66hPJfqdDsSHFUgLe9lJzUzgAUuDq 9Jp16Oc100kbYjR7OUBCx5ql22dOxUxgv3keLcceDKByB1S4xczj9SN9GJjVUSvldOxr yTDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=iRcD74PqDgDBzegruQIHovvQ7kjmYa2Akp+vTva5feE=; b=u3yBVtPJy4E6mUmpqind3LOQ442CFamQZ8dqksKcWbJurLVWlNrxyouMnsWE7z1P2B okIfLeksNENSF+tiG7tv/BEDq7Hm2lZ3WsiETZ/1iUiwzoL04nzLM+3RAUtMkFU+ddBl D/x2rZykEbxpasBXTdFmMlYA2ThyGBsqFGsq6G+wVxz3rmZxnwf33sQeMWg2rt+fj/C8 PoIzziVzyjUacCsu9ZGdE/WiOkFwqRt1Pa6R5DHknfulZannNpUsu3TN6TXwyHJysv9J tRof2qLEHkFMEAX7d+wrEoX32Z1cUDsEjRCofxF9pBNbZSpArlucycRkzejUKP1EYWGL e+Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=Cg4kIG1a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w61-v6si9694300plb.155.2018.04.17.02.05.20; Tue, 17 Apr 2018 02:05:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=Cg4kIG1a; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752749AbeDQJD5 (ORCPT + 99 others); Tue, 17 Apr 2018 05:03:57 -0400 Received: from mail-oi0-f53.google.com ([209.85.218.53]:36714 "EHLO mail-oi0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752731AbeDQJDx (ORCPT ); Tue, 17 Apr 2018 05:03:53 -0400 Received: by mail-oi0-f53.google.com with SMTP id h11-v6so7905640oic.3 for ; Tue, 17 Apr 2018 02:03:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=iRcD74PqDgDBzegruQIHovvQ7kjmYa2Akp+vTva5feE=; b=Cg4kIG1ao+ou4FK9/n5B47GeuY/zEWperXbyAgb0FmXfhVkyUkJKNprnokhfBQ0/ja 9CYoal9W8HCoOYs6q8uy+2prtDcgctgYG2//neQEvc8s416gv6YrIFy4p5w17J9I5qcf j5teUdfWl2V/AP8Ol0FO725z207EwVULIvLKY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=iRcD74PqDgDBzegruQIHovvQ7kjmYa2Akp+vTva5feE=; b=hgKWr28lAXUrhfd2oNCBaLS+3tRxyRXmTkYGj8aku4fq65jVtOOG9GXpX8dfOt6KD6 YlLxqWSoML2IQ0XaOT4VUlKb3mBBmL7gKG2wMSeEqlX7AP6jWlofqg8ESw0ZniezrTRm oj7UKR0BANRBqwQ5N3W8sfuEfqK7Nc1okNNwMwzsB4mAQVlnq4AFVaOHrggawAg8dohb HVo0tIdvKf2Wrg3hS1jTrZmHWPKKnzPtEi9GJWWIgSg8IcdCiAXRSmOgHIBMPOboPLMb h7QKtyr0n1BeqYbFQXi1meJpQbr4kNHyZnEBUTUS6NFM5rsfHlOXy9QANPj4K86MVjwy 0PrQ== X-Gm-Message-State: ALQs6tAOQnljwbGCJAaM3uVh/pz7TTdAmHnRKRxWg6tYYmsd9HqoE2jV lFSkOorzlEBrCviLkugfH2C8eUNdRtrpLY00HUojhA== X-Received: by 2002:a54:4d9e:: with SMTP id y30-v6mr623639oix.282.1523955832553; Tue, 17 Apr 2018 02:03:52 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:4d13:0:0:0:0:0 with HTTP; Tue, 17 Apr 2018 02:03:52 -0700 (PDT) In-Reply-To: <20180416213536.GC28657@bhelgaas-glaptop.roam.corp.google.com> References: <20180413135659.GC46420@bhelgaas-glaptop.roam.corp.google.com> <20180413211651.GA80087@bhelgaas-glaptop.roam.corp.google.com> <20180414160918.GA158153@bhelgaas-glaptop.roam.corp.google.com> <20180416213536.GC28657@bhelgaas-glaptop.roam.corp.google.com> From: Srinath Mannam Date: Tue, 17 Apr 2018 14:33:52 +0530 Message-ID: Subject: Re: Issue with Enable LTR while pcie_aspm off To: Bjorn Helgaas Cc: Bjorn Helgaas , Ray Jui , linux-pci@vger.kernel.org, Rajat Jain , Keith Busch , linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bjorn, Thank you for more insight you have given about the problem. For us the issue comes before we disable apst feature. on APST quirk set, NVMe driver disable apst by send a command to NVMe controller. We see issue at the time of NVMe initialization only. So APST quirk did not helped. On Tue, Apr 17, 2018 at 3:05 AM, Bjorn Helgaas wrote: > [+cc Keith, linux-nvme, LKML; another possible ASPM issue with Samsung > NVMe SSD 960 EVO] > > On Mon, Apr 16, 2018 at 09:03:33PM +0530, Srinath Mannam wrote: >> On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas wrote: >> > On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote: >> >> I am sorry, in previous mail by mistake I have written L1.s support is >> >> not there, Actually I wanted to write L0s support is not there. >> >> L1 and L1ss support is there. >> > >> > If your endpoint (and everything in the path) advertise both LTR and >> > L1ss support, that patch probably won't make a difference. >> > >> > It *might* make a difference if only part of the path supports both, >> > because my reading of the spec is that L1ss requires LTR and LTR >> > requires the entire path to support LTR, and we currently don't >> > enforce that "entire path" part before enabling L1ss. >> > >> Yes, this patch did not work. > > OK, thanks for checking. Since there's only one link in the path and > both ends advertise L1SS and LTR support, I wouldn't expect it to make > a difference. > >> >> But In our platform we required to disable ASPM. >> >> > We're trying to figure out exactly *why* you must disable ASPM. If >> > it's because of a hardware defect, e.g., the device advertises ASPM >> > support but it's actually broken, we probably need to add a quirk. >> > Given the complexity of ASPM, it's surprising we don't have similar >> > quirks already. >> >> We see issues with ASPM enabled. Some link issues observed so for >> time being we are using with aspm disabled until we fix that issue. > > I see other reports of ASPM issues with that Samsung 960 PRO NVMe SSD. > Maybe they're related? > > https://lkml.kernel.org/r/20171214184701.GA6322@libmpq.org > https://forums.lenovo.com/t5/ThinkCentre-A-E-M-S-Series/M900-Tiny-UEFI-Bug-M-2-NVMe-SSD-amp-8260-WiFi-ASPM-disabled-Much/td-p/3570469 > > You might try setting NVME_QUIRK_NO_APST to see if that's related. > There are some quirks that sound similar: > > 8427bbc22486 ("nvme-pci: disable APST on Samsung SSD 960 EVO + ASUS PRIME B350M-A") > 467c77d4cbef ("nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A") > > Keith, et al, here's the relevant part of Srinath's lspci. Both ends > of the link claim to support ASPM including L1SS and LTR, but Srinath > has to disable ASPM to get the SSD to work reliably. Just FYI. > >> 0000:00:00.0 PCI bridge: Broadcom Limited Device d714 (prog-if 00 [Normal decode]) >> Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 >> Capabilities: [ac] Express (v2) Root Port (Slot-), MSI 00 >> LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <2us >> ClockPM+ Surprise- LLActRep- BwNot+ ASPMOptComp+ >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE# ARIFwd+ >> AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS- >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd- >> AtomicOpsCtl: ReqEn- EgressBlck- >> Capabilities: [240 v1] L1 PM Substates >> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ >> PortCommonModeRestoreTime=8us PortTPowerOnTime=10us >> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- >> T_CommonMode=1us LTR1.2_Threshold=0ns > >> 0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 (prog-if 02 [NVM Express]) >> Capabilities: [70] Express (v2) Endpoint, MSI 00 >> LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us >> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported >> AtomicOpsCap: 32bit- 64bit- 128bitCAS- >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled >> AtomicOpsCtl: ReqEn- >> Capabilities: [190 v1] L1 PM Substates >> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ >> PortCommonModeRestoreTime=10us PortTPowerOnTime=10us >> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- >> T_CommonMode=0us LTR1.2_Threshold=0ns > > >> with LTR enabled also we observed some problem, that after LTR >> messages received from EP, we see completion timeout with config >> write. > >> So I thought If LTR configuration function also part of aspm file, >> as it was under CONFIG_ASPM. using pcie_aspm boot arg I can disable >> both ASPM and LTR. > >> If this is not possible, then I will go for alternative solution of >> quirk implementation as you suggest. > > Is this platform a lab prototype or is it already shipping? If it's > already shipping, you probably need some sort of upstream solution > like an NVMe or PCIe quirk, but if not, maybe you can just hack your > bringup kernel to disable ASPM and LTR until you fix the root cause. > we are at evolution stage so we need to fix this ASAP. As you said earlier, Can I add sysfs interface to enable LTR same as we do L1SS or in the part of aspm cap init function. > Bjorn Regards, Srinath.