Received: by 10.192.165.148 with SMTP id m20csp3779723imm; Mon, 23 Apr 2018 12:12:14 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/SiLF2jRD0EE9vAflgFg3BP5yGMSnrjuBedQzELHiumIHoGitgxetNhpBDhY9iA4rDNjZQ X-Received: by 2002:a17:902:8e8c:: with SMTP id bg12-v6mr21743603plb.295.1524510734045; Mon, 23 Apr 2018 12:12:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524510734; cv=none; d=google.com; s=arc-20160816; b=GeF3QxaAfE+f2Cpr55h5twwkyHjl4Ri1reJesjCgNoxWPURCEDcyZtS0qj4wbcFm7M RYL3UD/eSu7Xu3mOK3iUtRmvN3kigqXjfcnFjMbvSCOFWYGXUYFo8lrFy+jGcRkKdiN4 JvYBN3Rufz6iM8nS/vcmYPtNOUkPQPD+J0AZNlbhBI/Z+H7CQuOhiT1bH8TDwNGxyVt5 F8WgU07tLGXm0fj1COyc2siZA+RcLeR/0Wmoo2QkZxJl7AkFCHJABrmktX4Dqe0lUUZZ 7G3e63hv9t+phudo37pE3NC1ZT7+1Gl243Cxpijp9WSbTcGdjjDPz4W4ygSwecw/oOFU sTjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=/nQOFzBNvxwZadxLx7MiOmJpDTdpzm6+XJKPDDdgfKQ=; b=exnov9Irj6kxBq4jNjC6+4nK2msf/PrTYgC+00p31CYMy2V52euJffMFd+uBc51kZL Qzlgf4uOU1O+tKCS+JDautnLM3nTxF5Jjel26VYk7NA6CutB8VQKe2o05vYnD7RXKRN8 H5e2lri/AnreFisdKUDS53f9NQiP9BfwKZgHQvFig/flR3vpIwuxMnpCfGLkvZd7lJoR 3uCn1N7CYiBs+1VE0DjUuy01Bn2yTDa5dIadtMJm9GKRxMBYAiIaeOQJntFHDVmHi3MS gKdDx7P+HvdImGRWGbcfDT4YaLd+ezjWZAA7fDw8WFRr+pCA16BvNxtSjts44hBo/OwC Y9Hw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b89-v6si12148242plb.262.2018.04.23.12.11.59; Mon, 23 Apr 2018 12:12:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932415AbeDWTKv (ORCPT + 99 others); Mon, 23 Apr 2018 15:10:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57892 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932175AbeDWTKr (ORCPT ); Mon, 23 Apr 2018 15:10:47 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 21FDE3141D28; Mon, 23 Apr 2018 19:10:47 +0000 (UTC) Received: from w520.home (ovpn-116-103.phx2.redhat.com [10.3.116.103]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5B6165C20D; Mon, 23 Apr 2018 19:10:45 +0000 (UTC) Date: Mon, 23 Apr 2018 13:10:44 -0600 From: Alex Williamson To: Sinan Kaya Cc: Bjorn Helgaas , Jason Gunthorpe , Bjorn Helgaas , linux-pci@vger.kernel.org, sulrich@codeaurora.org, timur@codeaurora.org, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Mike Marciniszyn , Dennis Dalessandro , Doug Ledford , "open list:HFI1 DRIVER" , open list , Alex Deucher , Rajat Jain Subject: Re: [PATCH 1/2] IB/hfi1: Try slot reset before secondary bus reset Message-ID: <20180423131044.53471670@w520.home> In-Reply-To: <10d9cf68-29ed-d205-a25f-b8dade53cdd8@codeaurora.org> References: <1524167784-5911-1-git-send-email-okaya@codeaurora.org> <20180419202632.GE14063@ziepe.ca> <20180419214722.GO28657@bhelgaas-glaptop.roam.corp.google.com> <290e9530-dcde-9c10-7ae0-59ac4c509db4@codeaurora.org> <20180420140049.GP28657@bhelgaas-glaptop.roam.corp.google.com> <20180420090420.03fb1e6c@w520.home> <10d9cf68-29ed-d205-a25f-b8dade53cdd8@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Mon, 23 Apr 2018 19:10:47 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 23 Apr 2018 13:28:22 -0400 Sinan Kaya wrote: > On 4/20/2018 11:04 AM, Alex Williamson wrote: > > Is there a concern here about whether the endpoint device driver or the > > PCI core really knows better about link retraining? This makes me > > remember my unfinished (and need to revisit) Pericom quirk[1] where > > errata in the PCIe switch requires that upstream and downstream links > > are balanced (ie. same link rate) or else enabling ACS results in > > packets not properly flowing through the switch. If an endpoint driver > > starts deciding to retrain links, overriding quirks in the PCI core, > > then such topology manipulation isn't possible. Why does the > > driver .probe() function think it can retrain at a higher link rate > > than PCI core? Thanks, > > The example given is for some serdes firmware load to happen in probe and > then performing a retrain to reach to a better speed. > > It becomes a chicken and egg problem. > > 1. Endpoint HW trains to gen1 by default pre-boot. > 2. PCI core enumerates the device. > 3. Endpoint driver gets loaded > 4. Driver does the firmware programming followed by a link retrain. > > I think it is the responsibility of the PCI core to provide reset APIs. > However, expecting endpoint drivers to be knowledgeable about hotplug is > too much. > > We can certainly contain AER change into pci directory by moving the slot > reset function to drivers/pci.h file. > > But, we need to think about what to do about VFIO and other endpoint > initiated reset cases. My suggestion was to move this into a single API and > remove all other APIs from include/linux/pci.h. I'm a little confused about the relation between reset and retrain. AIUI we can retrain the link without any sort of endpoint reset and if some sort of driver/firmware setup is required on the endpoint to achieve the target link speed, then I'd think we only want to retrain. How this is going to work with vfio is an interesting question. We're only providing access to the device, not the link to the device. Multifunction endpoints become a big problem if one function starts requesting link retraining while another is in use elsewhere. > Coming back to this patch... > > Do we need a retrain API with the speed that driver wants to reach to? > API can return what was actually accomplished. The quirk from Alex can > run inside this API to make a decision on what speed do we actually want > to reach to for a given PCI topology by reprogramming the target link speed > field. Yes, I think the core should provide a retraining API, that would also make the Pericom quirk easier to implement. We'd want a field/flag on the pcidev that could be set by quirks to limit the highest target rate, but it makes sense that this should be an interface provided by and control point for the PCI core. Thanks, Alex