Received: by 10.192.165.148 with SMTP id m20csp3855199imm; Mon, 23 Apr 2018 13:42:39 -0700 (PDT) X-Google-Smtp-Source: AIpwx49sFpj5wEw2ShKeGG8jqE/ElyNJhz4E3cCJKEO3ow9vyC6tjRa5Vd9cndVJSvD896KT0/jm X-Received: by 10.99.137.74 with SMTP id v71mr18593117pgd.423.1524516159650; Mon, 23 Apr 2018 13:42:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524516159; cv=none; d=google.com; s=arc-20160816; b=KzTsUY35Z3QeRmvsiD1pr8q27EtnW6LcsRcSOgjnPh0+3Y3s6Huixt7Hy0lgP+t0Oo ahk69SNuKRtmXtffke1AFExbRPJgy5qN+fAV2LYMremX+yNW8pr1t4A4II1OISEkj9ph JMzKuKHiaIMnQZI+vip1Z1PYJOUJ1JfFsK3ROrMZraw12nVkC8lymrKiCFRLjFHWedhU OWG56Ut4thGIz+JIs7H6Mjz+8NDkyfmtsW6HSG8ALSt5EA0Yd/6PA6F5FNHWmk4VrJ6P EEw0I00xkOhF/NRHP7yRMkYLvpOeG5vIe1Xo4U6Nj3PQ+gjuu5VtLqDG9IefHPdJbdE1 gxgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=VfvzlqmgUlusZ/HBXFWDF00wz18P48nw1Qr9TDJSWMQ=; b=uqa8NBjhfcJpURZb42z+8HQip/0n1tzmto3H+qQxU2UwNKNTBbsxqi7Ra6sLyelCvt fyOt6IZddXiq6Nwr5mYHPOq50HypjdK/CLYaht5wFrvZjr6GII8qVXfeVreIghhU1EkZ Jgvm3UrIJc0Nd/HkLZe+t0H73becjf79ENUEBB7P7WlVYEuVrxERbtbas52CioTmCQAN cQVHfidKoZnyX+chP+TSh7FV90sLaxMkVvAtRjLeUbXoRB8vAtICSjANFYbuIeWw2y41 w6WO3peGAx9f/GCcbqC44WzxT87tK1C8YPin2qFrE2zCRRBuFl402HJZPR183m/tzQg5 iq1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d21-v6si12426780plr.352.2018.04.23.13.42.25; Mon, 23 Apr 2018 13:42:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932225AbeDWUlK (ORCPT + 99 others); Mon, 23 Apr 2018 16:41:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37972 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755680AbeDWUlG (ORCPT ); Mon, 23 Apr 2018 16:41:06 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B2B3C0587E6; Mon, 23 Apr 2018 20:41:06 +0000 (UTC) Received: from w520.home (ovpn-116-103.phx2.redhat.com [10.3.116.103]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9D4A95DD7B; Mon, 23 Apr 2018 20:41:04 +0000 (UTC) Date: Mon, 23 Apr 2018 14:41:04 -0600 From: Alex Williamson To: Bjorn Helgaas Cc: Sinan Kaya , Jason Gunthorpe , Bjorn Helgaas , linux-pci@vger.kernel.org, sulrich@codeaurora.org, timur@codeaurora.org, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Mike Marciniszyn , Dennis Dalessandro , Doug Ledford , "open list:HFI1 DRIVER" , open list , Alex Deucher , Rajat Jain Subject: Re: [PATCH 1/2] IB/hfi1: Try slot reset before secondary bus reset Message-ID: <20180423144104.2dad7495@w520.home> In-Reply-To: <20180423202311.GA164898@bhelgaas-glaptop.roam.corp.google.com> References: <1524167784-5911-1-git-send-email-okaya@codeaurora.org> <20180419202632.GE14063@ziepe.ca> <20180419214722.GO28657@bhelgaas-glaptop.roam.corp.google.com> <290e9530-dcde-9c10-7ae0-59ac4c509db4@codeaurora.org> <20180420140049.GP28657@bhelgaas-glaptop.roam.corp.google.com> <20180420090420.03fb1e6c@w520.home> <10d9cf68-29ed-d205-a25f-b8dade53cdd8@codeaurora.org> <20180423131044.53471670@w520.home> <20180423202311.GA164898@bhelgaas-glaptop.roam.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 23 Apr 2018 20:41:06 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 23 Apr 2018 15:23:11 -0500 Bjorn Helgaas wrote: > On Mon, Apr 23, 2018 at 01:10:44PM -0600, Alex Williamson wrote: > > On Mon, 23 Apr 2018 13:28:22 -0400 > > Sinan Kaya wrote: > > > > > On 4/20/2018 11:04 AM, Alex Williamson wrote: > > > > Is there a concern here about whether the endpoint device driver or the > > > > PCI core really knows better about link retraining? This makes me > > > > remember my unfinished (and need to revisit) Pericom quirk[1] where > > > > errata in the PCIe switch requires that upstream and downstream links > > > > are balanced (ie. same link rate) or else enabling ACS results in > > > > packets not properly flowing through the switch. If an endpoint driver > > > > starts deciding to retrain links, overriding quirks in the PCI core, > > > > then such topology manipulation isn't possible. Why does the > > > > driver .probe() function think it can retrain at a higher link rate > > > > than PCI core? Thanks, > > > > > > The example given is for some serdes firmware load to happen in probe and > > > then performing a retrain to reach to a better speed. > > > > > > It becomes a chicken and egg problem. > > > > > > 1. Endpoint HW trains to gen1 by default pre-boot. > > > 2. PCI core enumerates the device. > > > 3. Endpoint driver gets loaded > > > 4. Driver does the firmware programming followed by a link retrain. > > > > > > I think it is the responsibility of the PCI core to provide reset APIs. > > > However, expecting endpoint drivers to be knowledgeable about hotplug is > > > too much. > > > > > > We can certainly contain AER change into pci directory by moving the slot > > > reset function to drivers/pci.h file. > > > > > > But, we need to think about what to do about VFIO and other endpoint > > > initiated reset cases. My suggestion was to move this into a single API and > > > remove all other APIs from include/linux/pci.h. > > > > I'm a little confused about the relation between reset and retrain. > > AIUI we can retrain the link without any sort of endpoint reset and if > > some sort of driver/firmware setup is required on the endpoint to > > achieve the target link speed, then I'd think we only want to retrain. > > In hfi1, do_pcie_gen3_transition() resets the device. I don't know if > retraining the link would be sufficient; maybe the reset is required > to make the device use the new firmware. I guess we already export > reset interfaces, so if we add a retrain interface, drivers could > choose what they need. > > > How this is going to work with vfio is an interesting question. We're > > only providing access to the device, not the link to the device. > > > Multifunction endpoints become a big problem if one function starts > > requesting link retraining while another is in use elsewhere. > > Can we just make it the driver's problem by returning -EPERM if one > function requests a retrain while another function is in use? Yes, this is basically how we handle bus resets through vfio-pci, if there are multiple devices affected by a bus reset, slots or functions, the caller needs to prove ownership of all of them before we do a bus reset. That's great for video cards where we have the GPU and audio functions typically assigned together anyway, but I don't know that it's a great solution for IB where you're more likely to want to assign separate functions to separate users and nobody can get the device to run at full speed unless all the functions are assigned to one user. Thanks, Alex