Received: by 10.223.164.202 with SMTP id h10csp185181wrb; Tue, 7 Nov 2017 04:59:58 -0800 (PST) X-Google-Smtp-Source: ABhQp+Sk1RQCcnqFwIJndErmHQ887K4o+J+ZIfd/zNNCEwkDKuoGVJLIPR7ZN/MaRULp3Yun1UqF X-Received: by 10.101.81.131 with SMTP id h3mr18621300pgq.190.1510059598298; Tue, 07 Nov 2017 04:59:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510059598; cv=none; d=google.com; s=arc-20160816; b=CJCTvu34Qk0fOaQp8+sHVJJoK6qPiE5TZPqbpcxAmDgdM8ttSy25ld8tsVdFon9TGB 3Fb6AsCmAbixNRQZti4m+wkHZzul6Ail/dxtu0+rmwfefpP/8qr3mTRSExRH6e8hVnTF oxdK2ubX/KF578uXDWHgksxzgRoD+2eiouP9g7J15MG8O9XI+tVPE0dLoaD6gFCiw78Y ItU8liyaWUvIw7V9AXLNYrwQ88kwgBytJoSMnDPetcq6BQUOjCsvR8VmFRJArmeX1v0G hnFj3xwGboncKLmMcqzzOmpEHuq99iBZ4UxrqwaEl8v6tyrNhf5M8nSv+BTRlnIkZSKU /9cA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=imU1zPV6WMGiTXgHnjBh4oGlP9onGV6bH+i69clvV8Y=; b=saEJtKwHHUCPYswWps2WqvMBw6NETnHsshivI5/f5Gn5HHnFAnL90zxEZWUIC+esT3 M8xIPqZ2cG7Y+6OvXtELE5F9M3BT4fONE1e6U+mDOIBFxDcEfjesLxqVNAi/3VN1wYjR ViW6yUMimMsI8IeAoTl3QOICHiFQq8JO+Y1T6QAXwXn/p5ojWrSLq+/6r7f3Sl9aHdob wHTtT2e0Q8/QiV4B6CFmkKD6jUtorRjfL4XmFz7KTUeMUNZyhJPv+mQiwU/mtDiOkcBE kpt7EwAy7t1ZdjH2Q0qTD1hTIBWW0mwNDFGIHuzCpAk9BikRPWFHD85BOg3tCcGaDNYR 2vWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i1si1152727plt.630.2017.11.07.04.59.45; Tue, 07 Nov 2017 04:59:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755898AbdKGLbi (ORCPT + 91 others); Tue, 7 Nov 2017 06:31:38 -0500 Received: from smtp.citrix.com.au ([103.14.252.240]:22059 "EHLO SMTP.CITRIX.COM.AU" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755000AbdKGLbg (ORCPT ); Tue, 7 Nov 2017 06:31:36 -0500 X-Greylist: delayed 592 seconds by postgrey-1.27 at vger.kernel.org; Tue, 07 Nov 2017 06:31:35 EST X-IronPort-AV: E=Sophos;i="5.44,358,1505779200"; d="scan'208";a="106688370" Date: Tue, 7 Nov 2017 11:21:23 +0000 From: Roger Pau =?iso-8859-1?Q?Monn=E9?= To: Govinda Tatti CC: , , , , Subject: Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute Message-ID: <20171107112123.t6w4v5irbjysmyhe@dhcp-3-128.uk.xensource.com> References: <20171106174842.20276-1-Govinda.Tatti@Oracle.COM> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20171106174842.20276-1-Govinda.Tatti@Oracle.COM> User-Agent: NeoMutt/20171013 X-ClientProxiedBy: AMSPEX02CAS02.citrite.net (10.69.22.113) To AMSPEX02CL03.citrite.net (10.69.22.127) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 06, 2017 at 12:48:42PM -0500, Govinda Tatti wrote: > The life-cycle of a PCI device in Xen pciback is complex and is constrained > by the generic PCI locking mechanism. > > - It starts with the device being bound to us, for which we do a function > reset (done via SysFS so the PCI lock is held). > - If the device is unbound from us, we also do a function reset > (done via SysFS so the PCI lock is held). > - If the device is un-assigned from a guest - we do a function reset > (no PCI lock is held). > > All reset operations are done on the individual PCI function level > (so bus:device:function). > > The reset for an individual PCI function means device must support FLR > (PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary > bus reset for a singleton device on a bus but FLR does not have widespread > support or it is not reliable in some cases. So, we need to provide an > alternate mechanism to users to perform a slot or bus level reset. > > Currently, a slot or bus reset is not exposed in SysFS as there is no good > way of exposing a bus topology there. This is due to the complexity - > we MUST know that the different functions of a PCIe device are not in use > by other drivers, or if they are in use (say one of them is assigned to a > guest and the other is idle) - it is still OK to reset the slot (assuming > both of them are owned by Xen pciback). > > This patch does that by doing a slot or bus reset (if slot not supported) > if all of the functions of a PCIe device belong to Xen PCIback. > > Due to the complexity with the PCI lock we cannot do the reset when a > device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind') > as the pci_[slot|bus]_reset also takes the same lock resulting in a > dead-lock. > > Putting the reset function in a work-queue or thread won't work either - > as we have to do the reset function outside the 'unbind' context (it holds > the PCI lock). But once you 'unbind' a device the device is no longer under > the ownership of Xen pciback and the pci_set_drvdata has been reset, so > we cannot use a thread for this. > > Instead of doing all this complex dance, we depend on the tool-stack doing > the right thing. As such, we implement the 'do_flr' SysFS attribute which > 'xl' uses when a device is detached or attached from/to a guest. It > bypasses the need to worry about the PCI lock. > > To not inadvertently do a bus reset that would affect devices that are in > use by other drivers (other than Xen pciback) prior to the reset, we check > that all of the devices under the bridge are owned by Xen pciback. If they > are not, we refrain from executing the bus (or slot) reset. > > Signed-off-by: Govinda Tatti > Signed-off-by: Konrad Rzeszutek Wilk > Reviewed-by: Boris Ostrovsky > --- > Documentation/ABI/testing/sysfs-driver-pciback | 12 +++ > drivers/xen/xen-pciback/pci_stub.c | 125 +++++++++++++++++++++++++ > 2 files changed, 137 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback > index 6a733bf..ccf7dc0 100644 > --- a/Documentation/ABI/testing/sysfs-driver-pciback > +++ b/Documentation/ABI/testing/sysfs-driver-pciback > @@ -11,3 +11,15 @@ Description: > #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks > will allow the guest to read and write to the configuration > register 0x0E. > + > +What: /sys/bus/pci/drivers/pciback/do_flr > +Date: Nov 2017 > +KernelVersion: 4.15 > +Contact: xen-devel@lists.xenproject.org > +Description: > + An option to perform a slot or bus reset when a PCI device > + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F > + will cause the pciback driver to perform a slot or bus reset > + if the device supports it. It also checks to make sure that > + all of the devices under the bridge are owned by Xen PCI > + backend. > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c > index 6331a95..2b2c269 100644 > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -244,6 +244,96 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev, > return found_dev; > } > > +struct pcistub_args { > + struct pci_dev *dev; const? > + int dcount; unsigned int. > +}; > + > +static int pcistub_search_dev(struct pci_dev *dev, void *data) Seems like this function would better return a boolean rather than an int. > +{ > + struct pcistub_device *psdev; > + struct pcistub_args *arg = data; > + bool found_dev = false; > + unsigned long flags; > + > + spin_lock_irqsave(&pcistub_devices_lock, flags); > + > + list_for_each_entry(psdev, &pcistub_devices, dev_list) { > + if (psdev->dev == dev) { > + found_dev = true; > + arg->dcount++; > + break; > + } > + } > + > + spin_unlock_irqrestore(&pcistub_devices_lock, flags); > + > + /* Device not owned by pcistub, someone owns it. Abort the walk */ > + if (!found_dev) > + arg->dev = dev; > + > + return found_dev ? 0 : 1; > +} > + > +static int pcistub_reset_dev(struct pci_dev *dev) > +{ > + struct xen_pcibk_dev_data *dev_data; > + bool slot = false, bus = false; > + > + if (!dev) > + return -EINVAL; > + > + dev_dbg(&dev->dev, "[%s]\n", __func__); > + > + if (!pci_probe_reset_slot(dev->slot)) { > + slot = true; > + } else if (!pci_probe_reset_bus(dev->bus)) { > + /* We won't attempt to reset a root bridge. */ > + if (!pci_is_root_bus(dev->bus)) > + bus = true; Con't you join the two if, ie: } else if (!pci_probe_reset_bus(dev->bus) && !pci_is_root_bus(dev->bus)) { > + } > + > + if (!bus && !slot) > + return -EOPNOTSUPP; > + > + if (!slot) { > + struct pcistub_args arg = { .dev = NULL, .dcount = 0 }; > + > + /* > + * Make sure all devices on this bus are owned by the > + * PCI backend so that we can safely reset the whole bus. > + */ > + pci_walk_bus(dev->bus, pcistub_search_dev, &arg); > + > + /* All devices under the bus should be part of pcistub! */ > + if (arg.dev) { > + dev_err(&dev->dev, "%s device on the bus is not owned by pcistub\n", > + pci_name(arg.dev)); > + > + return -EBUSY; Not sure EBUSY is the best return code here, EINVAL? > + } > + > + dev_dbg(&dev->dev, "pcistub owns %d devices on the bus\n", > + arg.dcount); > + } > + > + dev_data = pci_get_drvdata(dev); > + if (!pci_load_saved_state(dev, dev_data->pci_saved_state)) > + pci_restore_state(dev); > + > + /* This disables the device. */ > + xen_pcibk_reset_device(dev); > + > + /* Cleanup up any emulated fields */ > + xen_pcibk_config_reset_dev(dev); > + > + dev_dbg(&dev->dev, "resetting %s device using %s reset\n", > + pci_name(dev), slot ? "slot" : "bus"); > + > + return slot ? pci_try_reset_slot(dev->slot) : > + pci_try_reset_bus(dev->bus); > +} > + > /* > * Called when: > * - XenBus state has been reconfigure (pci unplug). See xen_pcibk_remove_device > @@ -1434,6 +1524,34 @@ static ssize_t permissive_show(struct device_driver *drv, char *buf) > static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show, > permissive_add); > > +static ssize_t pcistub_do_flr(struct device_driver *drv, const char *buf, > + size_t count) > +{ > + struct pcistub_device *psdev; > + int domain, bus, slot, func; > + int err; > + > + err = str_to_slot(buf, &domain, &bus, &slot, &func); > + if (err) > + goto out; return err; > + > + psdev = pcistub_device_find(domain, bus, slot, func); if (!psdev) return -ENODEV; > + if (psdev) { > + err = pcistub_reset_dev(psdev->dev); > + pcistub_device_put(psdev); > + } else { > + err = -ENODEV; > + } > + > +out: > + if (!err) > + err = count; > + > + return err; What's the purpose of returning count here? Roger. From 1583346318922306893@xxx Mon Nov 06 19:31:58 +0000 2017 X-GM-THRID: 1583346318922306893 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread