Received: by 10.223.164.202 with SMTP id h10csp1614627wrb; Wed, 8 Nov 2017 07:02:49 -0800 (PST) X-Google-Smtp-Source: ABhQp+T2TroGB3qK+aDz4TOTCh+cGlwcZyQS/qdWPJy025aINFU+Zg24V0lq64dOaagdpFL2k2i1 X-Received: by 10.99.96.11 with SMTP id u11mr727700pgb.295.1510153369092; Wed, 08 Nov 2017 07:02:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510153369; cv=none; d=google.com; s=arc-20160816; b=V+Fqcn7+VNnTIpRnsW0Y869Pj6h89jm5WFJ1XXlRxq7KxadEIBWEgNGx4opRrtqgCx H7Ugq02TF3+0xkWr7fF57U22QXhoyqj+6ifSe3FS16Spr1q+qaJaR96Ry24N/MLBTkzP FpL1pZ2b0R84wrCA7n7ji1yJVjl/p/vqNIViZoXQcfBB8Y+24FloHbj2d9lint9D03KA 8btP28W0RaIZdFC0qXUhgxE7RTHwWQ1uIs8vu9BEkNBb851T5LzZ1wjdCFK3nD0kcA5l 7gpZ8L615v35cSMTnm7n3PqfcRZT6mbT428ArXLpRH1XfrQLkVBaI5AGzKSZ0drhdMg5 VzPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=9K67piH7Epe38hnGYYHUBz1tWw/loUrOPciz+TqflZQ=; b=s8HhWqz2UtOtvQ7MlsbGXPkCO2twSbMKa5JGSeOFyuxBk8whmKKGrGnsYyWI0Q4EQQ qO3y+5tzbc+eknfGJYdTpKPYOAfxoRgoUAsAeFSnM9HPbIIWFYO7C0MGCc2GOXOHqpvT r/AGC5hH4oQGf9/KXGlMN474kh6M5g8to/m07GW/yyHSB58XpaowJ9naHoIY7PZk1a2O IGT8c/kFa5G6qSjsSt/+HM69cpaFK5mrdbWDHVMqnO/lZoSI4VvvvLaRyp+LnIQ7NCH6 ZS7CeMfsZpOnwrZwhR8WEBmxKqu7tpTFp6N9aq/qw7D7cW6d4qLK2zKF9gtgB4AmqhGH W+fA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b73si4152263pga.432.2017.11.08.07.02.36; Wed, 08 Nov 2017 07:02:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752673AbdKHPAw (ORCPT + 83 others); Wed, 8 Nov 2017 10:00:52 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:38100 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752565AbdKHPAv (ORCPT ); Wed, 8 Nov 2017 10:00:51 -0500 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vA8F0hcF022332 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Nov 2017 15:00:44 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vA8F0gtM008618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Nov 2017 15:00:43 GMT Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vA8F0g4b026287; Wed, 8 Nov 2017 15:00:42 GMT Received: from [10.135.190.159] (/10.135.190.159) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 08 Nov 2017 07:00:41 -0800 Subject: Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute To: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= Cc: jgross@suse.com, xen-devel@lists.xenproject.org, boris.ostrovsky@oracle.com, linux-kernel@vger.kernel.org, konrad.wilk@oracle.com References: <20171106174842.20276-1-Govinda.Tatti@Oracle.COM> <20171107112123.t6w4v5irbjysmyhe@dhcp-3-128.uk.xensource.com> From: Govinda Tatti Organization: Oracle Corporation Message-ID: Date: Wed, 8 Nov 2017 09:00:33 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171107112123.t6w4v5irbjysmyhe@dhcp-3-128.uk.xensource.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks Roger for your review comments. Please see below for my comments. On 11/7/2017 5:21 AM, Roger Pau Monné wrote: > On Mon, Nov 06, 2017 at 12:48:42PM -0500, Govinda Tatti wrote: >> The life-cycle of a PCI device in Xen pciback is complex and is constrained >> by the generic PCI locking mechanism. >> >> - It starts with the device being bound to us, for which we do a function >> reset (done via SysFS so the PCI lock is held). >> - If the device is unbound from us, we also do a function reset >> (done via SysFS so the PCI lock is held). >> - If the device is un-assigned from a guest - we do a function reset >> (no PCI lock is held). >> >> All reset operations are done on the individual PCI function level >> (so bus:device:function). >> >> The reset for an individual PCI function means device must support FLR >> (PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary >> bus reset for a singleton device on a bus but FLR does not have widespread >> support or it is not reliable in some cases. So, we need to provide an >> alternate mechanism to users to perform a slot or bus level reset. >> >> Currently, a slot or bus reset is not exposed in SysFS as there is no good >> way of exposing a bus topology there. This is due to the complexity - >> we MUST know that the different functions of a PCIe device are not in use >> by other drivers, or if they are in use (say one of them is assigned to a >> guest and the other is idle) - it is still OK to reset the slot (assuming >> both of them are owned by Xen pciback). >> >> This patch does that by doing a slot or bus reset (if slot not supported) >> if all of the functions of a PCIe device belong to Xen PCIback. >> >> Due to the complexity with the PCI lock we cannot do the reset when a >> device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind') >> as the pci_[slot|bus]_reset also takes the same lock resulting in a >> dead-lock. >> >> Putting the reset function in a work-queue or thread won't work either - >> as we have to do the reset function outside the 'unbind' context (it holds >> the PCI lock). But once you 'unbind' a device the device is no longer under >> the ownership of Xen pciback and the pci_set_drvdata has been reset, so >> we cannot use a thread for this. >> >> Instead of doing all this complex dance, we depend on the tool-stack doing >> the right thing. As such, we implement the 'do_flr' SysFS attribute which >> 'xl' uses when a device is detached or attached from/to a guest. It >> bypasses the need to worry about the PCI lock. >> >> To not inadvertently do a bus reset that would affect devices that are in >> use by other drivers (other than Xen pciback) prior to the reset, we check >> that all of the devices under the bridge are owned by Xen pciback. If they >> are not, we refrain from executing the bus (or slot) reset. >> >> Signed-off-by: Govinda Tatti >> Signed-off-by: Konrad Rzeszutek Wilk >> Reviewed-by: Boris Ostrovsky >> --- >> Documentation/ABI/testing/sysfs-driver-pciback | 12 +++ >> drivers/xen/xen-pciback/pci_stub.c | 125 +++++++++++++++++++++++++ >> 2 files changed, 137 insertions(+) >> >> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback >> index 6a733bf..ccf7dc0 100644 >> --- a/Documentation/ABI/testing/sysfs-driver-pciback >> +++ b/Documentation/ABI/testing/sysfs-driver-pciback >> @@ -11,3 +11,15 @@ Description: >> #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks >> will allow the guest to read and write to the configuration >> register 0x0E. >> + >> +What: /sys/bus/pci/drivers/pciback/do_flr >> +Date: Nov 2017 >> +KernelVersion: 4.15 >> +Contact: xen-devel@lists.xenproject.org >> +Description: >> + An option to perform a slot or bus reset when a PCI device >> + is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F >> + will cause the pciback driver to perform a slot or bus reset >> + if the device supports it. It also checks to make sure that >> + all of the devices under the bridge are owned by Xen PCI >> + backend. >> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c >> index 6331a95..2b2c269 100644 >> --- a/drivers/xen/xen-pciback/pci_stub.c >> +++ b/drivers/xen/xen-pciback/pci_stub.c >> @@ -244,6 +244,96 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev, >> return found_dev; >> } >> >> +struct pcistub_args { >> + struct pci_dev *dev; > const? This field will point to first device that is not owned by pcistub. > >> + int dcount; > unsigned int. OK. > >> +}; >> + >> +static int pcistub_search_dev(struct pci_dev *dev, void *data) > Seems like this function would better return a boolean rather than an > int. pcistub_search_dev() is invoked through pci_walk_bus() and it expects int return code. > >> +{ >> + struct pcistub_device *psdev; >> + struct pcistub_args *arg = data; >> + bool found_dev = false; >> + unsigned long flags; >> + >> + spin_lock_irqsave(&pcistub_devices_lock, flags); >> + >> + list_for_each_entry(psdev, &pcistub_devices, dev_list) { >> + if (psdev->dev == dev) { >> + found_dev = true; >> + arg->dcount++; >> + break; >> + } >> + } >> + >> + spin_unlock_irqrestore(&pcistub_devices_lock, flags); >> + >> + /* Device not owned by pcistub, someone owns it. Abort the walk */ >> + if (!found_dev) >> + arg->dev = dev; >> + >> + return found_dev ? 0 : 1; >> +} >> + >> +static int pcistub_reset_dev(struct pci_dev *dev) >> +{ >> + struct xen_pcibk_dev_data *dev_data; >> + bool slot = false, bus = false; >> + >> + if (!dev) >> + return -EINVAL; >> + >> + dev_dbg(&dev->dev, "[%s]\n", __func__); >> + >> + if (!pci_probe_reset_slot(dev->slot)) { >> + slot = true; >> + } else if (!pci_probe_reset_bus(dev->bus)) { >> + /* We won't attempt to reset a root bridge. */ >> + if (!pci_is_root_bus(dev->bus)) >> + bus = true; > Con't you join the two if, ie: > > } else if (!pci_probe_reset_bus(dev->bus) && !pci_is_root_bus(dev->bus)) { Ok > >> + } >> + >> + if (!bus && !slot) >> + return -EOPNOTSUPP; >> + >> + if (!slot) { >> + struct pcistub_args arg = { .dev = NULL, .dcount = 0 }; >> + >> + /* >> + * Make sure all devices on this bus are owned by the >> + * PCI backend so that we can safely reset the whole bus. >> + */ >> + pci_walk_bus(dev->bus, pcistub_search_dev, &arg); >> + >> + /* All devices under the bus should be part of pcistub! */ >> + if (arg.dev) { >> + dev_err(&dev->dev, "%s device on the bus is not owned by pcistub\n", >> + pci_name(arg.dev)); >> + >> + return -EBUSY; > Not sure EBUSY is the best return code here, EINVAL? I don't think EINVAL is right return code for this case. > >> + } >> + >> + dev_dbg(&dev->dev, "pcistub owns %d devices on the bus\n", >> + arg.dcount); >> + } >> + >> + dev_data = pci_get_drvdata(dev); >> + if (!pci_load_saved_state(dev, dev_data->pci_saved_state)) >> + pci_restore_state(dev); >> + >> + /* This disables the device. */ >> + xen_pcibk_reset_device(dev); >> + >> + /* Cleanup up any emulated fields */ >> + xen_pcibk_config_reset_dev(dev); >> + >> + dev_dbg(&dev->dev, "resetting %s device using %s reset\n", >> + pci_name(dev), slot ? "slot" : "bus"); >> + >> + return slot ? pci_try_reset_slot(dev->slot) : >> + pci_try_reset_bus(dev->bus); >> +} >> + >> /* >> * Called when: >> * - XenBus state has been reconfigure (pci unplug). See xen_pcibk_remove_device >> @@ -1434,6 +1524,34 @@ static ssize_t permissive_show(struct device_driver *drv, char *buf) >> static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show, >> permissive_add); >> >> +static ssize_t pcistub_do_flr(struct device_driver *drv, const char *buf, >> + size_t count) >> +{ >> + struct pcistub_device *psdev; >> + int domain, bus, slot, func; >> + int err; >> + >> + err = str_to_slot(buf, &domain, &bus, &slot, &func); >> + if (err) >> + goto out; > return err; > >> + >> + psdev = pcistub_device_find(domain, bus, slot, func); > if (!psdev) > return -ENODEV; > >> + if (psdev) { >> + err = pcistub_reset_dev(psdev->dev); >> + pcistub_device_put(psdev); >> + } else { >> + err = -ENODEV; >> + } >> + >> +out: >> + if (!err) >> + err = count; >> + >> + return err; > What's the purpose of returning count here? Not sure. Will check with Konrad and address this comment. I will post revised patch later this week. Thanks. Cheers GOVINDA From 1583450716039652801@xxx Tue Nov 07 23:11:19 +0000 2017 X-GM-THRID: 1583346318922306893 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread