Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753701AbdDKVvj (ORCPT ); Tue, 11 Apr 2017 17:51:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37041 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753319AbdDKVvf (ORCPT ); Tue, 11 Apr 2017 17:51:35 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 28379CA011 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=alex.williamson@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 28379CA011 Date: Tue, 11 Apr 2017 15:51:34 -0600 From: Alex Williamson To: bodong@mellanox.com Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, saeedm@mellanox.com, Eli Cohen Subject: Re: [v2] PCI: Add an option to control probing of VFs before enabling SR-IOV Message-ID: <20170411155134.70108168@t450s.home> In-Reply-To: <1490198038-20465-1-git-send-email-bodong@mellanox.com> References: <1490198038-20465-1-git-send-email-bodong@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 11 Apr 2017 21:51:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6262 Lines: 181 On Wed, 22 Mar 2017 17:53:58 +0200 bodong@mellanox.com wrote: > From: Bodong Wang > > Sometimes it is not desirable to probe the virtual functions after > SRIOV is enabled. This can save host side resource usage by VF > instances which would be eventually probed to VMs. > > Add a new PCI sysfs interface "sriov_probe_vfs" to control that > from the PF, all current callers still retain the same functionality. > To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable probe) to > > /sys/bus/pci/devices//sriov_probe_vfs > > Note that, the choice must be made before enabling VFs. The change > will not take effect if VFs are already enabled. Simply, one can set > sriov_numvfs to 0, choose whether to probe or not, and then resume > sriov_numvfs. > > Signed-off-by: Bodong Wang > Signed-off-by: Eli Cohen > Reviewed-by: Gavin Shan > --- > Documentation/PCI/pci-iov-howto.txt | 10 ++++++++++ > drivers/pci/iov.c | 1 + > drivers/pci/pci-driver.c | 22 ++++++++++++++++++---- > drivers/pci/pci-sysfs.c | 28 ++++++++++++++++++++++++++++ > drivers/pci/pci.h | 1 + > 5 files changed, 58 insertions(+), 4 deletions(-) There should be an update to Documentation/ABI/testing/sysfs-bus-pci in here too. > > diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt > index 2d91ae2..902a528 100644 > --- a/Documentation/PCI/pci-iov-howto.txt > +++ b/Documentation/PCI/pci-iov-howto.txt > @@ -68,6 +68,16 @@ To disable SR-IOV capability: > echo 0 > \ > /sys/bus/pci/devices//sriov_numvfs > > +To enable probing VFs by a compatible driver on the host: nit, probably a good idea to note that probing is enabled by default. > +Before enabling SR-IOV capabilities, do: > + echo 1 > \ > + /sys/bus/pci/devices//sriov_probe_vfs > + > +To disable probing VFs by a compatible driver on the host: > +Before enabling SR-IOV capabilities, do: > + echo 0 > \ > + /sys/bus/pci/devices//sriov_probe_vfs > + > 3.2 Usage example > > Following piece of code illustrates the usage of the SR-IOV API. > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index 2479ae8..70691de 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -450,6 +450,7 @@ static int sriov_init(struct pci_dev *dev, int pos) > iov->total_VFs = total; > iov->pgsz = pgsz; > iov->self = dev; > + iov->probe_vfs = true; > pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap); > pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link); > if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index afa7271..2a1cf84 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -394,6 +394,18 @@ void __weak pcibios_free_irq(struct pci_dev *dev) > { > } > > +#ifdef CONFIG_PCI_IOV > +static inline bool pci_device_can_probe(struct pci_dev *pdev) > +{ > + return (!pdev->is_virtfn || pdev->physfn->sriov->probe_vfs); > +} > +#else > +static inline bool pci_device_can_probe(struct pci_dev *pdev) > +{ > + return true; > +} > +#endif > + > static int pci_device_probe(struct device *dev) > { > int error; > @@ -405,10 +417,12 @@ static int pci_device_probe(struct device *dev) > return error; > > pci_dev_get(pci_dev); > - error = __pci_device_probe(drv, pci_dev); > - if (error) { > - pcibios_free_irq(pci_dev); > - pci_dev_put(pci_dev); > + if (pci_device_can_probe(pci_dev)) { > + error = __pci_device_probe(drv, pci_dev); > + if (error) { > + pcibios_free_irq(pci_dev); > + pci_dev_put(pci_dev); > + } > } > > return error; > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 25d010d..1d5b89d 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -526,10 +526,37 @@ static ssize_t sriov_numvfs_store(struct device *dev, > return count; > } > > +static ssize_t sriov_probe_vfs_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + > + return sprintf(buf, "%u\n", pdev->sriov->probe_vfs); > +} > + > +static ssize_t sriov_probe_vfs_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + bool probe_vfs; > + > + if (kstrtobool(buf, &probe_vfs) < 0) > + return -EINVAL; > + > + pdev->sriov->probe_vfs = probe_vfs; > + > + return count; > +} > + > static struct device_attribute sriov_totalvfs_attr = __ATTR_RO(sriov_totalvfs); > static struct device_attribute sriov_numvfs_attr = > __ATTR(sriov_numvfs, (S_IRUGO|S_IWUSR|S_IWGRP), > sriov_numvfs_show, sriov_numvfs_store); > +static struct device_attribute sriov_probe_vfs_attr = > + __ATTR(sriov_probe_vfs, (S_IRUGO|S_IWUSR|S_IWGRP), > + sriov_probe_vfs_show, sriov_probe_vfs_store); > #endif /* CONFIG_PCI_IOV */ > > static ssize_t driver_override_store(struct device *dev, > @@ -1549,6 +1576,7 @@ static umode_t pci_dev_hp_attrs_are_visible(struct kobject *kobj, > static struct attribute *sriov_dev_attrs[] = { > &sriov_totalvfs_attr.attr, > &sriov_numvfs_attr.attr, > + &sriov_probe_vfs_attr.attr, > NULL, > }; > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 8dd38e6..b7d8127d 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -272,6 +272,7 @@ struct pci_sriov { > struct pci_dev *self; /* this PF */ > struct mutex lock; /* lock for setting sriov_numvfs in sysfs */ > resource_size_t barsz[PCI_SRIOV_NUM_BARS]; /* VF BAR size */ > + bool probe_vfs; /* control probing of VFs */ > }; > > #ifdef CONFIG_PCI_ATS Aside from the missing ABI update and howto nit, this seems ok to me, though I wonder if this isn't a more general problem that should maybe be solved in the driver core for any device that may create/expose child devices. The drivers_autoprobe interface we currently have is pretty limited with its bus-subsystem level scope. Thanks, Alex