2017-09-27 13:17:48

by David Woodhouse

[permalink] [raw]
Subject: [PATCH] uio/uio_pci_generic: Add SR-IOV support

From: David Woodhouse <[email protected]>

Allow userspace to configure SR-IOV VFs through sysfs.

Currently, we need an in-kernel driver to permit this. But sometimes
*all* we want to do is enable the VFs so that we can assign them to
guests; we don't actually need to deal with the PF in any other way
from the host kernel. So let's make it possible to use UIO for that.

Signed-off-by: David Woodhouse <[email protected]>
---
It's not entirely clear to me why we require the driver to "enable"
SR-IOV like this anyway — were there some which needed to do something
special and device-specific instead of just falling through to
pci_{en,dis}able_sriov(), such that we need to effectively whitelist
this in the driver rather than blacklisting the "problematic" ones via
PCI quirks?

drivers/uio/uio_pci_generic.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
index a56fdf9..bd196f0 100644
--- a/drivers/uio/uio_pci_generic.c
+++ b/drivers/uio/uio_pci_generic.c
@@ -108,15 +108,27 @@ static void remove(struct pci_dev *pdev)
struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);

uio_unregister_device(&gdev->info);
+ pci_disable_sriov(pdev);
pci_disable_device(pdev);
kfree(gdev);
}

+static int sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+ if (!num_vfs) {
+ pci_disable_sriov(pdev);
+ return 0;
+ }
+
+ return pci_enable_sriov(pdev, num_vfs);
+}
+
static struct pci_driver uio_pci_driver = {
.name = "uio_pci_generic",
.id_table = NULL, /* only dynamic id's */
.probe = probe,
.remove = remove,
+ .sriov_configure = sriov_configure,
};

module_pci_driver(uio_pci_driver);
--
2.7.4

--
dwmw2


Attachments:
smime.p7s (4.82 kB)

2017-09-27 22:00:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

[+cc Don, Alex D, Alex W, Bryant, Bodong, Michael, kvm list]

On Wed, Sep 27, 2017 at 01:59:22PM +0100, David Woodhouse wrote:
> From: David Woodhouse <[email protected]>
>
> Allow userspace to configure SR-IOV VFs through sysfs.
>
> Currently, we need an in-kernel driver to permit this. But sometimes
> *all* we want to do is enable the VFs so that we can assign them to
> guests; we don't actually need to deal with the PF in any other way
> from the host kernel. So let's make it possible to use UIO for that.
>
> Signed-off-by: David Woodhouse <[email protected]>
> ---
> It's not entirely clear to me why we require the driver to "enable"
> SR-IOV like this anyway — were there some which needed to do something
> special and device-specific instead of just falling through to
> pci_{en,dis}able_sriov(), such that we need to effectively whitelist
> this in the driver rather than blacklisting the "problematic" ones via
> PCI quirks?

IIUC, this question is basically "why doesn't the PCI core enable IOV
automatically when it sees an IOV-capable device?"

I think one reason is that an admin might want to control the number
of VFs we enable (e.g., via 1789382a72a5 ("PCI: SRIOV control and
status via sysfs" [1]). But I guess you already know about that,
since this patch uses that sysfs path, so maybe I don't understand
your question.

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1789382a72a5

> drivers/uio/uio_pci_generic.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
> index a56fdf9..bd196f0 100644
> --- a/drivers/uio/uio_pci_generic.c
> +++ b/drivers/uio/uio_pci_generic.c
> @@ -108,15 +108,27 @@ static void remove(struct pci_dev *pdev)
> struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);
>
> uio_unregister_device(&gdev->info);
> + pci_disable_sriov(pdev);
> pci_disable_device(pdev);
> kfree(gdev);
> }
>
> +static int sriov_configure(struct pci_dev *pdev, int num_vfs)
> +{
> + if (!num_vfs) {
> + pci_disable_sriov(pdev);
> + return 0;
> + }
> +
> + return pci_enable_sriov(pdev, num_vfs);
> +}
> +
> static struct pci_driver uio_pci_driver = {
> .name = "uio_pci_generic",
> .id_table = NULL, /* only dynamic id's */
> .probe = probe,
> .remove = remove,
> + .sriov_configure = sriov_configure,
> };
>
> module_pci_driver(uio_pci_driver);
> --
> 2.7.4
>
> --
> dwmw2


2017-09-27 22:21:02

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On Wed, 2017-09-27 at 17:00 -0500, Bjorn Helgaas wrote:
>
>
> IIUC, this question is basically "why doesn't the PCI core enable IOV
> automatically when it sees an IOV-capable device?"
>
> I think one reason is that an admin might want to control the number
> of VFs we enable (e.g., via 1789382a72a5 ("PCI: SRIOV control and
> status via sysfs" [1]).  But I guess you already know about that,
> since this patch uses that sysfs path, so maybe I don't understand
> your question.


I mean, why doesn't the PCI core *allow* SR-IOV to be enabled via
sysfs, unless the driver does this?


Attachments:
smime.p7s (4.82 kB)

2017-09-27 23:07:02

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On Wed, Sep 27, 2017 at 3:20 PM, David Woodhouse <[email protected]> wrote:
> On Wed, 2017-09-27 at 17:00 -0500, Bjorn Helgaas wrote:
>>
>>
>> IIUC, this question is basically "why doesn't the PCI core enable IOV
>> automatically when it sees an IOV-capable device?"
>>
>> I think one reason is that an admin might want to control the number
>> of VFs we enable (e.g., via 1789382a72a5 ("PCI: SRIOV control and
>> status via sysfs" [1]). But I guess you already know about that,
>> since this patch uses that sysfs path, so maybe I don't understand
>> your question.
>
>
> I mean, why doesn't the PCI core *allow* SR-IOV to be enabled via
> sysfs, unless the driver does this?

The general idea is that the driver usually has to free up resources
so they can be reassigned to the VF devices.

For example in the case of the Intel NICs enabling SR-IOV reassigns
the queues to the VFs, and the PF has to be aware that this change is
happening so that it doesn't try to make use of queues that then
belong to the VFs.

- Alex

2017-09-28 12:12:32

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On 09/27/2017 06:00 PM, Bjorn Helgaas wrote:
> [+cc Don, Alex D, Alex W, Bryant, Bodong, Michael, kvm list]
>
> On Wed, Sep 27, 2017 at 01:59:22PM +0100, David Woodhouse wrote:
>> From: David Woodhouse <[email protected]>
>>
>> Allow userspace to configure SR-IOV VFs through sysfs.
>>
>> Currently, we need an in-kernel driver to permit this. But sometimes
>> *all* we want to do is enable the VFs so that we can assign them to
>> guests; we don't actually need to deal with the PF in any other way
>> from the host kernel. So let's make it possible to use UIO for that.
>>
>> Signed-off-by: David Woodhouse <[email protected]>
>> ---
>> It's not entirely clear to me why we require the driver to "enable"
>> SR-IOV like this anyway — were there some which needed to do something
>> special and device-specific instead of just falling through to
>> pci_{en,dis}able_sriov(), such that we need to effectively whitelist
>> this in the driver rather than blacklisting the "problematic" ones via
>> PCI quirks?
>
> IIUC, this question is basically "why doesn't the PCI core enable IOV
> automatically when it sees an IOV-capable device?"
>
> I think one reason is that an admin might want to control the number
> of VFs we enable (e.g., via 1789382a72a5 ("PCI: SRIOV control and
> status via sysfs" [1]). But I guess you already know about that,
> since this patch uses that sysfs path, so maybe I don't understand
> your question.
>
The major reason is that most bios don't scan extended pci config for
SRIOV info, and thus, dont provide enough resources to configure VFs.
If an SRIOV-capable device is one of the first devices scanned in a PCI tree
(off a PCI Root Port), then it could consume all the bus resources,
and fail configuring the rest of the PCI devices, often leaving the system
unbootable.
Another reason why it's not an all-or-nothing selection when a PF's VFs are
enabled as well -- there may be enough resources available for some VFs to
be enabled, but not all. The (needs-to-be-ECN'd) req that VFs of a PF
have to be aligned to the VF BAR[n]*num_VFs_enabled is the real difficulty,
since large, aligned free mmio space is not commonly 'free' after a PCI bus(tree)
scan and resource allocation. Free bus num space (gaps), and MSI resources
add further complications.

Will try to make some time later today to dig into the patches further
and add more concrete suggestions/feedback afterward.

> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1789382a72a5
>
>> drivers/uio/uio_pci_generic.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
>> index a56fdf9..bd196f0 100644
>> --- a/drivers/uio/uio_pci_generic.c
>> +++ b/drivers/uio/uio_pci_generic.c
>> @@ -108,15 +108,27 @@ static void remove(struct pci_dev *pdev)
>> struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);
>>
>> uio_unregister_device(&gdev->info);
>> + pci_disable_sriov(pdev);
>> pci_disable_device(pdev);
>> kfree(gdev);
>> }
>>
>> +static int sriov_configure(struct pci_dev *pdev, int num_vfs)
>> +{
>> + if (!num_vfs) {
>> + pci_disable_sriov(pdev);
>> + return 0;
>> + }
>> +
>> + return pci_enable_sriov(pdev, num_vfs);
>> +}
>> +
>> static struct pci_driver uio_pci_driver = {
>> .name = "uio_pci_generic",
>> .id_table = NULL, /* only dynamic id's */
>> .probe = probe,
>> .remove = remove,
>> + .sriov_configure = sriov_configure,
>> };
>>
>> module_pci_driver(uio_pci_driver);
>> --
>> 2.7.4
>>
>> --
>> dwmw2
>
>

2017-09-28 12:22:29

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On 09/27/2017 07:06 PM, Alexander Duyck wrote:
> On Wed, Sep 27, 2017 at 3:20 PM, David Woodhouse <[email protected]> wrote:
>> On Wed, 2017-09-27 at 17:00 -0500, Bjorn Helgaas wrote:
>>>
>>>
>>> IIUC, this question is basically "why doesn't the PCI core enable IOV
>>> automatically when it sees an IOV-capable device?"
>>>
>>> I think one reason is that an admin might want to control the number
>>> of VFs we enable (e.g., via 1789382a72a5 ("PCI: SRIOV control and
>>> status via sysfs" [1]). But I guess you already know about that,
>>> since this patch uses that sysfs path, so maybe I don't understand
>>> your question.
>>
>>
>> I mean, why doesn't the PCI core *allow* SR-IOV to be enabled via
>> sysfs, unless the driver does this?
>
> The general idea is that the driver usually has to free up resources
> so they can be reassigned to the VF devices.
>
> For example in the case of the Intel NICs enabling SR-IOV reassigns
> the queues to the VFs, and the PF has to be aware that this change is
> happening so that it doesn't try to make use of queues that then
> belong to the VFs.
>
> - Alex
>
After reading Alex's response, I now understand Dave's question
better and why the patch won't work in general.

In every SRIOV capable device I've run into to date, the PF has
to know the VFs are being assigned due to some resource mgmt, if not
internal (e.g., switch) configuration reasons.
The reasons are often subtle.

From the context of the patches (uio), why aren't VFs enabled via
user-level sysfs interface? That was provided for user-mgmt apps
to have a PCIe-generic/common, device-independent method of VF enablement
(read: libvirt for device-assignment of VFs to VMs).
This also enables one to manage it at a finer, not-all-or-nothing level,
which is good due to bios strengths/weaknesses that various systems have
for SRIOV/VF-enablement.



2017-09-28 13:46:31

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On Thu, 2017-09-28 at 08:22 -0400, Don Dutile wrote:
>
> After reading Alex's response, I now understand Dave's question
> better and why the patch won't work in general.

UIO doesn't work "in general". It requires a very *specific* userspace
driver for the hardware in question, which needs to know what it's
doing.

> In every SRIOV capable device I've run into to date, the PF has
> to know the VFs are being assigned due to some resource mgmt, if not
> internal (e.g., switch) configuration reasons.
> The reasons are often subtle.

Sure. If there's actually a userspace driver (which in my case there
isn't), and if that's needed for the hardware ins question (which in my
case it isn't), then it'd need to do that before enabling the VFs via
the user-level sysfs interface of which you speak.

>  From the context of the patches (uio), why aren't VFs enabled via
> user-level sysfs interface? That was provided for user-mgmt apps
> to have a PCIe-generic/common, device-independent method of VF
> enablement

That is *precisely* what we're doing. But the user-space sysfs
interface doesn't actually *exist* unless a driver is bound to the
device in question, with a .sriov-configure method. Which is where I
came in... :)


Attachments:
smime.p7s (4.82 kB)

2017-09-28 15:05:36

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On 09/28/2017 09:46 AM, David Woodhouse wrote:
> On Thu, 2017-09-28 at 08:22 -0400, Don Dutile wrote:
>>
>> After reading Alex's response, I now understand Dave's question
>> better and why the patch won't work in general.
>
> UIO doesn't work "in general". It requires a very *specific* userspace
> driver for the hardware in question, which needs to know what it's
> doing.
>
>> In every SRIOV capable device I've run into to date, the PF has
>> to know the VFs are being assigned due to some resource mgmt, if not
>> internal (e.g., switch) configuration reasons.
>> The reasons are often subtle.
>
> Sure. If there's actually a userspace driver (which in my case there
> isn't), and if that's needed for the hardware ins question (which in my
> case it isn't), then it'd need to do that before enabling the VFs via
> the user-level sysfs interface of which you speak.
>
>> From the context of the patches (uio), why aren't VFs enabled via
>> user-level sysfs interface? That was provided for user-mgmt apps
>> to have a PCIe-generic/common, device-independent method of VF
>> enablement
>
> That is *precisely* what we're doing. But the user-space sysfs
> interface doesn't actually *exist* unless a driver is bound to the
> device in question, with a .sriov-configure method. Which is where I
> came in... :)
>
ah, nickel summary: no in-kernel driver w/.sriov-configure method.
if so, now I'm up to speed with you....
hmmmm....
so, that would imply we need an in-kernel, pcie-common, .sriov-configure method
that's invoked if a driver isn't bound to a device? ... yes?

2017-09-28 15:52:39

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On Thu, 2017-09-28 at 11:05 -0400, Don Dutile wrote:
> ah, nickel summary: no in-kernel driver w/.sriov-configure method.
> if so, now I'm up to speed with you....
> hmmmm....
> so, that would imply we need an in-kernel, pcie-common, .sriov-
> configure method
> that's invoked if a driver isn't bound to a device? ... yes?

Well that was kind of the point in my question below the ---

Is that something we want to be generic? Would we want to have quirks
for the devices where we might *not* want it? 

Anything that *has* a driver for the PF, should have .sriov_configure
already. Anything that doesn't have a driver can (now) use UIO to
enable SR-IOV. So we don't *have* to make it unconditionally
available...


Attachments:
smime.p7s (4.82 kB)

2017-09-28 16:56:24

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH] uio/uio_pci_generic: Add SR-IOV support

On 09/28/2017 11:52 AM, David Woodhouse wrote:
> On Thu, 2017-09-28 at 11:05 -0400, Don Dutile wrote:
>> ah, nickel summary: no in-kernel driver w/.sriov-configure method.
>> if so, now I'm up to speed with you....
>> hmmmm....
>> so, that would imply we need an in-kernel, pcie-common, .sriov-
>> configure method
>> that's invoked if a driver isn't bound to a device? ... yes?
>
> Well that was kind of the point in my question below the ---
>
> Is that something we want to be generic? Would we want to have quirks
> for the devices where we might *not* want it?
>
> Anything that *has* a driver for the PF, should have .sriov_configure
> already. Anything that doesn't have a driver can (now) use UIO to
> enable SR-IOV. So we don't *have* to make it unconditionally
> available...
>
Well, my point is more like: why put it in uio?
why not make it available via pcie, setup while/if no driver attached?
i.e., other non-uio users can use the mechanism.... like libvirt? ...
if a PF driver isn't required.