2008-10-01 16:07:34

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Sat, Sep 27, 2008 at 04:28:45PM +0800, Zhao, Yu wrote:
> +++ b/Documentation/DocBook/kernel-api.tmpl
> @@ -239,6 +239,7 @@ X!Ekernel/module.c
> </sect1>
>
> <sect1><title>PCI Support Library</title>
> +!Iinclude/linux/pci.h

Why do you need to do this? Thus far, all the documentation has been
with the implementation, not in the header file.

> +1.2 What is ARI
> +
> +Alternative Routing-ID Interpretation (ARI) allows a PCI Express Endpoint
> +to use its device number field as part of function number. Traditionally,
> +an Endpoint can only have 8 functions, and the device number of all
> +Endpoints is zero. With ARI enabled, an Endpoint can have up to 256
> +functions by using device number in conjunction with function number to
> +indicate a function in the device. This is almost transparent to the Linux
> +kernel because the Linux kernel still can use 8-bit bus number field plus
> +8-bit devfn number field to locate a function. ARI is managed via the ARI
> +Forwarding bit in the Device Capabilities 2 register of the PCI Express
> +Capability on the Root Port or the Downstream Port and a new ARI Capability
> +on the Endpoint.

I don't think this section actually helps a software developer use
SR-IOV, does it?

> +2. User Guide
> +
> +2.1 How can I manage SR-IOV
> +
> +If a device supports SR-IOV, then there should be some entries under
> +Physical Function's PCI device directory. These entries are in directory:
> + - /sys/bus/pci/devices/BB:DD.F/iov/
> + (BB:DD:F is bus:dev:fun)

The 'domain:' prefix has been there for a long time now.

> +and
> + - /sys/bus/pci/devices/BB:DD.F/iov/N
> + (N is VF number from 0 to initialvfs-1)
> +
> +To enable or disable SR-IOV:
> + - /sys/bus/pci/devices/BB:DD.F/iov/enable
> + (writing 1/0 means enable/disable VFs, state change will
> + notify PF driver)
> +
> +To change number of Virtual Functions:
> + - /sys/bus/pci/devices/BB:DD.F/iov/numvfs
> + (writing positive integer to this file will change NumVFs)
> +
> +The total and initial number of VFs can get from:
> + - /sys/bus/pci/devices/BB:DD.F/iov/totalvfs
> + - /sys/bus/pci/devices/BB:DD.F/iov/initialvfs
> +
> +The identifier of a VF that belongs to this PF can get from:
> + - /sys/bus/pci/devices/BB:DD.F/iov/N/rid
> + (for all class of devices)

Wouldn't it be more useful to have the iov/N directories be a symlink to
the actual pci_dev used by the virtual function?

> +For network device, there are:
> + - /sys/bus/pci/devices/BB:DD.F/iov/N/mac
> + - /sys/bus/pci/devices/BB:DD.F/iov/N/vlan
> + (value update will notify PF driver)

We already have tools to set the MAC and VLAN parameters for network
devices.

> +To register SR-IOV service, Physical Function device driver needs to call:
> + int pci_iov_register(struct pci_dev *dev,
> + int (*notify)(struct pci_dev *, u32), char **entries)

I think a better interface would put the 'notify' into the struct
pci_driver. That would make 'notify' a bad name .... how about
'virtual'? There's also no documentation for the second parameter to
'notify'.

> +Note: entries could be NULL if PF driver doesn't want to create new entries
> +under /sys/bus/pci/devices/BB:DD.F/iov/N/.

So 'entries' is a list of names to create sysfs entries for?

> +To enable SR-IOV, Physical Function device driver needs to call:
> + int pci_iov_enable(struct pci_dev *dev, int numvfs)
> +
> +To disable SR-IOV, Physical Function device driver needs to call:
> + void pci_iov_disable(struct pci_dev *dev)

I'm not 100% convinced about this API. The assumption here is that the
driver will do it, but I think it should probably be in the core. The
driver probably wants to be notified that the PCI core is going to
create a virtual function, and would it please prepare to do so, but I'm
not convinced this should be triggered by the driver. How would the
driver decide to create a new virtual function?

> +To read or write VFs configuration:
> + - int pci_iov_read_config(struct pci_dev *dev, int id,
> + char *entry, char *buf, int size);
> + - int pci_iov_write_config(struct pci_dev *dev, int id,
> + char *entry, char *buf);

I think we'd be better off having the driver create its own sysfs
entries if it really needs to.

> +3.2 Usage example
> +
> +Following piece of code illustrates the usage of APIs above.
> +
[...]
> +static struct pci_driver dev_driver = {
> + .name = "SR-IOV Physical Function driver",
> + .id_table = dev_id_table,
> + .probe = dev_probe,
> + .remove = __devexit_p(dev_remove),
> +#ifdef CONFIG_PM
> + .suspend = dev_suspend,
> + .resume = dev_resume,
> +#endif
> +};

>From my reading of the SR-IOV spec, this isn't how it's supposed to
work. The device is supposed to be a fully functional PCI device that
on demand can start peeling off virtual functions; it's not supposed to
boot up and initialise all its virtual functions at once.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."


2008-10-14 00:23:44

by Dong, Eddie

[permalink] [raw]
Subject: RE: [PATCH 6/6 v3] PCI: document the change

Matthew Wilcox wrote:

> Wouldn't it be more useful to have the iov/N directories
> be a symlink to the actual pci_dev used by the virtual
> function?

The main concern here is that a VF may be disabed such as when PF enter
D3 state or undergo an reset and thus be plug-off, but user won't
re-configure the VF in case the PF return back to working state.

>
>> +For network device, there are:
>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/mac
>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/vlan
>> + (value update will notify PF driver)
>
> We already have tools to set the MAC and VLAN parameters
> for network devices.

Do you mean Ethtool? If yes, it is impossible for SR-IOV since the
configuration has to be done in PF side, rather than VF.

>
> I'm not 100% convinced about this API. The assumption
> here is that the driver will do it, but I think it should
> probably be in the core. The driver probably wants to be

Our concern is that the PF driver may put an default state when it is
loaded so that SR-IOV can work without any user level configuration, but
of course the driver won't dynamically change it.
Do u mean we remove this ability?

> notified that the PCI core is going to create a virtual
> function, and would it please prepare to do so, but I'm
> not convinced this should be triggered by the driver.
> How would the driver decide to create a new virtual
> function?
>
>
> From my reading of the SR-IOV spec, this isn't how it's
> supposed to work. The device is supposed to be a fully
> functional PCI device that on demand can start peeling
> off virtual functions; it's not supposed to boot up and
> initialise all its virtual functions at once.

The spec defines either we enable all VFs or Disable. Per VF enabling is
not supported.
Is this what you concern?

Thanks, eddie

2008-10-14 01:08:58

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Tue, Oct 14, 2008 at 08:23:34AM +0800, Dong, Eddie wrote:
> Matthew Wilcox wrote:
> > Wouldn't it be more useful to have the iov/N directories
> > be a symlink to the actual pci_dev used by the virtual
> > function?
>
> The main concern here is that a VF may be disabed such as when PF enter
> D3 state or undergo an reset and thus be plug-off, but user won't
> re-configure the VF in case the PF return back to working state.

If we're relying on the user to reconfigure virtual functions on return
to D0 from D3, that's a very fragile system.

> >> +For network device, there are:
> >> + - /sys/bus/pci/devices/BB:DD.F/iov/N/mac
> >> + - /sys/bus/pci/devices/BB:DD.F/iov/N/vlan
> >> + (value update will notify PF driver)
> >
> > We already have tools to set the MAC and VLAN parameters
> > for network devices.
>
> Do you mean Ethtool? If yes, it is impossible for SR-IOV since the
> configuration has to be done in PF side, rather than VF.

I don't think ethtool has that ability; ip(8) can set mac addresses and
vconfig(8) sets vlan parameters.

The device driver already has to be aware of SR-IOV. If it's going to
support the standard tools (and it damn well ought to), then it should
call the PF driver to set these kinds of parameters.

> > I'm not 100% convinced about this API. The assumption
> > here is that the driver will do it, but I think it should
> > probably be in the core. The driver probably wants to be
>
> Our concern is that the PF driver may put an default state when it is
> loaded so that SR-IOV can work without any user level configuration, but
> of course the driver won't dynamically change it.
> Do u mean we remove this ability?
>
> > notified that the PCI core is going to create a virtual
> > function, and would it please prepare to do so, but I'm
> > not convinced this should be triggered by the driver.
> > How would the driver decide to create a new virtual
> > function?

Let me try to explain this a bit better.

The user decides they want a new ethernet virtual function. In the
scheme as you have set up:

1. User communicates to ethernet driver "I want a new VF"
2. Ethernet driver tells PCI core "create new VF".

I propose:

1. User tells PCI core "I want a new VF on PCI device 0000:01:03.0"
2. PCI core tells driver "User wants a new VF"

My scheme gives us a unified way of creating new VFs, yours requires each
driver to invent a way for the user to tell them to create a new VF.
Unless I've misunderstood your code and docs.

> > From my reading of the SR-IOV spec, this isn't how it's
> > supposed to work. The device is supposed to be a fully
> > functional PCI device that on demand can start peeling
> > off virtual functions; it's not supposed to boot up and
> > initialise all its virtual functions at once.
>
> The spec defines either we enable all VFs or Disable. Per VF enabling is
> not supported.
> Is this what you concern?

I don't think that's true. The spec requires you to enable all the
VFs from 0 to NumVFs, but NumVFs can be lower than TotalVFs. At least,
that's how I read it.

But no, that isn't my concern. My concern is that you've written a
driver here that seems to be a stub driver. That doesn't seem to be
how SR-IOV is supposed to work; it's supposed to be a fully-functional
driver that has SR-IOV knowledge added to it.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-10-14 02:34:01

by Dong, Eddie

[permalink] [raw]
Subject: RE: [PATCH 6/6 v3] PCI: document the change

Matthew Wilcox wrote:
> On Tue, Oct 14, 2008 at 08:23:34AM +0800, Dong, Eddie
> wrote:
>> Matthew Wilcox wrote:
>>> Wouldn't it be more useful to have the iov/N directories
>>> be a symlink to the actual pci_dev used by the virtual
>>> function?
>>
>> The main concern here is that a VF may be disabed such
>> as when PF enter D3 state or undergo an reset and thus
>> be plug-off, but user won't re-configure the VF in case
>> the PF return back to working state.
>
> If we're relying on the user to reconfigure virtual
> functions on return to D0 from D3, that's a very fragile
> system.

No. that is the concern we don't put those configuration under VF nodes
because it will disappear.
Do I miss something?

>
>>>> +For network device, there are:
>>>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/mac
>>>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/vlan
>>>> + (value update will notify PF driver)
>>>
>>> We already have tools to set the MAC and VLAN parameters
>>> for network devices.
>>
>> Do you mean Ethtool? If yes, it is impossible for SR-IOV
>> since the configuration has to be done in PF side,
>> rather than VF.
>
> I don't think ethtool has that ability; ip(8) can set mac
> addresses and vconfig(8) sets vlan parameters.
>
> The device driver already has to be aware of SR-IOV. If
> it's going to support the standard tools (and it damn
> well ought to), then it should call the PF driver to set
> these kinds of parameters.

OK, as if it has the VF parameter, will look into details.
BTW, the SR-IOV patch is not only for network, some other devices such
as IDE will use same code base as well and we image it could have other
parameter to set such as starting LBA of a IDE VF.



>
>>> I'm not 100% convinced about this API. The assumption
>>> here is that the driver will do it, but I think it
>>> should probably be in the core. The driver probably
>>> wants to be
>>
>> Our concern is that the PF driver may put an default
>> state when it is loaded so that SR-IOV can work without
>> any user level configuration, but of course the driver
>> won't dynamically change it.
>> Do u mean we remove this ability?
>>
>>> notified that the PCI core is going to create a virtual
>>> function, and would it please prepare to do so, but I'm
>>> not convinced this should be triggered by the driver.
>>> How would the driver decide to create a new virtual
>>> function?
>
> Let me try to explain this a bit better.
>
> The user decides they want a new ethernet virtual
> function. In the scheme as you have set up:
>
> 1. User communicates to ethernet driver "I want a new VF"
> 2. Ethernet driver tells PCI core "create new VF".
>
> I propose:
>
> 1. User tells PCI core "I want a new VF on PCI device
> 0000:01:03.0"
> 2. PCI core tells driver "User wants a new VF"

If user need a new VF, the VF must be already enabled or existed in OS.
Otherwise, we need to disable all VFs first and then change NumVFs to
re-enable VFs.
Spec says: "NumVFs may only be written while VF Enable is Clear"

>
> My scheme gives us a unified way of creating new VFs,
> yours requires each driver to invent a way for the user
> to tell them to create a new VF. Unless I've
> misunderstood your code and docs.

Assign a VF is kind of user & core job.

>
>>> From my reading of the SR-IOV spec, this isn't how it's
>>> supposed to work. The device is supposed to be a fully
>>> functional PCI device that on demand can start peeling
>>> off virtual functions; it's not supposed to boot up and
>>> initialise all its virtual functions at once.
>>
>> The spec defines either we enable all VFs or Disable.
>> Per VF enabling is not supported. Is this what you
>> concern?
>
> I don't think that's true. The spec requires you to
> enable all the
> VFs from 0 to NumVFs, but NumVFs can be lower than
> TotalVFs. At least, that's how I read it.

Yes, but setting NumVFs can only occur when VFs are disabled.
Following are from spec.

NumVFs may only be written while VF Enable is Clear. If NumVFs is
written when VF Enable is
Set, the results are undefined.
The initial value of NumVFs is undefined.

>
> But no, that isn't my concern. My concern is that you've
> written a driver here that seems to be a stub driver.
> That doesn't seem to be
> how SR-IOV is supposed to work; it's supposed to be a
> fully-functional driver that has SR-IOV knowledge added
> to it.

Yes, it is a full feature driver as if PF has resource in, for example
not all queues are assigned to VFs.

Thx, eddie

2008-10-14 03:09:52

by Zhao, Yu

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Tue, Oct 14, 2008 at 10:31:03AM +0800, Dong, Eddie wrote:
> Matthew Wilcox wrote:
> > On Tue, Oct 14, 2008 at 08:23:34AM +0800, Dong, Eddie
> > wrote:
> >> Matthew Wilcox wrote:
> >>> Wouldn't it be more useful to have the iov/N directories
> >>> be a symlink to the actual pci_dev used by the virtual
> >>> function?
> >>
> >> The main concern here is that a VF may be disabed such
> >> as when PF enter D3 state or undergo an reset and thus
> >> be plug-off, but user won't re-configure the VF in case
> >> the PF return back to working state.
> >
> > If we're relying on the user to reconfigure virtual
> > functions on return to D0 from D3, that's a very fragile
> > system.
>
> No. that is the concern we don't put those configuration under VF nodes because it will disappear.
> Do I miss something?
>
> >
> >>>> +For network device, there are:
> >>>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/mac
> >>>> + - /sys/bus/pci/devices/BB:DD.F/iov/N/vlan
> >>>> + (value update will notify PF driver)
> >>>
> >>> We already have tools to set the MAC and VLAN parameters
> >>> for network devices.
> >>
> >> Do you mean Ethtool? If yes, it is impossible for SR-IOV
> >> since the configuration has to be done in PF side,
> >> rather than VF.
> >
> > I don't think ethtool has that ability; ip(8) can set mac
> > addresses and vconfig(8) sets vlan parameters.
> >
> > The device driver already has to be aware of SR-IOV. If
> > it's going to support the standard tools (and it damn
> > well ought to), then it should call the PF driver to set
> > these kinds of parameters.
>
> OK, as if it has the VF parameter, will look into details.

Neither ip(8) nor vconfig(8) can set MAC and VLAN address for VF when
the VF driver is not loaded.

> BTW, the SR-IOV patch is not only for network, some other devices such as IDE will use same code base as well and we image it could have other parameter to set such as starting LBA of a IDE VF.

As Eddie said, we have two problems here:
1) User has to set device specific parameters of a VF when he wants to
use this VF with KVM (assign this device to KVM guest). In this case,
VF driver is not loaded in the host environment. So operations which
are implemented as driver callback (e.g. set_mac_address()) are not
supported.
2) For security reason, some SR-IOV devices prohibit the VF driver
configuring the VF via its own register space. Instead, the configurations
must be done through the PF which the VF is associated with. This means PF
driver has to receive parameters that are used to configure its VFs. These
parameters obviously can be passed by traditional tools, if without
modification for SR-IOV.

2008-10-14 04:01:33

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Tue, Oct 14, 2008 at 10:14:35AM +0800, Yu Zhao wrote:
> > BTW, the SR-IOV patch is not only for network, some other devices such as IDE will use same code base as well and we image it could have other parameter to set such as starting LBA of a IDE VF.
>
> As Eddie said, we have two problems here:
> 1) User has to set device specific parameters of a VF when he wants to
> use this VF with KVM (assign this device to KVM guest). In this case,
> VF driver is not loaded in the host environment. So operations which
> are implemented as driver callback (e.g. set_mac_address()) are not
> supported.

I suspect what you want to do is create, then configure the device in
the host, then assign it to the guest.

> 2) For security reason, some SR-IOV devices prohibit the VF driver
> configuring the VF via its own register space. Instead, the configurations
> must be done through the PF which the VF is associated with. This means PF
> driver has to receive parameters that are used to configure its VFs. These
> parameters obviously can be passed by traditional tools, if without
> modification for SR-IOV.

I think that idea also covers this point.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-10-14 04:27:41

by Dong, Eddie

[permalink] [raw]
Subject: RE: [PATCH 6/6 v3] PCI: document the change

Matthew Wilcox wrote:
> On Tue, Oct 14, 2008 at 10:14:35AM +0800, Yu Zhao wrote:
>>> BTW, the SR-IOV patch is not only for network, some
>>> other devices such as IDE will use same code base as
>>> well and we image it could have other parameter to set
>>> such as starting LBA of a IDE VF.
>>
>> As Eddie said, we have two problems here:
>> 1) User has to set device specific parameters of a VF
>> when he wants to use this VF with KVM (assign this
>> device to KVM guest). In this case,
>> VF driver is not loaded in the host environment. So
>> operations which
>> are implemented as driver callback (e.g.
>> set_mac_address()) are not supported.
>
> I suspect what you want to do is create, then configure
> the device in the host, then assign it to the guest.

That is not true. Rememver the created VFs will be destroyed no matter
for PF power event or error recovery conducted reset.
So what we want is:

Config, create, assign, and then deassign and destroy and then
recreate...

>
>> 2) For security reason, some SR-IOV devices prohibit the
>> VF driver configuring the VF via its own register space.
>> Instead, the configurations must be done through the PF
>> which the VF is associated with. This means PF driver
>> has to receive parameters that are used to configure its
>> VFs. These parameters obviously can be passed by
>> traditional tools, if without modification for SR-IOV.
>
> I think that idea also covers this point.
>
Sorry can u explain a little bit more? The SR-IOV patch won't define
what kind of entries should be created or not, we leave network
subsystem to decide what to do. Same for disk subsstem etc.

Thx, eddie

2008-10-14 04:46:54

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Tue, Oct 14, 2008 at 12:18:40PM +0800, Dong, Eddie wrote:
> Matthew Wilcox wrote:
> > On Tue, Oct 14, 2008 at 10:14:35AM +0800, Yu Zhao wrote:
> >> As Eddie said, we have two problems here:
> >> 1) User has to set device specific parameters of a VF
> >> when he wants to use this VF with KVM (assign this
> >> device to KVM guest). In this case,
> >> VF driver is not loaded in the host environment. So
> >> operations which
> >> are implemented as driver callback (e.g.
> >> set_mac_address()) are not supported.
> >
> > I suspect what you want to do is create, then configure
> > the device in the host, then assign it to the guest.
>
> That is not true. Rememver the created VFs will be destroyed no matter
> for PF power event or error recovery conducted reset.
> So what we want is:
>
> Config, create, assign, and then deassign and destroy and then
> recreate...

Yes, but my point is this all happens in the _host_, not in the _guest_.

> Sorry can u explain a little bit more? The SR-IOV patch won't define
> what kind of entries should be created or not, we leave network
> subsystem to decide what to do. Same for disk subsstem etc.

No entries should be created. This needs to be not SR-IOV specific.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-10-14 05:02:17

by Zhao, Yu

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

On Tue, Oct 14, 2008 at 12:01:05PM +0800, Matthew Wilcox wrote:
> On Tue, Oct 14, 2008 at 10:14:35AM +0800, Yu Zhao wrote:
> > > BTW, the SR-IOV patch is not only for network, some other devices such as IDE will use same code base as well and we image it could have other parameter to set such as starting LBA of a IDE VF.
> >
> > As Eddie said, we have two problems here:
> > 1) User has to set device specific parameters of a VF when he wants to
> > use this VF with KVM (assign this device to KVM guest). In this case,
> > VF driver is not loaded in the host environment. So operations which
> > are implemented as driver callback (e.g. set_mac_address()) are not
> > supported.
>
> I suspect what you want to do is create, then configure the device in
> the host, then assign it to the guest.
>
> > 2) For security reason, some SR-IOV devices prohibit the VF driver
> > configuring the VF via its own register space. Instead, the configurations
> > must be done through the PF which the VF is associated with. This means PF
> > driver has to receive parameters that are used to configure its VFs. These
> > parameters obviously can be passed by traditional tools, if without
^^^
Sorry, here I meant to say 'can not'.

> > modification for SR-IOV.
>
> I think that idea also covers this point.

Can you please elaborate this?

Thanks a lot.

>
> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-10-17 06:03:18

by Anirban Chakraborty

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change


On Oct 13, 2008, at 9:46 PM, Matthew Wilcox wrote:

> On Tue, Oct 14, 2008 at 12:18:40PM +0800, Dong, Eddie wrote:
>> Matthew Wilcox wrote:
>>> On Tue, Oct 14, 2008 at 10:14:35AM +0800, Yu Zhao wrote:
>>>> As Eddie said, we have two problems here:
>>>> 1) User has to set device specific parameters of a VF
>>>> when he wants to use this VF with KVM (assign this
>>>> device to KVM guest). In this case,
>>>> VF driver is not loaded in the host environment. So
>>>> operations which
>>>> are implemented as driver callback (e.g.
>>>> set_mac_address()) are not supported.
>>>
>>> I suspect what you want to do is create, then configure
>>> the device in the host, then assign it to the guest.
>>
>> That is not true. Rememver the created VFs will be destroyed no
>> matter
>> for PF power event or error recovery conducted reset.
>> So what we want is:
>>
>> Config, create, assign, and then deassign and destroy and then
>> recreate...
>
> Yes, but my point is this all happens in the _host_, not in the
> _guest_.
>
>> Sorry can u explain a little bit more? The SR-IOV patch won't define
>> what kind of entries should be created or not, we leave network
>> subsystem to decide what to do. Same for disk subsstem etc.
>
> No entries should be created. This needs to be not SR-IOV specific.

I think we need to cover both the scenarios here, virtualization and
non virtualization. In the absence of virtualization, the VF and PF
driver should be identical. In this context, how does the PF driver
allocates a VF? Is dynamic allocation of VFs possible, or does it have
to allocate all the VFs that the device supports when the PF driver
loads? Also, will the probe function be called for the VFs, or does
the PF driver handle only the probe for the physical function? In
virtualization context things get bit more complex as the the VF
driver in guest would like to treat the VF as a physical function but
that may not be possible from the device perspective as the control
registers may well be shared between VF and PF.
I would think that the VF allocation is the job of SR PCIM. PCIM may
well ask the PF driver to configure a VF upon user request.

Thanks much,
Anirban Chakraborty

> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2008-11-15 12:46:29

by Yu Zhao

[permalink] [raw]
Subject: Re: [PATCH 6/6 v3] PCI: document the change

Matthew Wilcox wrote:
> On Sat, Sep 27, 2008 at 04:28:45PM +0800, Zhao, Yu wrote:
>> +To register SR-IOV service, Physical Function device driver needs to call:
>> + int pci_iov_register(struct pci_dev *dev,
>> + int (*notify)(struct pci_dev *, u32), char **entries)
>
> I think a better interface would put the 'notify' into the struct
> pci_driver. That would make 'notify' a bad name .... how about
> 'virtual'? There's also no documentation for the second parameter to
> 'notify'.

Yes, putting the callback function to the 'pci_driver' is better. Looks
like the 'virtual' is not very descriptive (and it's a adj. while other
callbacks are verb). Any other candidates?

Thanks,
Yu