Greetings,
The motivation is make VF device visible to PF driver. PF driver
may need this to access VF's PCI configuration.
Another use case is in sysfs symbolic linking. Some of VF's sysfs
entries are created by PF driver. For example, /sys/class/net/ethx/vfx
in Intel 82576 NIC driver. Makeing a symbolic link from VF's pci device
to this path also must be done in PF's driver.
Currently, there is no hint about VF's bus/devfn in PF's pci_dev.
The offset and stride entries(which are used to calculate bus/devfn
of VF devices) in VF's PCI configuration is also invisible in PF's
driver. So IMO this helper function is needed.
Any reply is appreciated, THX.
(ps: gmail will do line wrap/tab replace, use attachment instead to patch)
--
Frank Pan
Computer Science and Technology
Tsinghua University
Signed-off-by: Frank Pan <[email protected]>
---
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ce6a366..f15aa2a 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -722,6 +722,18 @@ int pci_num_vf(struct pci_dev *dev)
}
EXPORT_SYMBOL_GPL(pci_num_vf);
+/**
+ * pci_get_vf_dev - return the PCI device of a specific VF
+ * @dev: the PCI device
+ */
+struct pci_dev *pci_get_vf_dev(struct pci_dev *dev, int id)
+{
+ return pci_get_bus_and_slot(
+ virtfn_bus(dev, id),
+ virtfn_devfn(dev, id));
+}
+EXPORT_SYMBOL_GPL(pci_get_vf_dev);
+
static int ats_alloc_one(struct pci_dev *dev, int ps)
{
int pos;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a327322..fb6010b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1374,6 +1374,7 @@ extern int pci_enable_sriov(struct pci_dev *dev,
int nr_virtfn);
extern void pci_disable_sriov(struct pci_dev *dev);
extern irqreturn_t pci_sriov_migration(struct pci_dev *dev);
extern int pci_num_vf(struct pci_dev *dev);
+extern struct pci_dev *pci_get_vf_dev(struct pci_dev *dev, int id);
#else
static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
{
@@ -1390,6 +1391,10 @@ static inline int pci_num_vf(struct pci_dev *dev)
{
return 0;
}
+static inline struct pci_dev *pci_get_vf_dev(struct pci_dev *dev, int id)
+{
+ return NULL;
+}
#endif
#if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
On Tue, Jun 01, 2010 at 01:05:25PM +0800, Frank Pan wrote:
> Greetings,
>
> The motivation is make VF device visible to PF driver. PF driver
> may need this to access VF's PCI configuration.
What is "VF" and "PF" here?
> Another use case is in sysfs symbolic linking. Some of VF's sysfs
> entries are created by PF driver. For example, /sys/class/net/ethx/vfx
> in Intel 82576 NIC driver. Makeing a symbolic link from VF's pci device
> to this path also must be done in PF's driver.
No, the network core should create that symlink, not the driver, right?
thanks,
greg k-h
>-----Original Message-----
>From: Greg KH [mailto:[email protected]]
>Sent: Tuesday, June 01, 2010 11:39 AM
>
>On Tue, Jun 01, 2010 at 01:05:25PM +0800, Frank Pan wrote:
>> Greetings,
>>
>> The motivation is make VF device visible to PF driver. PF driver
>> may need this to access VF's PCI configuration.
>
>What is "VF" and "PF" here?
>
>> Another use case is in sysfs symbolic linking. Some of VF's sysfs
>> entries are created by PF driver. For example, /sys/class/net/ethx/vfx
>> in Intel 82576 NIC driver. Makeing a symbolic link from VF's pci device
>> to this path also must be done in PF's driver.
>
>No, the network core should create that symlink, not the driver, right?
>
>thanks,
>
>greg k-h
Furthermore, the links are already there in sysfs, created by the PCI subsystem.
/sys/class/net/ethX/device links to the PCI device entry for the PF device, which itself has symlinks to each of the VF devices.
This patch just isn't necessary.
-Mitch
> What is "VF" and "PF" here?
Virtual function/Physical function
> No, the network core should create that symlink, not the driver, right?
They are not network interfaces, so network core
won't link them. They are for VM uses. I may need
to explain this more clearly.
PF's interface in sysfs is /sys/class/net/ethx, while
VF has no interface. So 82576 driver makes directories
to provide interface-like sysfs entries: /sys/class/net/ethx/vf[1-7]
Make a link to this directory form VF's pci device is
almost impossible, because PF doesn't know VF's bdf.
Thanks for reply.
--
Frank Pan
Computer Science and Technology
Tsinghua University
> Furthermore, the links are already there in sysfs, created by the PCI subsystem.
>
> /sys/class/net/ethX/device links to the PCI device entry for the PF device, which itself has symlinks to each of the VF devices.
>
> This patch just isn't necessary.
I'm not talking about PF's pci device, because PF
driver knows it. I'm talking about VF's.
The root cause is PF cannot get VF's bdf, so it has
no idea where VF is. IMO it's abnormal.
Make symlinks is just a use case. When a userspace
app only knows VF's bdf, it has no idea how to get
into the VF's interface-like directories.
(/sys/class/net/ethX/vf[1-7])
Because of:
1. Userspace app cannot get PF's bdf from VF's bdf
2. Userspace app cannot guess the interface name of
pf(ethX)
So a symlink from /sys/bus/pci/devices/xxxxx to
/sys/class/net/ethX/vf[1-7] is useful.
This is a real issue.
Thanks for reply.
--
Frank Pan
Computer Science and Technology
Tsinghua University
>-----Original Message-----
>From: Frank Pan [mailto:[email protected]]
>
>I'm not talking about PF's pci device, because PF
>driver knows it. I'm talking about VF's.
>The root cause is PF cannot get VF's bdf, so it has
>no idea where VF is. IMO it's abnormal.
>
>Make symlinks is just a use case. When a userspace
>app only knows VF's bdf, it has no idea how to get
>into the VF's interface-like directories.
>(/sys/class/net/ethX/vf[1-7])
>Because of:
>1. Userspace app cannot get PF's bdf from VF's bdf
>2. Userspace app cannot guess the interface name of
>pf(ethX)
>So a symlink from /sys/bus/pci/devices/xxxxx to
>/sys/class/net/ethX/vf[1-7] is useful.
>This is a real issue.
>
Frank, I'm not sure what you're trying to accomplish here. All of the information you need is already in sysfs. Given the PF device, you can look at /sys/class/net/ethX/device/virtfnX to get the bus/device/function of each of the VF devices.
If the VF driver is loaded in your kernel, then given the bus/device/function of the vf device, you can look at /sys/class/net/ethX/device/virtfnX/net to see which interface corresponds to that VF.
Furthermore, if the VF driver is loaded, you can find the PF device by looking at /sys/class/net/ethX/device/physfn, and you can find out which interface it is by looking at /sys/class/net/ethX/device/physfn/net
So it's all there. You don't need anything else.
The current PF drivers (both ixgbe and igb) do not directly manipulate sysfs at all, so there is no /sys/class/net/ethX/vfX. If you see this setup, you are using a very, very old version of the igb driver, which is not supported at all. Please upgrade to a recent kernel/driver combination.
-Mitch
>Thanks for reply.
>
>--
>Frank Pan
>
>Computer Science and Technology
>Tsinghua University
hi Mitch,
> Frank, I'm not sure what you're trying to accomplish here. All of the information you need is already in sysfs. Given the PF device, you can look at /sys/class/net/ethX/device/virtfnX to get the bus/device/function of each of the VF devices.
Yes, that's the most funny part. Sysfs can only be traversed in
usespace. So the thing userspace knows isn't known by driver(pf driver
have no idea about vf's bdf), while the thing driver knows isn't known
by userspace(one cannot infer pf's bdf from vf's bdf).
Please think kernel/userspace as 2 system, they can hardly communicate
these informations. IMHO give a syscall/ioctl telling these is funny
and strange.
> If the VF driver is loaded in your kernel, then given the bus/device/function of the vf device, you can look at /sys/class/net/ethX/device/virtfnX/net to see which interface corresponds to that VF.
VF driver will never be loaded on physical machine, it can only be
loaded in a virtual machine. On a physical machine, VF won't have any
interface.
> Furthermore, if the VF driver is loaded, you can find the PF device by looking at /sys/class/net/ethX/device/physfn, and you can find out which interface it is by looking at /sys/class/net/ethX/device/physfn/net
So, when VF has no interface, this path is also unavailable. That's
why I say given vf's bdf, one cannot infer pf's.
> The current PF drivers (both ixgbe and igb) do not directly manipulate sysfs at all, so there is no /sys/class/net/ethX/vfX. If you see this setup, you are using a very, very old version of the igb driver, which is not supported at all. Please upgrade to a recent kernel/driver combination.
You are right, vfX is not available now. Anyway, I'm hacking igb
driver to provide informations about vf into userspace. When driver
don't know vf's bdf and userspace don't know pf's bdf, this thing
cannot be done. That's the motivation of this patch.
Don't forget another one: PF cannot access VF's PCI space. This is
also a <maybe common> use case. I need this to dump/change the VF's
PCI space.
Thanks for reply.
--
Frank Pan
Computer Science and Technology
Tsinghua University
> So, when VF has no interface, this path is also unavailable. That's
> why I say given vf's bdf, one cannot infer pf's.
Sorry I just missed something. Userspace can find pf's bdf by physfn.
So, the motivation is provide some information about vf from pf driver by sysfs.
I cannot avoid to create one directory for each vf, like vf0, vf1,
etc. Although userspace can find these directories, but it has no idea
about which one to use. Userspace just knows vf's bdf, it won't care
about/know vf's id. So I have the issue to create a symlink from each
vf's /sys/bus/pci/devices/xxxx/ to their corresponding directories.
And then comes the issue: PF cannot get VF's bdf in kernel space.
This is not a strange issue, hacking every SR-IOV device can met.
Thanks.
--
Frank Pan
Computer Science and Technology
Tsinghua University
>-----Original Message-----
>From: Frank Pan [mailto:[email protected]]
>Sent: Wednesday, June 02, 2010 8:11 PM
[snip]
>So, the motivation is provide some information about vf from pf driver by
>sysfs.
>
>I cannot avoid to create one directory for each vf, like vf0, vf1,
>etc. Although userspace can find these directories, but it has no idea
>about which one to use. Userspace just knows vf's bdf, it won't care
>about/know vf's id. So I have the issue to create a symlink from each
>vf's /sys/bus/pci/devices/xxxx/ to their corresponding directories.
>
>And then comes the issue: PF cannot get VF's bdf in kernel space.
>
>This is not a strange issue, hacking every SR-IOV device can met.
>
Frank, I still don't understand what you're trying to do here. What's the use case? What is the end goal that you're trying to get to?
What information do you need about the VF, and why? SR-IOV works properly today without this stuff.
If we can understand what you're trying to do, then maybe we can help.
-Mitch
>Thanks.
>
>--
>Frank Pan
>
>Computer Science and Technology
>Tsinghua University
* Frank Pan ([email protected]) wrote:
> hi Mitch,
>
> > Frank, I'm not sure what you're trying to accomplish here. All of the information you need is already in sysfs. Given the PF device, you can look at /sys/class/net/ethX/device/virtfnX to get the bus/device/function of each of the VF devices.
>
> Yes, that's the most funny part. Sysfs can only be traversed in
> usespace. So the thing userspace knows isn't known by driver(pf driver
> have no idea about vf's bdf), while the thing driver knows isn't known
> by userspace(one cannot infer pf's bdf from vf's bdf).
> Please think kernel/userspace as 2 system, they can hardly communicate
> these informations. IMHO give a syscall/ioctl telling these is funny
> and strange.
>
> > If the VF driver is loaded in your kernel, then given the bus/device/function of the vf device, you can look at /sys/class/net/ethX/device/virtfnX/net to see which interface corresponds to that VF.
>
> VF driver will never be loaded on physical machine, it can only be
> loaded in a virtual machine. On a physical machine, VF won't have any
> interface.
VF is often loaded on the physical machine. There's also a networking
specific mechanism for querying and configuring a VF via the PF.
While your patch is simple, it's unclear to me what your end goal is.
The patch itself only adds a function. if you showed how you are
planning to use it, that would really help.
It's especially confusing that you are comparing your patch with
symlinks visible in sysfs.
thanks,
-chris
> Frank, I still don't understand what you're trying to do here. What's the use case? What is the end goal that you're trying to get to?
>
> What information do you need about the VF, and why? ?SR-IOV works properly today without this stuff.
>
> If we can understand what you're trying to do, then maybe we can help.
Sorry, I used to think of this issue is common, but maybe not.
I'm hacking igb to add support of VF migration. Migration is emerging
as a standard of PCI IOV, but has not implemented in intel 82576(igb).
The idea is interact with VF's MMIO/PCI configuration in the physical
machine, which only PF's driver exists.
This work started by userspace. All I know is the PCI BDF of the VF,
because give BDF to the VMM is enough to assign a VF to a VM.
So here comes the first issue, neither userspace nor PF driver knows
the map from VF's ID to BDF. My plan is communicating with PF's driver
by sysfs, so I repeatly complain about the sysfs issue.
PF can access VF's BAR when it knows VF's ID. On the other hand, PF
can access VF's PCI configuration only when if knows VF's BDF. This is
the second issue. Even when userspace calculated and gave it an id, PF
driver still cannot access the PCI configuration of VF.
Really thanks for your help. I'm new on both kernel scope and English scope. :)
--
Frank Pan
Computer Science and Technology
Tsinghua University
> VF is often loaded on the physical machine. ?There's also a networking
> specific mechanism for querying and configuring a VF via the PF.
? I don't really understand. What do you mean load? I don't think
physical machine is able to use it as a hardware device.
> While your patch is simple, it's unclear to me what your end goal is.
> The patch itself only adds a function. if you showed how you are
> planning to use it, that would really help.
Currently my hack is applied on 2.6.18 because of xen's limitation. As
Mitch says, igb driver is significantly different with recent ones. On
the other hand, my hack just exposes several PCI configuration and
MMIO registers to the sysfs, you will only understand the use case
when you also see the userspace hacks.
> It's especially confusing that you are comparing your patch with
> symlinks visible in sysfs.
That's my fault. The most recent reply explains it.
Thanks for reply.
--
Frank Pan
Computer Science and Technology
Tsinghua University
On Tue, 1 Jun 2010 13:05:25 +0800
Frank Pan <[email protected]> wrote:
> Greetings,
>
> The motivation is make VF device visible to PF driver. PF driver
> may need this to access VF's PCI configuration.
> Another use case is in sysfs symbolic linking. Some of VF's sysfs
> entries are created by PF driver. For example, /sys/class/net/ethx/vfx
> in Intel 82576 NIC driver. Makeing a symbolic link from VF's pci device
> to this path also must be done in PF's driver.
>
> Currently, there is no hint about VF's bus/devfn in PF's pci_dev.
> The offset and stride entries(which are used to calculate bus/devfn
> of VF devices) in VF's PCI configuration is also invisible in PF's
> driver. So IMO this helper function is needed.
>
> Any reply is appreciated, THX.
>
> (ps: gmail will do line wrap/tab replace, use attachment instead to patch)
Per the discussion in this thread it sounds like this really has
nothing to do with sysfs and more to do with being a convenient API for
drivers.
Is that correct? If so, and assuming there's not some other way of
getting this info from a driver, I'm ok with it, but it should be
submitted as part of a patchset including driver code that uses it.
--
Jesse Barnes, Intel Open Source Technology Center
* Frank Pan ([email protected]) wrote:
> > VF is often loaded on the physical machine. ?There's also a networking
> > specific mechanism for querying and configuring a VF via the PF.
> ? I don't really understand. What do you mean load? I don't think
> physical machine is able to use it as a hardware device.
Yes, it is possible. Your example of igb...igbvf driver can be loaded
on physical machine and drive a VF instance. In fact, this is a pretty
normal mode for KVM.
> > While your patch is simple, it's unclear to me what your end goal is.
> > The patch itself only adds a function. if you showed how you are
> > planning to use it, that would really help.
> Currently my hack is applied on 2.6.18 because of xen's limitation. As
> Mitch says, igb driver is significantly different with recent ones. On
> the other hand, my hack just exposes several PCI configuration and
> MMIO registers to the sysfs, you will only understand the use case
> when you also see the userspace hacks.
OK, but this sounds like the wrong approach. We already have a
mechanism for using the PF to program a VF. Have a look at the netlink
interface. Specifically, do_setvfinfo().
thanks,
-chris
> Per the discussion in this thread it sounds like this really has
> nothing to do with sysfs and more to do with being a convenient API for
> drivers.
Yes.
> Is that correct? If so, and assuming there's not some other way of
> getting this info from a driver, I'm ok with it, but it should be
> submitted as part of a patchset including driver code that uses it.
Yes, I may submit them all after the driver hack is complete. I put the
issue here to get response of some thing like, is this change possible?
or is there a better way to do it?
Thanks for reply.
--
Frank Pan
Computer Science and Technology
Tsinghua University
On Fri, 11 Jun 2010 15:42:12 +0800
Frank Pan <[email protected]> wrote:
> > Per the discussion in this thread it sounds like this really has
> > nothing to do with sysfs and more to do with being a convenient API for
> > drivers.
> Yes.
>
> > Is that correct? If so, and assuming there's not some other way of
> > getting this info from a driver, I'm ok with it, but it should be
> > submitted as part of a patchset including driver code that uses it.
> Yes, I may submit them all after the driver hack is complete. I put the
> issue here to get response of some thing like, is this change possible?
> or is there a better way to do it?
>
> Thanks for reply.
I think once we see the driver usage it'll be easier to say.
Chris is the one you'll need to convince though.
--
Jesse Barnes, Intel Open Source Technology Center