2012-10-03 17:51:48

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 0/5] PCI: per pci device sysfs set_max_vfs support

First two patches are old patches in my queue.
They introduce pci_dev_type and use it with vga in sysfs.

other two will add set_max_vfs in sysfs for device that support siov.

last one is draft version for ixgbe, still need ixgbe guys to sort it
out.

Thanks

Yinghai


Yinghai Lu (5):
PCI: Add pci_dev_type
PCI, sys: Use is_visable() with boot_vga attribute for pci_dev
PCI: add set_max_vfs in pci_driver ops
PCI: Add max_vfs in sysfs per pci device where supports
ixgbe: add driver set_max_vfs support

drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 +
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++---
drivers/pci/pci-sysfs.c | 96 ++++++++++++++++++++++---
drivers/pci/pci.h | 1 +
drivers/pci/probe.c | 1 +
include/linux/pci.h | 2 +
6 files changed, 126 insertions(+), 20 deletions(-)

--
1.7.7


2012-10-03 17:52:09

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 4/5] PCI: Add max_vfs in sysfs per pci device where supports

only pci device that support sriov will have max_vfs show up in /sys

when user set value in /sys, driver ops set_max_vfs will be called to enable
VF there.

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/pci-sysfs.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index fbbb97f..9b6f409 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -458,6 +458,52 @@ boot_vga_show(struct device *dev, struct device_attribute *attr, char *buf)
}
struct device_attribute vga_attr = __ATTR_RO(boot_vga);

+static ssize_t
+max_vfs_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ return sprintf(buf, "%u\n", pdev->max_vfs);
+}
+
+static void max_vfs_callback(struct device *dev)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ mutex_lock(&pci_remove_rescan_mutex);
+ if (pdev->is_added && dev->driver) {
+ struct pci_driver *pdrv;
+
+ pdrv = to_pci_driver(dev->driver);
+ if (pdrv->set_max_vfs)
+ pdrv->set_max_vfs(pdev);
+
+ }
+ mutex_unlock(&pci_remove_rescan_mutex);
+}
+static ssize_t
+max_vfs_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned long val;
+ int err;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (kstrtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ pdev->max_vfs = val;
+
+ err = device_schedule_callback(dev, max_vfs_callback);
+
+ if (err)
+ return err;
+
+ return count;
+}
+struct device_attribute max_vfs_attr =
+ __ATTR(max_vfs, 0644, max_vfs_show, max_vfs_store);
+
static void
pci_config_pm_runtime_get(struct pci_dev *pdev)
{
@@ -1405,6 +1451,7 @@ late_initcall(pci_sysfs_init);

static struct attribute *pci_dev_dev_attrs[] = {
&vga_attr.attr,
+ &max_vfs_attr.attr,
NULL,
};

@@ -1418,6 +1465,10 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
return 0;

+ if (a == &max_vfs_attr.attr)
+ if (!pdev->is_physfn)
+ return 0;
+
return a->mode;
}

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7d70a5e..f7423d4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -335,6 +335,7 @@ struct pci_dev {
unsigned int broken_intx_masking:1;
unsigned int io_window_1k:1; /* Intel P2P bridge 1K I/O windows */
pci_dev_flags_t dev_flags;
+ unsigned int max_vfs;
atomic_t enable_cnt; /* pci_enable_device has been called */

u32 saved_config_space[16]; /* config space saved at suspend time */
--
1.7.7

2012-10-03 17:51:54

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 1/5] PCI: Add pci_dev_type

need to use it for visiable attribute control in syfsfs for pci_dev.

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/pci-sysfs.c | 24 ++++++++++++++++++++++++
drivers/pci/pci.h | 1 +
drivers/pci/probe.c | 1 +
3 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 02d107b..3d160aa 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1411,3 +1411,27 @@ static int __init pci_sysfs_init(void)
}

late_initcall(pci_sysfs_init);
+
+static struct attribute *pci_dev_dev_attrs[] = {
+ NULL,
+};
+
+static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
+ struct attribute *a, int n)
+{
+ return a->mode;
+}
+
+static struct attribute_group pci_dev_attr_group = {
+ .attrs = pci_dev_dev_attrs,
+ .is_visible = pci_dev_attrs_are_visible,
+};
+
+static const struct attribute_group *pci_dev_attr_groups[] = {
+ &pci_dev_attr_group,
+ NULL,
+};
+
+struct device_type pci_dev_type = {
+ .groups = pci_dev_attr_groups,
+};
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index bacbcba..6f6cd14 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -157,6 +157,7 @@ static inline int pci_no_d1d2(struct pci_dev *dev)
}
extern struct device_attribute pci_dev_attrs[];
extern struct device_attribute pcibus_dev_attrs[];
+extern struct device_type pci_dev_type;
#ifdef CONFIG_HOTPLUG
extern struct bus_attribute pci_bus_attrs[];
#else
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ec909af..0312f1c48 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -975,6 +975,7 @@ int pci_setup_device(struct pci_dev *dev)
dev->sysdata = dev->bus->sysdata;
dev->dev.parent = dev->bus->bridge;
dev->dev.bus = &pci_bus_type;
+ dev->dev.type = &pci_dev_type;
dev->hdr_type = hdr_type & 0x7f;
dev->multifunction = !!(hdr_type & 0x80);
dev->error_state = pci_channel_io_normal;
--
1.7.7

2012-10-03 17:52:28

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 5/5] ixgbe: add driver set_max_vfs support

Need ixgbe guys to close the loop to use set_max_vfs instead
kernel parameters.

Signed-off-by: Yinghai Lu <[email protected]>
Cc: Jeff Kirsher <[email protected]>
Cc: Jesse Brandeburg <[email protected]>
Cc: Greg Rose <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 +
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++++++++++++-----
2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index b9623e9..d39d975 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -558,6 +558,8 @@ struct ixgbe_adapter {
u32 interrupt_event;
u32 led_reg;

+ struct ixgbe_info *ixgbe_info;
+
#ifdef CONFIG_IXGBE_PTP
struct ptp_clock *ptp_clock;
struct ptp_clock_info ptp_caps;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index ee61819..1c097c7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -129,13 +129,6 @@ static struct notifier_block dca_notifier = {
};
#endif

-#ifdef CONFIG_PCI_IOV
-static unsigned int max_vfs;
-module_param(max_vfs, uint, 0);
-MODULE_PARM_DESC(max_vfs,
- "Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63");
-#endif /* CONFIG_PCI_IOV */
-
static unsigned int allow_unsupported_sfp;
module_param(allow_unsupported_sfp, uint, 0);
MODULE_PARM_DESC(allow_unsupported_sfp,
@@ -4496,7 +4489,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
#ifdef CONFIG_PCI_IOV
/* assign number of SR-IOV VFs */
if (hw->mac.type != ixgbe_mac_82598EB)
- adapter->num_vfs = (max_vfs > 63) ? 0 : max_vfs;
+ adapter->num_vfs = min_t(int, pdev->max_vfs, 63);

#endif
/* enable itr by default in dynamic mode */
@@ -7220,8 +7213,9 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,

#ifdef CONFIG_PCI_IOV
ixgbe_enable_sriov(adapter, ii);
-
#endif
+ adapter->ixgbe_info = ii;
+
netdev->features = NETIF_F_SG |
NETIF_F_IP_CSUM |
NETIF_F_IPV6_CSUM |
@@ -7683,11 +7677,43 @@ static const struct pci_error_handlers ixgbe_err_handler = {
.resume = ixgbe_io_resume,
};

+static void ixgbe_set_max_vfs(struct pci_dev *pdev)
+{
+#ifdef CONFIG_PCI_IOV
+ struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
+ struct ixgbe_hw *hw = &adapter->hw;
+ int num_vfs = 0;
+
+ /* assign number of SR-IOV VFs */
+ if (hw->mac.type != ixgbe_mac_82598EB)
+ num_vfs = min_t(int, pdev->max_vfs, 63);
+
+ /* no change */
+ if (adapter->num_vfs == num_vfs)
+ return;
+
+ if (!num_vfs) {
+ /* disable sriov */
+ ixgbe_disable_sriov(adapter);
+ adapter->num_vfs = 0;
+ } else if (!adapter->num_vfs && num_vfs) {
+ /* enable sriov */
+ adapter->num_vfs = num_vfs;
+ ixgbe_enable_sriov(adapter, adapter->ixgbe_info);
+ } else {
+ /* increase or decrease */
+ }
+
+ pdev->max_vfs = adapter->num_vfs;
+#endif
+}
+
static struct pci_driver ixgbe_driver = {
.name = ixgbe_driver_name,
.id_table = ixgbe_pci_tbl,
.probe = ixgbe_probe,
.remove = __devexit_p(ixgbe_remove),
+ .set_max_vfs = ixgbe_set_max_vfs,
#ifdef CONFIG_PM
.suspend = ixgbe_suspend,
.resume = ixgbe_resume,
--
1.7.7

2012-10-03 17:52:50

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 3/5] PCI: add set_max_vfs in pci_driver ops

Will use it enable sriov for pci devices.

Signed-off-by: Yinghai Lu <[email protected]>
---
include/linux/pci.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index be1de01..7d70a5e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -590,6 +590,7 @@ struct pci_driver {
const struct pci_device_id *id_table; /* must be non-NULL for probe to be called */
int (*probe) (struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
void (*remove) (struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */
+ void (*set_max_vfs) (struct pci_dev *dev); /* enable sriov */
int (*suspend) (struct pci_dev *dev, pm_message_t state); /* Device suspended */
int (*suspend_late) (struct pci_dev *dev, pm_message_t state);
int (*resume_early) (struct pci_dev *dev);
--
1.7.7

2012-10-03 17:52:05

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 2/5] PCI, sys: Use is_visable() with boot_vga attribute for pci_dev

That could let pci_create_sysfs_dev_files more simple.

also fix possible fix memleak during removing path.

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/pci-sysfs.c | 21 ++++++++++-----------
1 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 3d160aa..fbbb97f 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1303,29 +1303,20 @@ int __must_check pci_create_sysfs_dev_files (struct pci_dev *pdev)
pdev->rom_attr = attr;
}

- if ((pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA) {
- retval = device_create_file(&pdev->dev, &vga_attr);
- if (retval)
- goto err_rom_file;
- }
-
/* add platform-specific attributes */
retval = pcibios_add_platform_entries(pdev);
if (retval)
- goto err_vga_file;
+ goto err_rom_file;

/* add sysfs entries for various capabilities */
retval = pci_create_capabilities_sysfs(pdev);
if (retval)
- goto err_vga_file;
+ goto err_rom_file;

pci_create_firmware_label_files(pdev);

return 0;

-err_vga_file:
- if ((pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
- device_remove_file(&pdev->dev, &vga_attr);
err_rom_file:
if (rom_size) {
sysfs_remove_bin_file(&pdev->dev.kobj, pdev->rom_attr);
@@ -1413,12 +1404,20 @@ static int __init pci_sysfs_init(void)
late_initcall(pci_sysfs_init);

static struct attribute *pci_dev_dev_attrs[] = {
+ &vga_attr.attr,
NULL,
};

static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
struct attribute *a, int n)
{
+ struct device *dev = container_of(kobj, struct device, kobj);
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (a == &vga_attr.attr)
+ if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
+ return 0;
+
return a->mode;
}

--
1.7.7

2012-10-03 17:57:49

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

On Wed, Oct 3, 2012 at 10:51 AM, Yinghai Lu <[email protected]> wrote:
> Need ixgbe guys to close the loop to use set_max_vfs instead
> kernel parameters.

Sorry, I should put RFC in the subject line for this one.

Thanks

Yinghai

2012-10-03 18:45:45

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

On Wed, Oct 03, 2012 at 10:51:35AM -0700, Yinghai Lu wrote:
> Need ixgbe guys to close the loop to use set_max_vfs instead
> kernel parameters.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: Jeff Kirsher <[email protected]>
> Cc: Jesse Brandeburg <[email protected]>
> Cc: Greg Rose <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: John Fastabend <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 +
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++++++++++++-----
> 2 files changed, 37 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index b9623e9..d39d975 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -558,6 +558,8 @@ struct ixgbe_adapter {
> u32 interrupt_event;
> u32 led_reg;
>
> + struct ixgbe_info *ixgbe_info;
> +
> #ifdef CONFIG_IXGBE_PTP
> struct ptp_clock *ptp_clock;
> struct ptp_clock_info ptp_caps;
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ee61819..1c097c7 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -129,13 +129,6 @@ static struct notifier_block dca_notifier = {
> };
> #endif
>
> -#ifdef CONFIG_PCI_IOV
> -static unsigned int max_vfs;
> -module_param(max_vfs, uint, 0);
> -MODULE_PARM_DESC(max_vfs,
> - "Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63");
> -#endif /* CONFIG_PCI_IOV */
> -
> static unsigned int allow_unsupported_sfp;
> module_param(allow_unsupported_sfp, uint, 0);
> MODULE_PARM_DESC(allow_unsupported_sfp,
> @@ -4496,7 +4489,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
> #ifdef CONFIG_PCI_IOV
> /* assign number of SR-IOV VFs */
> if (hw->mac.type != ixgbe_mac_82598EB)
> - adapter->num_vfs = (max_vfs > 63) ? 0 : max_vfs;
> + adapter->num_vfs = min_t(int, pdev->max_vfs, 63);

Could we make this min_t(uint, ...);

->max_vfs is type unsigned int. We take an unsigned long from sysfs.
We silently truncate it to an unsigned int. Then we cast it to a
negative number and compare against 63 and take the minimum...

It's root only so it's not a problem but it's a hassle to audit.

regards,
dan carpenter

2012-10-03 18:47:19

by Duyck, Alexander H

[permalink] [raw]
Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

On 10/03/2012 10:51 AM, Yinghai Lu wrote:
> Need ixgbe guys to close the loop to use set_max_vfs instead
> kernel parameters.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: Jeff Kirsher <[email protected]>
> Cc: Jesse Brandeburg <[email protected]>
> Cc: Greg Rose <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: John Fastabend <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 +
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++++++++++++-----
> 2 files changed, 37 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index b9623e9..d39d975 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -558,6 +558,8 @@ struct ixgbe_adapter {
> u32 interrupt_event;
> u32 led_reg;
>
> + struct ixgbe_info *ixgbe_info;
> +
> #ifdef CONFIG_IXGBE_PTP
> struct ptp_clock *ptp_clock;
> struct ptp_clock_info ptp_caps;
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ee61819..1c097c7 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -129,13 +129,6 @@ static struct notifier_block dca_notifier = {
> };
> #endif
>
> -#ifdef CONFIG_PCI_IOV
> -static unsigned int max_vfs;
> -module_param(max_vfs, uint, 0);
> -MODULE_PARM_DESC(max_vfs,
> - "Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63");
> -#endif /* CONFIG_PCI_IOV */
> -
> static unsigned int allow_unsupported_sfp;
> module_param(allow_unsupported_sfp, uint, 0);
> MODULE_PARM_DESC(allow_unsupported_sfp,
> @@ -4496,7 +4489,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
> #ifdef CONFIG_PCI_IOV
> /* assign number of SR-IOV VFs */
> if (hw->mac.type != ixgbe_mac_82598EB)
> - adapter->num_vfs = (max_vfs > 63) ? 0 : max_vfs;
> + adapter->num_vfs = min_t(int, pdev->max_vfs, 63);
>
> #endif
> /* enable itr by default in dynamic mode */
> @@ -7220,8 +7213,9 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>
> #ifdef CONFIG_PCI_IOV
> ixgbe_enable_sriov(adapter, ii);
> -
> #endif
> + adapter->ixgbe_info = ii;
> +
> netdev->features = NETIF_F_SG |
> NETIF_F_IP_CSUM |
> NETIF_F_IPV6_CSUM |
> @@ -7683,11 +7677,43 @@ static const struct pci_error_handlers ixgbe_err_handler = {
> .resume = ixgbe_io_resume,
> };
>
> +static void ixgbe_set_max_vfs(struct pci_dev *pdev)
> +{
> +#ifdef CONFIG_PCI_IOV
> + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
> + struct ixgbe_hw *hw = &adapter->hw;
> + int num_vfs = 0;
> +
> + /* assign number of SR-IOV VFs */
> + if (hw->mac.type != ixgbe_mac_82598EB)
> + num_vfs = min_t(int, pdev->max_vfs, 63);
> +
> + /* no change */
> + if (adapter->num_vfs == num_vfs)
> + return;
> +
> + if (!num_vfs) {
> + /* disable sriov */
> + ixgbe_disable_sriov(adapter);
> + adapter->num_vfs = 0;
> + } else if (!adapter->num_vfs && num_vfs) {
> + /* enable sriov */
> + adapter->num_vfs = num_vfs;
> + ixgbe_enable_sriov(adapter, adapter->ixgbe_info);
> + } else {
> + /* increase or decrease */
> + }
> +
> + pdev->max_vfs = adapter->num_vfs;
> +#endif
> +}
> +
> static struct pci_driver ixgbe_driver = {
> .name = ixgbe_driver_name,
> .id_table = ixgbe_pci_tbl,
> .probe = ixgbe_probe,
> .remove = __devexit_p(ixgbe_remove),
> + .set_max_vfs = ixgbe_set_max_vfs,
> #ifdef CONFIG_PM
> .suspend = ixgbe_suspend,
> .resume = ixgbe_resume,

The ixgbe_set_max_vfs function has several issues. The two big ones are
that this function assumes it can just enable/disable SR-IOV without any
other changes being necessary which is not the case. I would recommend
looking at ixgbe_setup_tc for how to do this properly. Secondly is the
fact that this code will change the PF network device and as such
sections of the code should be called with the RTNL lock held. In
addition I believe you have to disable SR-IOV before enabling it again
with a different number of VFs.

Below is a link to one of the early patches for igb when we were first
introducing SR-IOV, and the in-driver sysfs value had been rejected. I
figure it might be useful as it was also using sysfs to enable/disable
VFs. It however doesn't have the correct locking on changing the queues
and as such will likely throw an error if you were to implement it the
same way now:
http://lists.openwall.net/netdev/2009/04/08/34

Thanks,

Alex

2012-10-03 18:55:42

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH 3/5] PCI: add set_max_vfs in pci_driver ops

On 10/03/2012 01:51 PM, Yinghai Lu wrote:
> Will use it enable sriov for pci devices.
>
> Signed-off-by: Yinghai Lu<[email protected]>
> ---
> include/linux/pci.h | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index be1de01..7d70a5e 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -590,6 +590,7 @@ struct pci_driver {
> const struct pci_device_id *id_table; /* must be non-NULL for probe to be called */
> int (*probe) (struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
> void (*remove) (struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */
> + void (*set_max_vfs) (struct pci_dev *dev); /* enable sriov */
> int (*suspend) (struct pci_dev *dev, pm_message_t state); /* Device suspended */
> int (*suspend_late) (struct pci_dev *dev, pm_message_t state);
> int (*resume_early) (struct pci_dev *dev);

I thought I stated the following in your earlier patch set....

(a) don't use 'set_max_vfs' ; it is not changing the max; the max
is whatever the device supports. This kind of terminology confuses
what is being done, and not descripting what is being done.
(b) this is equiv to the sriov_enable_vfs in the RFC set I sent.
-- in this set, it prevents the user trying to do more than one enable,
and that check should be done, and reject the request, which solves one
of the complaints Alexander had.

I'll try to mind-meld your sysfs attr creation patches to mind later today
and post a new series tonight or tomorrow. Sorry, stuck in mtgs today (and right now!),
thus the delay.

- Don

2012-10-03 19:02:40

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

On 10/03/2012 02:47 PM, Alexander Duyck wrote:
> On 10/03/2012 10:51 AM, Yinghai Lu wrote:
>> Need ixgbe guys to close the loop to use set_max_vfs instead
>> kernel parameters.
>>
>> Signed-off-by: Yinghai Lu<[email protected]>
>> Cc: Jeff Kirsher<[email protected]>
>> Cc: Jesse Brandeburg<[email protected]>
>> Cc: Greg Rose<[email protected]>
>> Cc: "David S. Miller"<[email protected]>
>> Cc: John Fastabend<[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> ---
>> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 +
>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++++++++++++-----
>> 2 files changed, 37 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index b9623e9..d39d975 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -558,6 +558,8 @@ struct ixgbe_adapter {
>> u32 interrupt_event;
>> u32 led_reg;
>>
>> + struct ixgbe_info *ixgbe_info;
>> +
>> #ifdef CONFIG_IXGBE_PTP
>> struct ptp_clock *ptp_clock;
>> struct ptp_clock_info ptp_caps;
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index ee61819..1c097c7 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -129,13 +129,6 @@ static struct notifier_block dca_notifier = {
>> };
>> #endif
>>
>> -#ifdef CONFIG_PCI_IOV
>> -static unsigned int max_vfs;
>> -module_param(max_vfs, uint, 0);
>> -MODULE_PARM_DESC(max_vfs,
>> - "Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63");
>> -#endif /* CONFIG_PCI_IOV */
>> -
>> static unsigned int allow_unsupported_sfp;
>> module_param(allow_unsupported_sfp, uint, 0);
>> MODULE_PARM_DESC(allow_unsupported_sfp,
>> @@ -4496,7 +4489,7 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
>> #ifdef CONFIG_PCI_IOV
>> /* assign number of SR-IOV VFs */
>> if (hw->mac.type != ixgbe_mac_82598EB)
>> - adapter->num_vfs = (max_vfs> 63) ? 0 : max_vfs;
>> + adapter->num_vfs = min_t(int, pdev->max_vfs, 63);
>>
>> #endif
>> /* enable itr by default in dynamic mode */
>> @@ -7220,8 +7213,9 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
>>
>> #ifdef CONFIG_PCI_IOV
>> ixgbe_enable_sriov(adapter, ii);
>> -
>> #endif
>> + adapter->ixgbe_info = ii;
>> +
>> netdev->features = NETIF_F_SG |
>> NETIF_F_IP_CSUM |
>> NETIF_F_IPV6_CSUM |
>> @@ -7683,11 +7677,43 @@ static const struct pci_error_handlers ixgbe_err_handler = {
>> .resume = ixgbe_io_resume,
>> };
>>
>> +static void ixgbe_set_max_vfs(struct pci_dev *pdev)
>> +{
>> +#ifdef CONFIG_PCI_IOV
>> + struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
>> + struct ixgbe_hw *hw =&adapter->hw;
>> + int num_vfs = 0;
>> +
>> + /* assign number of SR-IOV VFs */
>> + if (hw->mac.type != ixgbe_mac_82598EB)
>> + num_vfs = min_t(int, pdev->max_vfs, 63);
>> +
>> + /* no change */
>> + if (adapter->num_vfs == num_vfs)
>> + return;
>> +
>> + if (!num_vfs) {
>> + /* disable sriov */
>> + ixgbe_disable_sriov(adapter);
>> + adapter->num_vfs = 0;
>> + } else if (!adapter->num_vfs&& num_vfs) {
>> + /* enable sriov */
>> + adapter->num_vfs = num_vfs;
>> + ixgbe_enable_sriov(adapter, adapter->ixgbe_info);
>> + } else {
>> + /* increase or decrease */
>> + }
>> +
>> + pdev->max_vfs = adapter->num_vfs;
>> +#endif
>> +}
>> +
>> static struct pci_driver ixgbe_driver = {
>> .name = ixgbe_driver_name,
>> .id_table = ixgbe_pci_tbl,
>> .probe = ixgbe_probe,
>> .remove = __devexit_p(ixgbe_remove),
>> + .set_max_vfs = ixgbe_set_max_vfs,
>> #ifdef CONFIG_PM
>> .suspend = ixgbe_suspend,
>> .resume = ixgbe_resume,
>
> The ixgbe_set_max_vfs function has several issues. The two big ones are
> that this function assumes it can just enable/disable SR-IOV without any
> other changes being necessary which is not the case. I would recommend
> looking at ixgbe_setup_tc for how to do this properly. Secondly is the
> fact that this code will change the PF network device and as such
> sections of the code should be called with the RTNL lock held. In
> addition I believe you have to disable SR-IOV before enabling it again
> with a different number of VFs.
>
> Below is a link to one of the early patches for igb when we were first
> introducing SR-IOV, and the in-driver sysfs value had been rejected. I
> figure it might be useful as it was also using sysfs to enable/disable
> VFs. It however doesn't have the correct locking on changing the queues
> and as such will likely throw an error if you were to implement it the
> same way now:
> http://lists.openwall.net/netdev/2009/04/08/34
>
> Thanks,
>
> Alex

Alex,
Thanks for patch set pointer.
When I started to work on the ixgbe example use based on the RFC set I posted,
I ran into the problem you outlined -- the PF uses/consumes all the queues &
MSI intrs when sriov not enabled at driver load time, which required
more network shutdown logic that I'm not familiar with... So, I was going
to defer to Greg to work that magic. :)
Greg: assume the 2 callback function interface in the RFC patch set I sent,
(primarily, just the include/linux/pci.h changes), and you can make the
necessary drivers mods from there. In the meantime, I'll make the changes
to my original/v1 RFC to reflect the changes that GKH & Yinghai recommended/implemented
for sysfs attribute creation & removal in a v2 posting.
The end result is that the current module parameter setting for max_vfs should
continue to work, and the sysfs interface will work when those pieces are provided.

-Don

2012-10-03 19:16:37

by Rose, Gregory V

[permalink] [raw]
Subject: RE: [PATCH 5/5] ixgbe: add driver set_max_vfs support

> -----Original Message-----
> From: Don Dutile [mailto:[email protected]]
> Sent: Wednesday, October 03, 2012 12:03 PM
> To: Duyck, Alexander H
> Cc: Yinghai Lu; Bjorn Helgaas; Greg Kroah-Hartman; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; Rose, Gregory V; [email protected]
> reply-to; Kirsher, Jeffrey T; Brandeburg, Jesse; David S. Miller;
> Fastabend, John R; [email protected];
> [email protected]
> Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support
>
> On 10/03/2012 02:47 PM, Alexander Duyck wrote:

[snip]

> >
> > The ixgbe_set_max_vfs function has several issues. The two big ones
> > are that this function assumes it can just enable/disable SR-IOV
> > without any other changes being necessary which is not the case. I
> > would recommend looking at ixgbe_setup_tc for how to do this properly.
> > Secondly is the fact that this code will change the PF network device
> > and as such sections of the code should be called with the RTNL lock
> > held. In addition I believe you have to disable SR-IOV before
> > enabling it again with a different number of VFs.
> >
> > Below is a link to one of the early patches for igb when we were first
> > introducing SR-IOV, and the in-driver sysfs value had been rejected.
> > I figure it might be useful as it was also using sysfs to
> > enable/disable VFs. It however doesn't have the correct locking on
> > changing the queues and as such will likely throw an error if you were
> > to implement it the same way now:
> > http://lists.openwall.net/netdev/2009/04/08/34
> >
> > Thanks,
> >
> > Alex
>
> Alex,
> Thanks for patch set pointer.
> When I started to work on the ixgbe example use based on the RFC set I
> posted, I ran into the problem you outlined -- the PF uses/consumes all
> the queues & MSI intrs when sriov not enabled at driver load time, which
> required more network shutdown logic that I'm not familiar with... So, I
> was going to defer to Greg to work that magic. :)
> Greg: assume the 2 callback function interface in the RFC patch set I
> sent,
> (primarily, just the include/linux/pci.h changes), and you can make
> the
> necessary drivers mods from there. In the meantime, I'll make the
> changes
> to my original/v1 RFC to reflect the changes that GKH & Yinghai
> recommended/implemented
> for sysfs attribute creation & removal in a v2 posting.
> The end result is that the current module parameter setting for
> max_vfs should
> continue to work, and the sysfs interface will work when those
> pieces are provided.

OK, I'll start work on it.

Thanks Don,

- Greg

>
> -Don

2012-10-03 19:28:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/5] PCI, sys: Use is_visable() with boot_vga attribute for pci_dev

On Wed, Oct 03, 2012 at 10:51:32AM -0700, Yinghai Lu wrote:
> That could let pci_create_sysfs_dev_files more simple.
>
> also fix possible fix memleak during removing path.
>
> Signed-off-by: Yinghai Lu <[email protected]>

This, combined with the 1/5 patch, looks great, thanks for doing this.

Acked-by: Greg Kroah-Hartman <[email protected]>

2012-10-03 20:37:22

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 5/5] ixgbe: add driver set_max_vfs support

On Wed, Oct 3, 2012 at 11:47 AM, Alexander Duyck
<[email protected]> wrote:
> The ixgbe_set_max_vfs function has several issues. The two big ones are
> that this function assumes it can just enable/disable SR-IOV without any
> other changes being necessary which is not the case. I would recommend
> looking at ixgbe_setup_tc for how to do this properly. Secondly is the
> fact that this code will change the PF network device and as such
> sections of the code should be called with the RTNL lock held. In
> addition I believe you have to disable SR-IOV before enabling it again
> with a different number of VFs.

yes, agreed.

>
> Below is a link to one of the early patches for igb when we were first
> introducing SR-IOV, and the in-driver sysfs value had been rejected. I
> figure it might be useful as it was also using sysfs to enable/disable
> VFs. It however doesn't have the correct locking on changing the queues
> and as such will likely throw an error if you were to implement it the
> same way now:
> http://lists.openwall.net/netdev/2009/04/08/34

yes, that is almost there if put that in-driver value to per device
value and ops.

Thanks

Yinghai

2012-10-03 20:41:36

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 3/5] PCI: add set_max_vfs in pci_driver ops

On Wed, Oct 3, 2012 at 11:55 AM, Don Dutile <[email protected]> wrote:
> On 10/03/2012 01:51 PM, Yinghai Lu wrote:
>>
>> Will use it enable sriov for pci devices.
>>
>> Signed-off-by: Yinghai Lu<[email protected]>
>> ---
>> include/linux/pci.h | 1 +
>> 1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index be1de01..7d70a5e 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -590,6 +590,7 @@ struct pci_driver {
>> const struct pci_device_id *id_table; /* must be non-NULL for
>> probe to be called */
>> int (*probe) (struct pci_dev *dev, const struct pci_device_id
>> *id); /* New device inserted */
>> void (*remove) (struct pci_dev *dev); /* Device removed (NULL if
>> not a hot-plug capable driver) */
>> + void (*set_max_vfs) (struct pci_dev *dev); /* enable sriov */
>> int (*suspend) (struct pci_dev *dev, pm_message_t state); /*
>> Device suspended */
>> int (*suspend_late) (struct pci_dev *dev, pm_message_t state);
>> int (*resume_early) (struct pci_dev *dev);
>
>
> I thought I stated the following in your earlier patch set....
>
> (a) don't use 'set_max_vfs' ; it is not changing the max; the max
> is whatever the device supports. This kind of terminology confuses
> what is being done, and not descripting what is being done.
> (b) this is equiv to the sriov_enable_vfs in the RFC set I sent.
> -- in this set, it prevents the user trying to do more than one enable,
> and that check should be done, and reject the request, which solves
> one
> of the complaints Alexander had.
>
> I'll try to mind-meld your sysfs attr creation patches to mind later today
> and post a new series tonight or tomorrow. Sorry, stuck in mtgs today (and
> right now!),
> thus the delay.

Sure. please update 3, 4 as your like, and ask greg.rose work on patch 5.

Thanks

Yinghai

2012-10-03 21:02:32

by Donald Dutile

[permalink] [raw]
Subject: Re: [PATCH 3/5] PCI: add set_max_vfs in pci_driver ops

On 10/03/2012 04:41 PM, Yinghai Lu wrote:
> On Wed, Oct 3, 2012 at 11:55 AM, Don Dutile<[email protected]> wrote:
>> On 10/03/2012 01:51 PM, Yinghai Lu wrote:
>>>
>>> Will use it enable sriov for pci devices.
>>>
>>> Signed-off-by: Yinghai Lu<[email protected]>
>>> ---
>>> include/linux/pci.h | 1 +
>>> 1 files changed, 1 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index be1de01..7d70a5e 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -590,6 +590,7 @@ struct pci_driver {
>>> const struct pci_device_id *id_table; /* must be non-NULL for
>>> probe to be called */
>>> int (*probe) (struct pci_dev *dev, const struct pci_device_id
>>> *id); /* New device inserted */
>>> void (*remove) (struct pci_dev *dev); /* Device removed (NULL if
>>> not a hot-plug capable driver) */
>>> + void (*set_max_vfs) (struct pci_dev *dev); /* enable sriov */
>>> int (*suspend) (struct pci_dev *dev, pm_message_t state); /*
>>> Device suspended */
>>> int (*suspend_late) (struct pci_dev *dev, pm_message_t state);
>>> int (*resume_early) (struct pci_dev *dev);
>>
>>
>> I thought I stated the following in your earlier patch set....
>>
>> (a) don't use 'set_max_vfs' ; it is not changing the max; the max
>> is whatever the device supports. This kind of terminology confuses
>> what is being done, and not descripting what is being done.
>> (b) this is equiv to the sriov_enable_vfs in the RFC set I sent.
>> -- in this set, it prevents the user trying to do more than one enable,
>> and that check should be done, and reject the request, which solves
>> one
>> of the complaints Alexander had.
>>
>> I'll try to mind-meld your sysfs attr creation patches to mind later today
>> and post a new series tonight or tomorrow. Sorry, stuck in mtgs today (and
>> right now!),
>> thus the delay.
>
> Sure. please update 3, 4 as your like, and ask greg.rose work on patch 5.
>
Exactly what I'm doing now.
Thanks for the initial patch split of 1 & 2.
I *think* I understand how the generic framework works for
visible/invisible attributes now works! :)

> Thanks
>
> Yinghai
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-10-04 14:22:29

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 1/5] PCI: Add pci_dev_type

On Wed, Oct 03, 2012 at 10:51:31AM -0700, Yinghai Lu wrote:
> need to use it for visiable attribute control in syfsfs for pci_dev.

Please use 'ispell' before sending your patches.


Also please explain why do you want this? If I do
'git annotate' on that file (say six months from now when I've forgotten
a lot) and try to lookup what 'is_visible' attribute is, I get
this one line explanation. Worst, the explanation does not
say *why* we need it - or the *purpose* behind it. Or if there
are patches that are going to utilize it.

The Documentation/SubmittingPatches says:

2) Describe your changes.

Describe the technical detail of the change(s) your patch includes.

Be as specific as possible. The WORST descriptions possible include
things like "update driver X", "bug fix for driver X", or "this patch
includes updates for subsystem X. Please apply."

The maintainer will thank you if you write your patch description in a
form which can be easily pulled into Linux's source code management
system, git, as a "commit log". See #15, below.

If your description starts to get long, that's a sign that you probably
need to split up your patch. See #3, next.

When you submit or resubmit a patch or patch series, include the
complete patch description and justification for it. Don't just
say that this is version N of the patch (series). Don't expect the
patch merger to refer back to earlier patch versions or referenced
URLs to find the patch description and put that into the patch.
I.e., the patch (series) and its description should be self-contained.
This benefits both the patch merger(s) and reviewers. Some reviewers
probably didn't even receive earlier versions of the patch.

If the patch fixes a logged bug entry, refer to that bug entry by
number and URL.

>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/pci-sysfs.c | 24 ++++++++++++++++++++++++
> drivers/pci/pci.h | 1 +
> drivers/pci/probe.c | 1 +
> 3 files changed, 26 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 02d107b..3d160aa 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1411,3 +1411,27 @@ static int __init pci_sysfs_init(void)
> }
>
> late_initcall(pci_sysfs_init);
> +
> +static struct attribute *pci_dev_dev_attrs[] = {
> + NULL,
> +};
> +
> +static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
> + struct attribute *a, int n)
> +{
> + return a->mode;
> +}
> +
> +static struct attribute_group pci_dev_attr_group = {
> + .attrs = pci_dev_dev_attrs,
> + .is_visible = pci_dev_attrs_are_visible,
> +};
> +
> +static const struct attribute_group *pci_dev_attr_groups[] = {
> + &pci_dev_attr_group,
> + NULL,
> +};
> +
> +struct device_type pci_dev_type = {
> + .groups = pci_dev_attr_groups,
> +};
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index bacbcba..6f6cd14 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -157,6 +157,7 @@ static inline int pci_no_d1d2(struct pci_dev *dev)
> }
> extern struct device_attribute pci_dev_attrs[];
> extern struct device_attribute pcibus_dev_attrs[];
> +extern struct device_type pci_dev_type;
> #ifdef CONFIG_HOTPLUG
> extern struct bus_attribute pci_bus_attrs[];
> #else
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ec909af..0312f1c48 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -975,6 +975,7 @@ int pci_setup_device(struct pci_dev *dev)
> dev->sysdata = dev->bus->sysdata;
> dev->dev.parent = dev->bus->bridge;
> dev->dev.bus = &pci_bus_type;
> + dev->dev.type = &pci_dev_type;
> dev->hdr_type = hdr_type & 0x7f;
> dev->multifunction = !!(hdr_type & 0x80);
> dev->error_state = pci_channel_io_normal;
> --
> 1.7.7
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-10-04 14:26:34

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 4/5] PCI: Add max_vfs in sysfs per pci device where supports

On Wed, Oct 03, 2012 at 10:51:34AM -0700, Yinghai Lu wrote:
> only pci device that support sriov will have max_vfs show up in /sys
^-Only ^-devices ^-SRIOV

>
> when user set value in /sys, driver ops set_max_vfs will be called to enable
> VF there.

Huh? What value? What are they enabling? Your comment makes it sound as
if setting any value (say '0xdeadbeef') will be called to enable a VF.

I don't think that is what the code does. Can you explain what the
proper values are to be submitted to the SysFS value 'max_vfs' and
what the kernel ought to be doing? Perhaps include a little snippet
from the kernel log so if somebody has a bug they can look at see
what you got and can compare?

Thank you.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/pci-sysfs.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 1 +
> 2 files changed, 52 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index fbbb97f..9b6f409 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -458,6 +458,52 @@ boot_vga_show(struct device *dev, struct device_attribute *attr, char *buf)
> }
> struct device_attribute vga_attr = __ATTR_RO(boot_vga);
>
> +static ssize_t
> +max_vfs_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + return sprintf(buf, "%u\n", pdev->max_vfs);
> +}
> +
> +static void max_vfs_callback(struct device *dev)
> +{
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + mutex_lock(&pci_remove_rescan_mutex);
> + if (pdev->is_added && dev->driver) {
> + struct pci_driver *pdrv;
> +
> + pdrv = to_pci_driver(dev->driver);
> + if (pdrv->set_max_vfs)
> + pdrv->set_max_vfs(pdev);
> +
> + }
> + mutex_unlock(&pci_remove_rescan_mutex);
> +}
> +static ssize_t
> +max_vfs_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + unsigned long val;
> + int err;
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + if (kstrtoul(buf, 0, &val) < 0)
> + return -EINVAL;
> +
> + pdev->max_vfs = val;
> +
> + err = device_schedule_callback(dev, max_vfs_callback);
> +
> + if (err)
> + return err;
> +
> + return count;
> +}
> +struct device_attribute max_vfs_attr =
> + __ATTR(max_vfs, 0644, max_vfs_show, max_vfs_store);
> +
> static void
> pci_config_pm_runtime_get(struct pci_dev *pdev)
> {
> @@ -1405,6 +1451,7 @@ late_initcall(pci_sysfs_init);
>
> static struct attribute *pci_dev_dev_attrs[] = {
> &vga_attr.attr,
> + &max_vfs_attr.attr,
> NULL,
> };
>
> @@ -1418,6 +1465,10 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
> if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> return 0;
>
> + if (a == &max_vfs_attr.attr)
> + if (!pdev->is_physfn)
> + return 0;
> +
> return a->mode;
> }
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 7d70a5e..f7423d4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -335,6 +335,7 @@ struct pci_dev {
> unsigned int broken_intx_masking:1;
> unsigned int io_window_1k:1; /* Intel P2P bridge 1K I/O windows */
> pci_dev_flags_t dev_flags;
> + unsigned int max_vfs;
> atomic_t enable_cnt; /* pci_enable_device has been called */
>
> u32 saved_config_space[16]; /* config space saved at suspend time */
> --
> 1.7.7
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-10-04 15:13:53

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/5] PCI: Add max_vfs in sysfs per pci device where supports

On Thu, Oct 4, 2012 at 7:15 AM, Konrad Rzeszutek Wilk <[email protected]> wrote:
> On Wed, Oct 03, 2012 at 10:51:34AM -0700, Yinghai Lu wrote:
>> only pci device that support sriov will have max_vfs show up in /sys
> ^-Only ^-devices ^-SRIOV
>
>>
>> when user set value in /sys, driver ops set_max_vfs will be called to enable
>> VF there.
>
> Huh? What value? What are they enabling? Your comment makes it sound as
> if setting any value (say '0xdeadbeef') will be called to enable a VF.
>
> I don't think that is what the code does. Can you explain what the
> proper values are to be submitted to the SysFS value 'max_vfs' and
> what the kernel ought to be doing? Perhaps include a little snippet
> from the kernel log so if somebody has a bug they can look at see
> what you got and can compare?

I sent out this patches as reference or base for Don and greg.rose
and other sriov guys.

Anyway, thank you for review them.

Yinghai