2016-03-24 20:02:46

by Babu Moger

[permalink] [raw]
Subject: [PATCH v4] sparc/PCI: Fix for panic while enabling SR-IOV

We noticed this panic while enabling SR-IOV in sparc.

mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan 1 2015)
mlx4_core: Initializing 0007:01:00.0
mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs
mlx4_core: Initializing 0007:01:00.1
Unable to handle kernel NULL pointer dereference
insmod(10010): Oops [#1]
CPU: 391 PID: 10010 Comm: insmod Not tainted
4.1.12-32.el6uek.kdump2.sparc64 #1
TPC: <dma_supported+0x20/0x80>
I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]>
Call Trace:
[00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core]
[00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core]
[0000000000725f14] local_pci_probe+0x34/0xa0
[0000000000726028] pci_call_probe+0xa8/0xe0
[0000000000726310] pci_device_probe+0x50/0x80
[000000000079f700] really_probe+0x140/0x420
[000000000079fa24] driver_probe_device+0x44/0xa0
[000000000079fb5c] __device_attach+0x3c/0x60
[000000000079d85c] bus_for_each_drv+0x5c/0xa0
[000000000079f588] device_attach+0x88/0xc0
[000000000071acd0] pci_bus_add_device+0x30/0x80
[0000000000736090] virtfn_add.clone.1+0x210/0x360
[00000000007364a4] sriov_enable+0x2c4/0x520
[000000000073672c] pci_enable_sriov+0x2c/0x40
[00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
[00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core]
Disabling lock debugging due to kernel taint
Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb5c]: __device_attach+0x3c/0x60
Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0
Caller[000000000079f588]: device_attach+0x88/0xc0
Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80
Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360
Caller[00000000007364a4]: sriov_enable+0x2c4/0x520
Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40
Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core]
Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb08]: __driver_attach+0x88/0xa0
Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0
Caller[000000000079f29c]: driver_attach+0x1c/0x40
Caller[000000000079e35c]: bus_add_driver+0x17c/0x220
Caller[00000000007a02d4]: driver_register+0x74/0x120
Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60
Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core]
Kernel panic - not syncing: Fatal exception
Press Stop-A (L1-A) to return to the boot prom
---[ end Kernel panic - not syncing: Fatal exception

Details:
Here is the call sequence
virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported

The panic happened at line 760(file arch/sparc/kernel/iommu.c)

758 int dma_supported(struct device *dev, u64 device_mask)
759 {
760 struct iommu *iommu = dev->archdata.iommu;
761 u64 dma_addr_mask = iommu->dma_addr_mask;
762
763 if (device_mask >= (1UL << 32UL))
764 return 0;
765
766 if ((device_mask & dma_addr_mask) == dma_addr_mask)
767 return 1;
768
769 #ifdef CONFIG_PCI
770 if (dev_is_pci(dev))
771 return pci64_dma_supported(to_pci_dev(dev), device_mask);
772 #endif
773
774 return 0;
775 }
776 EXPORT_SYMBOL(dma_supported);

Same panic happened with Intel ixgbe driver also.

SR-IOV code looks for arch specific data while enabling
VFs. When VF device is added, driver probe function makes set
of calls to initialize the pci device. Because the VF device is
added different way than the normal PF device(which happens via
of_create_pci_dev for sparc), some of the arch specific initialization
does not happen for VF device. That causes panic when archdata is
accessed.

To fix this, I have used already defined weak function
pcibios_setup_device to copy archdata from PF to VF.
Also verified the fix.

Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Sowmini Varadhan <[email protected]>
Reviewed-by: Ethan Zhao <[email protected]>
---
v2:
Removed RFC.
Made changes per comments from Ethan Zhao.
Now the changes are only in Sparc specific code.
Removed the changes from driver/pci.
Implemented already defined weak function pcibios_add_device
in arch/sparc/kernel/pci.c to initialize sriov archdata.

v3:
Fixed the compile error reported in kbuild test robot.

v4:
Fixed indentation per comments from David Miller

arch/sparc/kernel/pci.c | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index badf095..9f9614d 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -994,6 +994,23 @@ void pcibios_set_master(struct pci_dev *dev)
/* No special bus mastering setup handling */
}

+#ifdef CONFIG_PCI_IOV
+int pcibios_add_device(struct pci_dev *dev)
+{
+ struct pci_dev *pdev;
+
+ /* Add sriov arch specific initialization here.
+ * Copy dev_archdata from PF to VF
+ */
+ if (dev->is_virtfn) {
+ pdev = dev->physfn;
+ memcpy(&dev->dev.archdata, &pdev->dev.archdata,
+ sizeof(struct dev_archdata));
+ }
+ return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
static int __init pcibios_init(void)
{
pci_dfl_cache_line_size = 64 >> 2;
--
1.7.1


2016-03-30 00:57:24

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v4] sparc/PCI: Fix for panic while enabling SR-IOV

From: Babu Moger <[email protected]>
Date: Thu, 24 Mar 2016 13:02:22 -0700

> We noticed this panic while enabling SR-IOV in sparc.
...
> SR-IOV code looks for arch specific data while enabling
> VFs. When VF device is added, driver probe function makes set
> of calls to initialize the pci device. Because the VF device is
> added different way than the normal PF device(which happens via
> of_create_pci_dev for sparc), some of the arch specific initialization
> does not happen for VF device. That causes panic when archdata is
> accessed.
>
> To fix this, I have used already defined weak function
> pcibios_setup_device to copy archdata from PF to VF.
> Also verified the fix.
>
> Signed-off-by: Babu Moger <[email protected]>
> Signed-off-by: Sowmini Varadhan <[email protected]>
> Reviewed-by: Ethan Zhao <[email protected]>

Looks good, applied and queued up for -stable, thanks.

Just a note, I am assuming that the VFs are not instantiated in the
device tree. Because when you just memcpy the arch data over from the
PF, one thing we end up doing is using the device node of the PF.

I slightly cringed at the memcpy, because at least one of these
pointers are to objects which are reference counted, the OF device.

Generally speaking we don't really support hot-plug for OF probed
devices, but if we did all of the device tree pointers have to be
refcounted properly.

So in the long term that whole sequence where we go:

struct dev_archdata *sd;
...
sd = &dev->dev.archdata;
sd->iommu = pbm->iommu;
sd->stc = &pbm->stc;
sd->host_controller = pbm;
sd->op = op = of_find_device_by_node(node);
sd->numa_node = pbm->numa_node;

should be encapsulated into a helper function, and both
of_create_pci_dev() and this new pcibios_setup_device() can
invoke it.

2016-03-30 15:31:42

by Babu Moger

[permalink] [raw]
Subject: Re: [PATCH v4] sparc/PCI: Fix for panic while enabling SR-IOV

Hi David,

On 3/29/2016 7:57 PM, David Miller wrote:
> From: Babu Moger <[email protected]>
> Date: Thu, 24 Mar 2016 13:02:22 -0700
>
>> We noticed this panic while enabling SR-IOV in sparc.
> ...
>> SR-IOV code looks for arch specific data while enabling
>> VFs. When VF device is added, driver probe function makes set
>> of calls to initialize the pci device. Because the VF device is
>> added different way than the normal PF device(which happens via
>> of_create_pci_dev for sparc), some of the arch specific initialization
>> does not happen for VF device. That causes panic when archdata is
>> accessed.
>>
>> To fix this, I have used already defined weak function
>> pcibios_setup_device to copy archdata from PF to VF.
>> Also verified the fix.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> Signed-off-by: Sowmini Varadhan <[email protected]>
>> Reviewed-by: Ethan Zhao <[email protected]>
>
> Looks good, applied and queued up for -stable, thanks.

Thanks.

>
> Just a note, I am assuming that the VFs are not instantiated in the
> device tree. Because when you just memcpy the arch data over from the
> PF, one thing we end up doing is using the device node of the PF.

No. VFs are not instantiated in device tree(/proc/device-tree)

>
> I slightly cringed at the memcpy, because at least one of these
> pointers are to objects which are reference counted, the OF device.
>
> Generally speaking we don't really support hot-plug for OF probed
> devices, but if we did all of the device tree pointers have to be
> refcounted properly.
>
> So in the long term that whole sequence where we go:
>
> struct dev_archdata *sd;
> ...
> sd = &dev->dev.archdata;
> sd->iommu = pbm->iommu;
> sd->stc = &pbm->stc;
> sd->host_controller = pbm;
> sd->op = op = of_find_device_by_node(node);
> sd->numa_node = pbm->numa_node;
>
> should be encapsulated into a helper function, and both
> of_create_pci_dev() and this new pcibios_setup_device() can
> invoke it.
>

Yes. Agree. We need to refactor the whole of_create_pci_dev path to support
hot-plug for the long term. I will start looking at it. For now we should be
fine with the current patch. thanks

2016-03-30 19:37:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4] sparc/PCI: Fix for panic while enabling SR-IOV

On Wed, Mar 30, 2016 at 10:31:18AM -0500, Babu Moger wrote:
> Hi David,
>
> On 3/29/2016 7:57 PM, David Miller wrote:
> > From: Babu Moger <[email protected]>
> > Date: Thu, 24 Mar 2016 13:02:22 -0700
> >
> >> We noticed this panic while enabling SR-IOV in sparc.
> > ...
> >> SR-IOV code looks for arch specific data while enabling
> >> VFs. When VF device is added, driver probe function makes set
> >> of calls to initialize the pci device. Because the VF device is
> >> added different way than the normal PF device(which happens via
> >> of_create_pci_dev for sparc), some of the arch specific initialization
> >> does not happen for VF device. That causes panic when archdata is
> >> accessed.
> >>
> >> To fix this, I have used already defined weak function
> >> pcibios_setup_device to copy archdata from PF to VF.
> >> Also verified the fix.
> >>
> >> Signed-off-by: Babu Moger <[email protected]>
> >> Signed-off-by: Sowmini Varadhan <[email protected]>
> >> Reviewed-by: Ethan Zhao <[email protected]>
> >
> > Looks good, applied and queued up for -stable, thanks.
>
> Thanks.
>
> >
> > Just a note, I am assuming that the VFs are not instantiated in the
> > device tree. Because when you just memcpy the arch data over from the
> > PF, one thing we end up doing is using the device node of the PF.
>
> No. VFs are not instantiated in device tree(/proc/device-tree)
>
> >
> > I slightly cringed at the memcpy, because at least one of these
> > pointers are to objects which are reference counted, the OF device.
> >
> > Generally speaking we don't really support hot-plug for OF probed
> > devices, but if we did all of the device tree pointers have to be
> > refcounted properly.
> >
> > So in the long term that whole sequence where we go:
> >
> > struct dev_archdata *sd;
> > ...
> > sd = &dev->dev.archdata;
> > sd->iommu = pbm->iommu;
> > sd->stc = &pbm->stc;
> > sd->host_controller = pbm;
> > sd->op = op = of_find_device_by_node(node);
> > sd->numa_node = pbm->numa_node;
> >
> > should be encapsulated into a helper function, and both
> > of_create_pci_dev() and this new pcibios_setup_device() can
> > invoke it.
> >
>
> Yes. Agree. We need to refactor the whole of_create_pci_dev path to support
> hot-plug for the long term. I will start looking at it. For now we should be
> fine with the current patch. thanks

of_create_pci_dev() duplicates a lot of the code in
pci_setup_device(). I wish we didn't have to do that
because it's easy to let them get out of sync, but I
don't know if there are any reasonable alternatives.

I've wondered in the past whether it would be possible
to use the pci_setup_device() path on sparc & powerpc by
writing PCI config accessors that look up OF properties
as needed to fabricate responses to config reads. Several
of the drivers in drivers/pci/host/* do a little bit of
this fabrication, although I don't think any go to the
extent of using OF.

Bjorn