Hello,
I tried to use SR-IOV virtualizaton for Mellanox ConnectX2 card with
mlx4_core driver with kernel 3.5.0. I built firware for the IB card with
sriov_en = true, lspci shows:
02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s
- IB QDR / 10GigE] (rev b0)
Subsystem: Super Micro Computer Inc Device 0048
Flags: bus master, fast devsel, latency 0, IRQ 24
Memory at fbd00000 (64-bit, non-prefetchable) [size=1M]
Memory at f8800000 (64-bit, prefetchable) [size=8M]
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
Capabilities: [148] Device Serial Number 00-25-90-ff-ff-28-09-08
Capabilities: [108] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: mlx4_core
however, the driver complains:
[ 3.558221] mlx4_core 0000:02:00.0: Enabling sriov with:4 vfs
[ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
[ 3.558299] mlx4_core 0000:02:00.0: Failed to enable sriov,continuing without sriov enabled (err = -12).
Is there any workaround for this? Or the bug is in BIOS and without a proper
fix this is never gonna work?
Perhaps, are there any persons more suitable for these kind of questions?
--
Luk?? Hejtm?nek
On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek <[email protected]> wrote:
> [ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
This comes from the core sriov_enable() function, not anything in mlx4.
(although my kernel doesn't have the print of nres in that message)
Not sure what it means.
On Wed, Aug 1, 2012 at 10:37 AM, Roland Dreier <[email protected]> wrote:
> On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek <[email protected]> wrote:
>> [ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
>
> This comes from the core sriov_enable() function, not anything in mlx4.
> (although my kernel doesn't have the print of nres in that message)
>
> Not sure what it means.
On Wed, Aug 1, 2012 at 10:37 AM, Roland Dreier <[email protected]> wrote:
> On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek <[email protected]> wrote:
>> [ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
>
> This comes from the core sriov_enable() function, not anything in mlx4.
> (although my kernel doesn't have the print of nres in that message)
>
> Not sure what it means.
iov bar is not assigned by BIOS, and kernel can not find range for it too.
Lukas, can you post whole boot log with PCI_DEBUG enabled? That will
tell exact why kernel does not assign them.
Recent kernel from 3.4... should enable realloc when SRIOV bar is not assigned.
Thanks
Yinghai
On Wed, Aug 01, 2012 at 11:29:02AM -0700, Yinghai Lu wrote:
> On Wed, Aug 1, 2012 at 10:37 AM, Roland Dreier <[email protected]> wrote:
> > On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek <[email protected]> wrote:
> >> [ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
> >
> > This comes from the core sriov_enable() function, not anything in mlx4.
> > (although my kernel doesn't have the print of nres in that message)
> >
> > Not sure what it means.
>
> iov bar is not assigned by BIOS, and kernel can not find range for it too.
>
> Lukas, can you post whole boot log with PCI_DEBUG enabled? That will
> tell exact why kernel does not assign them.
>
> Recent kernel from 3.4... should enable realloc when SRIOV bar is not assigned.
here is full boot log.
http://www.fi.muni.cz/~xhejtman/dmesg.log
weird with PCI_DEBUG it does not load mlx driver at all..
--
Luk?? Hejtm?nek
On Wed, Aug 01, 2012 at 11:29:02AM -0700, Yinghai Lu wrote:
> iov bar is not assigned by BIOS, and kernel can not find range for it too.
>
> Lukas, can you post whole boot log with PCI_DEBUG enabled? That will
> tell exact why kernel does not assign them.
>
> Recent kernel from 3.4... should enable realloc when SRIOV bar is not assigned.
sorry for confusing, PCI_DEBUG does not break mlx driver, it is reallocation
code that results:
[ 3.555008] mlx4_core 0000:02:00.0: Missing UAR, aborting.
--
Luk?? Hejtm?nek
On Wed, Aug 1, 2012 at 1:28 PM, Lukas Hejtmanek <[email protected]> wrote:
> On Wed, Aug 01, 2012 at 11:29:02AM -0700, Yinghai Lu wrote:
>> On Wed, Aug 1, 2012 at 10:37 AM, Roland Dreier <[email protected]> wrote:
>> > On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek <[email protected]> wrote:
>> >> [ 3.558296] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov->nres: 1)
>> >
>> > This comes from the core sriov_enable() function, not anything in mlx4.
>> > (although my kernel doesn't have the print of nres in that message)
>> >
>> > Not sure what it means.
>>
>> iov bar is not assigned by BIOS, and kernel can not find range for it too.
>>
>> Lukas, can you post whole boot log with PCI_DEBUG enabled? That will
>> tell exact why kernel does not assign them.
>>
>> Recent kernel from 3.4... should enable realloc when SRIOV bar is not assigned.
>
> here is full boot log.
> http://www.fi.muni.cz/~xhejtman/dmesg.log
>
> weird with PCI_DEBUG it does not load mlx driver at all..
[ 0.699280] pci 0000:02:00.0: [15b3:673c] type 00 class 0x0c0600
[ 0.699529] pci 0000:02:00.0: reg 10: [mem 0xfbd00000-0xfbdfffff 64bit]
[ 0.699726] pci 0000:02:00.0: reg 18: [mem 0xf8800000-0xf8ffffff 64bit pref]
[ 0.701577] pci 0000:02:00.0: reg 134: [mem 0x00000000-0x007fffff 64bit pref]
[ 0.710975] pci 0000:00:03.0: PCI bridge to [bus 02-02]
[ 0.711044] pci 0000:00:03.0: bridge window [mem 0xfbd00000-0xfbdfffff]
[ 0.711049] pci 0000:00:03.0: bridge window [mem
0xf8800000-0xf8ffffff 64bit pref]
...
kernel try to clear the bridge, but still can not find the range.
[ 0.761807] PCI: No. 2 try to assign unassigned res
[ 0.761808] release child resource [mem 0xf8800000-0xf8ffffff 64bit pref]
[ 0.761811] pci 0000:00:03.0: resource 15 [mem
0xf8800000-0xf8ffffff 64bit pref] released
[ 0.761813] pci 0000:00:03.0: PCI bridge to [bus 02-02]
[ 0.761881] release child resource [mem 0xfbc1c000-0xfbc1ffff]
[ 0.761882] release child resource [mem 0xfbc20000-0xfbc3ffff pref]
[ 0.761883] release child resource [mem 0xfbc40000-0xfbc5ffff]
[ 0.761884] release child resource [mem 0xfbc60000-0xfbc7ffff]
[ 0.761885] release child resource [mem 0xfbc9c000-0xfbc9ffff]
[ 0.761886] release child resource [mem 0xfbca0000-0xfbcbffff pref]
[ 0.761887] release child resource [mem 0xfbcc0000-0xfbcdffff]
[ 0.761888] release child resource [mem 0xfbce0000-0xfbcfffff]
[ 0.761891] pci 0000:00:01.0: resource 14 [mem
0xfbc00000-0xfbcfffff] released
[ 0.761893] pci 0000:00:01.0: PCI bridge to [bus 01-01]
[ 0.761967] pci 0000:00:01.0: bridge window [mem
0x00100000-0x001fffff] to [bus 01-01] add_size 100000
[ 0.761974] pci 0000:00:03.0: bridge window [mem
0x00800000-0x00ffffff 64bit pref] to [bus 02-02] add_size 20000000
[ 0.761999] pci 0000:00:03.0: res[15]=[mem 0x00800000-0x00ffffff
64bit pref] get_res_add_size add_size 20000000
[ 0.762002] pci 0000:00:01.0: res[14]=[mem 0x00100000-0x001fffff]
get_res_add_size add_size 100000
[ 0.762006] pci 0000:00:03.0: BAR 15: can't assign mem pref (size 0x20800000)
[ 0.762076] pci 0000:00:01.0: BAR 14: assigned [mem 0xc0000000-0xc01fffff]
[ 0.767124] pci 0000:00:01.0: BAR 15: assigned [mem
0xc0200000-0xc02fffff pref]
[ 0.767218] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767229] pci 0000:01:00.0: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767240] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767252] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767263] pci 0000:01:00.0: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767274] pci 0000:01:00.1: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767285] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767296] pci 0000:01:00.0: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767307] pci 0000:01:00.1: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767318] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767329] pci 0000:01:00.0: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767340] pci 0000:01:00.1: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767347] pci 0000:01:00.0: res[7]=[mem
0x00000000-0xffffffffffffffff 64bit] get_res_add_size add_size 20000
[ 0.767349] pci 0000:01:00.0: res[10]=[mem
0x00000000-0xffffffffffffffff 64bit] get_res_add_size add_size 20000
[ 0.767351] pci 0000:01:00.1: res[7]=[mem
0x00000000-0xffffffffffffffff 64bit] get_res_add_size add_size 20000
[ 0.767354] pci 0000:01:00.1: res[10]=[mem
0x00000000-0xffffffffffffffff 64bit] get_res_add_size add_size 20000
[ 0.767356] pci 0000:01:00.0: BAR 0: assigned [mem 0xc0000000-0xc001ffff]
[ 0.767427] pci 0000:01:00.0: BAR 1: assigned [mem 0xc0020000-0xc003ffff]
[ 0.767497] pci 0000:01:00.0: BAR 6: assigned [mem
0xc0200000-0xc021ffff pref]
[ 0.767580] pci 0000:01:00.1: BAR 0: assigned [mem 0xc0040000-0xc005ffff]
[ 0.767651] pci 0000:01:00.1: BAR 1: assigned [mem 0xc0060000-0xc007ffff]
[ 0.767722] pci 0000:01:00.1: BAR 6: assigned [mem
0xc0220000-0xc023ffff pref]
[ 0.767804] pci 0000:01:00.0: BAR 3: assigned [mem 0xc0080000-0xc0083fff]
[ 0.767884] pci 0000:01:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.767886] pci 0000:01:00.0: BAR 7: assigned [mem
0xc0084000-0xc00a3fff 64bit]
[ 0.767981] pci 0000:01:00.0: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.767983] pci 0000:01:00.0: BAR 10: assigned [mem
0xc00a4000-0xc00c3fff 64bit]
[ 0.768070] pci 0000:01:00.1: BAR 3: assigned [mem 0xc00c4000-0xc00c7fff]
[ 0.768149] pci 0000:01:00.1: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 0.768151] pci 0000:01:00.1: BAR 7: assigned [mem
0xc00c8000-0xc00e7fff 64bit]
[ 0.768246] pci 0000:01:00.1: reg 190: [mem 0x00000000-0x00003fff 64bit]
[ 0.768248] pci 0000:01:00.1: BAR 10: assigned [mem
0xc00e8000-0xc0107fff 64bit]
[ 0.768336] pci 0000:00:01.0: PCI bridge to [bus 01-01]
[ 0.768403] pci 0000:00:01.0: bridge window [io 0xe000-0xefff]
[ 0.768472] pci 0000:00:01.0: bridge window [mem 0xc0000000-0xc01fffff]
[ 0.768542] pci 0000:00:01.0: bridge window [mem
0xc0200000-0xc02fffff pref]
[ 0.768823] pci 0000:02:00.0: reg 134: [mem 0x00000000-0x007fffff 64bit pref]
[ 0.768826] pci 0000:02:00.0: res[9]=[mem
0x00000000-0xffffffffffffffff 64bit pref] get_res_add_size add_size
20000000
[ 0.768829] pci 0000:02:00.0: BAR 2: can't assign mem pref (size 0x800000)
[ 0.769094] pci 0000:02:00.0: reg 134: [mem 0x00000000-0x007fffff 64bit pref]
[ 0.769096] pci 0000:02:00.0: BAR 9: can't assign mem pref (size 0x20000000)
[ 0.769166] pci 0000:02:00.0: BAR 2: can't assign mem pref (size 0x800000)
[ 0.769430] pci 0000:02:00.0: reg 134: [mem 0x00000000-0x007fffff 64bit pref]
[ 0.769432] pci 0000:02:00.0: BAR 9: can't assign mem pref (size 0x20000000)
[ 0.769501] pci 0000:00:03.0: PCI bridge to [bus 02-02]
[ 0.769568] pci 0000:00:03.0: bridge window [mem 0xfbd00000-0xfbdfffff]
...
_CRS does not provide 64 bit resource range.
[ 0.688670] PCI: Using host bridge windows from ACPI; if necessary,
use "pci=nocrs" and report a bug
[ 0.688846] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 0.689065] pci_root PNP0A08:00: host bridge window [io 0x0000-0x0cf7]
[ 0.689134] pci_root PNP0A08:00: host bridge window [io 0x0d00-0xffff]
[ 0.689202] pci_root PNP0A08:00: host bridge window [mem
0x000a0000-0x000bffff]
[ 0.689285] pci_root PNP0A08:00: host bridge window [mem
0x000d0000-0x000dffff]
[ 0.689368] pci_root PNP0A08:00: host bridge window [mem
0xc0000000-0xdfffffff]
[ 0.689451] pci_root PNP0A08:00: host bridge window [mem
0xf0000000-0xfed8ffff]
[ 0.689576] PCI host bridge to bus 0000:00
[ 0.689640] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7]
[ 0.689708] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff]
[ 0.689775] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
[ 0.689844] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000dffff]
[ 0.689913] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xdfffffff]
[ 0.689981] pci_bus 0000:00: root bus resource [mem 0xf0000000-0xfed8ffff]
you may try to boot with pci=nocrs
Thanks
Yinghai
On Wed, Aug 1, 2012 at 1:56 PM, Lukas Hejtmanek <[email protected]> wrote:
> On Wed, Aug 01, 2012 at 11:29:02AM -0700, Yinghai Lu wrote:
>> iov bar is not assigned by BIOS, and kernel can not find range for it too.
>>
>> Lukas, can you post whole boot log with PCI_DEBUG enabled? That will
>> tell exact why kernel does not assign them.
>>
>> Recent kernel from 3.4... should enable realloc when SRIOV bar is not assigned.
>
> sorry for confusing, PCI_DEBUG does not break mlx driver, it is reallocation
> code that results:
> [ 3.555008] mlx4_core 0000:02:00.0: Missing UAR, aborting.
yes, i knew that.
one patch in my for-pci-next should address that.
http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=patch;h=fcce563f868e296f46a2eeaa88d6959bcee26a2d
Thanks
Yinghai
On Wed, Aug 01, 2012 at 02:27:34PM -0700, Yinghai Lu wrote:
> you may try to boot with pci=nocrs
ok, pci=nocrs i got:
02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
02:00.1 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2 Virtual Function] (rev b0)
02:00.2 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2 Virtual Function] (rev b0)
02:00.3 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2 Virtual Function] (rev b0)
02:00.4 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2 Virtual Function] (rev b0)
so it works. Should I try your patch without pci=nocrs?
--
Luk?? Hejtm?nek
On Wed, Aug 01, 2012 at 02:32:17PM -0700, Yinghai Lu wrote:
> yes, i knew that.
>
> one patch in my for-pci-next should address that.
>
> http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=patch;h=fcce563f868e296f46a2eeaa88d6959bcee26a2d
this is probably only half-way. well mlx driver loads but it complains again
on MMIO:
[ 3.564844] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
[ 3.564845] mlx4_core: Initializing 0000:02:00.0
[ 3.564967] mlx4_core 0000:02:00.0: Enabling sriov with:4 vfs
[ 3.565087] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV
[ 3.565402] mlx4_core 0000:02:00.0: Failed to enable sriov,continuing
without sriov enabled (err = -12).
so it seems, that pic=nocsr is a must now.
--
Luk?? Hejtm?nek
On Wed, Aug 1, 2012 at 3:08 PM, Lukas Hejtmanek <[email protected]> wrote:
> On Wed, Aug 01, 2012 at 02:32:17PM -0700, Yinghai Lu wrote:
>> yes, i knew that.
>>
>> one patch in my for-pci-next should address that.
>>
>> http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=patch;h=fcce563f868e296f46a2eeaa88d6959bcee26a2d
>
> this is probably only half-way. well mlx driver loads but it complains again
> on MMIO:
> [ 3.564844] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
> [ 3.564845] mlx4_core: Initializing 0000:02:00.0
> [ 3.564967] mlx4_core 0000:02:00.0: Enabling sriov with:4 vfs
> [ 3.565087] mlx4_core 0000:02:00.0: not enough MMIO resources for SR-IOV
> [ 3.565402] mlx4_core 0000:02:00.0: Failed to enable sriov,continuing
> without sriov enabled (err = -12).
yes, that is, it will make BAR2 has fallback resource again.
>
> so it seems, that pic=nocsr is a must now.
yes. Or you have bios provide SRIOV support or 64 bit resource in _CRS.
Yinghai
On Wed, Aug 01, 2012 at 04:36:14PM -0700, Yinghai Lu wrote:
> > so it seems, that pic=nocsr is a must now.
>
> yes. Or you have bios provide SRIOV support or 64 bit resource in _CRS.
Well, I can use PCI passthrough in Xen now, however, it seems SR-IOV does not
work in case of Mellanox mlx4 driver.
With 3.5 stock kernel, I got this message in virtual domain:
[ 2.666623] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
[ 2.666635] mlx4_core: Initializing 0000:00:00.1
[ 2.666717] mlx4_core 0000:00:00.1: enabling device (0000 -> 0002)
[ 2.666975] mlx4_core 0000:00:00.1: Xen PCI mapped GSI0 to IRQ168
[ 2.667040] mlx4_core 0000:00:00.1: enabling bus mastering
[ 2.667184] mlx4_core 0000:00:00.1: Detected virtual function - running in slave mode
[ 2.667214] mlx4_core 0000:00:00.1: Sending reset
[ 2.667319] mlx4_core 0000:00:00.1: Sending vhcr0
[ 2.667886] mlx4_core 0000:00:00.1: HCA minimum page size:1
[ 2.668067] mlx4_core 0000:00:00.1: The host doesn't support eth interface
[ 2.668074] mlx4_core 0000:00:00.1: QUERY_FUNC_CAP command failed, aborting.
[ 2.668079] mlx4_core 0000:00:00.1: Failed to obtain slave caps
[ 2.668305] mlx4_core: probe of 0000:00:00.1 failed with error -93
not sure what does it mean.
I also tried OFED package from Mellanox which seems to have better SR-IOV
support (at least mlx4_ib does not complain that SR-IOV is not supported).
However, it does not work when SR-IOV enabled:
[13677.034266] mlx4_core 0000:02:00.0: Running in master mode
[13689.278238] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
[13689.278324] mlx4_core 0000:02:00.0: NOP command failed to generate MSI-X interrupt IRQ 241).
[13689.278399] mlx4_core 0000:02:00.0: Trying again without MSI-X.
[13699.286473] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
[13699.286557] mlx4_core 0000:02:00.0: NOP command failed to generate interrupt (IRQ 237), aborting.
[13699.286633] mlx4_core 0000:02:00.0: BIOS or ACPI interrupt routing problem?
[13701.406680] mlx4_core: probe of 0000:02:00.0 failed with error -16
if I disable SR-IOV mode for this driver, it works OK. Could the interrupt
problem be BIOS related? I.e., it won't work until I got BIOS which properly
supports SR-IOV with Mellanox card?
--
Luk?? Hejtm?nek
Sorry about top-posting, using an webemail client.
This looks like you are using PV PCI passthrough? If so, did you
remember to use 'iommu=soft' to enable the Xen-SWIOTLB in your guest?
And are you booting with more than 4GB? Or is less than 3GB (so that you have
a nice gap in E820).
----- Original Message -----
From: [email protected]
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Sent: Friday, August 3, 2012 4:34:03 AM GMT -05:00 US/Canada Eastern
Subject: Re: mellanox mlx4_core and SR-IOV
On Wed, Aug 01, 2012 at 04:36:14PM -0700, Yinghai Lu wrote:
> > so it seems, that pic=nocsr is a must now.
>
> yes. Or you have bios provide SRIOV support or 64 bit resource in _CRS.
Well, I can use PCI passthrough in Xen now, however, it seems SR-IOV does not
work in case of Mellanox mlx4 driver.
With 3.5 stock kernel, I got this message in virtual domain:
[ 2.666623] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
[ 2.666635] mlx4_core: Initializing 0000:00:00.1
[ 2.666717] mlx4_core 0000:00:00.1: enabling device (0000 -> 0002)
[ 2.666975] mlx4_core 0000:00:00.1: Xen PCI mapped GSI0 to IRQ168
[ 2.667040] mlx4_core 0000:00:00.1: enabling bus mastering
[ 2.667184] mlx4_core 0000:00:00.1: Detected virtual function - running in slave mode
[ 2.667214] mlx4_core 0000:00:00.1: Sending reset
[ 2.667319] mlx4_core 0000:00:00.1: Sending vhcr0
[ 2.667886] mlx4_core 0000:00:00.1: HCA minimum page size:1
[ 2.668067] mlx4_core 0000:00:00.1: The host doesn't support eth interface
[ 2.668074] mlx4_core 0000:00:00.1: QUERY_FUNC_CAP command failed, aborting.
[ 2.668079] mlx4_core 0000:00:00.1: Failed to obtain slave caps
[ 2.668305] mlx4_core: probe of 0000:00:00.1 failed with error -93
not sure what does it mean.
I also tried OFED package from Mellanox which seems to have better SR-IOV
support (at least mlx4_ib does not complain that SR-IOV is not supported).
However, it does not work when SR-IOV enabled:
[13677.034266] mlx4_core 0000:02:00.0: Running in master mode
[13689.278238] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
[13689.278324] mlx4_core 0000:02:00.0: NOP command failed to generate MSI-X interrupt IRQ 241).
[13689.278399] mlx4_core 0000:02:00.0: Trying again without MSI-X.
[13699.286473] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
[13699.286557] mlx4_core 0000:02:00.0: NOP command failed to generate interrupt (IRQ 237), aborting.
[13699.286633] mlx4_core 0000:02:00.0: BIOS or ACPI interrupt routing problem?
[13701.406680] mlx4_core: probe of 0000:02:00.0 failed with error -16
if I disable SR-IOV mode for this driver, it works OK. Could the interrupt
problem be BIOS related? I.e., it won't work until I got BIOS which properly
supports SR-IOV with Mellanox card?
--
Lukáš Hejtmánek
On Fri, Aug 3, 2012 at 1:33 AM, Lukas Hejtmanek <[email protected]> wrote:
> On Wed, Aug 01, 2012 at 04:36:14PM -0700, Yinghai Lu wrote:
>> > so it seems, that pic=nocsr is a must now.
>>
>> yes. Or you have bios provide SRIOV support or 64 bit resource in _CRS.
>
> Well, I can use PCI passthrough in Xen now, however, it seems SR-IOV does not
> work in case of Mellanox mlx4 driver.
>
> With 3.5 stock kernel, I got this message in virtual domain:
> [ 2.666623] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
> [ 2.666635] mlx4_core: Initializing 0000:00:00.1
> [ 2.666717] mlx4_core 0000:00:00.1: enabling device (0000 -> 0002)
> [ 2.666975] mlx4_core 0000:00:00.1: Xen PCI mapped GSI0 to IRQ168
> [ 2.667040] mlx4_core 0000:00:00.1: enabling bus mastering
> [ 2.667184] mlx4_core 0000:00:00.1: Detected virtual function - running in slave mode
> [ 2.667214] mlx4_core 0000:00:00.1: Sending reset
> [ 2.667319] mlx4_core 0000:00:00.1: Sending vhcr0
> [ 2.667886] mlx4_core 0000:00:00.1: HCA minimum page size:1
> [ 2.668067] mlx4_core 0000:00:00.1: The host doesn't support eth interface
> [ 2.668074] mlx4_core 0000:00:00.1: QUERY_FUNC_CAP command failed, aborting.
> [ 2.668079] mlx4_core 0000:00:00.1: Failed to obtain slave caps
> [ 2.668305] mlx4_core: probe of 0000:00:00.1 failed with error -93
>
> not sure what does it mean.
did you check if SRIOV bar for that card is assigned in DOM0 ?
can you kvm with pci pass through?
I only tried pci through with intel igb and ixgbe sriov device with
kvm recently.
please make sure you have intel_iommu=on ...
Thanks
Yinghai
Hi,
On Fri, Aug 03, 2012 at 06:49:59AM -0700, Konrad Wilk wrote:
> This looks like you are using PV PCI passthrough? If so, did you
> remember to use 'iommu=soft' to enable the Xen-SWIOTLB in your guest?
> And are you booting with more than 4GB? Or is less than 3GB (so that you have
> a nice gap in E820).
good catch. I forgot to pass swiotl=force for DomU in Xen. So now, it seems
that mlx4_core works, mlx4_en (ethernet part) works as well. Unfortunately,
the IB part does not. IB layer complains that SR-IOV is currently unsupported
(kernel 3.5.0). So no luck here so far.
There is OFED stack directly from Mellanox, that seems to support SR-IOV even
for IB layer, but they have buildable sources only for RHEL/SLES kernels
(2.6.32) and even correcting the sources to get it compile with 3.5.0 does not
make things work. The driver complains about interrupts not working in Dom0 or
even without Xen hypervisor at all.
The only good point is, that I managed to convice Supermicro (board
manufacturer), that enabling SR-IOV in BIOS leads to BIOS lockup, they
confirmed it and maybe they provide BIOS upgrade.
Thanks all.
--
Luk?? Hejtm?nek
On Sun, Aug 05, 2012 at 10:05:00AM +0200, Lukas Hejtmanek wrote:
> Hi,
>
> On Fri, Aug 03, 2012 at 06:49:59AM -0700, Konrad Wilk wrote:
> > This looks like you are using PV PCI passthrough? If so, did you
> > remember to use 'iommu=soft' to enable the Xen-SWIOTLB in your guest?
> > And are you booting with more than 4GB? Or is less than 3GB (so that you have
> > a nice gap in E820).
>
> good catch. I forgot to pass swiotl=force for DomU in Xen. So now, it seems
> that mlx4_core works, mlx4_en (ethernet part) works as well. Unfortunately,
> the IB part does not. IB layer complains that SR-IOV is currently unsupported
> (kernel 3.5.0). So no luck here so far.
Don't use swiotlb=force. That is for the old style kernels. Use iommu=soft.
>
> There is OFED stack directly from Mellanox, that seems to support SR-IOV even
> for IB layer, but they have buildable sources only for RHEL/SLES kernels
> (2.6.32) and even correcting the sources to get it compile with 3.5.0 does not
> make things work. The driver complains about interrupts not working in Dom0 or
> even without Xen hypervisor at all.
So there is a bug that .. well, I thought I had fixed it with the
IB layer but maybe not. It was about VM_IO having to be used on the vmaps
being setup. But I can't recall the details. Perhaps the InfiniBand mailing
list might have some ... ah here it is:
http://old-list-archives.xen.org/archives/html/xen-devel/2011-01/msg00246.html
>
> The only good point is, that I managed to convice Supermicro (board
> manufacturer), that enabling SR-IOV in BIOS leads to BIOS lockup, they
> confirmed it and maybe they provide BIOS upgrade.
>
> Thanks all.
>
> --
> Lukáš Hejtmánek
On Mon, Aug 06, 2012 at 10:07:06AM -0400, Konrad Rzeszutek Wilk wrote:
> > good catch. I forgot to pass swiotl=force for DomU in Xen. So now, it seems
> > that mlx4_core works, mlx4_en (ethernet part) works as well. Unfortunately,
> > the IB part does not. IB layer complains that SR-IOV is currently unsupported
> > (kernel 3.5.0). So no luck here so far.
>
> Don't use swiotlb=force. That is for the old style kernels. Use iommu=soft.
OK.
> > There is OFED stack directly from Mellanox, that seems to support SR-IOV even
> > for IB layer, but they have buildable sources only for RHEL/SLES kernels
> > (2.6.32) and even correcting the sources to get it compile with 3.5.0 does not
> > make things work. The driver complains about interrupts not working in Dom0 or
> > even without Xen hypervisor at all.
>
> So there is a bug that .. well, I thought I had fixed it with the
> IB layer but maybe not. It was about VM_IO having to be used on the vmaps
> being setup. But I can't recall the details. Perhaps the InfiniBand mailing
> list might have some ... ah here it is:
> http://old-list-archives.xen.org/archives/html/xen-devel/2011-01/msg00246.html
not sure what do you mean. This fix is for Mellanox OFED driver to work? Or for stock kernel?
Stock kernel contains explicit check for SR-IOV and refuses to load.
this is exact fail of the Mellanox OFED driver.
kernel: [ 6.568433] mlx4_core: Mellanox ConnectX core driver v1.0-mlnx_ofed1.5.3 (November 3, 2011)
kernel: [ 6.568526] mlx4_core: Initializing 0000:02:00.0
kernel: [ 7.071292] mlx4_core 0000:02:00.0: Enabling sriov with:1 vfs
kernel: [ 7.175587] mlx4_core 0000:02:00.0: Running in master mode
kernel: [ 18.613383] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
kernel: [ 18.613475] mlx4_core 0000:02:00.0: NOP command failed to generate MSI-X interrupt IRQ 94).
kernel: [ 18.613564] mlx4_core 0000:02:00.0: Trying again without MSI-X.
kernel: [ 28.606086] mlx4_core 0000:02:00.0: command 0x31 timed out (go bit not cleared)
kernel: [ 30.615093] mlx4_core: probe of 0000:02:00.0 failed with error -16
--
Luk?? Hejtm?nek
On 08/03/2012 02:33 AM, Lukas Hejtmanek wrote:
> I also tried OFED package from Mellanox which seems to have better SR-IOV
> support (at least mlx4_ib does not complain that SR-IOV is not supported).
> However, it does not work when SR-IOV enabled:
Last I heard they were not officially providing support for SR-IOV. Has
anyone heard otherwise from the Mellanox folks?
Chris
On Fri, Aug 10, 2012 at 12:51:53PM -0600, Chris Friesen wrote:
> On 08/03/2012 02:33 AM, Lukas Hejtmanek wrote:
> >I also tried OFED package from Mellanox which seems to have better SR-IOV
> >support (at least mlx4_ib does not complain that SR-IOV is not supported).
> >However, it does not work when SR-IOV enabled:
>
> Last I heard they were not officially providing support for SR-IOV.
> Has anyone heard otherwise from the Mellanox folks?
they speak about it for 2 years:
http://www.openfabrics.org/archives/spring2010sonoma/Monday/1.30%20Liran%20Liss%20I%3FO%20Virtualization/sriov_liss.ppt
these are modified OFED drivers which seem to contain SR-IOV code also for IB
layer.
http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=26&menu_section=34#tab-three
--
Luk?? Hejtm?nek