2013-03-06 02:48:30

by Andrew Cooks

[permalink] [raw]
Subject: Re: Marvell 88NV9143 in mini-PCIe not working with intel_iommu=on

On Tue, Mar 5, 2013 at 11:03 PM, Gaudenz Steinlin <[email protected]> wrote:
>
> [ Sending this to the MVUMI driver authors and the IOMMU list as I can't
> tell which part is at fault. ]
>
> [ ... ]
> [ 4.342079] dmar: DRHD: handling fault status reg 2
> [ 4.342132] dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr fffff000
> [ 4.342132] DMAR:[fault reason 02] Present bit in context entry is clear
> [ ... ]
> [ 34.344078] mvumi 0000:02:00.1: no handshake response at state 0x2.
> [ 34.344115] mvumi 0000:02:00.1: isr : global=0x0,status=0x0.
> [ 34.344146] mvumi 0000:02:00.1: handshake failed at state 0x2.
> [ 34.344266] mvumi: probe of 0000:02:00.1 failed with error -22
>

Looks like another Marvell DMA source tag issue.

> And the full lspic output for this device:
>
> gaudenz@meteor:~$ sudo lspci -vv -nnq -s 02:
> 02:00.0 Mass storage controller [01ff]: Marvell Technology Group Ltd. Device [1b4b:91f3]
> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
> ...
> Capabilities: [140 v1] Virtual Channel
> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb: Fixed- WRR32- WRR64- WRR128-
> Ctrl: ArbSelect=Fixed
> Status: InProgress-
> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> Status: NegoPending- InProgress-
>
> 02:00.1 Mass storage controller [0180]: Marvell Technology Group Ltd. Device [1b4b:9143] (rev 10)
> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
> ...
> Kernel driver in use: mvumi
>
> 02:00.2 Non-Volatile memory controller [0108]: Marvell Technology Group Ltd. Device [1b4b:91e3] (prog-if 01)
> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>...

In this case it seems like a multifunction device with 02:00.1 being
the only function that the mvumi driver cares about. So my guess is
that 02:00.1 is issuing DMA with the incorrect tag of 02:00.0.

Perhaps Alex Williamson can tell us about iommu device groups, whether
it would be possible to group the functions together automatically and
whether that would solve the problem. It should also be possible to
adapt the quirk patch I posted recently to handle this, but I'm still
waiting to hear if that patch has a future.

2013-03-06 03:20:56

by Alex Williamson

[permalink] [raw]
Subject: Re: Marvell 88NV9143 in mini-PCIe not working with intel_iommu=on

On Wed, 2013-03-06 at 10:48 +0800, Andrew Cooks wrote:
> On Tue, Mar 5, 2013 at 11:03 PM, Gaudenz Steinlin <[email protected]> wrote:
> >
> > [ Sending this to the MVUMI driver authors and the IOMMU list as I can't
> > tell which part is at fault. ]
> >
> > [ ... ]
> > [ 4.342079] dmar: DRHD: handling fault status reg 2
> > [ 4.342132] dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr fffff000
> > [ 4.342132] DMAR:[fault reason 02] Present bit in context entry is clear
> > [ ... ]
> > [ 34.344078] mvumi 0000:02:00.1: no handshake response at state 0x2.
> > [ 34.344115] mvumi 0000:02:00.1: isr : global=0x0,status=0x0.
> > [ 34.344146] mvumi 0000:02:00.1: handshake failed at state 0x2.
> > [ 34.344266] mvumi: probe of 0000:02:00.1 failed with error -22
> >
>
> Looks like another Marvell DMA source tag issue.
>
> > And the full lspic output for this device:
> >
> > gaudenz@meteor:~$ sudo lspci -vv -nnq -s 02:
> > 02:00.0 Mass storage controller [01ff]: Marvell Technology Group Ltd. Device [1b4b:91f3]
> > Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
> > ...
> > Capabilities: [140 v1] Virtual Channel
> > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> > Arb: Fixed- WRR32- WRR64- WRR128-
> > Ctrl: ArbSelect=Fixed
> > Status: InProgress-
> > VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> > Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> > Status: NegoPending- InProgress-
> >
> > 02:00.1 Mass storage controller [0180]: Marvell Technology Group Ltd. Device [1b4b:9143] (rev 10)
> > Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
> > ...
> > Kernel driver in use: mvumi
> >
> > 02:00.2 Non-Volatile memory controller [0108]: Marvell Technology Group Ltd. Device [1b4b:91e3] (prog-if 01)
> > Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
> >...
>
> In this case it seems like a multifunction device with 02:00.1 being
> the only function that the mvumi driver cares about. So my guess is
> that 02:00.1 is issuing DMA with the incorrect tag of 02:00.0.
>
> Perhaps Alex Williamson can tell us about iommu device groups, whether
> it would be possible to group the functions together automatically and
> whether that would solve the problem. It should also be possible to
> adapt the quirk patch I posted recently to handle this, but I'm still
> waiting to hear if that patch has a future.

I don't see any ACS support so the functions are likely placed into the
same IOMMU group. If your guess about the source ID tagging is correct
then oddly you probably could use VFIO to assign these devices to a QEMU
guest and it would work because all the functions get the same IOMMU
mapping, but the host IOMMU drivers don't make use of IOMMU groups for
the DMA API, so it's no help there. Thanks,

Alex

2013-03-07 09:25:39

by Gaudenz Steinlin

[permalink] [raw]
Subject: Re: Marvell 88NV9143 in mini-PCIe not working with intel_iommu=on


Hi Andrew

Andrew Cooks <[email protected]> writes:

> On Tue, Mar 5, 2013 at 11:03 PM, Gaudenz Steinlin <[email protected]> wrote:
>>
>> [ Sending this to the MVUMI driver authors and the IOMMU list as I can't
>> tell which part is at fault. ]
>>
>> [ ... ]
>> [ 4.342079] dmar: DRHD: handling fault status reg 2
>> [ 4.342132] dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr fffff000
>> [ 4.342132] DMAR:[fault reason 02] Present bit in context entry is clear
>> [ ... ]
>> [ 34.344078] mvumi 0000:02:00.1: no handshake response at state 0x2.
>> [ 34.344115] mvumi 0000:02:00.1: isr : global=0x0,status=0x0.
>> [ 34.344146] mvumi 0000:02:00.1: handshake failed at state 0x2.
>> [ 34.344266] mvumi: probe of 0000:02:00.1 failed with error -22
>>
>
> Looks like another Marvell DMA source tag issue.

You are probably right with this. See below.

>
>> And the full lspic output for this device:
>>
>> gaudenz@meteor:~$ sudo lspci -vv -nnq -s 02:
>> 02:00.0 Mass storage controller [01ff]: Marvell Technology Group Ltd. Device [1b4b:91f3]
>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>> ...
>> Capabilities: [140 v1] Virtual Channel
>> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
>> Arb: Fixed- WRR32- WRR64- WRR128-
>> Ctrl: ArbSelect=Fixed
>> Status: InProgress-
>> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>> Status: NegoPending- InProgress-
>>
>> 02:00.1 Mass storage controller [0180]: Marvell Technology Group Ltd. Device [1b4b:9143] (rev 10)
>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>> ...
>> Kernel driver in use: mvumi
>>
>> 02:00.2 Non-Volatile memory controller [0108]: Marvell Technology Group Ltd. Device [1b4b:91e3] (prog-if 01)
>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>>...
>
> In this case it seems like a multifunction device with 02:00.1 being
> the only function that the mvumi driver cares about. So my guess is
> that 02:00.1 is issuing DMA with the incorrect tag of 02:00.0.
>
> Perhaps Alex Williamson can tell us about iommu device groups, whether
> it would be possible to group the functions together automatically and
> whether that would solve the problem. It should also be possible to
> adapt the quirk patch I posted recently to handle this, but I'm still
> waiting to hear if that patch has a future.

I adapted your quirk patch to my device and it works. As I'm very new to
this I don't know if my modifications are right or if there is a better
way to do this. Diff on top of the latest version of the quirk you
posted to the iommu list:

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 13323f2..0ba462a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1684,7 +1684,7 @@ static int map_ghost_dma_fn(struct dmar_domain *domain,

fn_map = pci_get_dma_source_map(pdev);

- for (fn = 1; fn < 8; fn++) {
+ for (fn = 0; fn < 8; fn++) {
if (fn_map & (1 << fn)) {
err = domain_context_mapping_one(domain,
pci_domain_nr(pdev->bus),
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d311100..21b664b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3314,6 +3314,7 @@ static const struct pci_dev_dma_multi_func_sources {
{ PCI_VENDOR_ID_MARVELL_2, 0x9128, (1<<0)|(1<<1)},
{ PCI_VENDOR_ID_MARVELL_2, 0x9130, (1<<0)|(1<<1)},
{ PCI_VENDOR_ID_MARVELL_2, 0x9172, (1<<0)|(1<<1)},
+ { PCI_VENDOR_ID_MARVELL_2, 0x9143, (1<<0)},
{ 0 }
};

Gaudenz

--
Ever tried. Ever failed. No matter.
Try again. Fail again. Fail better.
~ Samuel Beckett ~

2013-03-08 02:53:20

by Andrew Cooks

[permalink] [raw]
Subject: Re: Marvell 88NV9143 in mini-PCIe not working with intel_iommu=on

On Thu, Mar 7, 2013 at 5:25 PM, Gaudenz Steinlin <[email protected]> wrote:
>
> Hi Andrew
>
> Andrew Cooks <[email protected]> writes:
>
>> On Tue, Mar 5, 2013 at 11:03 PM, Gaudenz Steinlin <[email protected]> wrote:
>>>
>>> [ Sending this to the MVUMI driver authors and the IOMMU list as I can't
>>> tell which part is at fault. ]
>>>
>>> [ ... ]
>>> [ 4.342079] dmar: DRHD: handling fault status reg 2
>>> [ 4.342132] dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr fffff000
>>> [ 4.342132] DMAR:[fault reason 02] Present bit in context entry is clear
>>> [ ... ]
>>> [ 34.344078] mvumi 0000:02:00.1: no handshake response at state 0x2.
>>> [ 34.344115] mvumi 0000:02:00.1: isr : global=0x0,status=0x0.
>>> [ 34.344146] mvumi 0000:02:00.1: handshake failed at state 0x2.
>>> [ 34.344266] mvumi: probe of 0000:02:00.1 failed with error -22
>>>
>>
>> Looks like another Marvell DMA source tag issue.
>
> You are probably right with this. See below.
>
>>
>>> And the full lspic output for this device:
>>>
>>> gaudenz@meteor:~$ sudo lspci -vv -nnq -s 02:
>>> 02:00.0 Mass storage controller [01ff]: Marvell Technology Group Ltd. Device [1b4b:91f3]
>>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>>> ...
>>> Capabilities: [140 v1] Virtual Channel
>>> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
>>> Arb: Fixed- WRR32- WRR64- WRR128-
>>> Ctrl: ArbSelect=Fixed
>>> Status: InProgress-
>>> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>>> Status: NegoPending- InProgress-
>>>
>>> 02:00.1 Mass storage controller [0180]: Marvell Technology Group Ltd. Device [1b4b:9143] (rev 10)
>>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>>> ...
>>> Kernel driver in use: mvumi
>>>
>>> 02:00.2 Non-Volatile memory controller [0108]: Marvell Technology Group Ltd. Device [1b4b:91e3] (prog-if 01)
>>> Subsystem: Marvell Technology Group Ltd. Device [1b4b:9143]
>>>...
>>
>> In this case it seems like a multifunction device with 02:00.1 being
>> the only function that the mvumi driver cares about. So my guess is
>> that 02:00.1 is issuing DMA with the incorrect tag of 02:00.0.
>>
>> Perhaps Alex Williamson can tell us about iommu device groups, whether
>> it would be possible to group the functions together automatically and
>> whether that would solve the problem. It should also be possible to
>> adapt the quirk patch I posted recently to handle this, but I'm still
>> waiting to hear if that patch has a future.
>
> I adapted your quirk patch to my device and it works.

Thanks for testing it!

> As I'm very new to this I don't know if my modifications are right or if there is a better
> way to do this.

Don't worry, I'm also pretty new here.

For this device, I think the quirk used for Ricoh devices that's
already in the mainline kernel would also work, because both function
0 and function 1 are known and the device only seems to use function
0. However, since it is another Marvell device, it may look out of
place in that table.

I'm hoping one of the veteran developers would give some guidance for
the best approach to enable more devices in future.

> Diff on top of the latest version of the quirk you
> posted to the iommu list:
>
I included this in v4 of the patch I posted, before I saw your message.

Thanks.

2013-03-26 15:14:43

by Joerg Roedel

[permalink] [raw]
Subject: Re: Marvell 88NV9143 in mini-PCIe not working with intel_iommu=on

On Fri, Mar 08, 2013 at 10:53:16AM +0800, Andrew Cooks wrote:
> I'm hoping one of the veteran developers would give some guidance for
> the best approach to enable more devices in future.

In cases like this where a device just uses the request-id of another
device you can make use of the existing quirk-code in
drivers/pci/quirks.c. Look at the function pci_get_dma_source() and add
your device there. This should also make it work with VT-d and AMD-Vi.


Joerg