2014-09-06 04:01:48

by Alex Williamson

[permalink] [raw]
Subject: Re: ath9k and AMD IOMMU alias breakage on 3.16?

On Fri, 2014-09-05 at 20:00 -0700, Jason Newton wrote:
> Hi,
>
> I have an AR9462 connected over minipcie, it came with the ASRock
> FM2A88x-itx motherboard and I'm using an AMD A10-7850K cpu with it. When I
> have IOMMU enabled, and this is desirable for opencl related things, the
> AR9462 malfunctions with these two errors occurring:
>
> AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:00.1
> address=0x000000fdf8080020 flags=0x0a00]
>
> Followed by this looping error, which reoccurs shortly after associating
> and obtaining an IP. I've attached a large snippet from dmesg, due to it's
> length.
>
> I'm not sure where the error is coming from but I did see that Alex
> Williamson posted a patch that dealt with aliased devices (like pci
> bridges) and AMD's IOMMU issues, see
> e028a9e6b8a637af09ac4114083280df4a7045f1 for reference.
>
> I then disable IOMMU in the bios, and immediately the ath9k/AR9462 pair is
> working without flaw or retries. Bios is up to date btw, updated it after
> building the machine last friday.
>
> So anyone have a clue what's going on here?

Please boot with "amd_iommu_dump" on the kernel boot line and send the
full dmesg log and 'sudo lspci -vvv' output. Thanks,

Alex



2014-09-08 10:32:06

by Joerg Roedel

[permalink] [raw]
Subject: Re: ath9k and AMD IOMMU alias breakage on 3.16?

On Fri, Sep 05, 2014 at 10:01:44PM -0600, Alex Williamson wrote:
> On Fri, 2014-09-05 at 20:00 -0700, Jason Newton wrote:
> > Hi,
> >
> > I have an AR9462 connected over minipcie, it came with the ASRock
> > FM2A88x-itx motherboard and I'm using an AMD A10-7850K cpu with it. When I
> > have IOMMU enabled, and this is desirable for opencl related things, the
> > AR9462 malfunctions with these two errors occurring:
> >
> > AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:00.1
> > address=0x000000fdf8080020 flags=0x0a00]

This means that the device 00:00.1 is sending to the interrupt/EOI
address-range while interrupt remapping is enabled. You can boot with
intremap=off on the kernel command line to work around this problem.

This looks either like another PCI aliasing issue or a 00:00.1 hidden
device is sending interrupt requests (which it is not allowed to do).


Joerg


2014-09-12 16:17:55

by Joerg Roedel

[permalink] [raw]
Subject: Re: ath9k and AMD IOMMU alias breakage on 3.16?

Hi Jason,

On Fri, Sep 05, 2014 at 10:28:01PM -0700, Jason Newton wrote:
> [ 0.021820] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b8 info 0000
> [ 0.021827] AMD-Vi: mmio-addr: 00000000feb80000
> [ 0.021844] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:01.0 flags: 00
> [ 0.021848] AMD-Vi: DEV_RANGE_END devid: ff:1f.6
> [ 0.022730] AMD-Vi: DEV_ALIAS_RANGE devid: 02:00.0 flags: 00 devid_to: 00:14.4
> [ 0.022735] AMD-Vi: DEV_RANGE_END devid: 02:1f.7
> [ 0.022745] AMD-Vi: DEV_SPECIAL(HPET[0]) devid: 00:14.0
> [ 0.022749] AMD-Vi: DEV_SPECIAL(IOAPIC[5]) devid: 00:14.0
> [ 0.022753] AMD-Vi: DEV_SPECIAL(IOAPIC[6]) devid: 00:00.0

It is just a test, as I don't know how the hardware is actually wired in
your system, but can you try to boot with 'ivrs_ioapic[6]=00:00.1' on
the kernel command line and report if it makes any difference?

Thanks,

Joerg


2014-09-12 21:19:25

by Jason Newton

[permalink] [raw]
Subject: Re: ath9k and AMD IOMMU alias breakage on 3.16?

Hi Joerg,

I've performed both experiments you've asked for.

intremap=off seems to have everything working with the IOMMU on
although I've only been running it for a few minutes and not tested
things exhaustively.

The ivrs_ioapic test however results in ath9k failing to load.

I've attached dmesg's of both test cases.

Also, I noticed SWIOTLB is in use - is that common with a IOMMU these days?

-Jason

On Fri, Sep 12, 2014 at 9:17 AM, Joerg Roedel <[email protected]> wrote:
> Hi Jason,
>
> On Fri, Sep 05, 2014 at 10:28:01PM -0700, Jason Newton wrote:
>> [ 0.021820] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b8 info 0000
>> [ 0.021827] AMD-Vi: mmio-addr: 00000000feb80000
>> [ 0.021844] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:01.0 flags: 00
>> [ 0.021848] AMD-Vi: DEV_RANGE_END devid: ff:1f.6
>> [ 0.022730] AMD-Vi: DEV_ALIAS_RANGE devid: 02:00.0 flags: 00 devid_to: 00:14.4
>> [ 0.022735] AMD-Vi: DEV_RANGE_END devid: 02:1f.7
>> [ 0.022745] AMD-Vi: DEV_SPECIAL(HPET[0]) devid: 00:14.0
>> [ 0.022749] AMD-Vi: DEV_SPECIAL(IOAPIC[5]) devid: 00:14.0
>> [ 0.022753] AMD-Vi: DEV_SPECIAL(IOAPIC[6]) devid: 00:00.0
>
> It is just a test, as I don't know how the hardware is actually wired in
> your system, but can you try to boot with 'ivrs_ioapic[6]=00:00.1' on
> the kernel command line and report if it makes any difference?
>
> Thanks,
>
> Joerg
>


Attachments:
dmesg_intremapoff.txt (60.31 kB)
dmesg_ivrs_ioapic.txt (60.23 kB)
Download all attachments

2014-09-06 05:28:04

by Jason Newton

[permalink] [raw]
Subject: Re: ath9k and AMD IOMMU alias breakage on 3.16?

Hi Alex,

I've attached what you've requested after I reenabled iommu.

On Fri, Sep 5, 2014 at 9:01 PM, Alex Williamson
<[email protected]> wrote:
> On Fri, 2014-09-05 at 20:00 -0700, Jason Newton wrote:
>> Hi,
>>
>> I have an AR9462 connected over minipcie, it came with the ASRock
>> FM2A88x-itx motherboard and I'm using an AMD A10-7850K cpu with it. When I
>> have IOMMU enabled, and this is desirable for opencl related things, the
>> AR9462 malfunctions with these two errors occurring:
>>
>> AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:00.1
>> address=0x000000fdf8080020 flags=0x0a00]
>>
>> Followed by this looping error, which reoccurs shortly after associating
>> and obtaining an IP. I've attached a large snippet from dmesg, due to it's
>> length.
>>
>> I'm not sure where the error is coming from but I did see that Alex
>> Williamson posted a patch that dealt with aliased devices (like pci
>> bridges) and AMD's IOMMU issues, see
>> e028a9e6b8a637af09ac4114083280df4a7045f1 for reference.
>>
>> I then disable IOMMU in the bios, and immediately the ath9k/AR9462 pair is
>> working without flaw or retries. Bios is up to date btw, updated it after
>> building the machine last friday.
>>
>> So anyone have a clue what's going on here?
>
> Please boot with "amd_iommu_dump" on the kernel boot line and send the
> full dmesg log and 'sudo lspci -vvv' output. Thanks,
>
> Alex
>


Attachments:
dmesg.txt (62.54 kB)
lspci.txt (30.74 kB)
Download all attachments