2011-05-22 22:05:02

by Daniel Haid

[permalink] [raw]
Subject: Question about iommu on x86_64 and radeon driver.

Hello,

I have an x86_64 system with a VIA chipset and 4GB of RAM. The
mainboard
is an ASUS M2V where the bios-setup has an option called "Map around
memory hole" and I have an "ATI Technologies Inc RV710 [Radeon HD
4350]"
graphics card according to lspci.

Now one of the following things happen to my system:

1) With the bios-option enabled and no kernel parameters I get the
following error:

[drm:r600_ring_test] *ERROR* radeon: ring test failed
(scratch(0x8504)=0xCAFEDEAD)
radeon 0000:02:00.0: disabling GPU acceleration

and then I can not use any 3d-acceleration. I also get the message

Looks like a VIA chipset. Disabling IOMMU. Override with iommu=allowed

2) With the bios-option enabled and "mem=3072M" I can not use only 3GB
of RAM,
but the radeon card works.

3) With the bios-option enabled and "iommu=allowed" I get 4GB of RAM
and
the radeon card works. But I wonder whether this can have any bad
effects?

4) Without the bios-option (and without any kernel parameters) I do not
get the "Looks like a VIA chipset. Disabling IOMMU. Override with
iommu=allowed"
message, but strangely linux shows only about 3GB of RAM.

I did not try other combinations. Now my questions are

A) Is this a bug in the radeon driver? Or maybe not, since
Documentation/x86/x86_64/boot-options.txt
seems to imply that for >3GB an iommu is required?

B) Is it safe to use iommu=allowed in my case ? If not, what problems
will I encounter and what
options should I use instead? Will I be stuck with 3GB of RAM?

Please cc me if you answer, since I am not subscribed.

Thank you.


2011-05-23 22:05:54

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Sun, May 22, 2011 at 10:56:27PM +0100, Daniel Haid wrote:
> Hello,
>
> I have an x86_64 system with a VIA chipset and 4GB of RAM. The
> mainboard
> is an ASUS M2V where the bios-setup has an option called "Map around
> memory hole" and I have an "ATI Technologies Inc RV710 [Radeon HD

There had to be more than 'Map around memory hole'? Was it called
GART or IOMMU?

> 4350]"
> graphics card according to lspci.
>
> Now one of the following things happen to my system:
>
> 1) With the bios-option enabled and no kernel parameters I get the
> following error:
>
> [drm:r600_ring_test] *ERROR* radeon: ring test failed
> (scratch(0x8504)=0xCAFEDEAD)
> radeon 0000:02:00.0: disabling GPU acceleration
>
> and then I can not use any 3d-acceleration. I also get the message

The problem you are hitting (I think) is that the AMD GART poor-man IOMMU is turned off
and the SWIOTLB is used instead. If you would like some technical
details, take a look at:
http://lists.freedesktop.org/archives/dri-devel/2011-January/006885.html
(the point #2 is what you are hitting).

>
> Looks like a VIA chipset. Disabling IOMMU. Override with iommu=allowed
>
> 2) With the bios-option enabled and "mem=3072M" I can not use only
> 3GB of RAM,
> but the radeon card works.
>
> 3) With the bios-option enabled and "iommu=allowed" I get 4GB of RAM
> and
> the radeon card works. But I wonder whether this can have any bad
> effects?

Not sure why the AMD GART IOMMU gets disabled on VIA chipsets. You might
want to use 'git gui blame arch/x86/kernel/early-quirks.c' and look
at the code in question to figure that out.

>
> 4) Without the bios-option (and without any kernel parameters) I do not
> get the "Looks like a VIA chipset. Disabling IOMMU. Override with
> iommu=allowed"
> message, but strangely linux shows only about 3GB of RAM.
>
> I did not try other combinations. Now my questions are
>
> A) Is this a bug in the radeon driver? Or maybe not, since
> Documentation/x86/x86_64/boot-options.txt
> seems to imply that for >3GB an iommu is required?

Correct, and an IOMMU does get turned on. The SWIOTLB one.
>
> B) Is it safe to use iommu=allowed in my case ? If not, what
> problems will I encounter and what
> options should I use instead? Will I be stuck with 3GB of RAM?

Well, if everything works.... but you might just want to use
the git gui blame to take a look at the back-story of why the quirk
was added.

2011-05-23 23:45:50

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> There had to be more than 'Map around memory hole'? Was it called
> GART or IOMMU?
I do not think that there was "IOMMU" or "GART" written there,
but I do not think that the mainboard in question has an IOMMU
(Am I correct that it would be a feature of the mainboard while
the AMD GART is a feature of the CPU?).
I will look again as soon as I have physical access to the system.

> The problem you are hitting (I think) is that the AMD GART poor-man
> IOMMU is turned off
> and the SWIOTLB is used instead. If you would like some technical
> details, take a look at:
>
> http://lists.freedesktop.org/archives/dri-devel/2011-January/006885.html
> (the point #2 is what you are hitting).

You are correct. In all the cases where the radeon card does not work
I see that SWIOTLB has been enabled in the kernel log.

So this is a bug? I suppose that all hardware should be working with
SWIOTLB? Will a patch that fixes this somewhen be included?
(The bug where your link points to was closed with WONTFIX)

> Not sure why the AMD GART IOMMU gets disabled on VIA chipsets. You
> might
> want to use 'git gui blame arch/x86/kernel/early-quirks.c' and look
> at the code in question to figure that out.
> Well, if everything works.... but you might just want to use
> the git gui blame to take a look at the back-story of why the quirk
> was added.

Unfortunately I am getting crashes with "iommu=allowed". I will look
at git blame.

Thank you for your answers.

2011-05-24 15:50:28

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Tue, May 24, 2011 at 12:45:47AM +0100, Daniel Haid wrote:
> >There had to be more than 'Map around memory hole'? Was it called
> >GART or IOMMU?
> I do not think that there was "IOMMU" or "GART" written there,
> but I do not think that the mainboard in question has an IOMMU
> (Am I correct that it would be a feature of the mainboard while
> the AMD GART is a feature of the CPU?).

So AMD GART is called poor-man IOMMU. And it is part of the
motherboard (northbridge mostly I think).

> I will look again as soon as I have physical access to the system.
>
> >The problem you are hitting (I think) is that the AMD GART poor-man
> >IOMMU is turned off
> >and the SWIOTLB is used instead. If you would like some technical
> >details, take a look at:
> >
> >http://lists.freedesktop.org/archives/dri-devel/2011-January/006885.html
> >(the point #2 is what you are hitting).
>
> You are correct. In all the cases where the radeon card does not work
> I see that SWIOTLB has been enabled in the kernel log.
>
> So this is a bug? I suppose that all hardware should be working with
> SWIOTLB? Will a patch that fixes this somewhen be included?
> (The bug where your link points to was closed with WONTFIX)

Not bug per say. I've been working on making the TTM use the DMA API
so that those pages are allocated at startup and you don't end up
with having to sync the pages .. but I broke PowerPC and ARM during 2.6.39
so I need to redo it.

>
> >Not sure why the AMD GART IOMMU gets disabled on VIA chipsets. You
> >might
> >want to use 'git gui blame arch/x86/kernel/early-quirks.c' and look
> >at the code in question to figure that out.
> >Well, if everything works.... but you might just want to use
> >the git gui blame to take a look at the back-story of why the quirk
> >was added.
>
> Unfortunately I am getting crashes with "iommu=allowed". I will look
> at git blame.
>
> Thank you for your answers.

2011-05-24 21:33:12

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Not bug per say. I've been working on making the TTM use the DMA API
> so that those pages are allocated at startup and you don't end up
> with having to sync the pages .. but I broke PowerPC and ARM during
> 2.6.39
> so I need to redo it.

If this is not a bug, shouldn't there be a message like
"radeon currently does not work with SWIOTLB, disabling..."
instead of
"ring test failed, disabling..."?

2011-05-24 22:49:12

by Andi Kleen

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

Daniel Haid <[email protected]> writes:
>
> Unfortunately I am getting crashes with "iommu=allowed". I will look
> at git blame.

I added it originally. The VIA chipset seems to hang on most
DMAs through the IOMMU, so it was disabled.

-Andi

2011-05-25 10:00:37

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> I added it originally. The VIA chipset seems to hang on most
> DMAs through the IOMMU, so it was disabled.

I see. Now, maybe this will not work for some reason, but
would it be possible to somehow disable both IOMMU and SWIOTLB
and tell all drivers only to do DMA with the lower 4GB of memory?

Unfortunately simply turning "iommu=off" does not work for me
as the sata and pata drivers stop working.

2011-05-25 13:03:12

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Wed, May 25, 2011 at 11:00:33AM +0100, Daniel Haid wrote:
> >I added it originally. The VIA chipset seems to hang on most
> >DMAs through the IOMMU, so it was disabled.
>
> I see. Now, maybe this will not work for some reason, but
> would it be possible to somehow disable both IOMMU and SWIOTLB
> and tell all drivers only to do DMA with the lower 4GB of memory?

Only if you allow 3GB or less in the machine. So you would have to do
mem=3G as well.

The reason is that (and you can see that yourself by looking at the
E820), is that 1GB is actually _above_ the 4GB.
>
> Unfortunately simply turning "iommu=off" does not work for me
> as the sata and pata drivers stop working.

Right, no surprise there either - they are 32-bit and can't DMA to
memory above 4GB (2^32).

2011-05-25 12:58:46

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Tue, May 24, 2011 at 10:33:08PM +0100, Daniel Haid wrote:
> >Not bug per say. I've been working on making the TTM use the DMA API
> >so that those pages are allocated at startup and you don't end up
> >with having to sync the pages .. but I broke PowerPC and ARM
> >during 2.6.39
> >so I need to redo it.
>
> If this is not a bug, shouldn't there be a message like
> "radeon currently does not work with SWIOTLB, disabling..."
> instead of
> "ring test failed, disabling..."?

Or perhaps just fix the driver? That is what I am working on - are you
interested in doing some beta-testing when they are ready?

2011-05-25 14:28:54

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Or perhaps just fix the driver? That is what I am working on - are
> you
> interested in doing some beta-testing when they are ready?

Of course I am. Anything to get rid of this problem.

2011-05-25 14:51:29

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

>> I see. Now, maybe this will not work for some reason, but
>> would it be possible to somehow disable both IOMMU and SWIOTLB
>> and tell all drivers only to do DMA with the lower 4GB of memory?
>
> Only if you allow 3GB or less in the machine. So you would have to do
> mem=3G as well.
>
> The reason is that (and you can see that yourself by looking at the
> E820), is that 1GB is actually _above_ the 4GB.

I see. Is it possible to somehow give "mem=3G" to the kernel and still
use the memory above 4G for something, like a ramdisk or something?

2011-05-25 20:21:24

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Only if you allow 3GB or less in the machine. So you would have to do
> mem=3G as well.
>
> The reason is that (and you can see that yourself by looking at the
> E820), is that 1GB is actually _above_ the 4GB.

Just another question on this one:

Why can a driver not simply ask for DMA-capable memory, is this not
what memory zones are for?

So if a driver wants to do DMA it requests DMA-memory and gets it from
the lower than 4GB address space.

But for some reason it is not done like this, why?

2011-05-25 23:05:47

by Andi Kleen

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Wed, May 25, 2011 at 09:21:21PM +0100, Daniel Haid wrote:
> >Only if you allow 3GB or less in the machine. So you would have to do
> >mem=3G as well.
> >
> >The reason is that (and you can see that yourself by looking at the
> >E820), is that 1GB is actually _above_ the 4GB.
>
> Just another question on this one:
>
> Why can a driver not simply ask for DMA-capable memory, is this not
> what memory zones are for?

In many cases the memory gets passed into the driver.

If it's not already in the right boundaries it would need to copy through
a bounce buffer. That is what swiotlb does, if there's no IOMMU.

-Andi

2011-05-27 15:48:44

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

I have looked again at the radeon driver and it seems that the
drm drivers use vmalloc_32 to allocate their dma memory and
that vmalloc_32 should allocate physical 32-bit addresses
(in 2.6.39)?

So why does it still not work?

2011-05-27 15:55:28

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Fri, May 27, 2011 at 04:48:39PM +0100, Daniel Haid wrote:
> I have looked again at the radeon driver and it seems that the
> drm drivers use vmalloc_32 to allocate their dma memory and
> that vmalloc_32 should allocate physical 32-bit addresses
> (in 2.6.39)?

The DRM code is not used anymore. It was used by XFree86 but nowadays
the TTM code is used instead.

>
> So why does it still not work?

Ah, this would be a lengthy writeup.. so did you look at the
link I pointed you to?

2011-05-27 22:20:13

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> The DRM code is not used anymore. It was used by XFree86 but nowadays
> the TTM code is used instead.

I see

> Ah, this would be a lengthy writeup.. so did you look at the
> link I pointed you to?

Yes I have started, but do not understand everything yet.

What are MFNs? Is this something Xen-specific?

An does your post imply that on bare metal with "swiotlb=force
iommu=soft"
alloc_page with GFP_DMA32 does also not allocate under 4GB?

2011-05-31 13:45:50

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Fri, May 27, 2011 at 11:20:09PM +0100, Daniel Haid wrote:
> >The DRM code is not used anymore. It was used by XFree86 but nowadays
> >the TTM code is used instead.
>
> I see
>
> >Ah, this would be a lengthy writeup.. so did you look at the
> >link I pointed you to?
>
> Yes I have started, but do not understand everything yet.
>
> What are MFNs? Is this something Xen-specific?
>

<nods>
>
> An does your post imply that on bare metal with "swiotlb=force
> iommu=soft"
> alloc_page with GFP_DMA32 does also not allocate under 4GB?

Noo.. It does, but the normal assumption of 'phys_to_virt' == 'phys_to_bus' is
not valid anymore. Think of a buffer (swiotlb) which has a pool
of pages and when a PCI device wants a page, it hands one out. It also has
other functionality such as 'mapping' of an already allocated page. If the
PCI device asks the IOMMU (swiotlb) to map a page (and if you have 'swiotlb=force'
the page provided has been allocated above 4GB and the device can only handle
up to 32-bit), then swiotlb gives out a page from its own pool. You now have
two addresses: the one from the PCI pool (swiotlb) and the one you already
allocated. You are suppose to program your PCI card to read/write data to the
page provided from the IOMMU (so the swiotlb), which means that it won't
write to the page you had allocated. Hence there are a calls, such as 'sync_page'..
which will copy the contents from the swiotlb page to the one you had allocated
(or vice-versa). This is called 'bounce buffer'.

The radeon (and nouveau) don't have the code to call the 'sync_page', and
you wouldn't really want to do so - as it slows down the performance of the
machine. There exists another mechanism which is to allocate the pages
at the start, and not do mapping later one.

2011-05-31 15:34:37

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Noo.. It does, but the normal assumption of 'phys_to_virt' ==
> 'phys_to_bus' is
> not valid anymore. Think of a buffer (swiotlb) which has a pool
> of pages and when a PCI device wants a page, it hands one out. It
> also has
> other functionality such as 'mapping' of an already allocated page.
> If the
> PCI device asks the IOMMU (swiotlb) to map a page (and if you have
> 'swiotlb=force'
> the page provided has been allocated above 4GB and the device can
> only handle
> up to 32-bit),

Does the radeon driver allocate without DMA32, possibly above 4GB, ...

> then swiotlb gives out a page from its own pool. You now have
> two addresses: the one from the PCI pool (swiotlb) and the one you
> already
> allocated.

... or does it allocate under 4GB but nevertheless get a page from
the swiotlb pool?

> You are suppose to program your PCI card to read/write data to the
> page provided from the IOMMU (so the swiotlb), which means that it
> won't
> write to the page you had allocated. Hence there are a calls, such as
> 'sync_page'..
> which will copy the contents from the swiotlb page to the one you had
> allocated
> (or vice-versa). This is called 'bounce buffer'.
>
> The radeon (and nouveau) don't have the code to call the 'sync_page',
> and
> you wouldn't really want to do so - as it slows down the performance
> of the
> machine. There exists another mechanism which is to allocate the
> pages
> at the start, and not do mapping later one.

Why can the radeon not simply allocate addresses under 4GB and not
request
adresses from the iommu at all?

I assume that if you request a page from the IOMMU, you are required to
do
these sync_page calls (and that they get optimized away with a hardware
IOMMU?).

So if the radeon uses the IOMMU but does not call sync_page even if
required to
the code seems to be broken. If this is indeed the case would it not be
possible to
simply add the sync_page calls for now (and thus fix the code), if it
is not
difficult, and implement the method with more performance later?

2011-05-31 16:02:44

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Tue, May 31, 2011 at 04:34:33PM +0100, Daniel Haid wrote:
> >Noo.. It does, but the normal assumption of 'phys_to_virt' ==
> >'phys_to_bus' is
> >not valid anymore. Think of a buffer (swiotlb) which has a pool
> >of pages and when a PCI device wants a page, it hands one out. It
> >also has
> >other functionality such as 'mapping' of an already allocated
> >page. If the
> >PCI device asks the IOMMU (swiotlb) to map a page (and if you have
> >'swiotlb=force'
> >the page provided has been allocated above 4GB and the device can
> >only handle
> >up to 32-bit),
>
> Does the radeon driver allocate without DMA32, possibly above 4GB, ...

On some pages, yes.
>
> >then swiotlb gives out a page from its own pool. You now have
> >two addresses: the one from the PCI pool (swiotlb) and the one you
> >already
> >allocated.
>
> ... or does it allocate under 4GB but nevertheless get a page from
> the swiotlb pool?

Only if the swiotlb is turned on and if the dma_mask is 32-bit.
Or if the swiotlb=force is set, then _every_ DMA operation goes
throught the swiotlb page pool.

>
> >You are suppose to program your PCI card to read/write data to the
> >page provided from the IOMMU (so the swiotlb), which means that it
> >won't
> >write to the page you had allocated. Hence there are a calls, such as
> >'sync_page'..
> >which will copy the contents from the swiotlb page to the one you had
> >allocated
> >(or vice-versa). This is called 'bounce buffer'.
> >
> >The radeon (and nouveau) don't have the code to call the
> >'sync_page', and
> >you wouldn't really want to do so - as it slows down the
> >performance of the
> >machine. There exists another mechanism which is to allocate the
> >pages
> >at the start, and not do mapping later one.
>
> Why can the radeon not simply allocate addresses under 4GB and not
> request
> adresses from the iommu at all?

It does on some. But not on all. It needs to go through
the PCI API to work with the graphic card and you need to map the
pages to the IOMMU.

>
> I assume that if you request a page from the IOMMU, you are required
> to do
> these sync_page calls (and that they get optimized away with a
> hardware IOMMU?).

Sure.. but not all drivers do it.
>
> So if the radeon uses the IOMMU but does not call sync_page even if
> required to
> the code seems to be broken. If this is indeed the case would it not
> be possible to
> simply add the sync_page calls for now (and thus fix the code), if
> it is not
> difficult, and implement the method with more performance later?

Why not fix it right the first time?

2011-05-31 16:20:05

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Why not fix it right the first time?

I meant it only if the fix was trivial, like adding a few lines of
sync_page calls. In that case maybe it could go into 3.0.0
and those people with via chipsets or xen could have a working system.

If there is no trivial fix then of course you are right.

2011-05-31 19:57:42

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

Just another question, the problem seems to be an issue with

swiotlb=force

and I assume that without "force" DMA with addresses that
already are below 4GB should be handled without any
bounce-buffer, and with "force" everything goes through
the bounce-buffer, is this correct?

Now I have a VIA chipset where the IOMMU does not work, so
swiotlb is activated, but is it activated in force mode or
in normal mode? How can I find out?

If it is activated in force mode, maybe the problem will
go away in normal mode?

2011-06-01 13:25:04

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Tue, May 31, 2011 at 08:57:39PM +0100, Daniel Haid wrote:
> Just another question, the problem seems to be an issue with
>
> swiotlb=force
>
> and I assume that without "force" DMA with addresses that
> already are below 4GB should be handled without any
> bounce-buffer, and with "force" everything goes through
> the bounce-buffer, is this correct?
>
> Now I have a VIA chipset where the IOMMU does not work, so
> swiotlb is activated, but is it activated in force mode or
> in normal mode? How can I find out?

Look in the source code. I somehow assumed it would activate it
but maybe it isn't. And the problem you have is related to
something else?
>
> If it is activated in force mode, maybe the problem will
> go away in normal mode?

<nods> Should.

2011-06-01 16:10:47

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

> Look in the source code. I somehow assumed it would activate it
> but maybe it isn't. And the problem you have is related to
> something else?

If I have read the code correctly the only way to get
swiotlb_force=1 in lib/swiotlb.c is to add the "swiotlb=force"
argument to the kernel parameters.

Without it on VIA chipsets swiotlb gets activated, but with
swiotlb_force=0. In this case DMA to addresses below 4GB should be
direct.

But since I still get the error the radeon driver must (incorrectly)
allocate memory above 4GB somewhere, but where?

2011-06-01 21:51:44

by Andi Kleen

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Wed, Jun 01, 2011 at 05:10:44PM +0100, Daniel Haid wrote:
> >Look in the source code. I somehow assumed it would activate it
> >but maybe it isn't. And the problem you have is related to
> >something else?
>
> If I have read the code correctly the only way to get
> swiotlb_force=1 in lib/swiotlb.c is to add the "swiotlb=force"
> argument to the kernel parameters.
>
> Without it on VIA chipsets swiotlb gets activated, but with
> swiotlb_force=0. In this case DMA to addresses below 4GB should be
> direct.
>
> But since I still get the error the radeon driver must (incorrectly)
> allocate memory above 4GB somewhere, but where?

Someone has to debug it. grep for the error message, read the
surrounding code, add printks, run it, until you figure out which allocation
has the problem.

Adding some radeon driver people to cc. The original problem
is described in http://choon.net/forum/read.php?21,106131,115940

-Andi

--
[email protected] -- Speaking for myself only.

2011-06-01 21:58:12

by Daniel Haid

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

I have also found out the following. In

line 741 of drivers/gpu/drm/radeon/radeon_device.c

there is a comment "PCIE - can handle 40-bits." -
I have a PCIE card - and then need_dma32 is not set.

So if I read it correctly the ttm allocation
routines will allocate memory over 4GB.

But if PCIE can handle 40 bits, why does swiotlb
give out a bounce buffer to the radeon driver at all?

Shouldn't

(dma_capable(dev, dev_addr, size) && !swiotlb_force)

on line 672 of lib/swiotlb.c be true?


2011-06-01 22:22:29

by Andi Kleen

[permalink] [raw]
Subject: Re: Question about iommu on x86_64 and radeon driver.

On Wed, Jun 01, 2011 at 10:58:08PM +0100, Daniel Haid wrote:
> I have also found out the following. In
>
> line 741 of drivers/gpu/drm/radeon/radeon_device.c
>
> there is a comment "PCIE - can handle 40-bits." -
> I have a PCIE card - and then need_dma32 is not set.
>
> So if I read it correctly the ttm allocation
> routines will allocate memory over 4GB.
>
> But if PCIE can handle 40 bits, why does swiotlb
> give out a bounce buffer to the radeon driver at all?

swiotlb only does what the driver tells it

Also new drivers default to 4GB, so if a driver wants
something else it has to set it explicitely.

Also BTW the driver should work in any case, so likely
something else is wrong.

Really you need to discuss this with the radeon driver people.
Readded cc.



-Andi
--
[email protected] -- Speaking for myself only.

2011-06-03 17:31:46

by Daniel Haid

[permalink] [raw]
Subject: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

On my x86_64 system with >4GB of ram and swiotlb instead of
a hardware iommu (because I have a VIA chipset), the call
to pci_set_dma_mask (see below) with 40bits returns an error.

But it seems that the radeon driver is designed to have
need_dma32 = true exactly if pci_set_dma_mask is called
with 32 bits and false if it is called with 40 bits.

I have read somewhere that the default are 32 bits. So if the
call fails I suppose that need_dma32 should be set to true.

And indeed the patch fixes the problem I have had before
and which I had described here:
http://choon.net/forum/read.php?21,106131,115940

---
linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c.old 2011-06-03
19:11:33.208891994 +0200
+++
linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c 2011-06-03
19:21:10.240337986 +0200
@@ -752,6 +752,7 @@ int radeon_device_init(struct radeon_dev
dma_bits = rdev->need_dma32 ? 32 : 40;
r = pci_set_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));
if (r) {
+ rdev->need_dma32 = true;
printk(KERN_WARNING "radeon: No suitable DMA available.\n");
}

2011-06-03 20:45:03

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

On Fri, Jun 3, 2011 at 1:31 PM, Daniel Haid <[email protected]> wrote:
> On my x86_64 system with >4GB of ram and swiotlb instead of
> a hardware iommu (because I have a VIA chipset), the call
> to pci_set_dma_mask (see below) with 40bits returns an error.
>
> But it seems that the radeon driver is designed to have
> need_dma32 = true exactly if pci_set_dma_mask is called
> with 32 bits and false if it is called with 40 bits.
>
> I have read somewhere that the default are 32 bits. So if the
> call fails I suppose that need_dma32 should be set to true.
>
> And indeed the patch fixes the problem I have had before
> and which I had described here:
> http://choon.net/forum/read.php?21,106131,115940

This looks like the correct fix. rdev->need_dma32 is used when we init
ttm for memory management later.

Alex

Acked-by: Alex Deucher <[email protected]>

>
> --- linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c.old
> ?2011-06-03 19:11:33.208891994 +0200
> +++ linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c ?2011-06-03
> 19:21:10.240337986 +0200
> @@ -752,6 +752,7 @@ int radeon_device_init(struct radeon_dev
> ? ? ? ?dma_bits = rdev->need_dma32 ? 32 : 40;
> ? ? ? ?r = pci_set_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));
> ? ? ? ?if (r) {
> + ? ? ? ? ? ? ? rdev->need_dma32 = true;
> ? ? ? ? ? ? ? ?printk(KERN_WARNING "radeon: No suitable DMA available.\n");
> ? ? ? ?}
>
>

2011-06-03 23:14:26

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

On Fri, Jun 03, 2011 at 04:44:59PM -0400, Alex Deucher wrote:
> On Fri, Jun 3, 2011 at 1:31 PM, Daniel Haid <[email protected]> wrote:
> > On my x86_64 system with >4GB of ram and swiotlb instead of
> > a hardware iommu (because I have a VIA chipset), the call
> > to pci_set_dma_mask (see below) with 40bits returns an error.
> >
> > But it seems that the radeon driver is designed to have
> > need_dma32 = true exactly if pci_set_dma_mask is called
> > with 32 bits and false if it is called with 40 bits.
> >
> > I have read somewhere that the default are 32 bits. So if the
> > call fails I suppose that need_dma32 should be set to true.
> >
> > And indeed the patch fixes the problem I have had before
> > and which I had described here:
> > http://choon.net/forum/read.php?21,106131,115940
>
> This looks like the correct fix. rdev->need_dma32 is used when we init
> ttm for memory management later.
>
> Alex

<nods>

Daniel, did you find other graphic drivers that forget to set need_dma32
after failing to set the dma_mask ?

>
> Acked-by: Alex Deucher <[email protected]>
>
> >
> > --- linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c.old
> > ?2011-06-03 19:11:33.208891994 +0200
> > +++ linux-2.6.39-gentoo/drivers/gpu/drm/radeon/radeon_device.c ?2011-06-03
> > 19:21:10.240337986 +0200
> > @@ -752,6 +752,7 @@ int radeon_device_init(struct radeon_dev
> > ? ? ? ?dma_bits = rdev->need_dma32 ? 32 : 40;
> > ? ? ? ?r = pci_set_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));
> > ? ? ? ?if (r) {
> > + ? ? ? ? ? ? ? rdev->need_dma32 = true;
> > ? ? ? ? ? ? ? ?printk(KERN_WARNING "radeon: No suitable DMA available.\n");
> > ? ? ? ?}
> >
> >

2011-06-06 17:41:49

by Daniel Haid

[permalink] [raw]
Subject: Re: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

> This looks like the correct fix. rdev->need_dma32 is used when we
> init
> ttm for memory management later.

Is there anything I have to do now to have it included? Will it also
go into the stable releases?

2011-06-06 17:45:53

by Daniel Haid

[permalink] [raw]
Subject: Re: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

> Daniel, did you find other graphic drivers that forget to set
> need_dma32
> after failing to set the dma_mask ?

No. I have looked at nouveau and they seem to do it correctly, but I
can
not test it because I do not have the hardware.

2011-06-06 19:08:57

by Alex Deucher

[permalink] [raw]
Subject: Re: [PATCH] tentative fix for radeon on systems >4GB without hardware iommu

On Mon, Jun 6, 2011 at 1:41 PM, Daniel Haid <[email protected]> wrote:
>> This looks like the correct fix. rdev->need_dma32 is used when we init
>> ttm for memory management later.
>
> Is there anything I have to do now to have it included? Will it also
> go into the stable releases?
>

I think Dave will pick it up and send it through the drm tree.

Alex