2008-02-06 04:41:21

by David Miller

[permalink] [raw]
Subject: more iommu sg merging fallout


The sparc64 change:

commit fde6a3c82d67f592eb587be4d12222b0ae6d4321
Author: FUJITA Tomonori <[email protected]>
Date: Mon Feb 4 22:28:02 2008 -0800

iommu sg merging: sparc64: make iommu respect the segment size limits

This patch makes iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <[email protected]>
Cc: Jeff Garzik <[email protected]>
Cc: James Bottomley <[email protected]>
Acked-by: Jens Axboe <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

has significant errors and is going to eat people's disks, as it just
nearly did to mine.

Typically what you'll see are NULL pointer derefernces in
dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps
on your superblock very shortly thereafter.

The changeset above modified only prepare_sg() but that is only the
first pass of the SG mapping algorithm of the sparc64 IOMMU layer.

The second pass that fills in the entries depends upon how the first
pass does things. So if you change the first pass decision making you
have to update the second pass's as well.

That second pass is implemented in fill_sg() (there is a version in
both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c),
which probably needs new logic as was added to prepare_sg() to handle
dma_get_max_seg_size().


2008-02-06 23:17:45

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

On Tue, 05 Feb 2008 20:41:38 -0800 (PST)
David Miller <[email protected]> wrote:

>
> The sparc64 change:
>
> commit fde6a3c82d67f592eb587be4d12222b0ae6d4321
> Author: FUJITA Tomonori <[email protected]>
> Date: Mon Feb 4 22:28:02 2008 -0800
>
> iommu sg merging: sparc64: make iommu respect the segment size limits
>
> This patch makes iommu respect segment size limits when merging sg
> lists.
>
> Signed-off-by: FUJITA Tomonori <[email protected]>
> Cc: Jeff Garzik <[email protected]>
> Cc: James Bottomley <[email protected]>
> Acked-by: Jens Axboe <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> has significant errors and is going to eat people's disks, as it just
> nearly did to mine.

Really sorry about it.


> Typically what you'll see are NULL pointer derefernces in
> dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps
> on your superblock very shortly thereafter.
>
> The changeset above modified only prepare_sg() but that is only the
> first pass of the SG mapping algorithm of the sparc64 IOMMU layer.
>
> The second pass that fills in the entries depends upon how the first
> pass does things. So if you change the first pass decision making you
> have to update the second pass's as well.
>
> That second pass is implemented in fill_sg() (there is a version in
> both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c),
> which probably needs new logic as was added to prepare_sg() to handle
> dma_get_max_seg_size().

PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
but their first pass decides how to merge sg entires (and stores that
information in the sg entries), then the second pass simpliy follows
it (Hopefully I understand these IOMMUs correctly, or else I break
them too).

I'll try this work again for 2.6.26. The SPARC IOMMUs are the most
complicated and seems that few people tests the -mm SPARC code. So I
guess that I need to find a SPARC box...

Thanks,

2008-02-06 23:19:42

by David Miller

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

From: FUJITA Tomonori <[email protected]>
Date: Thu, 07 Feb 2008 08:12:36 +0900

> Really sorry about it.

I am happy to test patches you send to me in the future :-)

> PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
> but their first pass decides how to merge sg entires (and stores that
> information in the sg entries), then the second pass simpliy follows
> it (Hopefully I understand these IOMMUs correctly, or else I break
> them too).

For now I've removed all of the merging code from the sparc64 IOMMU
support so that other users do not get corrupt filesystems. It
basically mimicks how the intel-iommu code works, ie. no attempts to
merge anything.

I intend to put merging back in, perhaps something similar to
powerpc's merging logic but without the expensive (in my opinion)
IOMMU allocation every loop. I think it is better to determine the
segment breaks in one pass, allocate that many IOMMU entries in one
allocation, then fill them all in.

Ideally, we should have some generic code that does all of this.
Then you would only need to test one implementation.

It is definitely doable and increasingly necessary as we have so
many reimplementations of what is essentially identical core code.

2008-02-06 23:54:02

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
David Miller <[email protected]> wrote:

> From: FUJITA Tomonori <[email protected]>
> Date: Thu, 07 Feb 2008 08:12:36 +0900
>
> > Really sorry about it.
>
> I am happy to test patches you send to me in the future :-)

Thanks a lot.


> > PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
> > but their first pass decides how to merge sg entires (and stores that
> > information in the sg entries), then the second pass simpliy follows
> > it (Hopefully I understand these IOMMUs correctly, or else I break
> > them too).
>
> For now I've removed all of the merging code from the sparc64 IOMMU
> support so that other users do not get corrupt filesystems. It
> basically mimicks how the intel-iommu code works, ie. no attempts to
> merge anything.

I've just saw it.


> I intend to put merging back in, perhaps something similar to
> powerpc's merging logic but without the expensive (in my opinion)
> IOMMU allocation every loop. I think it is better to determine the
> segment breaks in one pass, allocate that many IOMMU entries in one
> allocation, then fill them all in.

I thought about asking you if I can modify the SPARC IOMMUs to do
allocation in every loop.

The reason why I need the allocation in every loop is that I also need
to fix the problem that IOMMUs allocate memory areas without
considering a low level driver's segment boundary limits.

http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html

As far as I know, all the IOMMUs except for SPARC allocate a free area
in every loop but if it's too expensive for SPARC, then we need to
find a different way to handle segment boundary limits.


> Ideally, we should have some generic code that does all of this.
> Then you would only need to test one implementation.
>
> It is definitely doable and increasingly necessary as we have so
> many reimplementations of what is essentially identical core code.

Agreed though it's a very hard task.

2008-02-07 00:34:39

by David Miller

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

From: FUJITA Tomonori <[email protected]>
Date: Thu, 07 Feb 2008 08:53:33 +0900

> On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
> David Miller <[email protected]> wrote:
>
> > I intend to put merging back in, perhaps something similar to
> > powerpc's merging logic but without the expensive (in my opinion)
> > IOMMU allocation every loop. I think it is better to determine the
> > segment breaks in one pass, allocate that many IOMMU entries in one
> > allocation, then fill them all in.
>
> I thought about asking you if I can modify the SPARC IOMMUs to do
> allocation in every loop.
>
> The reason why I need the allocation in every loop is that I also need
> to fix the problem that IOMMUs allocate memory areas without
> considering a low level driver's segment boundary limits.
>
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html
>
> As far as I know, all the IOMMUs except for SPARC allocate a free area
> in every loop but if it's too expensive for SPARC, then we need to
> find a different way to handle segment boundary limits.

Ok then what I'll do is adopt some version of powerpc's merging
allocator into the sparc64 code.

2008-02-07 01:38:27

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

On Wed, 06 Feb 2008 16:01:47 -0800 (PST)
David Miller <[email protected]> wrote:

> From: FUJITA Tomonori <[email protected]>
> Date: Thu, 07 Feb 2008 08:53:33 +0900
>
> > On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
> > David Miller <[email protected]> wrote:
> >
> > > I intend to put merging back in, perhaps something similar to
> > > powerpc's merging logic but without the expensive (in my opinion)
> > > IOMMU allocation every loop. I think it is better to determine the
> > > segment breaks in one pass, allocate that many IOMMU entries in one
> > > allocation, then fill them all in.
> >
> > I thought about asking you if I can modify the SPARC IOMMUs to do
> > allocation in every loop.
> >
> > The reason why I need the allocation in every loop is that I also need
> > to fix the problem that IOMMUs allocate memory areas without
> > considering a low level driver's segment boundary limits.
> >
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html
> >
> > As far as I know, all the IOMMUs except for SPARC allocate a free area
> > in every loop but if it's too expensive for SPARC, then we need to
> > find a different way to handle segment boundary limits.
>
> Ok then what I'll do is adopt some version of powerpc's merging
> allocator into the sparc64 code.

Great, thanks! SPARC IOMMUs use bitmap for the free area management
like POWER IOMMUs so it could use lib/iommu-helper as POWER does.

2008-02-12 05:40:18

by David Miller

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

From: FUJITA Tomonori <[email protected]>
Date: Thu, 07 Feb 2008 10:38:01 +0900

> Great, thanks! SPARC IOMMUs use bitmap for the free area management
> like POWER IOMMUs so it could use lib/iommu-helper as POWER does.

Please look at Linus's current tree, I believe I have things
in a working state, and the DMA mask portions should be easier
to add now.

2008-02-16 06:04:17

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

Sorry for the late response,

On Mon, 11 Feb 2008 21:40:36 -0800 (PST)
David Miller <[email protected]> wrote:

> From: FUJITA Tomonori <[email protected]>
> Date: Thu, 07 Feb 2008 10:38:01 +0900
>
> > Great, thanks! SPARC IOMMUs use bitmap for the free area management
> > like POWER IOMMUs so it could use lib/iommu-helper as POWER does.
>
> Please look at Linus's current tree, I believe I have things
> in a working state, and the DMA mask portions should be easier
> to add now.

Thanks! It looks great.

iommu->page_table_map_base is on IO_PAGE_SIZE boundary? If so, the
following patch works, I think (sorry, it's only compile tested
again).

Thanks,

=
>From 91af83802c30d071f98444846e85310a8d58ab3d Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori <[email protected]>
Date: Sat, 16 Feb 2008 14:29:02 +0900
Subject: [PATCH] sparc64: make IOMMU code respect the segment boundary limits

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/sparc64/kernel/iommu.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index d3276eb..ffefbf7 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -134,7 +134,8 @@ unsigned long iommu_range_alloc(struct device *dev,
else
boundary_size = ALIGN(1UL << 32, 1 << IO_PAGE_SHIFT);

- n = iommu_area_alloc(arena->map, limit, start, npages, 0,
+ n = iommu_area_alloc(arena->map, limit, start, npages,
+ iommu->page_table_map_base >> IO_PAGE_SHIFT,
boundary_size >> IO_PAGE_SHIFT, 0);
if (n == -1) {
if (likely(pass < 1)) {
--
1.5.3.4

2008-02-18 07:41:17

by David Miller

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

From: FUJITA Tomonori <[email protected]>
Date: Sat, 16 Feb 2008 15:03:43 +0900

> [PATCH] sparc64: make IOMMU code respect the segment boundary limits
>
> Signed-off-by: FUJITA Tomonori <[email protected]>

Looks good, but I think it will break sound for some ALI chips.

Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack()
and it's caller pci_dma_supported().

2008-02-19 10:58:26

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

On Sun, 17 Feb 2008 23:41:42 -0800 (PST)
David Miller <[email protected]> wrote:

> From: FUJITA Tomonori <[email protected]>
> Date: Sat, 16 Feb 2008 15:03:43 +0900
>
> > [PATCH] sparc64: make IOMMU code respect the segment boundary limits
> >
> > Signed-off-by: FUJITA Tomonori <[email protected]>
>
> Looks good, but I think it will break sound for some ALI chips.
>
> Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack()
> and it's caller pci_dma_supported().

Could you explain the problem a little more?

The shift argument is only used as an offset when iommu-helper decides
whether a memory area (index plus npages) spanning LLD's segment
boudnary size or not.

For example, if a device's segment boudary size is 64K, the helper see
the following value is larger than 64K or not:

((the offset + index of the IOMMU table) ((64K / 8K) - 1) + npages

2008-02-21 06:55:42

by David Miller

[permalink] [raw]
Subject: Re: more iommu sg merging fallout

From: FUJITA Tomonori <[email protected]>
Date: Tue, 19 Feb 2008 19:57:58 +0900

> On Sun, 17 Feb 2008 23:41:42 -0800 (PST)
> David Miller <[email protected]> wrote:
>
> > From: FUJITA Tomonori <[email protected]>
> > Date: Sat, 16 Feb 2008 15:03:43 +0900
> >
> > > [PATCH] sparc64: make IOMMU code respect the segment boundary limits
> > >
> > > Signed-off-by: FUJITA Tomonori <[email protected]>
> >
> > Looks good, but I think it will break sound for some ALI chips.
> >
> > Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack()
> > and it's caller pci_dma_supported().
>
> Could you explain the problem a little more?

Sorry, I was thinking of a different problem (for when I add
dma mask support to this code).

I will apply your patch, thanks!