2011-03-11 11:51:05

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Friday 11 March 2011, Marek Szyprowski wrote:
>
> We followed the style of iommu API for other mainline ARM platforms (both OMAP and MSM
> also have custom API for their iommu modules). I've briefly checked include/linux/iommu.h
> API and I've noticed that it has been designed mainly for KVM support. There is also
> include/linux/intel-iommu.h interface, but I it is very specific to intel gfx chips.

The MSM code actually uses the generic iommu.h code, using register_iommu, so
the drivers can use the regular iommu_map.

I believe the omap code predates the iommu API, and should really be changed
to use that. At least it was added before I started reviewing the code.

The iommu API is not really meant to be KVM specific, it's just that the
in-tree users are basically limited to KVM at the moment. Another user that
is coming up soon is the vmio device driver that can be used to transparently
pass devices to user space. The idea behind the IOMMU API is that you can
map arbitrary bus addresses to physical memory addresses, but it does not
deal with allocating the bus addresses or providing buffer management such
as cache flushes.

> Is there any example how include/linux/dma-mapping.h interface can be used for iommu
> mappings?

The dma-mapping API is the normal interface that you should use for IOMMUs
that sit between DMA devices and kernel memory. The idea is that you
completely abstract the concept of an IOMMU so the device driver uses
the same code for talking to a device with an IOMMU and another device with
a linear mapping or an swiotlb bounce buffer.

This means that the user of the dma-mapping API does not get to choose the
bus addresses, but instead you use the API to get a bus address for a
chunk of memory, and then you can pass that address to a device.

See arch/powerpc/kernel/iommu.c and arch/x86/kernel/amd_iommu.c for common
examples of how this is implemented. The latter one actually implements
both the iommu_ops for iommu.h and dma_map_ops for dma-mapping.h.

Arnd


2011-03-11 12:35:44

by Marek Szyprowski

[permalink] [raw]
Subject: RE: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

Hello,

On Friday, March 11, 2011 12:51 PM Arnd Bergmann wrote:

> On Friday 11 March 2011, Marek Szyprowski wrote:
> >
> > We followed the style of iommu API for other mainline ARM platforms (both OMAP and MSM
> > also have custom API for their iommu modules). I've briefly checked include/linux/iommu.h
> > API and I've noticed that it has been designed mainly for KVM support. There is also
> > include/linux/intel-iommu.h interface, but I it is very specific to intel gfx chips.
>
> The MSM code actually uses the generic iommu.h code, using register_iommu, so
> the drivers can use the regular iommu_map.
>
> I believe the omap code predates the iommu API, and should really be changed
> to use that. At least it was added before I started reviewing the code.
>
> The iommu API is not really meant to be KVM specific, it's just that the
> in-tree users are basically limited to KVM at the moment. Another user that
> is coming up soon is the vmio device driver that can be used to transparently
> pass devices to user space. The idea behind the IOMMU API is that you can
> map arbitrary bus addresses to physical memory addresses, but it does not
> deal with allocating the bus addresses or providing buffer management such
> as cache flushes.

Yea, I've noticed this and this basically what we expect from iommu driver.
However the iommu.h API requires a separate call to map each single memory page.
This is quite ineffective approach and imho the API need to be extended to allow
mapping of the arbitrary set of pages.

> > Is there any example how include/linux/dma-mapping.h interface can be used for iommu
> > mappings?
>
> The dma-mapping API is the normal interface that you should use for IOMMUs
> that sit between DMA devices and kernel memory. The idea is that you
> completely abstract the concept of an IOMMU so the device driver uses
> the same code for talking to a device with an IOMMU and another device with
> a linear mapping or an swiotlb bounce buffer.
>
> This means that the user of the dma-mapping API does not get to choose the
> bus addresses, but instead you use the API to get a bus address for a
> chunk of memory, and then you can pass that address to a device.
>
> See arch/powerpc/kernel/iommu.c and arch/x86/kernel/amd_iommu.c for common
> examples of how this is implemented. The latter one actually implements
> both the iommu_ops for iommu.h and dma_map_ops for dma-mapping.h.

Thanks for your comments! We will check how is it suitable for our case.

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

2011-03-11 14:08:08

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Friday 11 March 2011, Marek Szyprowski wrote:
> > The iommu API is not really meant to be KVM specific, it's just that the
> > in-tree users are basically limited to KVM at the moment. Another user that
> > is coming up soon is the vmio device driver that can be used to transparently
> > pass devices to user space. The idea behind the IOMMU API is that you can
> > map arbitrary bus addresses to physical memory addresses, but it does not
> > deal with allocating the bus addresses or providing buffer management such
> > as cache flushes.
>
> Yea, I've noticed this and this basically what we expect from iommu driver.
> However the iommu.h API requires a separate call to map each single memory page.
> This is quite ineffective approach and imho the API need to be extended to allow
> mapping of the arbitrary set of pages.

We can always discuss extensions to the existing infrastructure, adding
an interface for mapping an array of page pointers in the iommu API
sounds like a good idea.

I also think that we should not really have separate iommu and dma-mapping
interfaces, but rather have a portable way to define an iommu so that it
can be used through the dma-mapping interfaces. I'm not asking you to
do that as a prerequisite to merging your driver, but it may be good to
keep in mind that the current situation is still lacking and that any
suggestion for improving this as part of your work to support the
samsung IOMMU is welcome.

Note that the ARM implementation of the dma-mapping.h interface currently
does not support IOMMUs, but that could be changed by wrapping it
using the include/asm-generic/dma-mapping-common.h infrastructure.

Arnd

2011-03-11 14:52:16

by Marek Szyprowski

[permalink] [raw]
Subject: RE: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

Hello,

On Friday, March 11, 2011 3:08 PM Arnd Bergmann wrote:

> On Friday 11 March 2011, Marek Szyprowski wrote:
> > > The iommu API is not really meant to be KVM specific, it's just that the
> > > in-tree users are basically limited to KVM at the moment. Another user that
> > > is coming up soon is the vmio device driver that can be used to transparently
> > > pass devices to user space. The idea behind the IOMMU API is that you can
> > > map arbitrary bus addresses to physical memory addresses, but it does not
> > > deal with allocating the bus addresses or providing buffer management such
> > > as cache flushes.
> >
> > Yea, I've noticed this and this basically what we expect from iommu driver.
> > However the iommu.h API requires a separate call to map each single memory page.
> > This is quite ineffective approach and imho the API need to be extended to allow
> > mapping of the arbitrary set of pages.
>
> We can always discuss extensions to the existing infrastructure, adding
> an interface for mapping an array of page pointers in the iommu API
> sounds like a good idea.

We will investigate this API further. From the first sight it looks it won't take
much work to port/rewrite our driver to fit into iommu.h API.

> I also think that we should not really have separate iommu and dma-mapping
> interfaces, but rather have a portable way to define an iommu so that it
> can be used through the dma-mapping interfaces. I'm not asking you to
> do that as a prerequisite to merging your driver, but it may be good to
> keep in mind that the current situation is still lacking and that any
> suggestion for improving this as part of your work to support the
> samsung IOMMU is welcome.

Well creating a portable iommu framework and merging it with dma-mapping interface
looks like a much harder (and time consuming) task. There is definitely a need for
it. I hope that it can be developed incrementally starting from the current iommu.h
and dma-mapping.h interfaces. Please note that there might be some subtle differences
in the hardware that such framework must be aware. The first obvious one is the
hardware design. Some platform has central iommu unit, other (like Samsung Exynos4)
has a separate iommu unit per each device driver (this is still a simplification,
because a video codec device has 2 memory interfaces and 2 iommu units). Currently
I probably have not enough knowledge to predict the other possible issues that need
to be taken into account in the portable and generic iommu/dma-mapping frame-work.

> Note that the ARM implementation of the dma-mapping.h interface currently
> does not support IOMMUs, but that could be changed by wrapping it
> using the include/asm-generic/dma-mapping-common.h infrastructure.

ARM dma-mapping framework also requires some additional research for better DMA
support (there are still issues with multiple mappings to be resolved).

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

2011-03-11 15:15:10

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Friday 11 March 2011, Marek Szyprowski wrote:
> On Friday, March 11, 2011 3:08 PM Arnd Bergmann wrote:
>
> > On Friday 11 March 2011, Marek Szyprowski wrote:
> > > > The iommu API is not really meant to be KVM specific, it's just that the
> > > > in-tree users are basically limited to KVM at the moment. Another user that
> > > > is coming up soon is the vmio device driver that can be used to transparently
> > > > pass devices to user space. The idea behind the IOMMU API is that you can
> > > > map arbitrary bus addresses to physical memory addresses, but it does not
> > > > deal with allocating the bus addresses or providing buffer management such
> > > > as cache flushes.
> > >
> > > Yea, I've noticed this and this basically what we expect from iommu driver.
> > > However the iommu.h API requires a separate call to map each single memory page.
> > > This is quite ineffective approach and imho the API need to be extended to allow
> > > mapping of the arbitrary set of pages.
> >
> > We can always discuss extensions to the existing infrastructure, adding
> > an interface for mapping an array of page pointers in the iommu API
> > sounds like a good idea.
>
> We will investigate this API further. From the first sight it looks it won't take
> much work to port/rewrite our driver to fit into iommu.h API.

Ok, sounds good.

> > I also think that we should not really have separate iommu and dma-mapping
> > interfaces, but rather have a portable way to define an iommu so that it
> > can be used through the dma-mapping interfaces. I'm not asking you to
> > do that as a prerequisite to merging your driver, but it may be good to
> > keep in mind that the current situation is still lacking and that any
> > suggestion for improving this as part of your work to support the
> > samsung IOMMU is welcome.
>
> Well creating a portable iommu framework and merging it with dma-mapping interface
> looks like a much harder (and time consuming) task. There is definitely a need for
> it. I hope that it can be developed incrementally starting from the current iommu.h
> and dma-mapping.h interfaces.

Yes, that is the idea. Maybe we should add it to the list things that the
Linaro kernel working group can target for the November release?

> Please note that there might be some subtle differences
> in the hardware that such framework must be aware. The first obvious one is the
> hardware design. Some platform has central iommu unit, other (like Samsung Exynos4)
> has a separate iommu unit per each device driver (this is still a simplification,
> because a video codec device has 2 memory interfaces and 2 iommu units). Currently
> I probably have not enough knowledge to predict the other possible issues that need
> to be taken into account in the portable and generic iommu/dma-mapping frame-work.

The dma-mapping API can deal well with one IOMMU per device, but would
need some tricks to work with one device that has two separate IOMMUs.

I'm not very familar with the iommu API, but in the common KVM scenario,
you need one IOMMU per device, so it should handle that just fine as well.

> > Note that the ARM implementation of the dma-mapping.h interface currently
> > does not support IOMMUs, but that could be changed by wrapping it
> > using the include/asm-generic/dma-mapping-common.h infrastructure.
>
> ARM dma-mapping framework also requires some additional research for better DMA
> support (there are still issues with multiple mappings to be resolved).

You mean mapping the same memory into multiple devices, or a different problem?

Arnd

2011-03-11 15:39:21

by Marek Szyprowski

[permalink] [raw]
Subject: RE: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

Hello,

On Friday, March 11, 2011 4:15 PM Arnd Bergmann wrote:

> On Friday 11 March 2011, Marek Szyprowski wrote:
> > On Friday, March 11, 2011 3:08 PM Arnd Bergmann wrote:
> >
> > > On Friday 11 March 2011, Marek Szyprowski wrote:
> > > > > The iommu API is not really meant to be KVM specific, it's just that the
> > > > > in-tree users are basically limited to KVM at the moment. Another user that
> > > > > is coming up soon is the vmio device driver that can be used to transparently
> > > > > pass devices to user space. The idea behind the IOMMU API is that you can
> > > > > map arbitrary bus addresses to physical memory addresses, but it does not
> > > > > deal with allocating the bus addresses or providing buffer management such
> > > > > as cache flushes.
> > > >
> > > > Yea, I've noticed this and this basically what we expect from iommu driver.
> > > > However the iommu.h API requires a separate call to map each single memory page.
> > > > This is quite ineffective approach and imho the API need to be extended to allow
> > > > mapping of the arbitrary set of pages.
> > >
> > > We can always discuss extensions to the existing infrastructure, adding
> > > an interface for mapping an array of page pointers in the iommu API
> > > sounds like a good idea.
> >
> > We will investigate this API further. From the first sight it looks it won't take
> > much work to port/rewrite our driver to fit into iommu.h API.
>
> Ok, sounds good.
>
> > > I also think that we should not really have separate iommu and dma-mapping
> > > interfaces, but rather have a portable way to define an iommu so that it
> > > can be used through the dma-mapping interfaces. I'm not asking you to
> > > do that as a prerequisite to merging your driver, but it may be good to
> > > keep in mind that the current situation is still lacking and that any
> > > suggestion for improving this as part of your work to support the
> > > samsung IOMMU is welcome.
> >
> > Well creating a portable iommu framework and merging it with dma-mapping interface
> > looks like a much harder (and time consuming) task. There is definitely a need for
> > it. I hope that it can be developed incrementally starting from the current iommu.h
> > and dma-mapping.h interfaces.
>
> Yes, that is the idea. Maybe we should add it to the list things that the
> Linaro kernel working group can target for the November release?
>
> > Please note that there might be some subtle differences
> > in the hardware that such framework must be aware. The first obvious one is the
> > hardware design. Some platform has central iommu unit, other (like Samsung Exynos4)
> > has a separate iommu unit per each device driver (this is still a simplification,
> > because a video codec device has 2 memory interfaces and 2 iommu units). Currently
> > I probably have not enough knowledge to predict the other possible issues that need
> > to be taken into account in the portable and generic iommu/dma-mapping frame-work.
>
> The dma-mapping API can deal well with one IOMMU per device, but would
> need some tricks to work with one device that has two separate IOMMUs.

We need to investigate the internals of dma-mapping API first. Right now I know too
little in this area.

> I'm not very familar with the iommu API, but in the common KVM scenario,
> you need one IOMMU per device, so it should handle that just fine as well.

Well, afair there are also systems with one central iommu module, which is shared
between devices. I have no idea how such model will fit into the dma-mapping API.

> > > Note that the ARM implementation of the dma-mapping.h interface currently
> > > does not support IOMMUs, but that could be changed by wrapping it
> > > using the include/asm-generic/dma-mapping-common.h infrastructure.
> >
> > ARM dma-mapping framework also requires some additional research for better DMA
> > support (there are still issues with multiple mappings to be resolved).
>
> You mean mapping the same memory into multiple devices, or a different problem?

Mapping the same memory area multiple times with different cache settings is not
legal on ARMv7+ systems. Currently the problems might caused by the low-memory
kernel linear mapping and second mapping created for example by dma_alloc_coherent()
function.

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

2011-03-11 16:00:29

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Friday 11 March 2011, Marek Szyprowski wrote:
> > > > does not support IOMMUs, but that could be changed by wrapping it
> > > > using the include/asm-generic/dma-mapping-common.h infrastructure.
> > >
> > > ARM dma-mapping framework also requires some additional research for better DMA
> > > support (there are still issues with multiple mappings to be resolved).
> >
> > You mean mapping the same memory into multiple devices, or a different problem?
>
> Mapping the same memory area multiple times with different cache settings is not
> legal on ARMv7+ systems. Currently the problems might caused by the low-memory
> kernel linear mapping and second mapping created for example by dma_alloc_coherent()
> function.

Yes, I know this problem, but I don't think the case you describe is a serious
limitation (there are more interesting cases, though): dma_map_single() etc
will create additional *bus* addresses for a physical address, not additional
virtual addresses.

dma_alloc_coherent should allocate memory that is not also mapped cached,
which is what I thought we do correctly.

Arnd

2011-03-14 12:37:54

by Cho KyongHo

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

2011/3/12 Arnd Bergmann <[email protected]>:
> On Friday 11 March 2011, Marek Szyprowski wrote:
>> > > > does not support IOMMUs, but that could be changed by wrapping it
>> > > > using the include/asm-generic/dma-mapping-common.h infrastructure.
>> > >
>> > > ARM dma-mapping framework also requires some additional research for better DMA
>> > > support (there are still issues with multiple mappings to be resolved).
>> >
>> > You mean mapping the same memory into multiple devices, or a different problem?
>>
>> Mapping the same memory area multiple times with different cache settings is not
>> legal on ARMv7+ systems. Currently the problems might caused by the low-memory
>> kernel linear mapping and second mapping created for example by dma_alloc_coherent()
>> function.
>
> Yes, I know this problem, but I don't think the case you describe is a serious
> limitation (there are more interesting cases, though): dma_map_single() etc
> will create additional *bus* addresses for a physical address, not additional
> virtual addresses.
>
> dma_alloc_coherent should allocate memory that is not also mapped cached,
> which is what I thought we do correctly.

I have also noticed that dma_map_single/page/sg() can map physical
memory into an arbitrary device address region.
But it is not enough solution for various kinds of IOMMUs.
As Kukjin Kim addressed, we need to support larger page size than 4KB
because we can reduce TLB miss when we have larger page size.

Our IOMMU(system mmu) supports all page size of ARM architecture
including 16MB, 1MB, 64KB and 4KB.
Since the largest size supported by buddy system of 32-bit architecture is 4MB,
our system support all page sizes except 16MB.
We proved that larger page size is helpful for DMA performance
significantly (more than 10%, approximately).
Big page size is not a problem for peripheral devices
because their address space is not suffer from external fragmentation.

Thanks to Arnd, I never knew about include/linux/iommu.h

Similar to dma-mappings.h, however, It is not enough for our
requirements even though it allows private data to be stored in
iommu_domain for platform-specific requirements.

I think we can consider another solution for the various requirements.
I think one of the most possible solutions is VCMM.
Or we can enhance include/linux/iommu.h with reference of VCMM.

You can find the most recent VCMM submitted at
http://marc.info/?l=linux-kernel&m=129255948319341&w=2

It looks somewhat complex but includes most of required features for
various IOMMUs
which will not be easily solved by include/linux/iommu.h

You can find VCMM core in
http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6-samsung.git;a=blob;f=mm/vcm.c;h=9fff0106ec0078fad1488308305c8486adbed9c0;hb=refs/heads/2.6.36-samsung

and platform specific implementation of VCMM in
http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6-samsung.git;a=blob;f=arch/arm/plat-s5p/s5p-vcm.c;h=7498c800aef8b01082e1b1c3ea0f66cefe3c85a1;hb=refs/heads/2.6.36-samsung

Cho KyongHo.

2011-03-14 12:47:22

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Mon, Mar 14, 2011 at 09:37:51PM +0900, KyongHo Cho wrote:
> I have also noticed that dma_map_single/page/sg() can map physical
> memory into an arbitrary device address region.
> But it is not enough solution for various kinds of IOMMUs.
> As Kukjin Kim addressed, we need to support larger page size than 4KB
> because we can reduce TLB miss when we have larger page size.
>
> Our IOMMU(system mmu) supports all page size of ARM architecture
> including 16MB, 1MB, 64KB and 4KB.
> Since the largest size supported by buddy system of 32-bit architecture is 4MB,
> our system support all page sizes except 16MB.
> We proved that larger page size is helpful for DMA performance
> significantly (more than 10%, approximately).
> Big page size is not a problem for peripheral devices
> because their address space is not suffer from external fragmentation.

1. dma_map_single() et.al. is used for mapping *system* *RAM* for devices
using whatever is necessary. It must not be used for trying to setup
arbitary other mappings.

2. It doesn't matter where the memory for dma_map_single() et.al. comes
from provided the virtual address is a valid system RAM address or
the struct page * is a valid struct page in the memory map (iow, you
can't create this yourself.)

3. In the case of an IOMMU, the DMA API does not limit you to only using
4K pages to setup the IOMMU mappings. You can use whatever you like
provided the hardware can cope with it. You can coalesce several
existing entries together provided you track what you're doing and can
undo what's been done when the mapping is no longer required.

So really there's no reason not to use 64K, 1M and 16M IOMMU entries if
that's the size of buffer which has been passed to the DMA API.

2011-03-14 13:32:28

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Monday 14 March 2011, KyongHo Cho wrote:
> I think we can consider another solution for the various requirements.
> I think one of the most possible solutions is VCMM.
> Or we can enhance include/linux/iommu.h with reference of VCMM.

I think extending or changing the existing interface would be much
preferred. It's always better to limit the number of interfaces
that do the same thing, and we already have more duplication than
we want with the two dma-mapping.h and iommu.h interfaces.

Note that any aspect of the existing interface can be changed if
necessary, as long as there is a way to migrate all the existing
users. Since the iommu API is not exported to user space, there
is no requirement to keep it stable.

Arnd

2011-03-15 01:45:56

by Inki Dae

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

2011/3/14 Russell King - ARM Linux <[email protected]>:
> On Mon, Mar 14, 2011 at 09:37:51PM +0900, KyongHo Cho wrote:
>> I have also noticed that dma_map_single/page/sg() can map physical
>> memory into an arbitrary device address region.
>> But it is not enough solution for various kinds of IOMMUs.
>> As Kukjin Kim addressed, we need to support larger page size than 4KB
>> because we can reduce TLB miss when we have larger page size.
>>
>> Our IOMMU(system mmu) supports all page size of ARM architecture
>> including 16MB, 1MB, 64KB and 4KB.
>> Since the largest size supported by buddy system of 32-bit architecture is 4MB,
>> our system support all page sizes except 16MB.
>> We proved that larger page size is helpful for DMA performance
>> significantly (more than 10%, approximately).
>> Big page size is not a problem for peripheral devices
>> because their address space is not suffer from external fragmentation.
>
> 1. dma_map_single() et.al. is used for mapping *system* *RAM* for devices
> ? using whatever is necessary. ?It must not be used for trying to setup
> ? arbitary other mappings.
>
> 2. It doesn't matter where the memory for dma_map_single() et.al. comes
> ? from provided the virtual address is a valid system RAM address or
> ? the struct page * is a valid struct page in the memory map (iow, you
> ? can't create this yourself.)

You mean that we cannot have arbitrary virtual address mapping for
iommu based device?
actually, we have memory mapping to arbitrary device virtual address
space, not kernel virtual address space.

>
> 3. In the case of an IOMMU, the DMA API does not limit you to only using
> ? 4K pages to setup the IOMMU mappings. ?You can use whatever you like
> ? provided the hardware can cope with it. ?You can coalesce several
> ? existing entries together provided you track what you're doing and can
> ? undo what's been done when the mapping is no longer required.
>
> So really there's no reason not to use 64K, 1M and 16M IOMMU entries if
> that's the size of buffer which has been passed to the DMA API.
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

2011-03-15 08:35:51

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Tue, Mar 15, 2011 at 10:45:50AM +0900, InKi Dae wrote:
> 2011/3/14 Russell King - ARM Linux <[email protected]>:
> > On Mon, Mar 14, 2011 at 09:37:51PM +0900, KyongHo Cho wrote:
> >> I have also noticed that dma_map_single/page/sg() can map physical
> >> memory into an arbitrary device address region.
> >> But it is not enough solution for various kinds of IOMMUs.
> >> As Kukjin Kim addressed, we need to support larger page size than 4KB
> >> because we can reduce TLB miss when we have larger page size.
> >>
> >> Our IOMMU(system mmu) supports all page size of ARM architecture
> >> including 16MB, 1MB, 64KB and 4KB.
> >> Since the largest size supported by buddy system of 32-bit architecture is 4MB,
> >> our system support all page sizes except 16MB.
> >> We proved that larger page size is helpful for DMA performance
> >> significantly (more than 10%, approximately).
> >> Big page size is not a problem for peripheral devices
> >> because their address space is not suffer from external fragmentation.
> >
> > 1. dma_map_single() et.al. is used for mapping *system* *RAM* for devices
> > ? using whatever is necessary. ?It must not be used for trying to setup
> > ? arbitary other mappings.
> >
> > 2. It doesn't matter where the memory for dma_map_single() et.al. comes
> > ? from provided the virtual address is a valid system RAM address or
> > ? the struct page * is a valid struct page in the memory map (iow, you
> > ? can't create this yourself.)
>
> You mean that we cannot have arbitrary virtual address mapping for
> iommu based device?

No. I mean exactly what I said - I'm talking about the DMA API in the
above two points. The implication is that you can not create arbitary
mappings of non-system RAM with the DMA API.

> actually, we have memory mapping to arbitrary device virtual address
> space, not kernel virtual address space.
>
> >
> > 3. In the case of an IOMMU, the DMA API does not limit you to only using
> > ? 4K pages to setup the IOMMU mappings. ?You can use whatever you like
> > ? provided the hardware can cope with it. ?You can coalesce several
> > ? existing entries together provided you track what you're doing and can
> > ? undo what's been done when the mapping is no longer required.
> >
> > So really there's no reason not to use 64K, 1M and 16M IOMMU entries if
> > that's the size of buffer which has been passed to the DMA API.
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

Russell King - ARM Linux 쓴 글:
> On Tue, Mar 15, 2011 at 10:45:50AM +0900, InKi Dae wrote:
>> 2011/3/14 Russell King - ARM Linux <[email protected]>:
>>> On Mon, Mar 14, 2011 at 09:37:51PM +0900, KyongHo Cho wrote:
>>>> I have also noticed that dma_map_single/page/sg() can map physical
>>>> memory into an arbitrary device address region.
>>>> But it is not enough solution for various kinds of IOMMUs.
>>>> As Kukjin Kim addressed, we need to support larger page size than 4KB
>>>> because we can reduce TLB miss when we have larger page size.
>>>>
>>>> Our IOMMU(system mmu) supports all page size of ARM architecture
>>>> including 16MB, 1MB, 64KB and 4KB.
>>>> Since the largest size supported by buddy system of 32-bit architecture is 4MB,
>>>> our system support all page sizes except 16MB.
>>>> We proved that larger page size is helpful for DMA performance
>>>> significantly (more than 10%, approximately).
>>>> Big page size is not a problem for peripheral devices
>>>> because their address space is not suffer from external fragmentation.
>>> 1. dma_map_single() et.al. is used for mapping *system* *RAM* for devices
>>> using whatever is necessary. It must not be used for trying to setup
>>> arbitary other mappings.
>>>
>>> 2. It doesn't matter where the memory for dma_map_single() et.al. comes
>>> from provided the virtual address is a valid system RAM address or
>>> the struct page * is a valid struct page in the memory map (iow, you
>>> can't create this yourself.)
>> You mean that we cannot have arbitrary virtual address mapping for
>> iommu based device?
>
> No. I mean exactly what I said - I'm talking about the DMA API in the
> above two points. The implication is that you can not create arbitary
> mappings of non-system RAM with the DMA API.
>
sorry but I couldn't understand exactly what you said. could you give me
your answer one more time?
does non-system RAM mean reserved memory regions? if not, is it
arbitrary virtual address space that isn't kernel or user virtual
address space and is the space for iommu based deivce?


>> actually, we have memory mapping to arbitrary device virtual address
>> space, not kernel virtual address space.
>>
>>> 3. In the case of an IOMMU, the DMA API does not limit you to only using
>>> 4K pages to setup the IOMMU mappings. You can use whatever you like
>>> provided the hardware can cope with it. You can coalesce several
>>> existing entries together provided you track what you're doing and can
>>> undo what's been done when the mapping is no longer required.
>>>
>>> So really there's no reason not to use 64K, 1M and 16M IOMMU entries if
>>> that's the size of buffer which has been passed to the DMA API.
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> [email protected]
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>

2011-03-15 10:13:13

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 3/7] ARM: Samsung: update/rewrite Samsung SYSMMU (IOMMU) driver

On Tue, Mar 15, 2011 at 06:34:42PM +0900, daeinki wrote:
> Russell King - ARM Linux 쓴 글:
>> On Tue, Mar 15, 2011 at 10:45:50AM +0900, InKi Dae wrote:
>>> 2011/3/14 Russell King - ARM Linux <[email protected]>:
>>>> On Mon, Mar 14, 2011 at 09:37:51PM +0900, KyongHo Cho wrote:
>>>>> I have also noticed that dma_map_single/page/sg() can map physical
>>>>> memory into an arbitrary device address region.
>>>>> But it is not enough solution for various kinds of IOMMUs.
>>>>> As Kukjin Kim addressed, we need to support larger page size than 4KB
>>>>> because we can reduce TLB miss when we have larger page size.
>>>>>
>>>>> Our IOMMU(system mmu) supports all page size of ARM architecture
>>>>> including 16MB, 1MB, 64KB and 4KB.
>>>>> Since the largest size supported by buddy system of 32-bit architecture is 4MB,
>>>>> our system support all page sizes except 16MB.
>>>>> We proved that larger page size is helpful for DMA performance
>>>>> significantly (more than 10%, approximately).
>>>>> Big page size is not a problem for peripheral devices
>>>>> because their address space is not suffer from external fragmentation.
>>>> 1. dma_map_single() et.al. is used for mapping *system* *RAM* for devices
>>>> using whatever is necessary. It must not be used for trying to setup
>>>> arbitary other mappings.
>>>>
>>>> 2. It doesn't matter where the memory for dma_map_single() et.al. comes
>>>> from provided the virtual address is a valid system RAM address or
>>>> the struct page * is a valid struct page in the memory map (iow, you
>>>> can't create this yourself.)
>>> You mean that we cannot have arbitrary virtual address mapping for
>>> iommu based device?
>>
>> No. I mean exactly what I said - I'm talking about the DMA API in the
>> above two points. The implication is that you can not create arbitary
>> mappings of non-system RAM with the DMA API.
>>
> sorry but I couldn't understand exactly what you said. could you give me
> your answer one more time?
> does non-system RAM mean reserved memory regions? if not, is it
> arbitrary virtual address space that isn't kernel or user virtual
> address space and is the space for iommu based deivce?

For dma_map_single(dev, addr, size, dir), basically:

for (a = addr; a < addr + size; a += PAGE_SIZE)
BUG_ON(!virt_addr_valid(a));

For dma_map_page(dev, page, offset, size, dir), 'page' must be something
obtained from one of the page-based kernel allocators (so either refering
to a page in the *existing* lowmem or highmem memory) _and_ you must not
use offset/size to then point at something outside that.

So, if you take something out of the kernel's knowledge of what is memory,
you can't then use the DMA API with it.