2005-04-13 10:43:10

by Rolf Offermanns

[permalink] [raw]
Subject: mmap + dma_alloc_coherent

Hi,

I would like to mmap a kernel buffer, allocated with pci_alloc_consistent()
for DMA, to userspace and came up with the following. Since there seem to be
some (unresolved) issues (see below) with this and I would like to do the
RightThing(TM), I would appreciate your comments about my stuff.

As for the unresolved issues, I found the following LKML threads to be very
helpful in understanding the problem:
1. (DMA API issues)
http://marc.theaimsgroup.com/?l=linux-kernel&m=108757847518687&w=2
2. (can device drivers return non-ram via vm_ops->nopage?)
http://marc.theaimsgroup.com/?l=linux-kernel&m=107978968703503&w=2

What I did (with comments on problematic / not fully clear to me parts):


pci_probe():
my_buffer = pci_alloc_consistent() (size: PAGE_SIZE << 4)

------------------------------------------------------------------------
my_mmap():
vma->vm_ops = &my_vm_ops;
vma->vm_flags |= (VM_RESERVED | VM_IO);
my_vma_open(vma);

my_vm_ops = {
.open = my_vm_open,
.close = my_vm_close,
.nopage = my_vm_nopage,
}

Q: Is VM_IO needed here? I took it from the sg.c driver.
Q: I choosed nopage because remap_page_range does not work on RAM pages.
Correct?
------------------------------------------------------------------------

my_vm_open():
increment vma usage count

my_vm_close():
decrement vma usage count

my_vm_nopage():
offset = (address - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
pageptr = my_buffer + offset;
page = virt_to_page(pageptr);

/* got it, now increment the count */
get_page(page);

if (type)
{
*type = VM_FAULT_MINOR;
}
return page;

Q: As seen in the threads mentioned above I should not use virt_to_page() on
addresses I got from pci_alloc_consistent/dma_alloc_coherent. What is the
right way to handle this?
Q: get_page() increments the page refcount. Where is the correspondig
put_page() operation?
--------------------------------------------------------------------------

pci_remove():
pci_free_consistent(my_buffer, ...)

--------------------------------------------------------------------------


I first tried the above and failed. Somehow my driver seemed to screw up the
page tables. I noticed a function sg_correct4mmap() in the sg.c driver which
"fixes" refcount handling on pages allocated using __get_free_pages() with
order > 0. After implementing this things improved. Here are the changes:

my_mmap():
if (!mmap_called)
{
correct4mmap(my_buffer, 1);
mmap_called = 1;
}

release():
if (mmap_called)
{
if (vma_usage_count == 0)
{
correct4mmap(my_buffer, 0)
mmap_called = 0;
}
}

Q: Can someone please briefly explain, why (if at all) this is needed?
-----------------------------------------------------------------------------

Somehow the whole thing does not "feel" right so I would really like some
comments on this. I think this could be helpful to other, too.

Quoting Jeff Garzik from one of the older threads:
"My suggestion/request to the VM wizards would be to directly provide mmap
helpers for dma/mmio/pio, that Does The Right Thing. And require their
use in every driver. Don't give driver writers the opportunity to think
about this stuff and/or screw it up."

There are such helper functions on ARM and Russel tried to push them into
the generic DMA API (the last time in June 2004, I think). Can any progress
be expected regarding these helper functions?


Thanks for your time.

-Rolf

--
Rolf Offermanns <[email protected]>
SYSGO AG Tel.: +49-6136-9948-0
Am Pfaffenstein 14 Fax: +49-6136-9948-10
55270 Klein-Winternheim http://www.sysgo.com


2005-04-13 11:19:47

by Russell King

[permalink] [raw]
Subject: Re: mmap + dma_alloc_coherent

On Wed, Apr 13, 2005 at 12:43:47PM +0200, Rolf Offermanns wrote:
> I would like to mmap a kernel buffer, allocated with pci_alloc_consistent()
> for DMA, to userspace and came up with the following. Since there seem to be
> some (unresolved) issues (see below) with this and I would like to do the
> RightThing(TM), I would appreciate your comments about my stuff.

This has come up before. ARM implements dma_mmap_*() to allow this
to happen, but it never got propagated to the other architectures.

Here's the (untested) x86 version. There may be a problem with
x86 not marking the pages reserved, which is required for
remap_pfn_range() to work.

diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x '*.orig' -x '*.rej' -r orig/arch/i386/kernel/pci-dma.c linux/arch/i386/kernel/pci-dma.c
--- orig/arch/i386/kernel/pci-dma.c Mon Apr 4 22:52:57 2005
+++ linux/arch/i386/kernel/pci-dma.c Mon Apr 4 22:44:45 2005
@@ -49,7 +49,7 @@ void *dma_alloc_coherent(struct device *
ret = (void *)__get_free_pages(gfp, order);

if (ret != NULL) {
- memset(ret, 0, size);
+ memset(ret, 0, PAGE_ALIGN(size));
*dma_handle = virt_to_phys(ret);
}
return ret;
diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x '*.orig' -x '*.rej' -r orig/include/asm-i386/dma-mapping.h linux/include/asm-i386/dma-mapping.h
--- orig/include/asm-i386/dma-mapping.h Mon Apr 4 22:54:41 2005
+++ linux/include/asm-i386/dma-mapping.h Mon Apr 4 22:48:11 2005
@@ -16,6 +16,23 @@ void *dma_alloc_coherent(struct device *
void dma_free_coherent(struct device *dev, size_t size,
void *vaddr, dma_addr_t dma_handle);

+static inline int
+dma_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
+ void *vaddr, dma_addr_t handle, size_t size)
+{
+ unsigned long offset = vma->vm_pgoff, usize;
+
+ size = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ usize = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+
+ if (offset >= size || usize > (size - offset))
+ return -ENXIO;
+
+ return remap_pfn_range(vma, vma->vm_start,
+ (__pa(vaddr) >> PAGE_SHIFT) + offset,
+ usize << PAGE_SHIFT, vma->vm_page_prot);
+}
+
static inline dma_addr_t
dma_map_single(struct device *dev, void *ptr, size_t size,
enum dma_data_direction direction)



--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2005-04-13 11:51:15

by Rolf Offermanns

[permalink] [raw]
Subject: Re: mmap + dma_alloc_coherent

On Wednesday 13 April 2005 13:19, Russell King wrote:
> This has come up before. ARM implements dma_mmap_*() to allow this
> to happen, but it never got propagated to the other architectures.
I know, this is why I referenced the other LKML threads. What keeps these
functions from being propagated to the other archs? Are there still
unresolved issues? (x86 not marking RAM pages reserved would be one I
assume)?
>
> Here's the (untested) x86 version. There may be a problem with
> x86 not marking the pages reserved, which is required for
> remap_pfn_range() to work.

So the fact that remap_pfn_range() does not work on pages allocated with
__get_free_pages() is an x86-only issue? Or is it by design?

-Rolf
--
Rolf Offermanns <[email protected]>
SYSGO AG Tel.: +49-6136-9948-0
Am Pfaffenstein 14 Fax: +49-6136-9948-10
55270 Klein-Winternheim http://www.sysgo.com