Hi,
I would like to mmap a kernel buffer, allocated with pci_alloc_consistent()
for DMA, to userspace and came up with the following. Since there seem to be
some (unresolved) issues (see below) with this and I would like to do the
RightThing(TM), I would appreciate your comments about my stuff.
As for the unresolved issues, I found the following LKML threads to be very
helpful in understanding the problem:
1. (DMA API issues)
http://marc.theaimsgroup.com/?l=linux-kernel&m=108757847518687&w=2
2. (can device drivers return non-ram via vm_ops->nopage?)
http://marc.theaimsgroup.com/?l=linux-kernel&m=107978968703503&w=2
What I did (with comments on problematic / not fully clear to me parts):
pci_probe():
my_buffer = pci_alloc_consistent() (size: PAGE_SIZE << 4)
------------------------------------------------------------------------
my_mmap():
vma->vm_ops = &my_vm_ops;
vma->vm_flags |= (VM_RESERVED | VM_IO);
my_vma_open(vma);
my_vm_ops = {
.open = my_vm_open,
.close = my_vm_close,
.nopage = my_vm_nopage,
}
Q: Is VM_IO needed here? I took it from the sg.c driver.
Q: I choosed nopage because remap_page_range does not work on RAM pages.
Correct?
------------------------------------------------------------------------
my_vm_open():
increment vma usage count
my_vm_close():
decrement vma usage count
my_vm_nopage():
offset = (address - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
pageptr = my_buffer + offset;
page = virt_to_page(pageptr);
/* got it, now increment the count */
get_page(page);
if (type)
{
*type = VM_FAULT_MINOR;
}
return page;
Q: As seen in the threads mentioned above I should not use virt_to_page() on
addresses I got from pci_alloc_consistent/dma_alloc_coherent. What is the
right way to handle this?
Q: get_page() increments the page refcount. Where is the correspondig
put_page() operation?
--------------------------------------------------------------------------
pci_remove():
pci_free_consistent(my_buffer, ...)
--------------------------------------------------------------------------
I first tried the above and failed. Somehow my driver seemed to screw up the
page tables. I noticed a function sg_correct4mmap() in the sg.c driver which
"fixes" refcount handling on pages allocated using __get_free_pages() with
order > 0. After implementing this things improved. Here are the changes:
my_mmap():
if (!mmap_called)
{
correct4mmap(my_buffer, 1);
mmap_called = 1;
}
release():
if (mmap_called)
{
if (vma_usage_count == 0)
{
correct4mmap(my_buffer, 0)
mmap_called = 0;
}
}
Q: Can someone please briefly explain, why (if at all) this is needed?
-----------------------------------------------------------------------------
Somehow the whole thing does not "feel" right so I would really like some
comments on this. I think this could be helpful to other, too.
Quoting Jeff Garzik from one of the older threads:
"My suggestion/request to the VM wizards would be to directly provide mmap
helpers for dma/mmio/pio, that Does The Right Thing. And require their
use in every driver. Don't give driver writers the opportunity to think
about this stuff and/or screw it up."
There are such helper functions on ARM and Russel tried to push them into
the generic DMA API (the last time in June 2004, I think). Can any progress
be expected regarding these helper functions?
Thanks for your time.
-Rolf
--
Rolf Offermanns <[email protected]>
SYSGO AG Tel.: +49-6136-9948-0
Am Pfaffenstein 14 Fax: +49-6136-9948-10
55270 Klein-Winternheim http://www.sysgo.com
On Wed, Apr 13, 2005 at 12:43:47PM +0200, Rolf Offermanns wrote:
> I would like to mmap a kernel buffer, allocated with pci_alloc_consistent()
> for DMA, to userspace and came up with the following. Since there seem to be
> some (unresolved) issues (see below) with this and I would like to do the
> RightThing(TM), I would appreciate your comments about my stuff.
This has come up before. ARM implements dma_mmap_*() to allow this
to happen, but it never got propagated to the other architectures.
Here's the (untested) x86 version. There may be a problem with
x86 not marking the pages reserved, which is required for
remap_pfn_range() to work.
diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x '*.orig' -x '*.rej' -r orig/arch/i386/kernel/pci-dma.c linux/arch/i386/kernel/pci-dma.c
--- orig/arch/i386/kernel/pci-dma.c Mon Apr 4 22:52:57 2005
+++ linux/arch/i386/kernel/pci-dma.c Mon Apr 4 22:44:45 2005
@@ -49,7 +49,7 @@ void *dma_alloc_coherent(struct device *
ret = (void *)__get_free_pages(gfp, order);
if (ret != NULL) {
- memset(ret, 0, size);
+ memset(ret, 0, PAGE_ALIGN(size));
*dma_handle = virt_to_phys(ret);
}
return ret;
diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x '*.orig' -x '*.rej' -r orig/include/asm-i386/dma-mapping.h linux/include/asm-i386/dma-mapping.h
--- orig/include/asm-i386/dma-mapping.h Mon Apr 4 22:54:41 2005
+++ linux/include/asm-i386/dma-mapping.h Mon Apr 4 22:48:11 2005
@@ -16,6 +16,23 @@ void *dma_alloc_coherent(struct device *
void dma_free_coherent(struct device *dev, size_t size,
void *vaddr, dma_addr_t dma_handle);
+static inline int
+dma_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
+ void *vaddr, dma_addr_t handle, size_t size)
+{
+ unsigned long offset = vma->vm_pgoff, usize;
+
+ size = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ usize = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+
+ if (offset >= size || usize > (size - offset))
+ return -ENXIO;
+
+ return remap_pfn_range(vma, vma->vm_start,
+ (__pa(vaddr) >> PAGE_SHIFT) + offset,
+ usize << PAGE_SHIFT, vma->vm_page_prot);
+}
+
static inline dma_addr_t
dma_map_single(struct device *dev, void *ptr, size_t size,
enum dma_data_direction direction)
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
On Wednesday 13 April 2005 13:19, Russell King wrote:
> This has come up before. ARM implements dma_mmap_*() to allow this
> to happen, but it never got propagated to the other architectures.
I know, this is why I referenced the other LKML threads. What keeps these
functions from being propagated to the other archs? Are there still
unresolved issues? (x86 not marking RAM pages reserved would be one I
assume)?
>
> Here's the (untested) x86 version. There may be a problem with
> x86 not marking the pages reserved, which is required for
> remap_pfn_range() to work.
So the fact that remap_pfn_range() does not work on pages allocated with
__get_free_pages() is an x86-only issue? Or is it by design?
-Rolf
--
Rolf Offermanns <[email protected]>
SYSGO AG Tel.: +49-6136-9948-0
Am Pfaffenstein 14 Fax: +49-6136-9948-10
55270 Klein-Winternheim http://www.sysgo.com