2011-05-18 22:14:42

by Leon Woestenberg

[permalink] [raw]
Subject: mmap() implementation for pci_alloc_consistent() memory?

Hello,

I cannot get my driver's mmap() to work. I allocate 64 KiB ringbuffer
using pci_alloc_consistent(), then implement mmap() to allow programs
to map that memory into their user space.

My driver writes 0xDEADBEEF into the first 32-bit word of the memory
block. When I dump this word from my mmap.c program, it reads 0. It
seems a zero-page got mapped rather than the buffer.

This is the code, Ieft out all error checking but inserted comments to
show what I have verified.

int main(void)
{
int fd = open("/device_node", O_RDWR | O_SYNC);
uint32_t *addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
uint32_t data = *addr;
printf("address 0x%p reads data 0x%08x\n", addr32, (unsigned int)data);
munmap(addr, 4096);
close(fd);
}


void ringbuffer_vma_open(struct vm_area_struct *vma)
{
}

void ringbuffer_vma_close(struct vm_area_struct *vma)
{
}

int ringbuffer_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
/* the buffer allocated with pci_alloc_consistent() */
void *vaddr = ringbuffer_virt;
int ret;

/* find the struct page that describes vaddr, the buffer
* allocated with pci_alloc_consistent() */
struct page *page = virt_to_page(lro_char->engine->ringbuffer_virt);
vmf->page = page;

/*** I have verified that vaddr, page, and the pfn correspond
with vaddr = pci_alloc_consistent() ***/
ret = vm_insert_pfn(vma, address, page_to_pfn(page));
return ret;
}

static const struct vm_operations_struct ringbuffer_vm_ops = {
.fault = ringbuffer_vma_fault,
};

static int ringbuffer_mmap(struct file *file, struct vm_area_struct *vma)
{
<...extract private data...>

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP;
vma->vm_private_data = file->private_data;
vma->vm_ops = &ringbuffer_vm_ops;
ringbuffer_vma_open(vma);
return 0;
}

What did I miss?

Regards,
--
Leon


2011-05-19 01:04:40

by Leon Woestenberg

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Hello,

On Thu, May 19, 2011 at 12:14 AM, Leon Woestenberg
<[email protected]> wrote:
>
> int ringbuffer_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> {
> ? ? ? ?/* the buffer allocated with pci_alloc_consistent() */
> ? ? ? ?void *vaddr = ringbuffer_virt;
> ? ? ? ?int ret;
>
> ? ? ? ?/* find the struct page that describes vaddr, the buffer
> ? ? ? ? * allocated with pci_alloc_consistent() */
> ? ? ? ?struct page *page = virt_to_page(lro_char->engine->ringbuffer_virt);
> ? ? ? ?vmf->page = page;
>
> ? ? ? ?/*** I have verified that vaddr, page, and the pfn correspond
> with vaddr = pci_alloc_consistent() ***/
> ? ? ? ?ret = vm_insert_pfn(vma, address, page_to_pfn(page));
> ? ? ? ?return ret;
> }
>

Some further debugging insights:

I found that pfn_valid is 0 on page_to_pfn(page). Isn't
pci_alloc_consistent() memory backed by a real struct page?

I found that when I use the allocation/mapping below instead of
pci_alloc_consistent(), the fault handler does the mapping correctly.

vaddr = __get_free_pages(GFP_KERNEL, 4);
busaddr = dma_map_single(lro->pci_dev, vaddr,
psize, dir_to_dev? DMA_TO_DEVICE: DMA_FROM_DEVICE);

Still no clue why the mmap fails on pci_alloc_consistent() memory.

Regards,
--
Leon

2011-05-19 14:59:39

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
> Hello,
>
> I cannot get my driver's mmap() to work. I allocate 64 KiB ringbuffer
> using pci_alloc_consistent(), then implement mmap() to allow programs
> to map that memory into their user space.
>
> My driver writes 0xDEADBEEF into the first 32-bit word of the memory
> block. When I dump this word from my mmap.c program, it reads 0. It
> seems a zero-page got mapped rather than the buffer.
>
> This is the code, Ieft out all error checking but inserted comments to
> show what I have verified.
>
> int main(void)
> {
> int fd = open("/device_node", O_RDWR | O_SYNC);
> uint32_t *addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> uint32_t data = *addr;
> printf("address 0x%p reads data 0x%08x\n", addr32, (unsigned int)data);
> munmap(addr, 4096);
> close(fd);
> }
>
>
> void ringbuffer_vma_open(struct vm_area_struct *vma)
> {
> }
>
> void ringbuffer_vma_close(struct vm_area_struct *vma)
> {
> }
>
> int ringbuffer_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> {
> /* the buffer allocated with pci_alloc_consistent() */
> void *vaddr = ringbuffer_virt;
> int ret;
>
> /* find the struct page that describes vaddr, the buffer
> * allocated with pci_alloc_consistent() */
> struct page *page = virt_to_page(lro_char->engine->ringbuffer_virt);
> vmf->page = page;
>
> /*** I have verified that vaddr, page, and the pfn correspond
> with vaddr = pci_alloc_consistent() ***/
> ret = vm_insert_pfn(vma, address, page_to_pfn(page));

address is the vmf->virtual_address?

And is the page_to_pfn(page) value correct? As in:

int pfn = page_to_pfn(page);

WARN(pfn << PAGE_SIZE != vaddr,"Something fishy.");

Hm, I think I might have misled you now that I look at that WARN.

The pfn to be supplied has to be physical page frame number. Which in
this case should be your bus addr shifted by PAGE_SIZE. Duh! Try that
value.

I think a better example might be the 'hpet_mmap' code as it is simpler
and it also adds the VM_IO flag.

> return ret;
> }
>
> static const struct vm_operations_struct ringbuffer_vm_ops = {
> .fault = ringbuffer_vma_fault,
> };
>
> static int ringbuffer_mmap(struct file *file, struct vm_area_struct *vma)
> {
> <...extract private data...>
>
> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> vma->vm_flags |= VM_RESERVED | VM_MIXEDMAP;
> vma->vm_private_data = file->private_data;
> vma->vm_ops = &ringbuffer_vm_ops;
> ringbuffer_vma_open(vma);
> return 0;
> }
>
> What did I miss?

I gave you the wrong data :-(

2011-05-19 15:56:15

by Clemens Ladisch

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Konrad Rzeszutek Wilk wrote:
> On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
> > I cannot get my driver's mmap() to work. I allocate 64 KiB ringbuffer
> > using pci_alloc_consistent(), then implement mmap() to allow programs
> > to map that memory into their user space.
> > ...
> > int ringbuffer_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > {
> > /* the buffer allocated with pci_alloc_consistent() */
> > void *vaddr = ringbuffer_virt;
> > int ret;
> >
> > /* find the struct page that describes vaddr, the buffer
> > * allocated with pci_alloc_consistent() */
> > struct page *page = virt_to_page(lro_char->engine->ringbuffer_virt);
> > vmf->page = page;
> >
> > /*** I have verified that vaddr, page, and the pfn correspond with vaddr = pci_alloc_consistent() ***/
> > ret = vm_insert_pfn(vma, address, page_to_pfn(page));
>
> address is the vmf->virtual_address?
>
> And is the page_to_pfn(page) value correct? As in:
>
> int pfn = page_to_pfn(page);
>
> WARN(pfn << PAGE_SIZE != vaddr,"Something fishy.");
>
> Hm, I think I might have misled you now that I look at that WARN.
>
> The pfn to be supplied has to be physical page frame number. Which in
> this case should be your bus addr shifted by PAGE_SIZE. Duh! Try that
> value.

There are wildly different implementations of pci_alloc_consistent
(actually dma_alloc_coherent) that can return somewhat different
virtual and/or physical addresses.

> I think a better example might be the 'hpet_mmap' code

Which is x86 and ia64 only.

> > static int ringbuffer_mmap(struct file *file, struct vm_area_struct *vma)
> > {
> > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

So is this an architecture without coherent caches?
Or would you want to use pgprot_dmacoherent, if available?


I recently looked into this problem, and ended up with the code below.
I then decided that streaming DMA mappings might be a better idea.


Regards,
Clemens


/* returns the struct page from the result of dma_alloc_coherent() */
struct page *dma_coherent_page(struct device *device,
void *address, dma_addr_t bus)
{
#if defined(CONFIG_ALPHA) || \
defined(CONFIG_CRIS) || \
defined(CONFIG_IA64) || \
defined(CONFIG_MIPS) || \
(defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)) || \
defined(CONFIG_SPARC64) || \
defined(CONFIG_TILE) || \
defined(CONFIG_UNICORE32) || \
defined(CONFIG_X86)
#ifdef CONFIG_MIPS
if (!plat_device_is_coherent(device))
address = CAC_ADDR(address);
#endif
return virt_to_page(address);
#elif defined(CONFIG_ARM)
return pfn_to_page(dma_to_pfn(device, bus));
#elif defined(CONFIG_FRV) || \
defined(CONFIG_MN10300)
#ifdef CONFIG_MN10300
if (WARN(!IS_ALIGNED(bus, PAGE_SIZE)), "PCI SRAM allocator is broken\n")
return NULL;
#endif
return virt_to_page(bus_to_virt(bus));
#elif defined(CONFIG_M68K) || \
defined(CONFIG_SPARC32)
return virt_to_page(phys_to_virt(bus));
#elif defined(CONFIG_PARISC)
return virt_to_page(__va(bus));
#elif defined(CONFIG_SUPERH)
return pfn_to_page(bus >> PAGE_SHIFT);
#elif defined(CONFIG_MICROBLAZE) || \
(defined(CONFIG_PPC) && defined(CONFIG_NOT_COHERENT_CACHE))
unsigned long vaddr = (unsigned long)address;
pgd_t *pgd = pgd_offset_k(vaddr);
pud_t *pud = pud_offset(pgd, vaddr);
pmd_t *pmd = pmd_offset(pud, vaddr);
pte_t *pte = pte_offset_kernel(pmd, vaddr);
if (!pte_none(*pte) && pte_present(*pte)) {
unsigned long pfn = pte_pfn(*pte);
if (pfn_valid(pfn))
return pfn_to_page(pfn);
}
return NULL;
#elif defined(CONFIG_XTENSA)
#error non-cacheable remapping not implemented
#else
#error unknown architecture
#endif
}

2011-05-19 22:10:59

by Leon Woestenberg

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Hello Clemens, Konrad, others,

On Thu, May 19, 2011 at 5:58 PM, Clemens Ladisch <[email protected]> wrote:
> Konrad Rzeszutek Wilk wrote:
>> On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
>> > I cannot get my driver's mmap() to work. I allocate 64 KiB ringbuffer
>> > using pci_alloc_consistent(), then implement mmap() to allow programs
>> > to map that memory into their user space.
>> > ...
>> > int ringbuffer_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> > {
>> > ? ? ? ? /* the buffer allocated with pci_alloc_consistent() */
>> > ? ? void *vaddr = ringbuffer_virt;
>> > ? ? int ret;
>> >
>> > ? ? /* find the struct page that describes vaddr, the buffer
>> > ? ? ?* allocated with pci_alloc_consistent() */
>> > ? ? struct page *page = virt_to_page(lro_char->engine->ringbuffer_virt);
>> > ? ? vmf->page = page;
>> >
>> > ? ? ? ? /*** I have verified that vaddr, page, and the pfn correspond with vaddr = pci_alloc_consistent() ***/
>> > ? ? ret = vm_insert_pfn(vma, address, page_to_pfn(page));
>>
>> address is the vmf->virtual_address?
>>
yes, I missed that line when removing some noise:

unsigned long address = (unsigned long)vmf->virtual_address;

>> And is the page_to_pfn(page) value correct? As in:
>>
>> ? int pfn = page_to_pfn(page);
>>
>> ? WARN(pfn << PAGE_SIZE != vaddr,"Something fishy.");
>>
Yes, this holds true.

I also verified that the pfn corresponds with the pfn of the virtual
address returned by pci_alloc_consistent().

>> Hm, I think I might have misled you now that I look at that WARN.
>>
>> The pfn to be supplied has to be physical page frame number. Which in
>> this case should be your bus addr shifted by PAGE_SIZE. Duh! Try that
>> value.
>
The physical address? Why? (Just learning here, this is no objection
to your suggestion.)

Also, the bus address is not the physical address, or not in general.
For example on IOMMU systems this certainly doesn't hold true.

So how can I reliably find the out the physical memory address of
pci_alloc_consistent() memory?

> There are wildly different implementations of pci_alloc_consistent
> (actually dma_alloc_coherent) that can return somewhat different
> virtual and/or physical addresses.
>
>> I think a better example might be the 'hpet_mmap' code
>
> Which is x86 and ia64 only.
>
>> > static int ringbuffer_mmap(struct file *file, struct vm_area_struct *vma)
>> > {
>> > ? ? vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> So is this an architecture without coherent caches?
>
My aim is to have an architecture independent driver.

My goal is to make sure the CPU can poll for data that the PCI(e) bus
master writes into host memory.

> Or would you want to use pgprot_dmacoherent, if available?
>
Hmm, let me check that.

> I recently looked into this problem, and ended up with the code below.
> I then decided that streaming DMA mappings might be a better idea.
>
I got streaming DMA mappings working already. I cannot use them in my
case as streaming mappings are not cache-coherent in general.

Thanks for the code snippet, it seems x86 only though?

Regards,
--
Leon

2011-05-20 06:48:45

by Clemens Ladisch

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Leon Woestenberg wrote:
> On Thu, May 19, 2011 at 5:58 PM, Clemens Ladisch <[email protected]> wrote:
>>> On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
>>> > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>>
>> So is this an architecture without coherent caches?
>
> My aim is to have an architecture independent driver.

Please note that most MMU architectures forbid mapping the same memory
with different attributes, so you must use pgprot_noncached if and only
if dma_alloc_coherent actually uses it. Something like the code below.

And I'm not sure if you have to do some additional cache flushes when
mapping on some architectures.

>> Or would you want to use pgprot_dmacoherent, if available?
>
> Hmm, let me check that.

It's available only on ARM and Unicore32.

There's also dma_mmap_coherent(), which does exactly what you want if
your buffer is physically contiguous, but it's ARM only.
Takashi tried to implement it for other architectures; I don't know
what came of it.


Regards,
Clemens


#ifndef pgprot_dmacoherent
/* determine whether coherent mappings need to be uncached */
#if defined(CONFIG_ALPHA) || \
defined(CONFIG_CRIS) || \
defined(CONFIG_IA64) || \
(defined(CONFIG_MIPS) && defined(CONFIG_DMA_COHERENT)) || \
(defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)) || \
defined(CONFIG_SPARC64) || \
defined(CONFIG_X86)
#define ARCH_HAS_DMA_COHERENT_CACHE
#endif
#endif

...
#ifdef pgprot_dmacoherent
vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot);
#elif !defined(ARCH_HAS_DMA_COHERENT_CACHE)
#ifdef CONFIG_MIPS
if (!plat_device_is_coherent(device))
#endif
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
#endif

2011-05-20 08:17:10

by Takashi Iwai

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

At Fri, 20 May 2011 08:51:03 +0200,
Clemens Ladisch wrote:
>
> Leon Woestenberg wrote:
> > On Thu, May 19, 2011 at 5:58 PM, Clemens Ladisch <[email protected]> wrote:
> >>> On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
> >>> > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> >>
> >> So is this an architecture without coherent caches?
> >
> > My aim is to have an architecture independent driver.
>
> Please note that most MMU architectures forbid mapping the same memory
> with different attributes, so you must use pgprot_noncached if and only
> if dma_alloc_coherent actually uses it. Something like the code below.
>
> And I'm not sure if you have to do some additional cache flushes when
> mapping on some architectures.
>
> >> Or would you want to use pgprot_dmacoherent, if available?
> >
> > Hmm, let me check that.
>
> It's available only on ARM and Unicore32.
>
> There's also dma_mmap_coherent(), which does exactly what you want if
> your buffer is physically contiguous, but it's ARM only.
> Takashi tried to implement it for other architectures; I don't know
> what came of it.

PPC got this recently (thanks to Ben), but still missing in other
areas.

There was little uncertain issue on MIPS, and it looks difficult to
achieve it on PA-RISC at all. The development was stuck due to lack
of time since then.


Takashi

2011-05-21 11:00:04

by Leon Woestenberg

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Hello Clemens, Takashi,

On Fri, May 20, 2011 at 10:17 AM, Takashi Iwai <[email protected]> wrote:
> At Fri, 20 May 2011 08:51:03 +0200,
> Clemens Ladisch wrote:
>>
>> Leon Woestenberg wrote:
>> > On Thu, May 19, 2011 at 5:58 PM, Clemens Ladisch <[email protected]> wrote:
>> >>> On Thu, May 19, 2011 at 12:14:40AM +0200, Leon Woestenberg wrote:
>> >>> > ? ? vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>> >>
>> >> So is this an architecture without coherent caches?
>> >
>> > My aim is to have an architecture independent driver.
>>
>> Please note that most MMU architectures forbid mapping the same memory
>> with different attributes, so you must use pgprot_noncached if and only
>> if dma_alloc_coherent actually uses it. ?Something like the code below.
>>
>> And I'm not sure if you have to do some additional cache flushes when
>> mapping on some architectures.
>>
>> >> Or would you want to use pgprot_dmacoherent, if available?
>> >
>> > Hmm, let me check that.
>>
>> It's available only on ARM and Unicore32.
>>
>> There's also dma_mmap_coherent(), which does exactly what you want if
>> your buffer is physically contiguous, but it's ARM only.
>> Takashi tried to implement it for other architectures; I don't know
>> what came of it.
>
> PPC got this recently (thanks to Ben), but still missing in other
> areas.
>
> There was little uncertain issue on MIPS, and it looks difficult to
> achieve it on PA-RISC at all. ?The development was stuck due to lack
> of time since then.
>
Thanks for all the insights, I wasn't aware there were arch-specific
calls that already solved the topic issue.

Having dma_mmap_coherent() there is good for one or two archs, but how
can we built portable drivers if the others arch's are still missing?

I assume this call is thus not officially DMA-API (yet)?

Clemens showed some pretty amazing preprocessor #if(def)s to cater for
the all the different arch's and their mapping/cache-coherency
behaviour, but that's not something I would like to put in a driver.

How would dma_mmap_coherent() look like on x86?


Regards,
--
Leon

2011-05-23 08:28:05

by Clemens Ladisch

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Leon Woestenberg wrote:
> Having dma_mmap_coherent() there is good for one or two archs, but how
> can we built portable drivers if the others arch's are still missing?

Easy: Resolve all issues, implement it for all the other arches, and add
it to the official DMA API.

> How would dma_mmap_coherent() look like on x86?

X86 and some others are always coherent; just use vm_insert_page() or
remap_page_range().


Regards,
Clemens

2011-05-24 14:18:14

by Leon Woestenberg

[permalink] [raw]
Subject: Re: mmap() implementation for pci_alloc_consistent() memory?

Hello Clemens,

On Mon, May 23, 2011 at 10:30 AM, Clemens Ladisch <[email protected]> wrote:
> Leon Woestenberg wrote:
>> Having dma_mmap_coherent() there is good for one or two archs, but how
>> can we built portable drivers if the others arch's are still missing?
>
> Easy: Resolve all issues, implement it for all the other arches, and add
> it to the official DMA API.
>
>> How would dma_mmap_coherent() look like on x86?
>
> X86 and some others are always coherent; just use vm_insert_page() or
> remap_page_range().
>
Hello Clemens,

On Mon, May 23, 2011 at 10:30 AM, Clemens Ladisch <[email protected]> wrote:
> Leon Woestenberg wrote:
>> Having dma_mmap_coherent() there is good for one or two archs, but how
>> can we built portable drivers if the others arch's are still missing?
>
> Easy: Resolve all issues, implement it for all the other arches, and add
> it to the official DMA API.
>
I could send patches, but I would get bashed. Would be a learning
experience though (and a long email thread).

>> How would dma_mmap_coherent() look like on x86?
>
> X86 and some others are always coherent; just use vm_insert_page() or
> remap_page_range().
>

For x86 (my current test system) I tend to go with remap_page_range()
so that the mapping is done before the pages are actually touched.

With that I leave the work-in-progress (dma_mmap_coherent) aside for a
moment, I'll revisit that later on ARM.

However, I still feel I'm treading sandy waters, not familiar enough
with the VM internals.

Does memory allocated with pci_alloc_consistent() need a get_page() in
the driver before I remap_page_range() it?
Does memory allocated with __get_free_pages() need a get_page() in the
driver before I remap_page_range() it?

And how about set_bit(PG_reserved, ...) on those pages? Is that
something of the past?

#if 0
...
<vm_insert_page implementation here, disabled>
...
#else /* remap_pfn_range */
#warning Using remap_pfn_range
static int buffer_mmap(struct file *file, struct vm_area_struct *vma)
{
unsigned long vsize, vsize2;
void *vaddr, *vaddr2;
unsigned long start = vma->vm_start;

* VM_RESERVED: prevent the pages from being swapped out */
vma->vm_flags |= VM_RESERVED;
vma->vm_private_data = file->private_data;

/* allocated using __get_free_pages() or pci_alloc_consistent() */
vaddr = buffer_virt;
vsize = buffer_size;

/* iterate over pages */
vaddr2 = vaddr;
vsize2 = vsize;
while (vsize2 > 0) {
printk(KERN_DEBUG "get_page(page=%p)\n", virt_to_page(vaddr2));
get_page(virt_to_page(vaddr2));
//set_bit(PG_reserved, &(virt_to_page(vaddr2)->flags));
vaddr2 += PAGE_SIZE;
vsize2 -= PAGE_SIZE;
}
remap_pfn_range(vma, vma->vm_start,
page_to_pfn(virt_to_page(vaddr)), vsize, vma->vm_page_prot);
return 0;
}
#endif



Thanks for the explanation,

Regards,
--
Leon