2005-12-15 19:09:00

by Laurent Pinchart

[permalink] [raw]
Subject: VM_RESERVED and PG_reserved : Allocating memory for video buffers

Hi everybody,

I'm writing a Linux driver for a USB Video Class compliant USB device. I
manage to understand pretty much everything on my own until the point where I
have to allocate video buffers.

I read other drivers to understand how they proceed. Most of them used vmalloc
with SetPageReserved and remap_pfn_range to map the memory to user space. I
thought I understood that, when I noticed that vm_insert_page has been added
in 2.6.15. I wasn't sure how to prevent pages from being swapped out, so I
read the excellent "Understanding the Linux Virtual Memory Manager", but I'm
still not sure to understand everything. This is where I ask for your help.

I need to allocate big buffers, so vmalloc is the way to go, as I don't need
contiguous memory. I need to map those buffers to user space, and I
understand that vm_insert_page will do the job nicely. My fears come from
pages being swapped out. I suppose I need to prevent that, as a page fault in
interrupt is a Bad Thing(TM). I'm not sure how PG_reserved and VM_RESERVED
interract with eachother. Can kernel pages be swapped out if they are not
mapped to user space ? Or does kswapd only walk VMAs when it tries to find
pages that will be swapped out ? If the later is true, is it enough to set
VM_RESERVED on the VMA in the mmap handler ?

Memory management is quite complex in the Linux kernel, and I definitely need
some help to understand how all the magic is performed :-)

Best regards,

Laurent Pinchart


2005-12-16 01:27:45

by Nick Piggin

[permalink] [raw]
Subject: Re: VM_RESERVED and PG_reserved : Allocating memory for video buffers

Laurent Pinchart wrote:
> Hi everybody,
>
> I'm writing a Linux driver for a USB Video Class compliant USB device. I
> manage to understand pretty much everything on my own until the point where I
> have to allocate video buffers.
>
> I read other drivers to understand how they proceed. Most of them used vmalloc
> with SetPageReserved and remap_pfn_range to map the memory to user space. I
> thought I understood that, when I noticed that vm_insert_page has been added
> in 2.6.15. I wasn't sure how to prevent pages from being swapped out, so I
> read the excellent "Understanding the Linux Virtual Memory Manager", but I'm
> still not sure to understand everything. This is where I ask for your help.
>
> I need to allocate big buffers, so vmalloc is the way to go, as I don't need
> contiguous memory. I need to map those buffers to user space, and I
> understand that vm_insert_page will do the job nicely. My fears come from
> pages being swapped out. I suppose I need to prevent that, as a page fault in
> interrupt is a Bad Thing(TM). I'm not sure how PG_reserved and VM_RESERVED
> interract with eachother. Can kernel pages be swapped out if they are not
> mapped to user space ? Or does kswapd only walk VMAs when it tries to find
> pages that will be swapped out ? If the later is true, is it enough to set
> VM_RESERVED on the VMA in the mmap handler ?
>

PG_reserved no longer does anything (except catching bugs in old code).
If you are writing new code, you shouldn't use it. Don't copy rvmalloc,
you should be able to use vmalloc directly.

vm_insert_page is indeed the right interface for mapping these pages
into userspace.

You do not have to worry about pages being swapped out, and you shouldn't
need to set any unusual vma or page flags. kswapd only walks the lru lists,
and it won't even look at any other pages.

Good luck,
Nick

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-12-16 11:32:56

by Laurent Pinchart

[permalink] [raw]
Subject: Re: VM_RESERVED and PG_reserved : Allocating memory for video buffers

On Friday 16 December 2005 02:27, Nick Piggin wrote:
> Laurent Pinchart wrote:
> > Hi everybody,
> >
> > I'm writing a Linux driver for a USB Video Class compliant USB device. I
> > manage to understand pretty much everything on my own until the point
> > where I have to allocate video buffers.
> >
> > I read other drivers to understand how they proceed. Most of them used
> > vmalloc with SetPageReserved and remap_pfn_range to map the memory to
> > user space. I thought I understood that, when I noticed that
> > vm_insert_page has been added in 2.6.15. I wasn't sure how to prevent
> > pages from being swapped out, so I read the excellent "Understanding the
> > Linux Virtual Memory Manager", but I'm still not sure to understand
> > everything. This is where I ask for your help.
> >
> > I need to allocate big buffers, so vmalloc is the way to go, as I don't
> > need contiguous memory. I need to map those buffers to user space, and I
> > understand that vm_insert_page will do the job nicely. My fears come from
> > pages being swapped out. I suppose I need to prevent that, as a page
> > fault in interrupt is a Bad Thing(TM). I'm not sure how PG_reserved and
> > VM_RESERVED interract with eachother. Can kernel pages be swapped out if
> > they are not mapped to user space ? Or does kswapd only walk VMAs when it
> > tries to find pages that will be swapped out ? If the later is true, is
> > it enough to set VM_RESERVED on the VMA in the mmap handler ?
>
> PG_reserved no longer does anything (except catching bugs in old code).
> If you are writing new code, you shouldn't use it. Don't copy rvmalloc,
> you should be able to use vmalloc directly.
>
> vm_insert_page is indeed the right interface for mapping these pages
> into userspace.

Ok, that's what I intended to do.

> You do not have to worry about pages being swapped out, and you shouldn't
> need to set any unusual vma or page flags. kswapd only walks the lru lists,
> and it won't even look at any other pages.

I'd still like to understand how things work (I'm one of those programmers who
don't like to code without understanding).

I think I understand how disk buffers or non-shared pages mapped by a regular
file can be reclaimed, but I have trouble with anonymous pages and shared
pages.

First of all, I haven't been to find a definition of an anonymous page. I
understand it as a page of memory not backed by a file (pages allocated by
vmalloc for instance). If this is wrong, what I'm about to say if probably
very wrong as well.

Are anonymous pages ever added to the LRU active list ? I suppose they are
not, which is why they are not reclaimed.

How does the kernel handle shared pages, (if for instance two processes map a
regular file with MAP_SHARED) ? They can't be reclaimed before all processes
which map them have had their PTEs modified. Is this where reverse mapping
comes into play ?

Finally, how are devices which map anonymous kernel memory to user space
handled ? When a page is inserted in a process VMA using vm_insert_page, it
becomes shared between the kernel and user space. Does the kernel see the
page as a regular device backed page, and put it in the LRU active list ? You
said I shouldn't need to set any unusual VMA or page flags. What's the exact
purpose of VM_RESERVED and VM_IO then ? And when should they be set ?

Hope I'm not bothering you too much with all those questions. I don't feel at
ease when developping kernel code if I don't have at least a basic
understanding of what I'm doing.

Laurent Pinchart

2005-12-16 13:34:50

by Nick Piggin

[permalink] [raw]
Subject: Re: VM_RESERVED and PG_reserved : Allocating memory for video buffers

Laurent Pinchart wrote:
> On Friday 16 December 2005 02:27, Nick Piggin wrote:

>>vm_insert_page is indeed the right interface for mapping these pages
>>into userspace.
>
>
> Ok, that's what I intended to do.
>
>
>>You do not have to worry about pages being swapped out, and you shouldn't
>>need to set any unusual vma or page flags. kswapd only walks the lru lists,
>>and it won't even look at any other pages.
>
>
> I'd still like to understand how things work (I'm one of those programmers who
> don't like to code without understanding).
>
> I think I understand how disk buffers or non-shared pages mapped by a regular
> file can be reclaimed, but I have trouble with anonymous pages and shared
> pages.
>
> First of all, I haven't been to find a definition of an anonymous page. I
> understand it as a page of memory not backed by a file (pages allocated by
> vmalloc for instance). If this is wrong, what I'm about to say if probably
> very wrong as well.
>

Anonymous memory is memory not backed by a file. However the userspace
program will gain access to your vmalloc memory by mmaping a /dev file,
right?

So the mm kind of treats these pages as file pages (not anonymous),
although it doesn't make much difference because it really has very little
to do with them aside from what you see in vm_insert_page.

> Are anonymous pages ever added to the LRU active list ? I suppose they are
> not, which is why they are not reclaimed.
>

They are, see the line

lru_cache_add_active(page);

in mm/memory.c:do_anonymous_page()

Anonymous memory is basically memory allocated by malloc(), to put simply.

Reclaiming anonymous memory involves writing it out to swap.

> How does the kernel handle shared pages, (if for instance two processes map a
> regular file with MAP_SHARED) ? They can't be reclaimed before all processes
> which map them have had their PTEs modified. Is this where reverse mapping
> comes into play ?
>

That's right. mm/rmap.c:try_to_unmap()

> Finally, how are devices which map anonymous kernel memory to user space
> handled ? When a page is inserted in a process VMA using vm_insert_page, it
> becomes shared between the kernel and user space. Does the kernel see the
> page as a regular device backed page, and put it in the LRU active list ? You

No, the kernel won't do anything with it after vm_insert_page (which does
not put it on the LRU) until the process unmaps that page, at which time
all the accounting done by vm_insert_page is undone.

> said I shouldn't need to set any unusual VMA or page flags. What's the exact
> purpose of VM_RESERVED and VM_IO then ? And when should they be set ?
>

VM_RESERVED I think was in 2.4 to stop the swapout code looking at that
vma. I don't think you should ever need to set it in 2.6.

VM_IO should be set on memory areas that have can have side effects when
accessing them, like memory mapped IO regions.

> Hope I'm not bothering you too much with all those questions. I don't feel at
> ease when developping kernel code if I don't have at least a basic
> understanding of what I'm doing.
>

That's OK, hope this is of some help.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-12-16 14:10:10

by Markus Rechberger

[permalink] [raw]
Subject: Re: VM_RESERVED and PG_reserved : Allocating memory for video buffers

On 12/16/05, Nick Piggin <[email protected]> wrote:
> Laurent Pinchart wrote:
> > On Friday 16 December 2005 02:27, Nick Piggin wrote:
>
> >>vm_insert_page is indeed the right interface for mapping these pages
> >>into userspace.
> >
> >
> > Ok, that's what I intended to do.
> >
> >
> >>You do not have to worry about pages being swapped out, and you shouldn't
> >>need to set any unusual vma or page flags. kswapd only walks the lru lists,
> >>and it won't even look at any other pages.
> >
> >
> > I'd still like to understand how things work (I'm one of those programmers who
> > don't like to code without understanding).
> >
> > I think I understand how disk buffers or non-shared pages mapped by a regular
> > file can be reclaimed, but I have trouble with anonymous pages and shared
> > pages.
> >
> > First of all, I haven't been to find a definition of an anonymous page. I
> > understand it as a page of memory not backed by a file (pages allocated by
> > vmalloc for instance). If this is wrong, what I'm about to say if probably
> > very wrong as well.
> >
>
> Anonymous memory is memory not backed by a file. However the userspace
> program will gain access to your vmalloc memory by mmaping a /dev file,
> right?
>
> So the mm kind of treats these pages as file pages (not anonymous),
> although it doesn't make much difference because it really has very little
> to do with them aside from what you see in vm_insert_page.
>
> > Are anonymous pages ever added to the LRU active list ? I suppose they are
> > not, which is why they are not reclaimed.
> >
>
> They are, see the line
>
> lru_cache_add_active(page);
>
> in mm/memory.c:do_anonymous_page()
>
> Anonymous memory is basically memory allocated by malloc(), to put simply.
>
> Reclaiming anonymous memory involves writing it out to swap.
>
> > How does the kernel handle shared pages, (if for instance two processes map a
> > regular file with MAP_SHARED) ? They can't be reclaimed before all processes
> > which map them have had their PTEs modified. Is this where reverse mapping
> > comes into play ?
> >
>
> That's right. mm/rmap.c:try_to_unmap()
>
> > Finally, how are devices which map anonymous kernel memory to user space
> > handled ? When a page is inserted in a process VMA using vm_insert_page, it
> > becomes shared between the kernel and user space. Does the kernel see the
> > page as a regular device backed page, and put it in the LRU active list ? You
>
> No, the kernel won't do anything with it after vm_insert_page (which does
> not put it on the LRU) until the process unmaps that page, at which time
> all the accounting done by vm_insert_page is undone.
>
> > said I shouldn't need to set any unusual VMA or page flags. What's the exact
> > purpose of VM_RESERVED and VM_IO then ? And when should they be set ?
> >
>
> VM_RESERVED I think was in 2.4 to stop the swapout code looking at that
> vma. I don't think you should ever need to set it in 2.6.
>
> VM_IO should be set on memory areas that have can have side effects when
> accessing them, like memory mapped IO regions.
>

This is documented on page 421 in the ldd3

http://lwn.net/Kernel/LDD3/ chapter memory mapping and dma

VM_IO prevents memory from beeing included in coredumps

(though the ldd3 was written for 2.6.10 or so, some pieces are already
outdated but it's the most recent available one I think)

> > Hope I'm not bothering you too much with all those questions. I don't feel at
> > ease when developping kernel code if I don't have at least a basic
> > understanding of what I'm doing.
> >
>
> That's OK, hope this is of some help.
>
> --
> SUSE Labs, Novell Inc.
>
> Send instant messages to your online friends http://au.messenger.yahoo.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


--
Markus Rechberger