2004-09-28 22:40:52

by Timur Tabi

[permalink] [raw]
Subject: get_user_pages() still broken in 2.6

I was hoping that this bug would be fixed in the 2.6 kernels, but
apparently it hasn't been.

Function get_user_pages() is supposed to lock user memory. However,
under extreme memory constraints, the kernel will swap out the "locked"
memory.

I have a test app which does this:

1) Calls our driver, which issues a get_user_pages() call for one page.
2) Calls our driver again to get the physical address of that page (the
driver uses pgd/pmd/pte_offset).
3) Tries allocate 1GB of memory (this system has 1GB of physical RAM).
4) Tries to get the physical address again.

In step 4, the physical address is usually zero, which means either
pgd_offset or pmd_offset failed. This indicates the page was swapped out.

I don't understand how this bug can continue to exist after all this
time. get_user_pages() is supposed to lock the memory, because drivers
use it for DMA'ing directly into user memory.

--
Timur Tabi
Staff Software Engineer
[email protected]


2004-09-28 23:03:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

On Tue, Sep 28, 2004 at 05:40:26PM -0500, Timur Tabi wrote:
> I was hoping that this bug would be fixed in the 2.6 kernels, but
> apparently it hasn't been.
>
> Function get_user_pages() is supposed to lock user memory. However,
> under extreme memory constraints, the kernel will swap out the "locked"
> memory.
>
> I have a test app which does this:
>
> 1) Calls our driver, which issues a get_user_pages() call for one page.
> 2) Calls our driver again to get the physical address of that page (the
> driver uses pgd/pmd/pte_offset).
> 3) Tries allocate 1GB of memory (this system has 1GB of physical RAM).
> 4) Tries to get the physical address again.
>
> In step 4, the physical address is usually zero, which means either
> pgd_offset or pmd_offset failed. This indicates the page was swapped out.
>
> I don't understand how this bug can continue to exist after all this
> time. get_user_pages() is supposed to lock the memory, because drivers
> use it for DMA'ing directly into user memory.

get_user_pages locks the page in memory. It doesn't do anything about ptes.

2004-09-28 23:21:41

by Dave Hansen

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

On Tue, 2004-09-28 at 16:03, Christoph Hellwig wrote:
> get_user_pages locks the page in memory. It doesn't do anything about ptes.

You probably want mlock(2) to keep the kernel from messing with the ptes
at all. But, you should probably really be thinking about why you're
accessing the page tables at all. I count *ONE* instance in drivers/
where page tables are accessed directly.

-- Dave

2004-09-29 15:06:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

On Wed, Sep 29, 2004 at 09:48:09AM -0500, Timur Tabi wrote:
> Christoph Hellwig wrote:
>
> > get_user_pages locks the page in memory. It doesn't do anything about ptes.
>
> I don't understand the difference. I thought a locked page is one that
> stays in memory (i.e. isn't swapped out) and whose physical address
> never changes. Is that wrong?

Yes. But if you're walking ptes you're looking at virtual addresses
somehow. Can you send me a pointer to your code please? I suspect
it's doing something terribly stupid.

2004-09-29 15:13:49

by Timur Tabi

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

Christoph Hellwig wrote:

> get_user_pages locks the page in memory. It doesn't do anything about ptes.

I don't understand the difference. I thought a locked page is one that
stays in memory (i.e. isn't swapped out) and whose physical address
never changes. Is that wrong? All I need to do is keep a page in
memory at the same physical address until I'm done with it.

--
Timur Tabi
Staff Software Engineer
[email protected]

2004-09-29 15:13:49

by Timur Tabi

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

Dave Hansen wrote:

> You probably want mlock(2) to keep the kernel from messing with the ptes
> at all.

mlock() can only be called via sys_mlock(), which is a user-space call.
Not only that, but only root can call sys_mlock(). This is not
compatible with our needs.

> But, you should probably really be thinking about why you're
> accessing the page tables at all. I count *ONE* instance in drivers/
> where page tables are accessed directly.

I access PTEs to get the physical addresses of a user-space buffer, so
that we can DMA to/from it directly.

--
Timur Tabi
Staff Software Engineer
[email protected]

2004-09-29 15:41:17

by Andi Kleen

[permalink] [raw]
Subject: Re: get_user_pages() still broken in 2.6

Timur Tabi <[email protected]> writes:

> Christoph Hellwig wrote:
>
>> get_user_pages locks the page in memory. It doesn't do anything about ptes.
>
> I don't understand the difference. I thought a locked page is one
> that stays in memory (i.e. isn't swapped out) and whose physical
> address never changes. Is that wrong? All I need to do is keep a
> page in memory at the same physical address until I'm done with it.

After get_user_pages you don't need the page tables anymore.
The struct page *s returned by it can be used for DMA.

-Andi