2002-11-07 10:04:49

by Ingo Oeser

[permalink] [raw]
Subject: get_user_pages rewrite (completed, updated for 2.4.46)

Hi Andrew,

now I have implemented the big get_user_pages rewrite. All the
patches apply to 2.4.46 so far.

Description of each patch (all called page_walk_api-2.5.46-XX):

01 - Implements the page walk api, staying as compatible, as
possible to the user users. Introduces get_one_user_page()
for the purpose, that the name suggests.

The page iterator is called walk_user_pages().
get_one_user_page() expects find_extend_vma() to be called first.
This is necessary or at least a nice optimization anyway in
all its callers.

Doesn't work with CONFIG_HUGETLB_PAGE and warns about that.
(02 removes that).

Introduces OBSOLETE_PAGE_WALKER and some mess to catch
callers of get_user_pages, which need to collect vmas.
(05 removes that).

Also demonstrate its benefits in make_pages_present().

02 - Implements a new follow_hugetlb_page() for i386,ia64,sparc64.
Caveat: spin_lock(&mm->page_table_lock) is taken in this
function, which wasn't needed before.
Some contention analysis is welcome!

03 - Use the new function from 02 and support huge tlb pages now, too.

04 - Sprinkle the get_one_user_page() where only one page is needed.

05 - This patch requires the aio fix, which I sent out earlier
today[1]. fs/aio.c called get_user_pages() with a wrong argument,
causing buffer overflow and scribble on kernel memory.

Here get_user_pages() is reduced by the last argument ("vmas"),
because no remaining callers used it.

I also sprinkled comments about optimizing with the new
page walker API, where apropriate.


All in all, I still need to implement the reason for all this:

A scatterlist page walker.

This will be done, if this stuff is accepted, because it
makes no sense without it.

The attached patchset supersedes the one I sent as page_walk_api[2].

Comments are very welcome.

Regards

Ingo Oeser

[1] Ben might post it together with his next AIO update, so
linux-kernel please stay tuned. On linux-mm it has been
posted.

[2] <[email protected]>
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth


Attachments:
(No filename) (2.17 kB)
page_walk_api-2.5.46-01.patch (13.63 kB)
page_walk_api-2.5.46-02.patch (6.52 kB)
page_walk_api-2.5.46-03.patch (2.15 kB)
page_walk_api-2.5.46-04.patch (2.21 kB)
page_walk_api-2.5.46-05.patch (8.25 kB)
Download all attachments

2002-11-07 11:34:15

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_user_pages rewrite (completed, updated for 2.4.46)

On Thu, Nov 07, 2002 at 11:08:40AM +0100, Ingo Oeser wrote:
+#ifdef OBSOLETE_PAGE_WALKER
+ if (vmas) {
+ gup_u.pv.vmas = vmas;
+ gup_u.pv.max_vmas = len;
+ walker = gup_add_pv;
+ printk("Obsolete argument \"vmas\" used!"
+ " Please send this report to [email protected]"
+ " or fix the caller. Stack trace follows...\n");
+ WARN_ON(vmas);
+ }
+#else
+ /* FIXME: Or should we simply ignore it? -ioe */
+ BUG_ON(vmas);
+#endif

What on earth? Why not just remove the obsolete code?


Bill

2002-11-07 22:07:53

by Ingo Oeser

[permalink] [raw]
Subject: Re: get_user_pages rewrite (completed, updated for 2.4.46)

Hi William,

On Thu, Nov 07, 2002 at 04:59:55AM -0800, William Lee Irwin III wrote:
> I think just fixing the callers and killing the dead code is all
> that's needed (or wanted).

Ok, so I put it up as splitup patches and as one complete patch onto

<http://www.tu-chemnitz.de/~ioe/patches-page_walk/index.html>

Regards

Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth

2002-11-08 22:37:40

by Andrew Morton

[permalink] [raw]
Subject: Re: get_user_pages rewrite (completed, updated for 2.4.46)

Ingo Oeser wrote:
>
> Hi Andrew,
>
> now I have implemented the big get_user_pages rewrite.


/* &custom_page_walker - A custom page walk handler for walk_user_pages().
* vma: The vma we walk pages of.
* page: The page we found or an %ERR_PTR() value
* virt_addr: The virtual address we are at while walking.
* customdata: Anything you would like to pass additionally.
*
* Returns:
* Negative values -> ERRNO values.
* 0 -> continue page walking.
* 1 -> abort page walking.
*
* If this functions gets a page, for which %IS_ERR(@page) is true, than it
* should do it's cleanup of customdata and return -PTR_ERR(@page).
*
* This function is called with @vma->vm_mm->page_table_lock held,
* if IS_ERR(@vma) is not true.
*
* But if IS_ERR(@vma) is true, IS_ERR(@page) is also true, since if we have no
* vma, then we also have no user space page.
*
* If it returns a negative value, then the page_table_lock must be dropped
* by this function, if it is held.
*/

This locking is rather awkward. Why is it necessary, and can it
be simplified??

wrt the removal of the vmas arg to get_user_pages(): I assume this
was because none of the multipage callers were using it?

The patches would be easier to follow if things were sequenced a
little differently: lose the intermediate steps. Or just roll
the whole thing into a single patch, really. I don't think there
are any intermediate steps in this one?