Hi Andrew,
now I have implemented the big get_user_pages rewrite. All the
patches apply to 2.4.46 so far.
Description of each patch (all called page_walk_api-2.5.46-XX):
01 - Implements the page walk api, staying as compatible, as
possible to the user users. Introduces get_one_user_page()
for the purpose, that the name suggests.
The page iterator is called walk_user_pages().
get_one_user_page() expects find_extend_vma() to be called first.
This is necessary or at least a nice optimization anyway in
all its callers.
Doesn't work with CONFIG_HUGETLB_PAGE and warns about that.
(02 removes that).
Introduces OBSOLETE_PAGE_WALKER and some mess to catch
callers of get_user_pages, which need to collect vmas.
(05 removes that).
Also demonstrate its benefits in make_pages_present().
02 - Implements a new follow_hugetlb_page() for i386,ia64,sparc64.
Caveat: spin_lock(&mm->page_table_lock) is taken in this
function, which wasn't needed before.
Some contention analysis is welcome!
03 - Use the new function from 02 and support huge tlb pages now, too.
04 - Sprinkle the get_one_user_page() where only one page is needed.
05 - This patch requires the aio fix, which I sent out earlier
today[1]. fs/aio.c called get_user_pages() with a wrong argument,
causing buffer overflow and scribble on kernel memory.
Here get_user_pages() is reduced by the last argument ("vmas"),
because no remaining callers used it.
I also sprinkled comments about optimizing with the new
page walker API, where apropriate.
All in all, I still need to implement the reason for all this:
A scatterlist page walker.
This will be done, if this stuff is accepted, because it
makes no sense without it.
The attached patchset supersedes the one I sent as page_walk_api[2].
Comments are very welcome.
Regards
Ingo Oeser
[1] Ben might post it together with his next AIO update, so
linux-kernel please stay tuned. On linux-mm it has been
posted.
[2] <[email protected]>
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth
On Thu, Nov 07, 2002 at 11:08:40AM +0100, Ingo Oeser wrote:
+#ifdef OBSOLETE_PAGE_WALKER
+ if (vmas) {
+ gup_u.pv.vmas = vmas;
+ gup_u.pv.max_vmas = len;
+ walker = gup_add_pv;
+ printk("Obsolete argument \"vmas\" used!"
+ " Please send this report to [email protected]"
+ " or fix the caller. Stack trace follows...\n");
+ WARN_ON(vmas);
+ }
+#else
+ /* FIXME: Or should we simply ignore it? -ioe */
+ BUG_ON(vmas);
+#endif
What on earth? Why not just remove the obsolete code?
Bill
Hi William,
On Thu, Nov 07, 2002 at 04:59:55AM -0800, William Lee Irwin III wrote:
> I think just fixing the callers and killing the dead code is all
> that's needed (or wanted).
Ok, so I put it up as splitup patches and as one complete patch onto
<http://www.tu-chemnitz.de/~ioe/patches-page_walk/index.html>
Regards
Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth
Ingo Oeser wrote:
>
> Hi Andrew,
>
> now I have implemented the big get_user_pages rewrite.
/* &custom_page_walker - A custom page walk handler for walk_user_pages().
* vma: The vma we walk pages of.
* page: The page we found or an %ERR_PTR() value
* virt_addr: The virtual address we are at while walking.
* customdata: Anything you would like to pass additionally.
*
* Returns:
* Negative values -> ERRNO values.
* 0 -> continue page walking.
* 1 -> abort page walking.
*
* If this functions gets a page, for which %IS_ERR(@page) is true, than it
* should do it's cleanup of customdata and return -PTR_ERR(@page).
*
* This function is called with @vma->vm_mm->page_table_lock held,
* if IS_ERR(@vma) is not true.
*
* But if IS_ERR(@vma) is true, IS_ERR(@page) is also true, since if we have no
* vma, then we also have no user space page.
*
* If it returns a negative value, then the page_table_lock must be dropped
* by this function, if it is held.
*/
This locking is rather awkward. Why is it necessary, and can it
be simplified??
wrt the removal of the vmas arg to get_user_pages(): I assume this
was because none of the multipage callers were using it?
The patches would be easier to follow if things were sequenced a
little differently: lose the intermediate steps. Or just roll
the whole thing into a single patch, really. I don't think there
are any intermediate steps in this one?