Hi,
I was debugging a lockup on sparc32 where we took a page_table_lock
in handle_mm_fault and then tried to take the same one again in
swap_out_mm. So it looks like we are scheduling somewhere inside
handle_mm_fault while holding the page_table_lock.
This machine might have highmem (Im awaiting more information). If
so copy_user_highpage and clear_user_highpage could sleep if we
run out of kmap entries. There are two problems:
do_anonymous_page calls clear_user_highpage with the page_table_lock held.
do_wp_page calls break_cow with the page_table_lock held.
Since I dont think we can drop the lock, do we need a kmap_atomic for
these?
Anton
On Thu, 26 Jul 2001, Anton Blanchard wrote:
> do_wp_page calls break_cow with the page_table_lock held.
>
> Since I dont think we can drop the lock, do we need a kmap_atomic for
> these?
calling kmap() with a spinlock held is indeed Very Bad, and break_cow()
uses kmap(). I dont know why this didnt get noticed earlier. Perhaps
because kmap() schedules very rarely.
the solution is to either use (per-CPU) atomic_kmap(), or to do the
clearing (and copying) speculatively, after allocating the page but before
locking the pagetable lock. This might lead to a bit more work in the
pagefault-race case, but we dont care about that window. It will on the
other hand reduce pagetable_lock contention (because the clearing/copying
is done outside the lock), so perhaps this solution is better.
Ingo
> [...] or to do the clearing (and copying) speculatively, after
> allocating the page but before locking the pagetable lock. This might
> lead to a bit more work in the pagefault-race case, but we dont care
> about that window. It will on the other hand reduce pagetable_lock
> contention (because the clearing/copying is done outside the lock), so
> perhaps this solution is better.
the attached highmem-2.4.7-A0 patch implements this method in both
affected functions. Comments?
Ingo
> > [...] or to do the clearing (and copying) speculatively, after
> > allocating the page but before locking the pagetable lock. This might
> > lead to a bit more work in the pagefault-race case, but we dont care
> > about that window. It will on the other hand reduce pagetable_lock
> > contention (because the clearing/copying is done outside the lock), so
> > perhaps this solution is better.
>
> the attached highmem-2.4.7-A0 patch implements this method in both
> affected functions. Comments?
It seems to me that the problem is more fundamental than that. Excuse my
ignorance, but what keeps the 'old_page' (and associated pte, checked two
lines down) from disappearing somewhere between the lock drop, alloc page
and the copy from the old page? Normally if this happens it appears the new
page gets dropped and the fault occurs again, and is resolved in a
potentially different way.
jlinton
On Thu, 26 Jul 2001, Jeremy Linton wrote:
> > > [...] or to do the clearing (and copying) speculatively, after
> > > allocating the page but before locking the pagetable lock. This might
> > > lead to a bit more work in the pagefault-race case, but we dont care
> > > about that window. It will on the other hand reduce pagetable_lock
> > > contention (because the clearing/copying is done outside the lock), so
> > > perhaps this solution is better.
> >
> > the attached highmem-2.4.7-A0 patch implements this method in both
> > affected functions. Comments?
> It seems to me that the problem is more fundamental than that. Excuse my
> ignorance, but what keeps the 'old_page' (and associated pte, checked two
> lines down) from disappearing somewhere between the lock drop, alloc page
> and the copy from the old page? Normally if this happens it appears the new
> page gets dropped and the fault occurs again, and is resolved in a
> potentially different way.
I was about to answer this by pointing out that, although the pte may
change and the old_page be reused for some other purpose while we drop
the lock, the old_page won't actually "disappear". It will remain
physically present, just containing irrelevant data: there won't be
any danger from copying the wrong data, we just notice further down
that the pte changed and discard this copy and fault again (or not).
But in writing, I realize (perhaps it's your very point, understated)
that it's conceivable (though *very* unlikely) that the old_page is
reused for some other purpose while we do the copy, then freed from
that use and reused for its original purpose by the time we regain the
lock: so that the pte_same() test succeeds yet the copied data is wrong.
Either do_wp_page() needs page_cache_get(old_page) before dropping
page_table_lock, page_cache_release(old_page) after reacquiring it;
or the kmap()s done while the lock is dropped, but copy_user_page()
and kunmap()s left until the lock has been reacquired. Ingo?
Hugh