In 2.4.16, vmscan.c::shrink_cache(), we have following piece of code -
/*
* The page is locked. IO in progress?
* Move it to the back of the list.
*/
if (unlikely(TryLockPage(page))) {
if (PageLaunder(page) && (gfp_mask & __GFP_FS)) {
page_cache_get(page);
spin_unlock(&pagemap_lru_lock);
wait_on_page(page);
page_cache_release(page);
spin_lock(&pagemap_lru_lock);
}
continue;
}
1) Who is moving the page the back of list ?
2) Is the locked page worth waiting for? I can understand that the page is being
laundered so after wait we may get a clean page but from performance
point of view this is involving unnecessary context switches. Also during
high memory pressure kswapd shall sleep here when it can get more
clean pages on the inactive list ? What are we loosing if we don't wait on
the page and believe that in next pass we shall free this page
-- Amol
[email protected] wrote:
>
> In 2.4.16, vmscan.c::shrink_cache(), we have following piece of code -
>
> /*
> * The page is locked. IO in progress?
> * Move it to the back of the list.
> */
> if (unlikely(TryLockPage(page))) {
> if (PageLaunder(page) && (gfp_mask & __GFP_FS)) {
> page_cache_get(page);
> spin_unlock(&pagemap_lru_lock);
> wait_on_page(page);
> page_cache_release(page);
> spin_lock(&pagemap_lru_lock);
> }
> continue;
> }
>
> 1) Who is moving the page the back of list ?
Nobody. The comment is wrong.
Possibly the code is wrong, too. We don't want to keep scanning
the same page all the time.
> 2) Is the locked page worth waiting for? I can understand that the page is being
> laundered so after wait we may get a clean page but from performance
> point of view this is involving unnecessary context switches. Also during
> high memory pressure kswapd shall sleep here when it can get more
> clean pages on the inactive list ? What are we loosing if we don't wait on
> the page and believe that in next pass we shall free this page
>
Well we need to wait on I/O _somewhere_ in there. Otherwise everyone
just ends up busywaiting on IO completion. The idea is that on the
first pass through the inactive list, we start I/O, mark the page as
PG_Launder and don't wait on the I/O. On the second pass through the
list, when we find a PG_Launder page, we wait on it. This has the
effect of slowing memory-requesters down to the speed of the I/O
system. All this is for mmapped pages. The same behaviour is
implemented for write() pages via the BH_Launder bits on its buffers
over in sync_page_buffers().
-
Andrew Morton wrote:
>
> [email protected] wrote:
> >
> > ...
> > 1) Who is moving the page the back of list ?
>
> Nobody. The comment is wrong.
>
Sorry. The page has *already* at the back of the list. It's the very
first thing that happens:
list_del(entry);
list_add(entry, &inactive_list);
So the comment should read "Leave it at the back of the inactive list".
-
On December 31, 2001 08:37 pm, Andrew Morton wrote:
> [email protected] wrote:
> >
> > In 2.4.16, vmscan.c::shrink_cache(), we have following piece of code -
> >
> > /*
> > * The page is locked. IO in progress?
> > * Move it to the back of the list.
> > */
> > if (unlikely(TryLockPage(page))) {
> > if (PageLaunder(page) && (gfp_mask & __GFP_FS)) {
> > page_cache_get(page);
> > spin_unlock(&pagemap_lru_lock);
> > wait_on_page(page);
> > page_cache_release(page);
> > spin_lock(&pagemap_lru_lock);
> > }
> > continue;
> > }
> >
> > 1) Who is moving the page the back of list ?
>
> Nobody. The comment is wrong.
>
> Possibly the code is wrong, too. We don't want to keep scanning
> the same page all the time.
>
> > 2) Is the locked page worth waiting for? I can understand that the page is being
> > laundered so after wait we may get a clean page but from performance
> > point of view this is involving unnecessary context switches. Also during
> > high memory pressure kswapd shall sleep here when it can get more
> > clean pages on the inactive list ? What are we loosing if we don't wait on
> > the page and believe that in next pass we shall free this page
> >
>
> Well we need to wait on I/O _somewhere_ in there. Otherwise everyone
> just ends up busywaiting on IO completion. The idea is that on the
> first pass through the inactive list, we start I/O, mark the page as
> PG_Launder and don't wait on the I/O. On the second pass through the
> list, when we find a PG_Launder page, we wait on it. This has the
> effect of slowing memory-requesters down to the speed of the I/O
> system. All this is for mmapped pages. The same behaviour is
> implemented for write() pages via the BH_Launder bits on its buffers
> over in sync_page_buffers().
I think we want the pages in process of being written to live on a separate
list. Pages can be pulled of that list by a separate thread, or perhaps in
the IO completion interrupt (opportunistically, if the list lock is available)
meaning kswapd would block less and waste less time examining locked pages.
--
Daniel
Daniel Phillips wrote:
>
> I think we want the pages in process of being written to live on a separate
> list. Pages can be pulled of that list by a separate thread, or perhaps in
> the IO completion interrupt (opportunistically, if the list lock is available)
> meaning kswapd would block less and waste less time examining locked pages.
Yes, possibly. Also the unlocked pages which have locked buffers,
which tends to be 99% of the pages...
But then again:
- I've never seen this code disgrace itself in profiler output unless
it's in already-hopelessly-confused mode.
- Personally, I wouldn't recommend anything like that without having
previously done a deep analysis of the existing implementation's
dynamics and behaviour. Something which would take a week (or two,
given the way the elevator analysis is shaping up).
This activity is something which I have never countenanced because
the code has been under continual futzing for a year.
-
Andrew Morton <[email protected]> on 01/01/2002 01:19:05 AM
To: Amol Lad/HSS@HSS, [email protected]
cc:
Subject: Re: locked page handling
Andrew Morton wrote:
>
> [email protected] wrote:
> >
> > ...
> > 1) Who is moving the page the back of list ?
>
> Nobody. The comment is wrong.
>
Sorry. The page has *already* at the back of the list. It's the very
first thing that happens:
list_del(entry);
list_add(entry, &inactive_list);
So the comment should read "Leave it at the back of the inactive list".
>>>>> well....list_add() moves the page to the beginning of list......
-