Add look-ahead support to __do_page_cache_readahead(),
which is needed by the adaptive read-ahead logic.
Signed-off-by: Wu Fengguang <[email protected]>
---
mm/readahead.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)
--- linux-2.6.17-rc4-mm3.orig/mm/readahead.c
+++ linux-2.6.17-rc4-mm3/mm/readahead.c
@@ -266,7 +266,8 @@ out:
*/
static int
__do_page_cache_readahead(struct address_space *mapping, struct file *filp,
- pgoff_t offset, unsigned long nr_to_read)
+ pgoff_t offset, unsigned long nr_to_read,
+ unsigned long lookahead_size)
{
struct inode *inode = mapping->host;
struct page *page;
@@ -279,7 +280,7 @@ __do_page_cache_readahead(struct address
if (isize == 0)
goto out;
- end_index = ((isize - 1) >> PAGE_CACHE_SHIFT);
+ end_index = ((isize - 1) >> PAGE_CACHE_SHIFT);
/*
* Preallocate as many pages as we will need.
@@ -302,6 +303,8 @@ __do_page_cache_readahead(struct address
break;
page->index = page_offset;
list_add(&page->lru, &page_pool);
+ if (page_idx == nr_to_read - lookahead_size)
+ __SetPageReadahead(page);
ret++;
}
read_unlock_irq(&mapping->tree_lock);
@@ -338,7 +341,7 @@ int force_page_cache_readahead(struct ad
if (this_chunk > nr_to_read)
this_chunk = nr_to_read;
err = __do_page_cache_readahead(mapping, filp,
- offset, this_chunk);
+ offset, this_chunk, 0);
if (err < 0) {
ret = err;
break;
@@ -385,7 +388,7 @@ int do_page_cache_readahead(struct addre
if (bdi_read_congested(mapping->backing_dev_info))
return -1;
- return __do_page_cache_readahead(mapping, filp, offset, nr_to_read);
+ return __do_page_cache_readahead(mapping, filp, offset, nr_to_read, 0);
}
/*
@@ -405,7 +408,7 @@ blockable_page_cache_readahead(struct ad
if (!block && bdi_read_congested(mapping->backing_dev_info))
return 0;
- actual = __do_page_cache_readahead(mapping, filp, offset, nr_to_read);
+ actual = __do_page_cache_readahead(mapping, filp, offset, nr_to_read, 0);
return check_ra_success(ra, nr_to_read, actual);
}
@@ -450,7 +453,7 @@ static int make_ahead_window(struct addr
* @req_size: hint: total size of the read which the caller is performing in
* PAGE_CACHE_SIZE units
*
- * page_cache_readahead() is the main function. If performs the adaptive
+ * page_cache_readahead() is the main function. It performs the adaptive
* readahead window size management and submits the readahead I/O.
*
* Note that @filp is purely used for passing on to the ->readpage[s]()
--
Wu Fengguang <[email protected]> wrote:
>
> Add look-ahead support to __do_page_cache_readahead(),
> which is needed by the adaptive read-ahead logic.
You'd need to define "look-ahead support" before telling us you've added it ;)
> @@ -302,6 +303,8 @@ __do_page_cache_readahead(struct address
> break;
> page->index = page_offset;
> list_add(&page->lru, &page_pool);
> + if (page_idx == nr_to_read - lookahead_size)
> + __SetPageReadahead(page);
> ret++;
> }
OK. But the __SetPageFoo() things still give me the creeps.
OT: look:
read_unlock_irq(&mapping->tree_lock);
page = page_cache_alloc_cold(mapping);
read_lock_irq(&mapping->tree_lock);
we should have a page allocation function which just allocates a page from
this CPU's per-cpu-pages magazine, and fails if the magazine is empty:
page = alloc_pages_local(mapping_gfp_mask(x)|__GFP_COLD);
if (!page) {
read_unlock_irq(&mapping->tree_lock);
/*
* This will refill the per-cpu-pages magazine
*/
page = page_cache_alloc_cold(mapping);
read_lock_irq(&mapping->tree_lock);
}
Andrew Morton writes:
> Wu Fengguang <[email protected]> wrote:
> > @@ -302,6 +303,8 @@ __do_page_cache_readahead(struct address
> > break;
> > page->index = page_offset;
> > list_add(&page->lru, &page_pool);
> > + if (page_idx == nr_to_read - lookahead_size)
> > + __SetPageReadahead(page);
> > ret++;
> > }
>
> OK. But the __SetPageFoo() things still give me the creeps.
I just hope that Wu Fengguang, or whoever is making these patches,
realizes that on some architectures, doing __set_bit on one CPU
concurrently with another CPU doing set_bit on a different bit in the
same word can result in the second CPU's update getting lost...
Paul.
Paul Mackerras <[email protected]> wrote:
>
> Andrew Morton writes:
>
> > Wu Fengguang <[email protected]> wrote:
> > > @@ -302,6 +303,8 @@ __do_page_cache_readahead(struct address
> > > break;
> > > page->index = page_offset;
> > > list_add(&page->lru, &page_pool);
> > > + if (page_idx == nr_to_read - lookahead_size)
> > > + __SetPageReadahead(page);
> > > ret++;
> > > }
> >
> > OK. But the __SetPageFoo() things still give me the creeps.
>
> I just hope that Wu Fengguang, or whoever is making these patches,
> realizes that on some architectures, doing __set_bit on one CPU
> concurrently with another CPU doing set_bit on a different bit in the
> same word can result in the second CPU's update getting lost...
>
That's true even on x86.
Yes, this is understood - in this case he's following Nick's dubious lead
in leveraging our knowledge that no other code path will be attempting to
modify this page's flags at this time. It's just been taken off the
freelist, it's not yet on the LRU and we own the only ref to it.
The only hole I was able to shoot in this is swsusp, which walks mem_map[]
fiddling with page flags. But when it does this, only one CPU is running.
But I'm itching for an excuse to extirpate it all ;)
On Thu, May 25, 2006 at 09:30:39AM -0700, Andrew Morton wrote:
> Wu Fengguang <[email protected]> wrote:
> >
> > Add look-ahead support to __do_page_cache_readahead(),
> > which is needed by the adaptive read-ahead logic.
>
> You'd need to define "look-ahead support" before telling us you've added it ;)
>
> > @@ -302,6 +303,8 @@ __do_page_cache_readahead(struct address
> > break;
> > page->index = page_offset;
> > list_add(&page->lru, &page_pool);
> > + if (page_idx == nr_to_read - lookahead_size)
> > + __SetPageReadahead(page);
> > ret++;
> > }
>
> OK. But the __SetPageFoo() things still give me the creeps.
Hehe, updated to SetPageReadahead().
> OT: look:
>
> read_unlock_irq(&mapping->tree_lock);
> page = page_cache_alloc_cold(mapping);
> read_lock_irq(&mapping->tree_lock);
>
> we should have a page allocation function which just allocates a page from
> this CPU's per-cpu-pages magazine, and fails if the magazine is empty:
>
> page = alloc_pages_local(mapping_gfp_mask(x)|__GFP_COLD);
> if (!page) {
> read_unlock_irq(&mapping->tree_lock);
> /*
> * This will refill the per-cpu-pages magazine
> */
> page = page_cache_alloc_cold(mapping);
> read_lock_irq(&mapping->tree_lock);
> }
Seems good, except for the alloc_pages_local() not being able to
spread memory among nodes as page_cache_alloc_cold() do.
Wu