When lifting the default readahead size from 128KB to 512KB,
make sure it won't add memory pressure to small memory systems.
For read-ahead, the memory pressure is mainly readahead buffers consumed
by too many concurrent streams. The context readahead can adapt
readahead size to thrashing threshold well. So in principle we don't
need to adapt the default _max_ read-ahead size to memory pressure.
For read-around, the memory pressure is mainly read-around misses on
executables/libraries. Which could be reduced by scaling down
read-around size on fast "reclaim passes".
This patch presents a straightforward solution: to limit default
readahead size proportional to available system memory, ie.
512MB mem => 512KB readahead size
128MB mem => 128KB readahead size
32MB mem => 32KB readahead size (minimal)
Strictly speaking, only read-around size has to be limited. However we
don't bother to seperate read-around size from read-ahead size for now.
CC: Matt Mackall <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
---
mm/readahead.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
--- linux.orig/mm/readahead.c 2010-02-24 10:44:42.000000000 +0800
+++ linux/mm/readahead.c 2010-02-24 10:44:42.000000000 +0800
@@ -19,6 +19,10 @@
#include <linux/pagevec.h>
#include <linux/pagemap.h>
+#define MIN_READAHEAD_PAGES DIV_ROUND_UP(VM_MIN_READAHEAD*1024, PAGE_CACHE_SIZE)
+
+static int __initdata user_defined_readahead_size;
+
static int __init config_readahead_size(char *str)
{
unsigned long bytes;
@@ -36,11 +40,33 @@ static int __init config_readahead_size(
bytes = 128 << 20;
}
+ user_defined_readahead_size = 1;
default_backing_dev_info.ra_pages = bytes / PAGE_CACHE_SIZE;
return 0;
}
early_param("readahead", config_readahead_size);
+static int __init check_readahead_size(void)
+{
+ /*
+ * Scale down default readahead size for small memory systems.
+ * For example, a 64MB box will do 64KB read-ahead/read-around
+ * instead of the default 512KB.
+ *
+ * Note that the default readahead size will also be scaled down
+ * for small devices in add_disk().
+ */
+ if (!user_defined_readahead_size) {
+ unsigned long max = roundup_pow_of_two(totalram_pages / 1024);
+ if (default_backing_dev_info.ra_pages > max)
+ default_backing_dev_info.ra_pages = max;
+ if (default_backing_dev_info.ra_pages < MIN_READAHEAD_PAGES)
+ default_backing_dev_info.ra_pages = MIN_READAHEAD_PAGES;
+ }
+ return 0;
+}
+fs_initcall(check_readahead_size);
+
/*
* Initialise a struct file's readahead state. Assumes that the caller has
* memset *ra to zero.
On 02/23/2010 10:10 PM, Wu Fengguang wrote:
> When lifting the default readahead size from 128KB to 512KB,
> make sure it won't add memory pressure to small memory systems.
>
> For read-ahead, the memory pressure is mainly readahead buffers consumed
> by too many concurrent streams. The context readahead can adapt
> readahead size to thrashing threshold well. So in principle we don't
> need to adapt the default _max_ read-ahead size to memory pressure.
>
> For read-around, the memory pressure is mainly read-around misses on
> executables/libraries. Which could be reduced by scaling down
> read-around size on fast "reclaim passes".
>
> This patch presents a straightforward solution: to limit default
> readahead size proportional to available system memory, ie.
> 512MB mem => 512KB readahead size
> 128MB mem => 128KB readahead size
> 32MB mem => 32KB readahead size (minimal)
>
> Strictly speaking, only read-around size has to be limited. However we
> don't bother to seperate read-around size from read-ahead size for now.
>
> CC: Matt Mackall<[email protected]>
> Signed-off-by: Wu Fengguang<[email protected]>
Acked-by: Rik van Riel <[email protected]>
Wu Fengguang wrote:
> When lifting the default readahead size from 128KB to 512KB,
> make sure it won't add memory pressure to small memory systems.
>
> For read-ahead, the memory pressure is mainly readahead buffers consumed
> by too many concurrent streams. The context readahead can adapt
> readahead size to thrashing threshold well. So in principle we don't
> need to adapt the default _max_ read-ahead size to memory pressure.
>
> For read-around, the memory pressure is mainly read-around misses on
> executables/libraries. Which could be reduced by scaling down
> read-around size on fast "reclaim passes".
>
> This patch presents a straightforward solution: to limit default
> readahead size proportional to available system memory, ie.
> 512MB mem => 512KB readahead size
> 128MB mem => 128KB readahead size
> 32MB mem => 32KB readahead size (minimal)
>
> Strictly speaking, only read-around size has to be limited. However we
> don't bother to seperate read-around size from read-ahead size for now.
>
> CC: Matt Mackall <[email protected]>
> Signed-off-by: Wu Fengguang <[email protected]>
What I state here is for read ahead in a "multi iozone sequential"
setup, I can't speak for real "read around" workloads.
So probably your table is fine to cover read-around+read-ahead in one
number.
I have tested 256MB mem systems with 512kb readahead quite a lot.
On those 512kb is still by far superior to smaller readaheads and I
didn't see major trashing or memory pressure impact.
Therefore I would recommend a table like:
>=256MB mem => 512KB readahead size
128MB mem => 128KB readahead size
32MB mem => 32KB readahead size (minimal)
--
Gr?sse / regards, Christian Ehrhardt
IBM Linux Technology Center, System z Linux Performance
On Thu, Feb 25, 2010 at 11:25:54PM +0800, Christian Ehrhardt wrote:
>
>
> Wu Fengguang wrote:
> > When lifting the default readahead size from 128KB to 512KB,
> > make sure it won't add memory pressure to small memory systems.
> >
> > For read-ahead, the memory pressure is mainly readahead buffers consumed
> > by too many concurrent streams. The context readahead can adapt
> > readahead size to thrashing threshold well. So in principle we don't
> > need to adapt the default _max_ read-ahead size to memory pressure.
> >
> > For read-around, the memory pressure is mainly read-around misses on
> > executables/libraries. Which could be reduced by scaling down
> > read-around size on fast "reclaim passes".
> >
> > This patch presents a straightforward solution: to limit default
> > readahead size proportional to available system memory, ie.
> > 512MB mem => 512KB readahead size
> > 128MB mem => 128KB readahead size
> > 32MB mem => 32KB readahead size (minimal)
> >
> > Strictly speaking, only read-around size has to be limited. However we
> > don't bother to seperate read-around size from read-ahead size for now.
> >
> > CC: Matt Mackall <[email protected]>
> > Signed-off-by: Wu Fengguang <[email protected]>
>
> What I state here is for read ahead in a "multi iozone sequential"
> setup, I can't speak for real "read around" workloads.
> So probably your table is fine to cover read-around+read-ahead in one
> number.
OK.
> I have tested 256MB mem systems with 512kb readahead quite a lot.
> On those 512kb is still by far superior to smaller readaheads and I
> didn't see major trashing or memory pressure impact.
In fact I'd expect a 64MB box to also benefit from 512kb readahead :)
> Therefore I would recommend a table like:
> >=256MB mem => 512KB readahead size
> 128MB mem => 128KB readahead size
> 32MB mem => 32KB readahead size (minimal)
So, I'm fed up with compromising the read-ahead size with read-around
size.
There is no good to introduce a read-around size to confuse the user
though. Instead, I'll introduce a read-around size limit _on top of_
the readahead size. This will allow power users to adjust
read-ahead/read-around size at the same time, while saving the low end
from unnecessary memory pressure :) I made the assumption that low end
users have no need to request a large read-around size.
Thanks,
Fengguang
---
readahead: limit read-ahead size for small memory systems
When lifting the default readahead size from 128KB to 512KB,
make sure it won't add memory pressure to small memory systems.
For read-ahead, the memory pressure is mainly readahead buffers consumed
by too many concurrent streams. The context readahead can adapt
readahead size to thrashing threshold well. So in principle we don't
need to adapt the default _max_ read-ahead size to memory pressure.
For read-around, the memory pressure is mainly read-around misses on
executables/libraries. Which could be reduced by scaling down
read-around size on fast "reclaim passes".
This patch presents a straightforward solution: to limit default
read-ahead size proportional to available system memory, ie.
512MB mem => 512KB readahead size
128MB mem => 128KB readahead size
32MB mem => 32KB readahead size
CC: Matt Mackall <[email protected]>
CC: Christian Ehrhardt <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
---
mm/filemap.c | 2 +-
mm/readahead.c | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
--- linux.orig/mm/filemap.c 2010-02-26 10:04:28.000000000 +0800
+++ linux/mm/filemap.c 2010-02-26 10:08:33.000000000 +0800
@@ -1431,7 +1431,7 @@ static void do_sync_mmap_readahead(struc
/*
* mmap read-around
*/
- ra_pages = max_sane_readahead(ra->ra_pages);
+ ra_pages = min(ra->ra_pages, roundup_pow_of_two(totalram_pages / 1024));
if (ra_pages) {
ra->start = max_t(long, 0, offset - ra_pages/2);
ra->size = ra_pages;
> readahead: limit read-ahead size for small memory systems
>
> When lifting the default readahead size from 128KB to 512KB,
> make sure it won't add memory pressure to small memory systems.
btw, I wrote some comments to summarize the now complex readahead size
rules..
==
readahead: add notes on readahead size
Basically, currently the default max readahead size
- is 512k
- is boot time configurable with "readahead="
and is auto scaled down:
- for small devices
- for small memory systems (read-around size alone)
CC: Matt Mackall <[email protected]>
CC: Christian Ehrhardt <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
---
mm/readahead.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
--- linux.orig/mm/readahead.c 2010-02-26 10:11:41.000000000 +0800
+++ linux/mm/readahead.c 2010-02-26 10:11:55.000000000 +0800
@@ -7,6 +7,28 @@
* Initial version.
*/
+/*
+ * Notes on readahead size.
+ *
+ * The default max readahead size is VM_MAX_READAHEAD=512k,
+ * which can be changed by user with boot time parameter "readahead="
+ * or runtime interface "/sys/devices/virtual/bdi/default/read_ahead_kb".
+ * The latter normally only takes effect in future for hot added devices.
+ *
+ * The effective max readahead size for each block device can be accessed with
+ * 1) the `blockdev` command
+ * 2) /sys/block/sda/queue/read_ahead_kb
+ * 3) /sys/devices/virtual/bdi/$(env stat -c '%t:%T' /dev/sda)/read_ahead_kb
+ *
+ * They are typically initialized with the global default size, however may be
+ * auto scaled down for small devices in add_disk(). NFS, software RAID, btrfs
+ * etc. have special rules to setup their default readahead size.
+ *
+ * The mmap read-around size typically equals with readahead size, with an
+ * extra limit proportional to system memory size. For example, a 64MB box
+ * will have a 64KB read-around size limit, 128MB mem => 128KB limit, etc.
+ */
+
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/memcontrol.h>
Unfortunately without a chance to measure this atm, this patch now looks
really good to me.
Thanks for adapting it to a read-ahead only per mem limit.
Acked-by: Christian Ehrhardt <[email protected]>
Wu Fengguang wrote:
> On Thu, Feb 25, 2010 at 11:25:54PM +0800, Christian Ehrhardt wrote:
>>
>> Wu Fengguang wrote:
>> > When lifting the default readahead size from 128KB to 512KB,
>> > make sure it won't add memory pressure to small memory systems.
>> >
>> > For read-ahead, the memory pressure is mainly readahead buffers consumed
>> > by too many concurrent streams. The context readahead can adapt
>> > readahead size to thrashing threshold well. So in principle we don't
>> > need to adapt the default _max_ read-ahead size to memory pressure.
>> >
>> > For read-around, the memory pressure is mainly read-around misses on
>> > executables/libraries. Which could be reduced by scaling down
>> > read-around size on fast "reclaim passes".
>> >
>> > This patch presents a straightforward solution: to limit default
>> > readahead size proportional to available system memory, ie.
>> > 512MB mem => 512KB readahead size
>> > 128MB mem => 128KB readahead size
>> > 32MB mem => 32KB readahead size (minimal)
>> >
>> > Strictly speaking, only read-around size has to be limited. However we
>> > don't bother to seperate read-around size from read-ahead size for now.
>> >
>> > CC: Matt Mackall <[email protected]>
>> > Signed-off-by: Wu Fengguang <[email protected]>
>>
>> What I state here is for read ahead in a "multi iozone sequential"
>> setup, I can't speak for real "read around" workloads.
>> So probably your table is fine to cover read-around+read-ahead in one
>> number.
>
> OK.
>
>> I have tested 256MB mem systems with 512kb readahead quite a lot.
>> On those 512kb is still by far superior to smaller readaheads and I
>> didn't see major trashing or memory pressure impact.
>
> In fact I'd expect a 64MB box to also benefit from 512kb readahead :)
>
>> Therefore I would recommend a table like:
>> >=256MB mem => 512KB readahead size
>> 128MB mem => 128KB readahead size
>> 32MB mem => 32KB readahead size (minimal)
>
> So, I'm fed up with compromising the read-ahead size with read-around
> size.
>
> There is no good to introduce a read-around size to confuse the user
> though. Instead, I'll introduce a read-around size limit _on top of_
> the readahead size. This will allow power users to adjust
> read-ahead/read-around size at the same time, while saving the low end
> from unnecessary memory pressure :) I made the assumption that low end
> users have no need to request a large read-around size.
>
> Thanks,
> Fengguang
> ---
> readahead: limit read-ahead size for small memory systems
>
> When lifting the default readahead size from 128KB to 512KB,
> make sure it won't add memory pressure to small memory systems.
>
> For read-ahead, the memory pressure is mainly readahead buffers consumed
> by too many concurrent streams. The context readahead can adapt
> readahead size to thrashing threshold well. So in principle we don't
> need to adapt the default _max_ read-ahead size to memory pressure.
>
> For read-around, the memory pressure is mainly read-around misses on
> executables/libraries. Which could be reduced by scaling down
> read-around size on fast "reclaim passes".
>
> This patch presents a straightforward solution: to limit default
> read-ahead size proportional to available system memory, ie.
> 512MB mem => 512KB readahead size
> 128MB mem => 128KB readahead size
> 32MB mem => 32KB readahead size
>
> CC: Matt Mackall <[email protected]>
> CC: Christian Ehrhardt <[email protected]>
> Signed-off-by: Wu Fengguang <[email protected]>
> ---
> mm/filemap.c | 2 +-
> mm/readahead.c | 22 ++++++++++++++++++++++
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> --- linux.orig/mm/filemap.c 2010-02-26 10:04:28.000000000 +0800
> +++ linux/mm/filemap.c 2010-02-26 10:08:33.000000000 +0800
> @@ -1431,7 +1431,7 @@ static void do_sync_mmap_readahead(struc
> /*
> * mmap read-around
> */
> - ra_pages = max_sane_readahead(ra->ra_pages);
> + ra_pages = min(ra->ra_pages, roundup_pow_of_two(totalram_pages / 1024));
> if (ra_pages) {
> ra->start = max_t(long, 0, offset - ra_pages/2);
> ra->size = ra_pages;
--
Gr?sse / regards, Christian Ehrhardt
IBM Linux Technology Center, System z Linux Performance
Christian,
On Fri, Feb 26, 2010 at 03:23:40PM +0800, Christian Ehrhardt wrote:
> Unfortunately without a chance to measure this atm, this patch now looks
> really good to me.
> Thanks for adapting it to a read-ahead only per mem limit.
> Acked-by: Christian Ehrhardt <[email protected]>
Thank you. Effective measurement is hard because it really depends on
how the user want to stress use his small memory system ;) So I think
a simple to understand and yet reasonable limit scheme would be OK.
Thanks,
Fengguang
---
readahead: limit read-ahead size for small memory systems
When lifting the default readahead size from 128KB to 512KB,
make sure it won't add memory pressure to small memory systems.
For read-ahead, the memory pressure is mainly readahead buffers consumed
by too many concurrent streams. The context readahead can adapt
readahead size to thrashing threshold well. So in principle we don't
need to adapt the default _max_ read-ahead size to memory pressure.
For read-around, the memory pressure is mainly read-around misses on
executables/libraries. Which could be reduced by scaling down
read-around size on fast "reclaim passes".
This patch presents a straightforward solution: to limit default
read-ahead size proportional to available system memory, ie.
512MB mem => 512KB read-around size
128MB mem => 128KB read-around size
32MB mem => 32KB read-around size
This will allow power users to adjust read-ahead/read-around size at
once, while saving the low end from unnecessary memory pressure, under
the assumption that low end users have no need to request a large
read-around size.
CC: Matt Mackall <[email protected]>
Acked-by: Christian Ehrhardt <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
---
mm/filemap.c | 2 +-
mm/readahead.c | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
--- linux.orig/mm/filemap.c 2010-02-26 10:04:28.000000000 +0800
+++ linux/mm/filemap.c 2010-02-26 10:08:33.000000000 +0800
@@ -1431,7 +1431,7 @@ static void do_sync_mmap_readahead(struc
/*
* mmap read-around
*/
- ra_pages = max_sane_readahead(ra->ra_pages);
+ ra_pages = min(ra->ra_pages, roundup_pow_of_two(totalram_pages / 1024));
if (ra_pages) {
ra->start = max_t(long, 0, offset - ra_pages/2);
ra->size = ra_pages;
On Fri, Feb 26, 2010 at 10:48:37AM +0800, Wu Fengguang wrote:
> > readahead: limit read-ahead size for small memory systems
> >
> > When lifting the default readahead size from 128KB to 512KB,
> > make sure it won't add memory pressure to small memory systems.
>
> btw, I wrote some comments to summarize the now complex readahead size
> rules..
>
> ==
> readahead: add notes on readahead size
>
> Basically, currently the default max readahead size
> - is 512k
> - is boot time configurable with "readahead="
> and is auto scaled down:
> - for small devices
> - for small memory systems (read-around size alone)
>
> CC: Matt Mackall <[email protected]>
> CC: Christian Ehrhardt <[email protected]>
> Signed-off-by: Wu Fengguang <[email protected]>
> ---
> mm/readahead.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> --- linux.orig/mm/readahead.c 2010-02-26 10:11:41.000000000 +0800
> +++ linux/mm/readahead.c 2010-02-26 10:11:55.000000000 +0800
> @@ -7,6 +7,28 @@
> * Initial version.
> */
>
> +/*
> + * Notes on readahead size.
> + *
> + * The default max readahead size is VM_MAX_READAHEAD=512k,
> + * which can be changed by user with boot time parameter "readahead="
> + * or runtime interface "/sys/devices/virtual/bdi/default/read_ahead_kb".
> + * The latter normally only takes effect in future for hot added devices.
> + *
> + * The effective max readahead size for each block device can be accessed with
> + * 1) the `blockdev` command
> + * 2) /sys/block/sda/queue/read_ahead_kb
> + * 3) /sys/devices/virtual/bdi/$(env stat -c '%t:%T' /dev/sda)/read_ahead_kb
> + *
> + * They are typically initialized with the global default size, however may be
> + * auto scaled down for small devices in add_disk(). NFS, software RAID, btrfs
> + * etc. have special rules to setup their default readahead size.
> + *
> + * The mmap read-around size typically equals with readahead size, with an
> + * extra limit proportional to system memory size. For example, a 64MB box
> + * will have a 64KB read-around size limit, 128MB mem => 128KB limit, etc.
> + */
> +
Great. I was confused among so many ways to control read ahead size. This
documentation helps a lot.
Vivek
> #include <linux/kernel.h>
> #include <linux/fs.h>
> #include <linux/memcontrol.h>