2012-11-08 23:25:56

by Sonny Rao

[permalink] [raw]
Subject: [PATCH] mm: Fix calculation of dirtyable memory

The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache. A bug causes an underflow thus making
the page count look like a big unsigned number. This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Puneet Kumar <[email protected]>
---
mm/page-writeback.c | 17 +++++++++++++----
1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..2a6356c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
unsigned long x = 0;

for_each_node_state(node, N_HIGH_MEMORY) {
+ unsigned long nr_pages;
struct zone *z =
&NODE_DATA(node)->node_zones[ZONE_HIGHMEM];

- x += zone_page_state(z, NR_FREE_PAGES) +
- zone_reclaimable_pages(z) - z->dirty_balance_reserve;
+ nr_pages = zone_page_state(z, NR_FREE_PAGES) +
+ zone_reclaimable_pages(z);
+ /*
+ * Unreclaimable memory (kernel memory or anonymous memory
+ * without swap) can bring down the dirtyable pages below
+ * the zone's dirty balance reserve.
+ */
+ if (nr_pages >= z->dirty_balance_reserve)
+ x += nr_pages - z->dirty_balance_reserve;
}
/*
* Make sure that the number of highmem pages is never larger
@@ -222,8 +230,9 @@ static unsigned long global_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
- dirty_balance_reserve;
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+ if (x >= dirty_balance_reserve)
+ x -= dirty_balance_reserve;

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
--
1.7.7.3


2012-11-08 23:37:59

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: Fix calculation of dirtyable memory

On Thu, 8 Nov 2012 15:25:35 -0800
Sonny Rao <[email protected]> wrote:

> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache. A bug causes an underflow thus making
> the page count look like a big unsigned number. This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
>
> Fix is to ensure there is no underflow while doing the math.
>
> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>
> ---
> mm/page-writeback.c | 17 +++++++++++++----
> 1 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 830893b..2a6356c 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
> unsigned long x = 0;
>
> for_each_node_state(node, N_HIGH_MEMORY) {
> + unsigned long nr_pages;
> struct zone *z =
> &NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
>
> - x += zone_page_state(z, NR_FREE_PAGES) +
> - zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> + nr_pages = zone_page_state(z, NR_FREE_PAGES) +
> + zone_reclaimable_pages(z);
> + /*
> + * Unreclaimable memory (kernel memory or anonymous memory
> + * without swap) can bring down the dirtyable pages below
> + * the zone's dirty balance reserve.
> + */
> + if (nr_pages >= z->dirty_balance_reserve)
> + x += nr_pages - z->dirty_balance_reserve;

If the system has two nodes and one is below its dirty_balance_reserve,
we could end up with something like:

x = 0;
...
x += 1000;
...
x += -100;

In this case, your fix would cause highmem_dirtyable_memory() to return
1000. Would it be better to instead return 900?

IOW, we instead add logic along the lines of

if ((long)x < 0)
x = 0;
return x;

2012-11-09 00:42:20

by Sonny Rao

[permalink] [raw]
Subject: [PATCHv2] mm: Fix calculation of dirtyable memory

add apkm's suggestion

2012-11-09 00:45:55

by Sonny Rao

[permalink] [raw]
Subject: Re: [PATCHv2] mm: Fix calculation of dirtyable memory

On Thu, Nov 8, 2012 at 4:42 PM, Sonny Rao <[email protected]> wrote:
> add apkm's suggestion
>

Oops, sorry, will add akpm's suggestion and re-post

2012-11-09 00:52:43

by Sonny Rao

[permalink] [raw]
Subject: [PATCHv2] mm: Fix calculation of dirtyable memory

The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache. A bug causes an underflow thus making
the page count look like a big unsigned number. This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Puneet Kumar <[email protected]>
---
v2: added apkm's suggestion to make the highmem calculation better
mm/page-writeback.c | 17 +++++++++++++++--
1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..ce62442 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
+ * Unreclaimable memory (kernel memory or anonymous memory
+ * without swap) can bring down the dirtyable pages below
+ * the zone's dirty balance reserve and the above calculation
+ * will underflow. However we still want to add in nodes
+ * which are below threshold (negative values) to get a more
+ * accurate calculation but make sure that the total never
+ * underflows.
+ */
+ if ((long)x < 0)
+ x = 0;
+
+ /*
* Make sure that the number of highmem pages is never larger
* than the number of the total dirtyable memory. This can only
* occur in very strange VM situations but we want to make sure
@@ -222,8 +234,9 @@ static unsigned long global_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
- dirty_balance_reserve;
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+ if (x >= dirty_balance_reserve)
+ x -= dirty_balance_reserve;

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
--
1.7.7.3

2012-11-09 02:36:44

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCHv2] mm: Fix calculation of dirtyable memory

On Thu, Nov 08, 2012 at 04:52:33PM -0800, Sonny Rao wrote:
> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache. A bug causes an underflow thus making
> the page count look like a big unsigned number. This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
>
> Fix is to ensure there is no underflow while doing the math.

Good catch, thanks!

> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>
> ---
> v2: added apkm's suggestion to make the highmem calculation better
> mm/page-writeback.c | 17 +++++++++++++++--
> 1 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 830893b..ce62442 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
> zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> }
> /*
> + * Unreclaimable memory (kernel memory or anonymous memory
> + * without swap) can bring down the dirtyable pages below
> + * the zone's dirty balance reserve and the above calculation
> + * will underflow. However we still want to add in nodes
> + * which are below threshold (negative values) to get a more
> + * accurate calculation but make sure that the total never
> + * underflows.
> + */
> + if ((long)x < 0)
> + x = 0;
> +
> + /*
> * Make sure that the number of highmem pages is never larger
> * than the number of the total dirtyable memory. This can only
> * occur in very strange VM situations but we want to make sure
> @@ -222,8 +234,9 @@ static unsigned long global_dirtyable_memory(void)
> {
> unsigned long x;
>
> - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
> - dirty_balance_reserve;
> + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
> + if (x >= dirty_balance_reserve)
> + x -= dirty_balance_reserve;

That can be converted to "if ((long)x < 0) x = 0;", too.

And I suspect zone_dirtyable_memory() needs similar fix, too.

Thanks,
Fengguang

2012-11-12 19:43:25

by Sonny Rao

[permalink] [raw]
Subject: [PATCH] mm: Fix calculation of dirtyable memory

The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache. A bug causes an underflow thus making
the page count look like a big unsigned number. This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Puneet Kumar <[email protected]>
---
v2: added apkm's suggestion to make the highmem calculation better
v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
(offlist mail) to use max() in global_dirtyable_memory()
mm/page-writeback.c | 25 ++++++++++++++++++++-----
1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..f9efbe8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
+ * Unreclaimable memory (kernel memory or anonymous memory
+ * without swap) can bring down the dirtyable pages below
+ * the zone's dirty balance reserve and the above calculation
+ * will underflow. However we still want to add in nodes
+ * which are below threshold (negative values) to get a more
+ * accurate calculation but make sure that the total never
+ * underflows.
+ */
+ if ((long)x < 0)
+ x = 0;
+
+ /*
* Make sure that the number of highmem pages is never larger
* than the number of the total dirtyable memory. This can only
* occur in very strange VM situations but we want to make sure
@@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
- dirty_balance_reserve;
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+ x -= max(x, dirty_balance_reserve);

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
@@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone)
* highmem zone can hold its share of dirty pages, so we don't
* care about vm_highmem_is_dirtyable here.
*/
- return zone_page_state(zone, NR_FREE_PAGES) +
- zone_reclaimable_pages(zone) -
- zone->dirty_balance_reserve;
+ unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
+ zone_reclaimable_pages(zone);
+
+ /* don't allow this to underflow */
+ nr_pages -= max(nr_pages, zone->dirty_balance_reserve);
+ return nr_pages;
}

/**
--
1.7.7.3

2012-11-12 20:32:36

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] mm: Fix calculation of dirtyable memory

On Mon, Nov 12, 2012 at 11:35:28AM -0800, Sonny Rao wrote:
> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache. A bug causes an underflow thus making
> the page count look like a big unsigned number. This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
>
> Fix is to ensure there is no underflow while doing the math.
>
> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>

Thanks for debugging and sending in the patch.

It might be useful to note in the changelog that the crawling
writeback problem only affects highmem systems because of the way the
underflowed count of high memory is subtracted from the overall amount
of dirtyable memory.

And that the problem was introduced with v3.2-4896-gab8fabd (which
means that we should include Cc: [email protected] for 3.3+).

The diff itself looks good to me, thanks again:

Acked-by: Johannes Weiner <[email protected]>

2012-11-12 21:36:08

by Sonny Rao

[permalink] [raw]
Subject: [PATCHv4] mm: Fix calculation of dirtyable memory

The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache. A bug causes an underflow thus making
the page count look like a big unsigned number. This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).
This generally only affects systems with highmem because the
underflowed count gets subtracted from the global count of
dirtyable memory.

The problem was introduced with v3.2-4896-gab8fabd

Fix is to ensure we don't get an underflowed total of either highmem
or global dirtyable memory.

Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Puneet Kumar <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
CC: [email protected]
---
v2: added apkm's suggestion to make the highmem calculation better
v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
(offlist mail) to use max() in global_dirtyable_memory()
v4: Added suggestions to description clarifying the role of highmem
and the commit which originally caused the problem
mm/page-writeback.c | 25 ++++++++++++++++++++-----
1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..f9efbe8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
+ * Unreclaimable memory (kernel memory or anonymous memory
+ * without swap) can bring down the dirtyable pages below
+ * the zone's dirty balance reserve and the above calculation
+ * will underflow. However we still want to add in nodes
+ * which are below threshold (negative values) to get a more
+ * accurate calculation but make sure that the total never
+ * underflows.
+ */
+ if ((long)x < 0)
+ x = 0;
+
+ /*
* Make sure that the number of highmem pages is never larger
* than the number of the total dirtyable memory. This can only
* occur in very strange VM situations but we want to make sure
@@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
- dirty_balance_reserve;
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+ x -= max(x, dirty_balance_reserve);

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
@@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone)
* highmem zone can hold its share of dirty pages, so we don't
* care about vm_highmem_is_dirtyable here.
*/
- return zone_page_state(zone, NR_FREE_PAGES) +
- zone_reclaimable_pages(zone) -
- zone->dirty_balance_reserve;
+ unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
+ zone_reclaimable_pages(zone);
+
+ /* don't allow this to underflow */
+ nr_pages -= max(nr_pages, zone->dirty_balance_reserve);
+ return nr_pages;
}

/**
--
1.7.7.3

2012-11-17 20:41:54

by Damien Wyart

[permalink] [raw]
Subject: Re: [PATCHv4] mm: Fix calculation of dirtyable memory

Hi,

> Fix is to ensure we don't get an underflowed total of either highmem
> or global dirtyable memory.
>
> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> CC: [email protected]
> ---
> v2: added apkm's suggestion to make the highmem calculation better
> v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
> (offlist mail) to use max() in global_dirtyable_memory()
> v4: Added suggestions to description clarifying the role of highmem
> and the commit which originally caused the problem
> mm/page-writeback.c | 25 ++++++++++++++++++++-----
> 1 files changed, 20 insertions(+), 5 deletions(-)

I am seeing a big performance regression with this patch applied on
top of 3.7-rc6 :
some actions generate a *lot* of disk activity (corresponding
processes are shown in D state),
and take a very long time to complete. The problem happens mainly when :
- running apt-get update (this is a Debian system) : the last phase
"reading packages lists"
takes *minutes* with disk LED blinking, instead of a few seconds.
- installing a kernel generated with kernel-package take also minutes
on the "updating grub"
step.

I am attaching corresponding config. This is a 64-bit system with 6Go
of memory and problems
arise with memory not used a lot (several gigs of free memory). I
guess this should be quite easy to reproduce...

Thanks,

Damien


Attachments:
config.37 (60.23 kB)

2012-11-19 13:40:12

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCHv4] mm: Fix calculation of dirtyable memory

On Mon, Nov 12, 2012 at 01:35:46PM -0800, Sonny Rao wrote:
> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache. A bug causes an underflow thus making
> the page count look like a big unsigned number. This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
> This generally only affects systems with highmem because the
> underflowed count gets subtracted from the global count of
> dirtyable memory.
>
> The problem was introduced with v3.2-4896-gab8fabd
>
> Fix is to ensure we don't get an underflowed total of either highmem
> or global dirtyable memory.
>
> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> CC: [email protected]
> ---
> v2: added apkm's suggestion to make the highmem calculation better
> v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
> (offlist mail) to use max() in global_dirtyable_memory()
> v4: Added suggestions to description clarifying the role of highmem
> and the commit which originally caused the problem
> mm/page-writeback.c | 25 ++++++++++++++++++++-----
> 1 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 830893b..f9efbe8 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
> zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> }
> /*
> + * Unreclaimable memory (kernel memory or anonymous memory
> + * without swap) can bring down the dirtyable pages below
> + * the zone's dirty balance reserve and the above calculation
> + * will underflow. However we still want to add in nodes
> + * which are below threshold (negative values) to get a more
> + * accurate calculation but make sure that the total never
> + * underflows.
> + */
> + if ((long)x < 0)
> + x = 0;
> +
> + /*
> * Make sure that the number of highmem pages is never larger
> * than the number of the total dirtyable memory. This can only
> * occur in very strange VM situations but we want to make sure
> @@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
> {
> unsigned long x;
>
> - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
> - dirty_balance_reserve;
> + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
> + x -= max(x, dirty_balance_reserve);

This unconditionally zeroes out x, except when it underflows it...

min().

> if (!vm_highmem_is_dirtyable)
> x -= highmem_dirtyable_memory(x);
> @@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone)
> * highmem zone can hold its share of dirty pages, so we don't
> * care about vm_highmem_is_dirtyable here.
> */
> - return zone_page_state(zone, NR_FREE_PAGES) +
> - zone_reclaimable_pages(zone) -
> - zone->dirty_balance_reserve;
> + unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
> + zone_reclaimable_pages(zone);
> +
> + /* don't allow this to underflow */
> + nr_pages -= max(nr_pages, zone->dirty_balance_reserve);
> + return nr_pages;

min().

2012-11-19 18:42:14

by Sonny Rao

[permalink] [raw]
Subject: [PATCHv5] mm: Fix calculation of dirtyable memory

The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache. A bug causes an underflow thus making
the page count look like a big unsigned number. This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).
This generally only affects systems with highmem because the
underflowed count gets subtracted from the global count of
dirtyable memory.

The problem was introduced with v3.2-4896-gab8fabd

Fix is to ensure we don't get an underflowed total of either highmem
or global dirtyable memory.

Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Puneet Kumar <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
CC: [email protected]
---
v2: added apkm's suggestion to make the highmem calculation better
v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
(offlist mail) to use max() in global_dirtyable_memory()
v4: Added suggestions to description clarifying the role of highmem
and the commit which originally caused the problem
v5: Fix bug where max() was used instead of min()
mm/page-writeback.c | 25 ++++++++++++++++++++-----
1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..f9efbe8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
+ * Unreclaimable memory (kernel memory or anonymous memory
+ * without swap) can bring down the dirtyable pages below
+ * the zone's dirty balance reserve and the above calculation
+ * will underflow. However we still want to add in nodes
+ * which are below threshold (negative values) to get a more
+ * accurate calculation but make sure that the total never
+ * underflows.
+ */
+ if ((long)x < 0)
+ x = 0;
+
+ /*
* Make sure that the number of highmem pages is never larger
* than the number of the total dirtyable memory. This can only
* occur in very strange VM situations but we want to make sure
@@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
{
unsigned long x;

- x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
- dirty_balance_reserve;
+ x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+ x -= min(x, dirty_balance_reserve);

if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
@@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone)
* highmem zone can hold its share of dirty pages, so we don't
* care about vm_highmem_is_dirtyable here.
*/
- return zone_page_state(zone, NR_FREE_PAGES) +
- zone_reclaimable_pages(zone) -
- zone->dirty_balance_reserve;
+ unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
+ zone_reclaimable_pages(zone);
+
+ /* don't allow this to underflow */
+ nr_pages -= min(nr_pages, zone->dirty_balance_reserve);
+ return nr_pages;
}

/**
--
1.7.7.3

2012-11-19 18:44:43

by Sonny Rao

[permalink] [raw]
Subject: Re: [PATCHv5] mm: Fix calculation of dirtyable memory

On Mon, Nov 19, 2012 at 10:41 AM, Sonny Rao <[email protected]> wrote:
> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache. A bug causes an underflow thus making
> the page count look like a big unsigned number. This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
> This generally only affects systems with highmem because the
> underflowed count gets subtracted from the global count of
> dirtyable memory.
>
> The problem was introduced with v3.2-4896-gab8fabd
>
> Fix is to ensure we don't get an underflowed total of either highmem
> or global dirtyable memory.
>
> Signed-off-by: Sonny Rao <[email protected]>
> Signed-off-by: Puneet Kumar <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> CC: [email protected]
> ---
> v2: added apkm's suggestion to make the highmem calculation better
> v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
> (offlist mail) to use max() in global_dirtyable_memory()
> v4: Added suggestions to description clarifying the role of highmem
> and the commit which originally caused the problem
> v5: Fix bug where max() was used instead of min()
> mm/page-writeback.c | 25 ++++++++++++++++++++-----
> 1 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 830893b..f9efbe8 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
> zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> }
> /*
> + * Unreclaimable memory (kernel memory or anonymous memory
> + * without swap) can bring down the dirtyable pages below
> + * the zone's dirty balance reserve and the above calculation
> + * will underflow. However we still want to add in nodes
> + * which are below threshold (negative values) to get a more
> + * accurate calculation but make sure that the total never
> + * underflows.
> + */
> + if ((long)x < 0)
> + x = 0;
> +
> + /*
> * Make sure that the number of highmem pages is never larger
> * than the number of the total dirtyable memory. This can only
> * occur in very strange VM situations but we want to make sure
> @@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
> {
> unsigned long x;
>
> - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
> - dirty_balance_reserve;
> + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
> + x -= min(x, dirty_balance_reserve);
>
> if (!vm_highmem_is_dirtyable)
> x -= highmem_dirtyable_memory(x);
> @@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone)
> * highmem zone can hold its share of dirty pages, so we don't
> * care about vm_highmem_is_dirtyable here.
> */
> - return zone_page_state(zone, NR_FREE_PAGES) +
> - zone_reclaimable_pages(zone) -
> - zone->dirty_balance_reserve;
> + unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
> + zone_reclaimable_pages(zone);
> +
> + /* don't allow this to underflow */
> + nr_pages -= min(nr_pages, zone->dirty_balance_reserve);
> + return nr_pages;
> }
>
> /**
> --
> 1.7.7.3
>

Damien, thanks for testing and finding that bug. If you could, please
give this version a try, thanks.

2012-11-19 19:30:10

by Damien Wyart

[permalink] [raw]
Subject: Re: [PATCHv5] mm: Fix calculation of dirtyable memory

* Sonny Rao <[email protected]> [2012-11-19 10:44]:
> Damien, thanks for testing and finding that bug. If you could, please
> give this version a try, thanks.

Tried it for a few hours (as soon as the min/max problem was suggested
on the list) and the previous problems disappeared.

--
Damien