2010-08-27 00:11:44

by Ying Han

[permalink] [raw]
Subject: [PATCH] vmscan: fix missing place to check nr_swap_pages.

Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
change that moves the check to common function inactive_anon_is_low.

Signed-off-by: Ying Han <[email protected]>
---
mm/vmscan.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3109ff7..c7923e7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1605,6 +1605,9 @@ static int inactive_anon_is_low(struct zone *zone, struct scan_control *sc)
{
int low;

+ if (nr_swap_pages <= 0)
+ return 0;
+
if (scanning_global_lru(sc))
low = inactive_anon_is_low_global(zone);
else
@@ -1856,7 +1859,7 @@ static void shrink_zone(int priority, struct zone *zone,
* Even if we did not try to evict anon pages at all, we want to
* rebalance the anon lru active/inactive ratio.
*/
- if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0)
+ if (inactive_anon_is_low(zone, sc))
shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);

throttle_vm_writeout(sc->gfp_mask);
--
1.7.1


2010-08-27 01:03:09

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

Hello.

On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <[email protected]> wrote:
> Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
> change that moves the check to common function inactive_anon_is_low.
>

Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
swap space now". If we have a swap device or file but not enough space
now, we need to aging anon pages to make inactive list enough size.
Otherwise, working set pages would be swapped out more fast before
promotion.

That aging is done by kswapd so I think it's not big harmful in the system.
But if you want to remove aging completely in non-swap system, we need
to identify non swap system and not enough swap space. I thought we
need it for embedded system.

Thanks.


--
Kind regards,
Minchan Kim

2010-08-27 03:31:06

by Ying Han

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Thu, Aug 26, 2010 at 6:03 PM, Minchan Kim <[email protected]> wrote:
>
> Hello.
>
> On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <[email protected]> wrote:
> > Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
> > change that moves the check to common function inactive_anon_is_low.
> >
>
> Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
> That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
> swap space now". If we have a swap device or file but not enough space
> now, we need to aging anon pages to make inactive list enough size.
> Otherwise, working set pages would be swapped out more fast before
> promotion.

We found the problem on one of our workloads where more TLB flush
happens without the change. Kswapd seems to be calling
shrink_active_list() which eventually clears access bit of those ptes
and does TLB flush
with ptep_clear_flush_young(). This system does not have swap
configured, and why aging the anon lru in that
case?

> That aging is done by kswapd so I think it's not big harmful in the system.
> But if you want to remove aging completely in non-swap system, we need
> to identify non swap system and not enough swap space. I thought we
> need it for embedded system.

Lots of TLB flush hurts the performance especially on large smp system. So does
it make sense if change it to:

+ if (nr_swap_pages == 0)
+ return 0;

--Ying


> Thanks.
>
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. ?For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2010-08-27 05:00:45

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Fri, Aug 27, 2010 at 12:31 PM, Ying Han <[email protected]> wrote:
> On Thu, Aug 26, 2010 at 6:03 PM, Minchan Kim <[email protected]> wrote:
>>
>> Hello.
>>
>> On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <[email protected]> wrote:
>> > Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
>> > change that moves the check to common function inactive_anon_is_low.
>> >
>>
>> Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
>> That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
>> swap space now". If we have a swap device or file but not enough space
>> now, we need to aging anon pages to make inactive list enough size.
>> Otherwise, working set pages would be swapped out more fast before
>> promotion.
>
> We found the problem on one of our workloads where more TLB flush
> happens without the change. Kswapd seems to be calling
> shrink_active_list() which eventually clears access bit of those ptes
> and does TLB flush
> with ptep_clear_flush_young(). This system does not have swap
> configured, and why aging the anon lru in that
> case?

True. I also wanted it but we have to care swap configured but
non-enabling still yet system as well as non-swap configured system at
that time.

If your system is no swap configured, how about this?
(It's a not formal proper patch but just quick patch to show the concept).

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3109ff7..641c6a6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1580,6 +1580,11 @@ static void shrink_active_list(unsigned long
nr_pages, struct zone *zone,
spin_unlock_irq(&zone->lru_lock);
}

+/*
+ * If system doesn't have a swap configuration,
+ * it doesn't need to age anon pages in kswapd.
+ */
+#ifdef CONFIG_SWAP
static int inactive_anon_is_low_global(struct zone *zone)
{
unsigned long active, inactive;
@@ -1611,6 +1616,12 @@ static int inactive_anon_is_low(struct zone
*zone, struct scan_control *sc)
low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup);
return low;
}
+#else
+static inline int inactive_anon_is_low(struct zone *zone, struct
scan_control *sc)
+{
+ return 0;
+}
+#endif

static int inactive_file_is_low_global(struct zone *zone)
{


--
Kind regards,
Minchan Kim

2010-08-27 16:36:02

by Ying Han

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Thu, Aug 26, 2010 at 10:00 PM, Minchan Kim <[email protected]> wrote:
>
> On Fri, Aug 27, 2010 at 12:31 PM, Ying Han <[email protected]> wrote:
> > On Thu, Aug 26, 2010 at 6:03 PM, Minchan Kim <[email protected]> wrote:
> >>
> >> Hello.
> >>
> >> On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <[email protected]> wrote:
> >> > Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
> >> > change that moves the check to common function inactive_anon_is_low.
> >> >
> >>
> >> Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
> >> That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
> >> swap space now". If we have a swap device or file but not enough space
> >> now, we need to aging anon pages to make inactive list enough size.
> >> Otherwise, working set pages would be swapped out more fast before
> >> promotion.
> >
> > We found the problem on one of our workloads where more TLB flush
> > happens without the change. Kswapd seems to be calling
> > shrink_active_list() which eventually clears access bit of those ptes
> > and does TLB flush
> > with ptep_clear_flush_young(). This system does not have swap
> > configured, and why aging the anon lru in that
> > case?
>
> True. I also wanted it but we have to care swap configured but
> non-enabling still yet system as well as non-swap configured system at
> that time.

Agree. ?In our case, we cares about the case where swap is not enabled
but is configured .
>
> If your system is no swap configured, how about this?
> (It's a not formal proper patch but just quick patch to show the concept).

In our system, we do have swap configured. In vmscan.c, there are
couple of places where we skip scanning
and shrinking anon lru while the condition if(nr_swap_pages <= 0) ?is
true. It still make sense to me to add it
to the shrink_active() condition as the?initial?patch.

Also, we found it is quite often to hit the condition
inactive_anon_is_low on machine with small numa node size, since the
zone->inactive_ratio is set based on the zone->present_pages.

--Ying

>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 3109ff7..641c6a6 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1580,6 +1580,11 @@ static void shrink_active_list(unsigned long
> nr_pages, struct zone *zone,
> ? ? ? ?spin_unlock_irq(&zone->lru_lock);
> ?}
>
> +/*
> + * If system doesn't have a swap configuration,
> + * it doesn't need to age anon pages in kswapd.
> + */
> +#ifdef CONFIG_SWAP
> ?static int inactive_anon_is_low_global(struct zone *zone)
> ?{
> ? ? ? ?unsigned long active, inactive;
> @@ -1611,6 +1616,12 @@ static int inactive_anon_is_low(struct zone
> *zone, struct scan_control *sc)
> ? ? ? ? ? ? ? ?low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup);
> ? ? ? ?return low;
> ?}
> +#else
> +static inline int inactive_anon_is_low(struct zone *zone, struct
> scan_control *sc)
> +{
> + ? ? ? return 0;
> +}
> +#endif
>
> ?static int inactive_file_is_low_global(struct zone *zone)
> ?{
>
>
> --
> Kind regards,
> Minchan Kim

2010-08-28 01:31:05

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Fri, Aug 27, 2010 at 9:35 AM, Ying Han <[email protected]> wrote:
> On Thu, Aug 26, 2010 at 10:00 PM, Minchan Kim <[email protected]> wrote:
>>
>> On Fri, Aug 27, 2010 at 12:31 PM, Ying Han <[email protected]> wrote:
>> > On Thu, Aug 26, 2010 at 6:03 PM, Minchan Kim <[email protected]> wrote:
>> >>
>> >> Hello.
>> >>
>> >> On Fri, Aug 27, 2010 at 9:11 AM, Ying Han <[email protected]> wrote:
>> >> > Fix a missed place where checks nr_swap_pages to do shrink_active_list. Make the
>> >> > change that moves the check to common function inactive_anon_is_low.
>> >> >
>> >>
>> >> Hmm.. AFAIR, we discussed it at that time but we concluded it's not good.
>> >> That's because nr_swap_pages < 0 means both "NO SWAP" and "NOT enough
>> >> swap space now". If we have a swap device or file but not enough space
>> >> now, we need to aging anon pages to make inactive list enough size.
>> >> Otherwise, working set pages would be swapped out more fast before
>> >> promotion.
>> >
>> > We found the problem on one of our workloads where more TLB flush
>> > happens without the change. Kswapd seems to be calling
>> > shrink_active_list() which eventually clears access bit of those ptes
>> > and does TLB flush
>> > with ptep_clear_flush_young(). This system does not have swap
>> > configured, and why aging the anon lru in that
>> > case?
>>
>> True. I also wanted it but we have to care swap configured but
>> non-enabling still yet system as well as non-swap configured system at
>> that time.
>
> Agree. ?In our case, we cares about the case where swap is not enabled
> but is configured .
>>
>> If your system is no swap configured, how about this?
>> (It's a not formal proper patch but just quick patch to show the concept).
>
> In our system, we do have swap configured. In vmscan.c, there are
> couple of places where we skip scanning
> and shrinking anon lru while the condition if(nr_swap_pages <= 0) ?is
> true. It still make sense to me to add it
> to the shrink_active() condition as the?initial?patch.
>
> Also, we found it is quite often to hit the condition
> inactive_anon_is_low on machine with small numa node size, since the
> zone->inactive_ratio is set based on the zone->present_pages.
>

Does "total_swap_pages" help?

Thanks,
Venki

2010-08-29 15:40:49

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Fri, Aug 27, 2010 at 09:35:58AM -0700, Ying Han wrote:

> Also, we found it is quite often to hit the condition
> inactive_anon_is_low on machine with small numa node size, since the
> zone->inactive_ratio is set based on the zone->present_pages.

What's your memory configuration and memory size?

Now we have zoned page allocator and zoned page reclaimer.
So it makes sense to me. :)

Anyway, I will resend new version. Thanks, Ying.
--
Kind regards,
Minchan Kim

2010-08-29 15:42:30

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH] vmscan: fix missing place to check nr_swap_pages.

On Fri, Aug 27, 2010 at 06:30:58PM -0700, Venkatesh Pallipadi wrote:
> On Fri, Aug 27, 2010 at 9:35 AM, Ying Han <[email protected]> wrote:
> > In our system, we do have swap configured. In vmscan.c, there are
> > couple of places where we skip scanning
> > and shrinking anon lru while the condition if(nr_swap_pages <= 0) ?is
> > true. It still make sense to me to add it
> > to the shrink_active() condition as the?initial?patch.
> >
> > Also, we found it is quite often to hit the condition
> > inactive_anon_is_low on machine with small numa node size, since the
> > zone->inactive_ratio is set based on the zone->present_pages.
> >
>
> Does "total_swap_pages" help?

Yes. Thanks for advising.

>
> Thanks,
> Venki

--
Kind regards,
Minchan Kim