2010-11-11 20:33:40

by Steven J. Magnani

[permalink] [raw]
Subject: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

Depending on processor speed, page size, and the amount of memory a process
is allowed to amass, cleanup of a large VM may freeze the system for many
seconds. This can result in a watchdog timeout.

Make sure other tasks receive some service when cleaning up large VMs.

Signed-off-by: Steven J. Magnani <[email protected]>
---
diff -uprN a/mm/nommu.c b/mm/nommu.c
--- a/mm/nommu.c 2010-10-21 07:42:23.000000000 -0500
+++ b/mm/nommu.c 2010-10-21 07:46:50.000000000 -0500
@@ -1656,6 +1656,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a
void exit_mmap(struct mm_struct *mm)
{
struct vm_area_struct *vma;
+ unsigned long next_yield = jiffies + HZ;

if (!mm)
return;
@@ -1668,6 +1669,11 @@ void exit_mmap(struct mm_struct *mm)
mm->mmap = vma->vm_next;
delete_vma_from_mm(vma);
delete_vma(mm, vma);
+ /* Yield periodically to prevent watchdog timeout */
+ if (time_after(jiffies, next_yield)) {
+ cond_resched();
+ next_yield = jiffies + HZ;
+ }
}

kleave("");


2010-11-12 02:44:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

On Thu, 11 Nov 2010 14:33:16 -0600 "Steven J. Magnani" <[email protected]> wrote:

> Depending on processor speed, page size, and the amount of memory a process
> is allowed to amass, cleanup of a large VM may freeze the system for many
> seconds. This can result in a watchdog timeout.

hm, that's no good.

> Make sure other tasks receive some service when cleaning up large VMs.
>
> Signed-off-by: Steven J. Magnani <[email protected]>
> ---
> diff -uprN a/mm/nommu.c b/mm/nommu.c
> --- a/mm/nommu.c 2010-10-21 07:42:23.000000000 -0500
> +++ b/mm/nommu.c 2010-10-21 07:46:50.000000000 -0500
> @@ -1656,6 +1656,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a
> void exit_mmap(struct mm_struct *mm)
> {
> struct vm_area_struct *vma;
> + unsigned long next_yield = jiffies + HZ;
>
> if (!mm)
> return;
> @@ -1668,6 +1669,11 @@ void exit_mmap(struct mm_struct *mm)
> mm->mmap = vma->vm_next;
> delete_vma_from_mm(vma);
> delete_vma(mm, vma);
> + /* Yield periodically to prevent watchdog timeout */
> + if (time_after(jiffies, next_yield)) {
> + cond_resched();
> + next_yield = jiffies + HZ;
> + }
> }
>
> kleave("");

You might be able to do this a bit more neatly with __ratelimit:

DEFINE_RATELIMIT_STATE(rl, HZ, 1);

...

if (___ratelimit(&rl, NULL))
cond_resched();

but ___ratelimit() isn't really ready for that - it still has (easily
fixed) assumptions that it's being used for printk ratelimiting.


But anyway. cond_resched() is pretty efficient and one second is still
a very long time. I suspect you don't need the ratelimiting at all?

2010-11-14 05:07:14

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

> Depending on processor speed, page size, and the amount of memory a process
> is allowed to amass, cleanup of a large VM may freeze the system for many
> seconds. This can result in a watchdog timeout.
>
> Make sure other tasks receive some service when cleaning up large VMs.
>
> Signed-off-by: Steven J. Magnani <[email protected]>
> ---
> diff -uprN a/mm/nommu.c b/mm/nommu.c
> --- a/mm/nommu.c 2010-10-21 07:42:23.000000000 -0500
> +++ b/mm/nommu.c 2010-10-21 07:46:50.000000000 -0500
> @@ -1656,6 +1656,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a
> void exit_mmap(struct mm_struct *mm)
> {
> struct vm_area_struct *vma;
> + unsigned long next_yield = jiffies + HZ;
>
> if (!mm)
> return;
> @@ -1668,6 +1669,11 @@ void exit_mmap(struct mm_struct *mm)
> mm->mmap = vma->vm_next;
> delete_vma_from_mm(vma);
> delete_vma(mm, vma);
> + /* Yield periodically to prevent watchdog timeout */
> + if (time_after(jiffies, next_yield)) {
> + cond_resched();
> + next_yield = jiffies + HZ;
> + }

If watchdog tiemr interval is less than HZ, this logic doesn't work. right?
If so, I would suggest just remove time_after() and call cond_resched() every time
because cond_resched is no-op if TIF_NEED_RESCHED is not setted.


2010-11-15 14:29:19

by Steven J. Magnani

[permalink] [raw]
Subject: Re: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

On Thu, 2010-11-11 at 18:40 -0800, Andrew Morton wrote:
> On Thu, 11 Nov 2010 14:33:16 -0600 "Steven J. Magnani" <[email protected]> wrote:
>
> > --- a/mm/nommu.c 2010-10-21 07:42:23.000000000 -0500
> > +++ b/mm/nommu.c 2010-10-21 07:46:50.000000000 -0500
> > @@ -1656,6 +1656,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a
> > void exit_mmap(struct mm_struct *mm)
> > {
> > struct vm_area_struct *vma;
> > + unsigned long next_yield = jiffies + HZ;
> >
> > if (!mm)
> > return;
> > @@ -1668,6 +1669,11 @@ void exit_mmap(struct mm_struct *mm)
> > mm->mmap = vma->vm_next;
> > delete_vma_from_mm(vma);
> > delete_vma(mm, vma);
> > + /* Yield periodically to prevent watchdog timeout */
> > + if (time_after(jiffies, next_yield)) {
> > + cond_resched();
> > + next_yield = jiffies + HZ;
> > + }
> > }
> >
> > kleave("");
>
[snip]
> cond_resched() is pretty efficient and one second is still
> a very long time. I suspect you don't need the ratelimiting at all?

Probably not, but the issue was that disposal of "large" VMs can starve
the system. Since these are not the norm (otherwise this would have been
fixed long ago) I was attempting to limit the impact on more
"normal"-sized VMs. Responsiveness is not great with a one-second
ratelimit, and as KOSAKI Motohiro points out this fix won't work on
systems with short watchdog intervals. I assumed that these were not
common.

As efficient as schedule() may be, it still scares me to call it on
reclaim of every block of memory allocated by a terminating process,
particularly on the relatively slow processors that inhabit NOMMU land.
It wasn't obvious to me that it has a quick exit. But since we are
talking about sharing the CPU with other processes perhaps this is only
an issue in an OOM scenario, when fast reclaim might be more important.

I can certainly respin the patch to call cond_resched() unconditionally
if that's the consensus.

Regards,
------------------------------------------------------------------------
Steven J. Magnani "I claim this network for MARS!
http://www.digidescorp.com Earthling, return my space modulator!"

#include <standard.disclaimer>

2010-11-16 04:51:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

On Mon, 15 Nov 2010 08:29:11 -0600 "Steven J. Magnani" <[email protected]> wrote:

> On Thu, 2010-11-11 at 18:40 -0800, Andrew Morton wrote:
> > On Thu, 11 Nov 2010 14:33:16 -0600 "Steven J. Magnani" <[email protected]> wrote:
> >
> > > --- a/mm/nommu.c 2010-10-21 07:42:23.000000000 -0500
> > > +++ b/mm/nommu.c 2010-10-21 07:46:50.000000000 -0500
> > > @@ -1656,6 +1656,7 @@ SYSCALL_DEFINE2(munmap, unsigned long, a
> > > void exit_mmap(struct mm_struct *mm)
> > > {
> > > struct vm_area_struct *vma;
> > > + unsigned long next_yield = jiffies + HZ;
> > >
> > > if (!mm)
> > > return;
> > > @@ -1668,6 +1669,11 @@ void exit_mmap(struct mm_struct *mm)
> > > mm->mmap = vma->vm_next;
> > > delete_vma_from_mm(vma);
> > > delete_vma(mm, vma);
> > > + /* Yield periodically to prevent watchdog timeout */
> > > + if (time_after(jiffies, next_yield)) {
> > > + cond_resched();
> > > + next_yield = jiffies + HZ;
> > > + }
> > > }
> > >
> > > kleave("");
> >
> [snip]
> > cond_resched() is pretty efficient and one second is still
> > a very long time. I suspect you don't need the ratelimiting at all?
>
> Probably not, but the issue was that disposal of "large" VMs can starve
> the system. Since these are not the norm (otherwise this would have been
> fixed long ago) I was attempting to limit the impact on more
> "normal"-sized VMs. Responsiveness is not great with a one-second
> ratelimit, and as KOSAKI Motohiro points out this fix won't work on
> systems with short watchdog intervals. I assumed that these were not
> common.
>
> As efficient as schedule() may be, it still scares me to call it on
> reclaim of every block of memory allocated by a terminating process,
> particularly on the relatively slow processors that inhabit NOMMU land.

This is cond_resched(), not schedule()! cond_resched() is just a few
instructions, except for the super-rare case where it calls schedule().

> It wasn't obvious to me that it has a quick exit. But since we are
> talking about sharing the CPU with other processes perhaps this is only
> an issue in an OOM scenario, when fast reclaim might be more important.
>
> I can certainly respin the patch to call cond_resched() unconditionally
> if that's the consensus.

You have a consensus of 1 so far :)

2010-11-16 13:04:04

by Steven J. Magnani

[permalink] [raw]
Subject: Re: [PATCH][RESEND] nommu: yield CPU periodically while disposing large VM

On Mon, 2010-11-15 at 20:47 -0800, Andrew Morton wrote:
> On Mon, 15 Nov 2010 08:29:11 -0600 "Steven J. Magnani" <[email protected]> wrote:
>
> > As efficient as schedule() may be, it still scares me to call it on
> > reclaim of every block of memory allocated by a terminating process,
> > particularly on the relatively slow processors that inhabit NOMMU land.
>
> This is cond_resched(), not schedule()! cond_resched() is just a few
> instructions, except for the super-rare case where it calls schedule().

The light comes on..._cond_resched() is overloaded. I was looking at the
static version, which calls schedule(). The extern version is much more
lightweight.

I'll respin the patch.

Thanks,
------------------------------------------------------------------------
Steven J. Magnani "I claim this network for MARS!
http://www.digidescorp.com Earthling, return my space modulator!"

#include <standard.disclaimer>