LinuxLists.cc - More latency regressions with 2.6.11-rc4-RT-V0.7.39-02

2005-02-23 18:08:59

Subject: More latency regressions with 2.6.11-rc4-RT-V0.7.39-02

Ingo,

Did something change recently in the VM that made copy_pte_range and
clear_page_range a lot more expensive? I noticed a reference in the
"Page Table Iterators" thread to excessive overhead introduced by
aggressive page freeing. That sure looks like what is going on in
trace2. trace1 and trace3 look like big fork latencies associated with
copy_pte_range.

This is all with PREEMPT_DESKTOP.

Lee

Attachments:

trace1.txt (3.75 kB)
trace2.txt (48.58 kB)
trace3.txt (3.75 kB)
Download all attachments

2005-02-23 19:18:10

by Hugh Dickins

[permalink] [raw]

Subject: Re: More latency regressions with 2.6.11-rc4-RT-V0.7.39-02

On Wed, 23 Feb 2005, Lee Revell wrote:
>
> Did something change recently in the VM that made copy_pte_range and
> clear_page_range a lot more expensive? I noticed a reference in the
> "Page Table Iterators" thread to excessive overhead introduced by
> aggressive page freeing. That sure looks like what is going on in
> trace2. trace1 and trace3 look like big fork latencies associated with
> copy_pte_range.

I'm just about to test this patch below: please give it a try: thanks...

Ingo's patch to reduce scheduling latencies, by checking for lockbreak
in copy_page_range, was in the -VP and -mm patchsets some months ago;
but got preempted by the 4level rework, and not reinstated since.
Restore it now in copy_pte_range - which mercifully makes it easier.

Signed-off-by: Hugh Dickins <[email protected]>

--- 2.6.11-rc4-bk9/mm/memory.c 2005-02-21 11:32:19.000000000 +0000
+++ linux/mm/memory.c 2005-02-23 18:35:28.000000000 +0000
@@ -328,6 +328,7 @@ static int copy_pte_range(struct mm_stru
pte_t *s, *d;
unsigned long vm_flags = vma->vm_flags;

+again:
d = dst_pte = pte_alloc_map(dst_mm, dst_pmd, addr);
if (!dst_pte)
return -ENOMEM;
@@ -338,11 +339,22 @@ static int copy_pte_range(struct mm_stru
if (pte_none(*s))
continue;
copy_one_pte(dst_mm, src_mm, d, s, vm_flags, addr);
+ /*
+ * We are holding two locks at this point - either of them
+ * could generate latencies in another task on another CPU.
+ */
+ if (need_resched() ||
+ need_lockbreak(&src_mm->page_table_lock) ||
+ need_lockbreak(&dst_mm->page_table_lock))
+ break;
}
pte_unmap_nested(src_pte);
pte_unmap(dst_pte);
spin_unlock(&src_mm->page_table_lock);
+
cond_resched_lock(&dst_mm->page_table_lock);
+ if (addr < end)
+ goto again;
return 0;
}

2005-02-23 19:37:52

by Lee Revell

[permalink] [raw]

Subject: Re: More latency regressions with 2.6.11-rc4-RT-V0.7.39-02

On Wed, 2005-02-23 at 19:16 +0000, Hugh Dickins wrote:
> On Wed, 23 Feb 2005, Lee Revell wrote:
> >
> > Did something change recently in the VM that made copy_pte_range and
> > clear_page_range a lot more expensive? I noticed a reference in the
> > "Page Table Iterators" thread to excessive overhead introduced by
> > aggressive page freeing. That sure looks like what is going on in
> > trace2. trace1 and trace3 look like big fork latencies associated with
> > copy_pte_range.
>
> I'm just about to test this patch below: please give it a try: thanks...
>
> Ingo's patch to reduce scheduling latencies, by checking for lockbreak
> in copy_page_range, was in the -VP and -mm patchsets some months ago;
> but got preempted by the 4level rework, and not reinstated since.
> Restore it now in copy_pte_range - which mercifully makes it easier.

Aha, that explains why all the latency regressions involve the VM
subsystem.

Thanks, your patch fixes the copy_pte_range latency. Now zap_pte_range,
which Ingo also fixed a few months ago, is the worst offender. Can this
fix be easily ported too?

Lee

Attachments:

trace5.txt (12.49 kB)

2005-02-23 19:53:26

On Fri, 2005-02-25 at 05:58 +0000, Hugh Dickins wrote:
> On Thu, 24 Feb 2005, Lee Revell wrote:
> > On Thu, 2005-02-24 at 08:26 +0000, Hugh Dickins wrote:
> > >
> > > If we'd got to it earlier, yes. But 2.6.11 looks to be just a day or
> > > two away, and we've no idea why zap_pte_range or clear_page_range
> > > would have reverted. Nor have we heard from Ingo yet.
> >
> > It's also not clear that the patch completely fixes the copy_pte_range
> > latency. This trace is from the Athlon XP.
>
> Then we need Ingo to investigate and explain all these reversions.
> I'm not _blaming_ Ingo for them, but I'm not familiar with his patches
> nor with deciphering latency traces - he's the magician around here.
>

Yup. Oh well.

I'll try to compile a comprehensive list of these so we can fix them for
2.6.12.

Lee