2007-12-25 05:16:14

by Yanmin Zhang

[permalink] [raw]
Subject: volanoMark 24% regression in 2.6.24-rc6: why a simple patch makes it

With kernel 2.6.24-rc6, volanoMark has much regression.

1) On 8-core stoakley: 17%;
2) On 16-core tigerton: 24%.

I bisected it down to patch fbdcf18df73758b2e187ab94678b30cd5f6ff9f9. It is
to fix the bad cpu number in /proc/cpuinfo. As a matter of fact, this issue
is already fixed by other 2 patches:
699d934d5f958d7944d195c03c334f28cc0b3669
and
c0c52d28e05e8bdaa2126570c02ecb1a7358cecc.

At the first glance, the patch looks good, at least no conflict with the other
2 patches. After double-checking it, I found in below call chain:

smp_store_cpu_info => identify_cpu => init_intel => init_intel_cacheinfo.

When CONFIG_X86_HT=y, init_intel_cacheinfo will uses cpuinfo_x86->cpu_index, which
is initiated by smp_store_cpu_info. If with patch fbdcf18df73758b2e187ab94678b30cd5f6ff9f9,
cpuinfo_x86->cpu_index is initiated after identify_cpu is called, so
init_intel_cacheinfo just always initiates per_cpu(cpu_llc_id, 0) = l2_id or l3_id. Then,
set_cpu_sibling_map will set bad llc_shared_map, so the core domain won't be built.

By checking domain info from dmesg, it really confirms my consequence.

>From this case, I really found that core domain could improve performance, at least when
testing by volanoMark. :)

The solution is just to revert patch fbdcf18df73758b2e187ab94678b30cd5f6ff9f9,
because other 2 patches which fixed the same issue are already in 2.6.24-rc5.

-yanmin


2007-12-25 08:26:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: volanoMark 24% regression in 2.6.24-rc6: why a simple patch makes it


* Zhang, Yanmin <[email protected]> wrote:

> The solution is just to revert patch
> fbdcf18df73758b2e187ab94678b30cd5f6ff9f9, because other 2 patches
> which fixed the same issue are already in 2.6.24-rc5.

Linus, please revert fbdcf18df73758, as requested by Yanmin. It was
noticed before (by Yinghai Lu) that this commit was not needed, but it
looked a harmless duplication and we incorrectly thought it's only a NOP
and wanted to fix it in v2.6.25 - but as Yanmin has analyzed it now, it
creates a sub-optimal sched-domains hierarchy (not setting up the domain
belonging to the core) when CONFIG_X86_HT=y.

Acked-by: Ingo Molnar <[email protected]>

nice work Yanmin!

Ingo