2014-07-02 06:49:36

by Yasuaki Ishimatsu

[permalink] [raw]
Subject: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

llc_shared_mask is not cleared even if cpu is offline or hot removed.
So when hot-plugging CPU, the mask has wrong value. The mask is used
by CSF schduler. So it breaks CFS scheduler.

Here is a example on my system.
My system has 4 sockets and each socket has 15 cores and HT is enabled.
In this case, each core of sockes is numbered as follows:

| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119

Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
It means that cache of Socket#2 is shared with CPU#30-44 and 90-104.

When hot-removing socket#2 and #3, each core of sockets is numbered
as follows:

| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89

But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains
having 0x3fff80000001fffc0000000.

After that, when hot-adding socket#2 and #3, each core of sockets is
numbered as follows:

| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119

Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000.
It means that cache of Socket#2 is shared with CPU#30-59 and 90-104.
So the mask has wrong value.

This patch fixes above problem by clearing llc_shared_mask bit of
offlined cpu.

Signed-off-by: Yasuaki Ishimatsu <[email protected]>
---
arch/x86/kernel/smpboot.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5492798..893cd2b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1279,6 +1279,7 @@ __init void prefill_possible_map(void)
static void remove_siblinginfo(int cpu)
{
int sibling;
+ int llc_shared;
struct cpuinfo_x86 *c = &cpu_data(cpu);

for_each_cpu(sibling, cpu_core_mask(cpu)) {
@@ -1290,9 +1291,12 @@ static void remove_siblinginfo(int cpu)
cpu_data(sibling).booted_cores--;
}

+ for_each_cpu(llc_shared, cpu_llc_shared_mask(cpu))
+ cpumask_clear_cpu(cpu, cpu_llc_shared_mask(llc_shared));
for_each_cpu(sibling, cpu_sibling_mask(cpu))
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
cpumask_clear(cpu_sibling_mask(cpu));
+ cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
c->cpu_core_id = 0;


2014-07-02 11:33:03

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

On Wed, Jul 02, 2014 at 03:41:21PM +0900, Yasuaki Ishimatsu wrote:
> llc_shared_mask is not cleared even if cpu is offline or hot removed.
> So when hot-plugging CPU, the mask has wrong value. The mask is used
> by CSF schduler. So it breaks CFS scheduler.
>
> Here is a example on my system.
> My system has 4 sockets and each socket has 15 cores and HT is enabled.
> In this case, each core of sockes is numbered as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
> Socket#2 | 30-44, 90-104
> Socket#3 | 45-59, 105-119
>
> Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
> It means that cache of Socket#2 is shared with CPU#30-44 and 90-104.
>
> When hot-removing socket#2 and #3, each core of sockets is numbered
> as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
>
> But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains
> having 0x3fff80000001fffc0000000.
>
> After that, when hot-adding socket#2 and #3, each core of sockets is
> numbered as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
> Socket#2 | 30-59
> Socket#3 | 90-119

Ok, this doesn't make too much sense to me. Why would the readded cores
have new numbers?

Because if they kept their old numbers, you wouldn't have to correct the
LLC mask.

Shouldn't the hotplug code keep stable core ids based on APIC id and
node and whatever across physical hotplug operations?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-02 11:45:54

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

(CC Peter)

On Wed, 2014-07-02 at 15:41 +0900, Yasuaki Ishimatsu wrote:
> llc_shared_mask is not cleared even if cpu is offline or hot removed.
> So when hot-plugging CPU, the mask has wrong value. The mask is used
> by CSF schduler. So it breaks CFS scheduler.
>
> Here is a example on my system.
> My system has 4 sockets and each socket has 15 cores and HT is enabled.
> In this case, each core of sockes is numbered as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
> Socket#2 | 30-44, 90-104
> Socket#3 | 45-59, 105-119
>
> Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
> It means that cache of Socket#2 is shared with CPU#30-44 and 90-104.
>
> When hot-removing socket#2 and #3, each core of sockets is numbered
> as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
>
> But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains
> having 0x3fff80000001fffc0000000.
>
> After that, when hot-adding socket#2 and #3, each core of sockets is
> numbered as follows:
>
> | CPU#
> Socket#0 | 0-14 , 60-74
> Socket#1 | 15-29, 75-89
> Socket#2 | 30-59
> Socket#3 | 90-119
>
> Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000.
> It means that cache of Socket#2 is shared with CPU#30-59 and 90-104.
> So the mask has wrong value.
>
> This patch fixes above problem by clearing llc_shared_mask bit of
> offlined cpu.
>
> Signed-off-by: Yasuaki Ishimatsu <[email protected]>
> ---
> arch/x86/kernel/smpboot.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 5492798..893cd2b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1279,6 +1279,7 @@ __init void prefill_possible_map(void)
> static void remove_siblinginfo(int cpu)
> {
> int sibling;
> + int llc_shared;
> struct cpuinfo_x86 *c = &cpu_data(cpu);
>
> for_each_cpu(sibling, cpu_core_mask(cpu)) {
> @@ -1290,9 +1291,12 @@ static void remove_siblinginfo(int cpu)
> cpu_data(sibling).booted_cores--;
> }
>
> + for_each_cpu(llc_shared, cpu_llc_shared_mask(cpu))
> + cpumask_clear_cpu(cpu, cpu_llc_shared_mask(llc_shared));
> for_each_cpu(sibling, cpu_sibling_mask(cpu))
> cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
> cpumask_clear(cpu_sibling_mask(cpu));
> + cpumask_clear(cpu_llc_shared_mask(cpu));
> cpumask_clear(cpu_core_mask(cpu));
> c->phys_proc_id = 0;
> c->cpu_core_id = 0;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-07-03 04:53:56

by Yasuaki Ishimatsu

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

(2014/07/02 20:32), Borislav Petkov wrote:
> On Wed, Jul 02, 2014 at 03:41:21PM +0900, Yasuaki Ishimatsu wrote:
>> llc_shared_mask is not cleared even if cpu is offline or hot removed.
>> So when hot-plugging CPU, the mask has wrong value. The mask is used
>> by CSF schduler. So it breaks CFS scheduler.
>>
>> Here is a example on my system.
>> My system has 4 sockets and each socket has 15 cores and HT is enabled.
>> In this case, each core of sockes is numbered as follows:
>>
>> | CPU#
>> Socket#0 | 0-14 , 60-74
>> Socket#1 | 15-29, 75-89
>> Socket#2 | 30-44, 90-104
>> Socket#3 | 45-59, 105-119
>>
>> Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
>> It means that cache of Socket#2 is shared with CPU#30-44 and 90-104.
>>
>> When hot-removing socket#2 and #3, each core of sockets is numbered
>> as follows:
>>
>> | CPU#
>> Socket#0 | 0-14 , 60-74
>> Socket#1 | 15-29, 75-89
>>
>> But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains
>> having 0x3fff80000001fffc0000000.
>>
>> After that, when hot-adding socket#2 and #3, each core of sockets is
>> numbered as follows:
>>
>> | CPU#
>> Socket#0 | 0-14 , 60-74
>> Socket#1 | 15-29, 75-89
>> Socket#2 | 30-59
>> Socket#3 | 90-119
>
> Ok, this doesn't make too much sense to me. Why would the readded cores
> have new numbers?

I think that the reason to apply CPU number to ACPI ID is that CPU is
used for the application without considering physical CPU. So even if
CPU number is changed, it is no matter. Thus the readded cores is
numbered to unused CPU number.

>
> Because if they kept their old numbers, you wouldn't have to correct the
> LLC mask.
>

I think the mask has 2 meanings as follows:
- representing CPUs that share same CPU cache.
- representing onlined CPUs
So even if we keep their old numbers, we should clear the mask when
offlinig CPU.

> Shouldn't the hotplug code keep stable core ids based on APIC id and
> node and whatever across physical hotplug operations?
>

Thanks,
Yasuaki Ishimatsu

2014-07-03 09:51:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

On Thu, Jul 03, 2014 at 01:52:52PM +0900, Yasuaki Ishimatsu wrote:
> I think that the reason to apply CPU number to ACPI ID is that CPU is
> used for the application without considering physical CPU. So even if
> CPU number is changed, it is no matter.

I don't think I understand what you're saying here...

> Thus the readded cores is numbered to unused CPU number.

Well, maybe we should use some method to number cores in a stable manner
so that they don't get new numbers when they reappear.

> I think the mask has 2 meanings as follows:
> - representing CPUs that share same CPU cache.

... that share the last level cache.

> - representing onlined CPUs

no, for that we have cpu_online_mask.

> So even if we keep their old numbers, we should clear the mask when
> offlinig CPU.

No, cpu_online_mask is for onlined cores. I think the mask which shows
which cores share a last level cache should not be changed *IF* the core
numbers remain stable, that is.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-04 00:16:07

by Yasuaki Ishimatsu

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

(2014/07/03 18:51), Borislav Petkov wrote:
> On Thu, Jul 03, 2014 at 01:52:52PM +0900, Yasuaki Ishimatsu wrote:
>> I think that the reason to apply CPU number to ACPI ID is that CPU is
>> used for the application without considering physical CPU. So even if
>> CPU number is changed, it is no matter.
>
> I don't think I understand what you're saying here...
>
>> Thus the readded cores is numbered to unused CPU number.
>
> Well, maybe we should use some method to number cores in a stable manner
> so that they don't get new numbers when they reappear.
>
>> I think the mask has 2 meanings as follows:
>> - representing CPUs that share same CPU cache.
>
> ... that share the last level cache.
>
>> - representing onlined CPUs
>
> no, for that we have cpu_online_mask.
>

>> So even if we keep their old numbers, we should clear the mask when
>> offlinig CPU.
>
> No, cpu_online_mask is for onlined cores. I think the mask which shows
> which cores share a last level cache should not be changed *IF* the core
> numbers remain stable, that is.

If so, why following maps are cleared by CPU offline?
- cpu_sigling_map
- cpu_core_map

I think the masks are used by cpumask_weight(). So they must be cleared
at CPU offline. And currently llc_shared_map is not used by cpumask_weight().
So even if we don't clear the mask, there is no problem. But llc_shared_map
will be used by cpumask_weight(). In this case, some problem will occur and
we will spent time the investigation of the problem.

So even if we keep CPU number, the mask should be cleared at CPU offline.

Thanks,
Yasuaki Ishimatsu

2014-07-04 10:59:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

On Fri, Jul 04, 2014 at 09:14:57AM +0900, Yasuaki Ishimatsu wrote:
> If so, why following maps are cleared by CPU offline?
> - cpu_sigling_map
> - cpu_core_map

I'll let you figure that out on your own by doing some quality code
staring. Hint: search for usages outside of the CPU hotplug path.

In any case, let me try to explain it to you one more time: if the core
numbers are static and don't change across physical hotplug, the cores
which share last level cache also remain the same. This is a static
property which doesn't simply change.

For example, if cores 6-11 and 34-41 have been sharing the last level
cache, then if the node containing them gets unplugged and replugged
back again - then the same cores share that exact same cache and nothing
has changed.

Except the core numbers, as you've shown. Which would be not only a
problem for the llc_shared_mask but also a big annoyance for sysadmins
and users having to realize that the topology enumeration has changed
and trying to make sense of what node went where and why isn't it the
same as when the machine booted.

So what I'm trying to say is, we should keep the core numbering stable
across hotplug to avoid unnecessary confusion; the llc_shared_mask is
just a small issue which results from not doing that.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-07-09 06:47:58

by Yasuaki Ishimatsu

[permalink] [raw]
Subject: Re: [PATCH] x86,cpu-hotplug: clear llc_shared_mask at CPU hotplug

Hi Borislav,

Sorry for late reply and thank you for your explanation.
I'm implementing your suggestion. After finishing test, I'll post a patch again.
So when I post a patch, please review it.

Thanks,
Yasuaki Ishimatsu

(2014/07/04 19:59), Borislav Petkov wrote:
> On Fri, Jul 04, 2014 at 09:14:57AM +0900, Yasuaki Ishimatsu wrote:
>> If so, why following maps are cleared by CPU offline?
>> - cpu_sigling_map
>> - cpu_core_map
>
> I'll let you figure that out on your own by doing some quality code
> staring. Hint: search for usages outside of the CPU hotplug path.
>
> In any case, let me try to explain it to you one more time: if the core
> numbers are static and don't change across physical hotplug, the cores
> which share last level cache also remain the same. This is a static
> property which doesn't simply change.
>
> For example, if cores 6-11 and 34-41 have been sharing the last level
> cache, then if the node containing them gets unplugged and replugged
> back again - then the same cores share that exact same cache and nothing
> has changed.
>
> Except the core numbers, as you've shown. Which would be not only a
> problem for the llc_shared_mask but also a big annoyance for sysadmins
> and users having to realize that the topology enumeration has changed
> and trying to make sense of what node went where and why isn't it the
> same as when the machine booted.
>
> So what I'm trying to say is, we should keep the core numbering stable
> across hotplug to avoid unnecessary confusion; the llc_shared_mask is
> just a small issue which results from not doing that.
>