2020-07-03 15:59:15

by Uladzislau Rezki

[permalink] [raw]
Subject: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

Hello, folk.

I have a system based on AMD 3970x CPUs. It has 32 physical cores
and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:

<snip>
urezki@pc638:~$ sudo dmesg | grep 128
[ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
[ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
[ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
...
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
urezki@pc638:~$
<snip>

For example SLUB thinks that it deals with 128 CPUs in the system what is
wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
"cpu_possible_mask" does not correspond to reality as well.

Any thoughts?

Thanks!

--
Vlad Rezki


2020-07-03 16:57:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 05:57:49PM +0200, Uladzislau Rezki wrote:
> Hello, folk.
>
> I have a system based on AMD 3970x CPUs. It has 32 physical cores
> and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
>
> <snip>
> urezki@pc638:~$ sudo dmesg | grep 128
> [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs

This is your BIOS saying it needs 128 ids, 64 of which are 'empty'.

I have a box like that as well, if it bothers you boot with:
"possible_cpus=64" or something.

2020-07-03 17:08:33

by Gabriel C

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

Am Fr., 3. Juli 2020 um 17:58 Uhr schrieb Uladzislau Rezki <[email protected]>:
>
> Hello, folk.

Hello,

>
> I have a system based on AMD 3970x CPUs. It has 32 physical cores
> and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
>
> <snip>
> urezki@pc638:~$ sudo dmesg | grep 128
> [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
> ...
> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
> [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
> [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> urezki@pc638:~$
> <snip>
>
> For example SLUB thinks that it deals with 128 CPUs in the system what is
> wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
> "cpu_possible_mask" does not correspond to reality as well.
>
> Any thoughts?

This is not a 5.8-rc3 problem. Almost all AMD CPUs and APUs are
looking like this.
The only CPUs I own are getting that right is a dual EPYC box,
everything else is broken
regarding the right C/T & socket(s) count, and that probably bc is
using NUAM code
to have the info.

I reported that a while back and no-one ever cared.

There is even a comment in the hotplug code saying setting the wrong CPU count
is a waste of resources.

I have a 2200G is reporting 48Cores.

AMD Ryzen 7 3750H reporting twice the cores and twice the socket.

...

[ 0.040578] smpboot: Allowing 16 CPUs, 8 hotplug CPUs
...
[ 0.382122] smpboot: Max logical packages: 2
..

I boot all the boxes restricting the cores to the correct count on the
command line.

Wasted resource or not, this is still a bug IMO.

Best Regards,

Gabriel C

2020-07-03 17:12:13

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 06:56:27PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 03, 2020 at 05:57:49PM +0200, Uladzislau Rezki wrote:
> > Hello, folk.
> >
> > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> >
> > <snip>
> > urezki@pc638:~$ sudo dmesg | grep 128
> > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
>
> This is your BIOS saying it needs 128 ids, 64 of which are 'empty'.
>
> I have a box like that as well, if it bothers you boot with:
> "possible_cpus=64" or something.
>
OK, i got it. I thought that "cpu_possible_mask" strictly follows
the rule: the number of CPUs in a system that physically are present.

Thanks, Peter!

--
Vlad Rezki

2020-07-03 17:25:14

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

> >
> > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> >
> > <snip>
> > urezki@pc638:~$ sudo dmesg | grep 128
> > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
> > ...
> > [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
> > [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
> > [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> > urezki@pc638:~$
> > <snip>
> >
> > For example SLUB thinks that it deals with 128 CPUs in the system what is
> > wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
> > "cpu_possible_mask" does not correspond to reality as well.
> >
> > Any thoughts?
>
> This is not a 5.8-rc3 problem. Almost all AMD CPUs and APUs are
> looking like this.
> The only CPUs I own are getting that right is a dual EPYC box,
> everything else is broken
> regarding the right C/T & socket(s) count, and that probably bc is
> using NUAM code
> to have the info.
>
> I reported that a while back and no-one ever cared.
>
> There is even a comment in the hotplug code saying setting the wrong CPU count
> is a waste of resources.
>
> I have a 2200G is reporting 48Cores.
>
> AMD Ryzen 7 3750H reporting twice the cores and twice the socket.
>
> ...
>
> [ 0.040578] smpboot: Allowing 16 CPUs, 8 hotplug CPUs
> ...
> [ 0.382122] smpboot: Max logical packages: 2
> ..
>
> I boot all the boxes restricting the cores to the correct count on the
> command line.
>
> Wasted resource or not, this is still a bug IMO.
>
I suspect that DEFINE_PER_CPU variables can be twice as big,
but i have not checked it actually. So, if the code needs to
identify real number of CPUs it can be a challenge :)

Thanks.

--
Vlad Rezki

2020-07-03 17:39:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 07:09:41PM +0200, Uladzislau Rezki wrote:
> On Fri, Jul 03, 2020 at 06:56:27PM +0200, Peter Zijlstra wrote:
> > On Fri, Jul 03, 2020 at 05:57:49PM +0200, Uladzislau Rezki wrote:
> > > Hello, folk.
> > >
> > > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> > >
> > > <snip>
> > > urezki@pc638:~$ sudo dmesg | grep 128
> > > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> >
> > This is your BIOS saying it needs 128 ids, 64 of which are 'empty'.
> >
> > I have a box like that as well, if it bothers you boot with:
> > "possible_cpus=64" or something.
> >
> OK, i got it. I thought that "cpu_possible_mask" strictly follows
> the rule: the number of CPUs in a system that physically are present.

Nah, it's based on ACPI (SRAT IIRC) tables. The case of
over-provisioning is useful for systems that support physical hotplug,
but I've seen boards without that capability do it too.

Just chalk it up to the foibles of BIOS.

2020-07-03 17:46:19

by Peter Zijlstra

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 07:07:39PM +0200, Gabriel C wrote:

> I boot all the boxes restricting the cores to the correct count on the
> command line.

This, because you're right about the wasted memory.

> Wasted resource or not, this is still a bug IMO.

Yeah, but not one we can do much about I think. It is the BIOS saying it
wants more because it expects someone to come along and stick another
CPU in.

Possible we could say that for single socket machines overprovisioning
is 'silly', but then, I've no idea how to detect that. You'll need to
find an ACPI person.

2020-07-03 18:39:48

by Gabriel C

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

Am Fr., 3. Juli 2020 um 19:45 Uhr schrieb Peter Zijlstra <[email protected]>:
>
> On Fri, Jul 03, 2020 at 07:07:39PM +0200, Gabriel C wrote:
>
> > I boot all the boxes restricting the cores to the correct count on the
> > command line.
>
> This, because you're right about the wasted memory.
>
> > Wasted resource or not, this is still a bug IMO.
>
> Yeah, but not one we can do much about I think. It is the BIOS saying it
> wants more because it expects someone to come along and stick another
> CPU in.
>
> Possible we could say that for single socket machines overprovisioning
> is 'silly', but then, I've no idea how to detect that. You'll need to
> find an ACPI person.

I know the EPYC box got that problem too initially, it reported twice
the cores and twice the sockets,
but got fixed in some kernel versions.

https://lkml.org/lkml/2018/5/20/102

I never really looked at how this is calculated but I still believe
there is a bug somewhere.

2020-07-03 19:08:50

by peter enderborg

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On 7/3/20 5:57 PM, Uladzislau Rezki wrote:
> Hello, folk.
>
> I have a system based on AMD 3970x CPUs. It has 32 physical cores
> and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
>
> <snip>
> urezki@pc638:~$ sudo dmesg | grep 128
> [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1

My 3950 do


[    0.005271] ACPI: SSDT 0x00000000CA1F5000 0003F1 (v02 ALASKA CPUSSDT  01072009 AMI  01072009)
[    0.108266] smpboot: Allowing 32 CPUs, 0 hotplug CPUs
[    0.111384] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:1

(Fedora F32  5.6.18-300) What motherboard and BIOs do you use?


> ...
> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
> [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
> [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> urezki@pc638:~$
> <snip>
>
> For example SLUB thinks that it deals with 128 CPUs in the system what is
> wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
> "cpu_possible_mask" does not correspond to reality as well.
>
> Any thoughts?
>
> Thanks!
>
> --
> Vlad Rezki


2020-07-03 19:29:48

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 07:38:14PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 03, 2020 at 07:09:41PM +0200, Uladzislau Rezki wrote:
> > On Fri, Jul 03, 2020 at 06:56:27PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 03, 2020 at 05:57:49PM +0200, Uladzislau Rezki wrote:
> > > > Hello, folk.
> > > >
> > > > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > > > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > > > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> > > >
> > > > <snip>
> > > > urezki@pc638:~$ sudo dmesg | grep 128
> > > > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > > > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> > >
> > > This is your BIOS saying it needs 128 ids, 64 of which are 'empty'.
> > >
> > > I have a box like that as well, if it bothers you boot with:
> > > "possible_cpus=64" or something.
> > >
> > OK, i got it. I thought that "cpu_possible_mask" strictly follows
> > the rule: the number of CPUs in a system that physically are present.
>
> Nah, it's based on ACPI (SRAT IIRC) tables. The case of
> over-provisioning is useful for systems that support physical hotplug,
> but I've seen boards without that capability do it too.
>
> Just chalk it up to the foibles of BIOS.
>
Yes, i see that such information is propagated by the BIOS to
the kernel, at least for x86 systems. Thad sad if i have single
socket system then we do not have any physical hotplug ability,
thus there is no need in over-provisioning.

Agree that it can be hard to fix, since all that depends on ACPI
interface. Like you stated in another mail.

--
Vlad Rezki

2020-07-03 19:30:56

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

> > Hello, folk.
> >
> > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> >
> > <snip>
> > urezki@pc638:~$ sudo dmesg | grep 128
> > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
>
> My 3950 do
>
>
> [    0.005271] ACPI: SSDT 0x00000000CA1F5000 0003F1 (v02 ALASKA CPUSSDT  01072009 AMI  01072009)
> [    0.108266] smpboot: Allowing 32 CPUs, 0 hotplug CPUs
> [    0.111384] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:1
>
> (Fedora F32  5.6.18-300) What motherboard and BIOs do you use?
>
I have MSI TRX40 with latest BIOS.

--
Vlad Rezki

2020-07-03 20:10:16

by Linus Torvalds

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 3, 2020 at 12:28 PM Uladzislau Rezki <[email protected]> wrote:
>
> I have MSI TRX40 with latest BIOS.

I think it's just that the BIOS is set for the max possible, in case
you'd have a 3990X.

I compile my kernel with CONFIG_NR_CPUS's set to 64. That works around
the issue.

Lots of distros seem to set CONFIG_MAXSMP to true, which I guess is
the most generic thing to do, but the problem with that is not just
the silly problem with the BIOS, but it also means that the kernel
does dynamic allocation for cpumasks even if you _don't_ have that
problem, because at compile-time you don't know how big the cpumask
will be.

With CONFIG_NR_CPUS's set to 64, the kernel will just use a "unsigned
long" on the stack (and in various data structures) and be done with
it, and not do unnecessary dynamic allocations.

Linus

2020-07-03 21:22:23

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

> On Fri, Jul 3, 2020 at 12:28 PM Uladzislau Rezki <[email protected]> wrote:
> >
> > I have MSI TRX40 with latest BIOS.
>
> I think it's just that the BIOS is set for the max possible, in case
> you'd have a 3990X.
>
3990x is the top one in this series, so indeed it can be a case and
explanation why nr_cpu_ids is set to 128.

>
> I compile my kernel with CONFIG_NR_CPUS's set to 64. That works around
> the issue.
>
> Lots of distros seem to set CONFIG_MAXSMP to true, which I guess is
> the most generic thing to do, but the problem with that is not just
> the silly problem with the BIOS, but it also means that the kernel
> does dynamic allocation for cpumasks even if you _don't_ have that
> problem, because at compile-time you don't know how big the cpumask
> will be.
>
> With CONFIG_NR_CPUS's set to 64, the kernel will just use a "unsigned
> long" on the stack (and in various data structures) and be done with
> it, and not do unnecessary dynamic allocations.
>
Thanks for proposed workaround! I will update the CONFIG_NR_CPUS with
proper value in my .config

Some background:
Actually i have been thinking about making vmalloc address space to
be per-CPU, i.e. divide it to per-CPU address space making an allocation
lock-less. It will eliminate a high lock contention. When i have done
a prototype i noticed and realized that there is a silly issue with
nr_cpu_ids on some systems.

Therefore i reported about it.

Thanks, Linus!

--
Vlad Rezki

2020-07-03 21:52:52

by Matthew Wilcox

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 11:20:47PM +0200, Uladzislau Rezki wrote:
> Some background:
> Actually i have been thinking about making vmalloc address space to
> be per-CPU, i.e. divide it to per-CPU address space making an allocation
> lock-less. It will eliminate a high lock contention. When i have done
> a prototype i noticed and realized that there is a silly issue with
> nr_cpu_ids on some systems.

vfree() may happen on a different CPU from the one which called vmalloc(),
so I'm not sure you're going to get as large a win as you think you will.

2020-07-03 22:24:55

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

On Fri, Jul 03, 2020 at 10:51:57PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 03, 2020 at 11:20:47PM +0200, Uladzislau Rezki wrote:
> > Some background:
> > Actually i have been thinking about making vmalloc address space to
> > be per-CPU, i.e. divide it to per-CPU address space making an allocation
> > lock-less. It will eliminate a high lock contention. When i have done
> > a prototype i noticed and realized that there is a silly issue with
> > nr_cpu_ids on some systems.
>
> vfree() may happen on a different CPU from the one which called vmalloc(),
> so I'm not sure you're going to get as large a win as you think you will.
>
Hmm.. According to my tests the difference is approximately 7x/10x but
i also need to say as of now those tests are draft. Indeed vfree() can
be done on different CPU, but i do not think it is a big issue. The main
goal is to make the vmalloc() to be scaled to number of CPUs in a system.
Because as number of CPUs increase as tight an allocation becomes.

Doing vfree() on another CPU would be kind of noise(critical section is
short), whereas other ones will be able to do progress because of their
own locks.

--
Vlad Rezki