2020-10-21 16:14:23

by Vanshidhar Konda

[permalink] [raw]
Subject: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

The current arm64 max NUMA nodes default to 4. Today's arm64 systems can
reach or exceed 16. Increase the number to 64 (matching x86_64).

Signed-off-by: Vanshidhar Konda <[email protected]>
---
arch/arm64/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 893130ce1626..3e69d3c981be 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -980,7 +980,7 @@ config NUMA
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
range 1 10
- default "2"
+ default "6"
depends on NEED_MULTIPLE_NODES
help
Specify the maximum number of NUMA Nodes available on the target
--
2.28.0


2020-10-22 07:53:53

by Vanshidhar Konda

[permalink] [raw]
Subject: Re: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

On Thu, Oct 22, 2020 at 12:44:15AM +0100, Robin Murphy wrote:
>On 2020-10-21 12:02, Jonathan Cameron wrote:
>>On Wed, 21 Oct 2020 09:43:21 +0530
>>Anshuman Khandual <[email protected]> wrote:
>>
>>>On 10/20/2020 11:39 PM, Valentin Schneider wrote:
>>>>
>>>>Hi,
>>>>
>>>>Nit on the subject: this only increases the default, the max is still 2?????.
>>>
>>>Agreed.
>>>
>>>>
>>>>On 20/10/20 18:34, Vanshidhar Konda wrote:
>>>>>The current arm64 max NUMA nodes default to 4. Today's arm64 systems can
>>>>>reach or exceed 16. Increase the number to 64 (matching x86_64).
>>>>>
>>>>>Signed-off-by: Vanshidhar Konda <[email protected]>
>>>>>---
>>>>> arch/arm64/Kconfig | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>>diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>index 893130ce1626..3e69d3c981be 100644
>>>>>--- a/arch/arm64/Kconfig
>>>>>+++ b/arch/arm64/Kconfig
>>>>>@@ -980,7 +980,7 @@ config NUMA
>>>>> config NODES_SHIFT
>>>>> int "Maximum NUMA Nodes (as a power of 2)"
>>>>> range 1 10
>>>>>- default "2"
>>>>>+ default "6"
>>>>
>>>>This leads to more statically allocated memory for things like node to CPU
>>>>maps (see uses of MAX_NUMNODES), but that shouldn't be too much of an
>>>>issue.
>>>
>>>The smaller systems should not be required to waste those memory in
>>>a default case, unless there is a real and available larger system
>>>with those increased nodes.
>>>
>>>>
>>>>AIUI this also directly correlates to how many more page->flags bits are
>>>>required: are we sure the max 10 works on any aarch64 platform? I'm
>>>
>>>We will have to test that. Besides 256 (2 ^ 8) is the first threshold
>>>to be crossed here.
>>>
>>>>genuinely asking here, given that I'm mostly a stranger to the mm
>>>>world. The default should be something we're somewhat confident works
>>>>everywhere.
>>>
>>>Agreed. Do we really need to match X86 right now ? Do we really have
>>>systems that has 64 nodes ? We should not increase the default node
>>>value and then try to solve some new problems, when there might not
>>>be any system which could even use that. I would suggest increase
>>>NODES_SHIFT value upto as required by a real and available system.
>>
>>I'm not going to give precise numbers on near future systems but it is public
>>that we ship 8 NUMA node ARM64 systems today. Things will get more
>>interesting as CXL and CCIX enter the market on ARM systems,
>>given chances are every CXL device will look like another NUMA
>>node (CXL spec says they should be presented as such) and you
>>may be able to rack up lots of them.
>>
>>So I'd argue minimum that makes sense today is 16 nodes, but looking forward
>>even a little and 64 is not a great stretch.
>>I'd make the jump to 64 so we can forget about this again for a year or two.
>>People will want to run today's distros on these new machines and we'd
>>rather not have to go around all the distros asking them to carry a patch
>>increasing this count (I assume they are already carrying such a patch
>>due to those 8 node systems)

To echo Jonathan's statement above we are looking at systems that will
need approximately 64 NUMA nodes over the next 5-6 years - the time for
which an LTS kernel would be maintained. Some of the reason's for
increasing NUMA nodes during this time period include CXL, CCIX and
NVDIMM (like Jonathan pointed out).

The main argument against increasing the NODES_SHIFT seems to be a
concern that it negatively impacts other ARM64 systems. Could anyone
share what kind of systems we are talking about? For a system that has
NEED_MULTIPLE_NODES set, would the impact be noticeable?

Vanshi

>
>Nit: I doubt any sane distro is going to carry a patch to adjust the
>*default* value of a Kconfig option. They might tune the actual value
>in their config, but, well, isn't that the whole point of configs? ;)
>
>Robin.
>
>>
>>Jonathan
>>
>>>
>>>>> depends on NEED_MULTIPLE_NODES
>>>>> help
>>>>> Specify the maximum number of NUMA Nodes available on the target
>>>
>>>_______________________________________________
>>>linux-arm-kernel mailing list
>>>[email protected]
>>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>>
>>
>>_______________________________________________
>>linux-arm-kernel mailing list
>>[email protected]
>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>

2020-10-22 18:08:47

by Vanshidhar Konda

[permalink] [raw]
Subject: Re: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

On Thu, Oct 22, 2020 at 12:21:27PM +0100, Robin Murphy wrote:
>On 2020-10-22 02:07, Vanshi Konda wrote:
>>On Thu, Oct 22, 2020 at 12:44:15AM +0100, Robin Murphy wrote:
>>>On 2020-10-21 12:02, Jonathan Cameron wrote:
>>>>On Wed, 21 Oct 2020 09:43:21 +0530
>>>>Anshuman Khandual <[email protected]> wrote:
>>>>
>>>>>On 10/20/2020 11:39 PM, Valentin Schneider wrote:
>>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>Nit on the subject: this only increases the default, the max
>>>>>>is still 2?????.
>>>>>
>>>>>Agreed.
>>>>>
>>>>>>
>>>>>>On 20/10/20 18:34, Vanshidhar Konda wrote:
>>>>>>>The current arm64 max NUMA nodes default to 4. Today's
>>>>>>>arm64 systems can
>>>>>>>reach or exceed 16. Increase the number to 64 (matching x86_64).
>>>>>>>
>>>>>>>Signed-off-by: Vanshidhar Konda <[email protected]>
>>>>>>>---
>>>>>>>??arch/arm64/Kconfig | 2 +-
>>>>>>>??1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>>diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>>>index 893130ce1626..3e69d3c981be 100644
>>>>>>>--- a/arch/arm64/Kconfig
>>>>>>>+++ b/arch/arm64/Kconfig
>>>>>>>@@ -980,7 +980,7 @@ config NUMA
>>>>>>>??config NODES_SHIFT
>>>>>>>?????????? int "Maximum NUMA Nodes (as a power of 2)"
>>>>>>>?????????? range 1 10
>>>>>>>-?????? default "2"
>>>>>>>+?????? default "6"
>>>>>>
>>>>>>This leads to more statically allocated memory for things
>>>>>>like node to CPU
>>>>>>maps (see uses of MAX_NUMNODES), but that shouldn't be too much of an
>>>>>>issue.
>>>>>
>>>>>The smaller systems should not be required to waste those memory in
>>>>>a default case, unless there is a real and available larger system
>>>>>with those increased nodes.
>>>>>
>>>>>>
>>>>>>AIUI this also directly correlates to how many more
>>>>>>page->flags bits are
>>>>>>required: are we sure the max 10 works on any aarch64 platform? I'm
>>>>>
>>>>>We will have to test that. Besides 256 (2 ^ 8) is the first threshold
>>>>>to be crossed here.
>>>>>
>>>>>>genuinely asking here, given that I'm mostly a stranger to the mm
>>>>>>world. The default should be something we're somewhat confident works
>>>>>>everywhere.
>>>>>
>>>>>Agreed. Do we really need to match X86 right now ? Do we really have
>>>>>systems that has 64 nodes ? We should not increase the default node
>>>>>value and then try to solve some new problems, when there might not
>>>>>be any system which could even use that. I would suggest increase
>>>>>NODES_SHIFT value upto as required by a real and available system.
>>>>
>>>>I'm not going to give precise numbers on near future systems but
>>>>it is public
>>>>that we ship 8 NUMA node ARM64 systems today.?? Things will get more
>>>>interesting as CXL and CCIX enter the market on ARM systems,
>>>>given chances are every CXL device will look like another NUMA
>>>>node (CXL spec says they should be presented as such) and you
>>>>may be able to rack up lots of them.
>>>>
>>>>So I'd argue minimum that makes sense today is 16 nodes, but
>>>>looking forward
>>>>even a little and 64 is not a great stretch.
>>>>I'd make the jump to 64 so we can forget about this again for a
>>>>year or two.
>>>>People will want to run today's distros on these new machines and we'd
>>>>rather not have to go around all the distros asking them to
>>>>carry a patch
>>>>increasing this count (I assume they are already carrying such a patch
>>>>due to those 8 node systems)
>>
>>To echo Jonathan's statement above we are looking at systems that will
>>need approximately 64 NUMA nodes over the next 5-6 years - the time for
>>which an LTS kernel would be maintained. Some of the reason's for
>>increasing NUMA nodes during this time period include CXL, CCIX and
>>NVDIMM (like Jonathan pointed out).
>>
>>The main argument against increasing the NODES_SHIFT seems to be a
>>concern that it negatively impacts other ARM64 systems. Could anyone
>>share what kind of systems we are talking about? For a system that has
>>NEED_MULTIPLE_NODES set, would the impact be noticeable?
>
>Systems like the ESPRESSObin - sure, sane people aren't trying to run
>desktops or development environments in 1GB of RAM, but it's not
>uncommon for them to use a minimal headless install of their favourite
>generic arm64 distro rather than something more "embedded" like

If someone is running a generic arm64 distro, at least some of them are
already paying the extra cost. NODES_SHIFT for Ubuntu and SuSE kernels
is already 6. CentOS/Redhat and Oracle Linux set it to 3. I've only seen
Debian set it to 2.

Vanshi

>OpenWrt or Armbian. Increasing a generic kernel's memory footprint
>(and perhaps more importantly, cache footprint) more than necessary is
>going to have *some* impact.
>
>Robin.
>
>>
>>Vanshi
>>
>>>
>>>Nit: I doubt any sane distro is going to carry a patch to adjust
>>>the *default* value of a Kconfig option. They might tune the
>>>actual value in their config, but, well, isn't that the whole
>>>point of configs? ;)
>>>
>>>Robin.
>>>
>>>>
>>>>Jonathan
>>>>
>>>>>
>>>>>>>?????????? depends on NEED_MULTIPLE_NODES
>>>>>>>?????????? help
>>>>>>>?????????????? Specify the maximum number of NUMA Nodes
>>>>>>>available on the target
>>>>>
>>>>>_______________________________________________
>>>>>linux-arm-kernel mailing list
>>>>>[email protected]
>>>>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>linux-arm-kernel mailing list
>>>>[email protected]
>>>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>

2020-10-22 20:07:39

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

On 2020-10-22 02:07, Vanshi Konda wrote:
> On Thu, Oct 22, 2020 at 12:44:15AM +0100, Robin Murphy wrote:
>> On 2020-10-21 12:02, Jonathan Cameron wrote:
>>> On Wed, 21 Oct 2020 09:43:21 +0530
>>> Anshuman Khandual <[email protected]> wrote:
>>>
>>>> On 10/20/2020 11:39 PM, Valentin Schneider wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Nit on the subject: this only increases the default, the max is
>>>>> still 2?????.
>>>>
>>>> Agreed.
>>>>
>>>>>
>>>>> On 20/10/20 18:34, Vanshidhar Konda wrote:
>>>>>> The current arm64 max NUMA nodes default to 4. Today's arm64
>>>>>> systems can
>>>>>> reach or exceed 16. Increase the number to 64 (matching x86_64).
>>>>>>
>>>>>> Signed-off-by: Vanshidhar Konda <[email protected]>
>>>>>> ---
>>>>>>  arch/arm64/Kconfig | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>> index 893130ce1626..3e69d3c981be 100644
>>>>>> --- a/arch/arm64/Kconfig
>>>>>> +++ b/arch/arm64/Kconfig
>>>>>> @@ -980,7 +980,7 @@ config NUMA
>>>>>>  config NODES_SHIFT
>>>>>>       int "Maximum NUMA Nodes (as a power of 2)"
>>>>>>       range 1 10
>>>>>> -    default "2"
>>>>>> +    default "6"
>>>>>
>>>>> This leads to more statically allocated memory for things like node
>>>>> to CPU
>>>>> maps (see uses of MAX_NUMNODES), but that shouldn't be too much of an
>>>>> issue.
>>>>
>>>> The smaller systems should not be required to waste those memory in
>>>> a default case, unless there is a real and available larger system
>>>> with those increased nodes.
>>>>
>>>>>
>>>>> AIUI this also directly correlates to how many more page->flags
>>>>> bits are
>>>>> required: are we sure the max 10 works on any aarch64 platform? I'm
>>>>
>>>> We will have to test that. Besides 256 (2 ^ 8) is the first threshold
>>>> to be crossed here.
>>>>
>>>>> genuinely asking here, given that I'm mostly a stranger to the mm
>>>>> world. The default should be something we're somewhat confident works
>>>>> everywhere.
>>>>
>>>> Agreed. Do we really need to match X86 right now ? Do we really have
>>>> systems that has 64 nodes ? We should not increase the default node
>>>> value and then try to solve some new problems, when there might not
>>>> be any system which could even use that. I would suggest increase
>>>> NODES_SHIFT value upto as required by a real and available system.
>>>
>>> I'm not going to give precise numbers on near future systems but it
>>> is public
>>> that we ship 8 NUMA node ARM64 systems today.  Things will get more
>>> interesting as CXL and CCIX enter the market on ARM systems,
>>> given chances are every CXL device will look like another NUMA
>>> node (CXL spec says they should be presented as such) and you
>>> may be able to rack up lots of them.
>>>
>>> So I'd argue minimum that makes sense today is 16 nodes, but looking
>>> forward
>>> even a little and 64 is not a great stretch.
>>> I'd make the jump to 64 so we can forget about this again for a year
>>> or two.
>>> People will want to run today's distros on these new machines and we'd
>>> rather not have to go around all the distros asking them to carry a
>>> patch
>>> increasing this count (I assume they are already carrying such a patch
>>> due to those 8 node systems)
>
> To echo Jonathan's statement above we are looking at systems that will
> need approximately 64 NUMA nodes over the next 5-6 years - the time for
> which an LTS kernel would be maintained. Some of the reason's for
> increasing NUMA nodes during this time period include CXL, CCIX and
> NVDIMM (like Jonathan pointed out).
>
> The main argument against increasing the NODES_SHIFT seems to be a
> concern that it negatively impacts other ARM64 systems. Could anyone
> share what kind of systems we are talking about? For a system that has
> NEED_MULTIPLE_NODES set, would the impact be noticeable?

Systems like the ESPRESSObin - sure, sane people aren't trying to run
desktops or development environments in 1GB of RAM, but it's not
uncommon for them to use a minimal headless install of their favourite
generic arm64 distro rather than something more "embedded" like OpenWrt
or Armbian. Increasing a generic kernel's memory footprint (and perhaps
more importantly, cache footprint) more than necessary is going to have
*some* impact.

Robin.

>
> Vanshi
>
>>
>> Nit: I doubt any sane distro is going to carry a patch to adjust the
>> *default* value of a Kconfig option. They might tune the actual value
>> in their config, but, well, isn't that the whole point of configs? ;)
>>
>> Robin.
>>
>>>
>>> Jonathan
>>>
>>>>
>>>>>>       depends on NEED_MULTIPLE_NODES
>>>>>>       help
>>>>>>         Specify the maximum number of NUMA Nodes available on the
>>>>>> target
>>>>
>>>> _______________________________________________
>>>> linux-arm-kernel mailing list
>>>> [email protected]
>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> [email protected]
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>

2020-10-28 21:40:16

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

On 10/22/20 11:25 AM, Vanshi Konda wrote:
> On Thu, Oct 22, 2020 at 12:21:27PM +0100, Robin Murphy wrote:
>> On 2020-10-22 02:07, Vanshi Konda wrote:
>>> On Thu, Oct 22, 2020 at 12:44:15AM +0100, Robin Murphy wrote:
>>>> On 2020-10-21 12:02, Jonathan Cameron wrote:
>>>>> On Wed, 21 Oct 2020 09:43:21 +0530
>>>>> Anshuman Khandual <[email protected]> wrote:
>>>>>
>>>>>> On 10/20/2020 11:39 PM, Valentin Schneider wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Nit on the subject: this only increases the default, the max is
>>>>>>> still 2?????.
>>>>>>
>>>>>> Agreed.
>>>>>>
>>>>>>>
>>>>>>> On 20/10/20 18:34, Vanshidhar Konda wrote:
>>>>>>>> The current arm64 max NUMA nodes default to 4. Today's arm64
>>>>>>>> systems can
>>>>>>>> reach or exceed 16. Increase the number to 64 (matching x86_64).
>>>>>>>>
>>>>>>>> Signed-off-by: Vanshidhar Konda
>>>>>>>> <[email protected]>
>>>>>>>> ---
>>>>>>>> ??arch/arm64/Kconfig | 2 +-
>>>>>>>> ??1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>>>> index 893130ce1626..3e69d3c981be 100644
>>>>>>>> --- a/arch/arm64/Kconfig
>>>>>>>> +++ b/arch/arm64/Kconfig
>>>>>>>> @@ -980,7 +980,7 @@ config NUMA
>>>>>>>> ??config NODES_SHIFT
>>>>>>>> ?????????? int "Maximum NUMA Nodes (as a power of 2)"
>>>>>>>> ?????????? range 1 10
>>>>>>>> -?????? default "2"
>>>>>>>> +?????? default "6"
>>>>>>>
>>>>>>> This leads to more statically allocated memory for things like
>>>>>>> node to CPU
>>>>>>> maps (see uses of MAX_NUMNODES), but that shouldn't be too much
>>>>>>> of an
>>>>>>> issue.
>>>>>>
>>>>>> The smaller systems should not be required to waste those memory in
>>>>>> a default case, unless there is a real and available larger system
>>>>>> with those increased nodes.
>>>>>>
>>>>>>>
>>>>>>> AIUI this also directly correlates to how many more page->flags
>>>>>>> bits are
>>>>>>> required: are we sure the max 10 works on any aarch64 platform? I'm
>>>>>>
>>>>>> We will have to test that. Besides 256 (2 ^ 8) is the first threshold
>>>>>> to be crossed here.
>>>>>>
>>>>>>> genuinely asking here, given that I'm mostly a stranger to the mm
>>>>>>> world. The default should be something we're somewhat confident
>>>>>>> works
>>>>>>> everywhere.
>>>>>>
>>>>>> Agreed. Do we really need to match X86 right now ? Do we really have
>>>>>> systems that has 64 nodes ? We should not increase the default node
>>>>>> value and then try to solve some new problems, when there might not
>>>>>> be any system which could even use that. I would suggest increase
>>>>>> NODES_SHIFT value upto as required by a real and available system.
>>>>>
>>>>> I'm not going to give precise numbers on near future systems but it
>>>>> is public
>>>>> that we ship 8 NUMA node ARM64 systems today.?? Things will get more
>>>>> interesting as CXL and CCIX enter the market on ARM systems,
>>>>> given chances are every CXL device will look like another NUMA
>>>>> node (CXL spec says they should be presented as such) and you
>>>>> may be able to rack up lots of them.
>>>>>
>>>>> So I'd argue minimum that makes sense today is 16 nodes, but
>>>>> looking forward
>>>>> even a little and 64 is not a great stretch.
>>>>> I'd make the jump to 64 so we can forget about this again for a
>>>>> year or two.
>>>>> People will want to run today's distros on these new machines and we'd
>>>>> rather not have to go around all the distros asking them to carry a
>>>>> patch
>>>>> increasing this count (I assume they are already carrying such a patch
>>>>> due to those 8 node systems)
>>>
>>> To echo Jonathan's statement above we are looking at systems that will
>>> need approximately 64 NUMA nodes over the next 5-6 years - the time for
>>> which an LTS kernel would be maintained. Some of the reason's for
>>> increasing NUMA nodes during this time period include CXL, CCIX and
>>> NVDIMM (like Jonathan pointed out).

This is a very good point. It won't be long until systems will be
pushing the number of NUMA nodes and increasing NODES_SHIFT only
slightly now will result in the default configuration not recognizing
all the nodes. CONFIG_NODES_SHIFT=6 seems a reasonable step up for a
generic kernel that should run well on small to very large systems for a
few years to come.
>>> The main argument against increasing the NODES_SHIFT seems to be a
>>> concern that it negatively impacts other ARM64 systems. Could anyone
>>> share what kind of systems we are talking about? For a system that has
>>> NEED_MULTIPLE_NODES set, would the impact be noticeable?
>>
>> Systems like the ESPRESSObin - sure, sane people aren't trying to run
>> desktops or development environments in 1GB of RAM, but it's not
>> uncommon for them to use a minimal headless install of their favourite
>> generic arm64 distro rather than something more "embedded" like
>
> If someone is running a generic arm64 distro, at least some of them are
> already paying the extra cost. NODES_SHIFT for Ubuntu and SuSE kernels
> is already 6. CentOS/Redhat and Oracle Linux set it to 3. I've only seen
> Debian set it to 2.

Right. The distros may not agree or even care what the default is, but
it doesn't make sense for the mainline default to lag too far behind
what the major distros use.

Shaggy

>
> Vanshi
>
>> OpenWrt or Armbian. Increasing a generic kernel's memory footprint
>> (and perhaps more importantly, cache footprint) more than necessary is
>> going to have *some* impact.
>>
>> Robin.
>>
>>>
>>> Vanshi
>>>
>>>>
>>>> Nit: I doubt any sane distro is going to carry a patch to adjust the
>>>> *default* value of a Kconfig option. They might tune the actual
>>>> value in their config, but, well, isn't that the whole point of
>>>> configs? ;)
>>>>
>>>> Robin.
>>>>
>>>>>
>>>>> Jonathan
>>>>>
>>>>>>
>>>>>>>> ?????????? depends on NEED_MULTIPLE_NODES
>>>>>>>> ?????????? help
>>>>>>>> ?????????????? Specify the maximum number of NUMA Nodes
>>>>>>>> available on the target

2020-10-28 21:41:20

by Vanshidhar Konda

[permalink] [raw]
Subject: Re: [PATCH] arm64: NUMA: Kconfig: Increase max number of nodes

On Thu, Oct 22, 2020 at 12:21:27PM +0100, Robin Murphy wrote:
>On 2020-10-22 02:07, Vanshi Konda wrote:
>>On Thu, Oct 22, 2020 at 12:44:15AM +0100, Robin Murphy wrote:
>>>On 2020-10-21 12:02, Jonathan Cameron wrote:
>>>>On Wed, 21 Oct 2020 09:43:21 +0530
>>>>Anshuman Khandual <[email protected]> wrote:
>>>>
>>>>>On 10/20/2020 11:39 PM, Valentin Schneider wrote:
>>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>Nit on the subject: this only increases the default, the max
>>>>>>is still 2?????.
>>>>>
>>>>>Agreed.
>>>>>
>>>>>>
>>>>>>On 20/10/20 18:34, Vanshidhar Konda wrote:
>>>>>>>The current arm64 max NUMA nodes default to 4. Today's
>>>>>>>arm64 systems can
>>>>>>>reach or exceed 16. Increase the number to 64 (matching x86_64).
>>>>>>>
>>>>>>>Signed-off-by: Vanshidhar Konda <[email protected]>
>>>>>>>---
>>>>>>>??arch/arm64/Kconfig | 2 +-
>>>>>>>??1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>>diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>>>index 893130ce1626..3e69d3c981be 100644
>>>>>>>--- a/arch/arm64/Kconfig
>>>>>>>+++ b/arch/arm64/Kconfig
>>>>>>>@@ -980,7 +980,7 @@ config NUMA
>>>>>>>??config NODES_SHIFT
>>>>>>>?????????? int "Maximum NUMA Nodes (as a power of 2)"
>>>>>>>?????????? range 1 10
>>>>>>>-?????? default "2"
>>>>>>>+?????? default "6"
>>>>>>
>>>>>>This leads to more statically allocated memory for things
>>>>>>like node to CPU
>>>>>>maps (see uses of MAX_NUMNODES), but that shouldn't be too much of an
>>>>>>issue.
>>>>>
>>>>>The smaller systems should not be required to waste those memory in
>>>>>a default case, unless there is a real and available larger system
>>>>>with those increased nodes.
>>>>>
>>>>>>
>>>>>>AIUI this also directly correlates to how many more
>>>>>>page->flags bits are
>>>>>>required: are we sure the max 10 works on any aarch64 platform? I'm
>>>>>
>>>>>We will have to test that. Besides 256 (2 ^ 8) is the first threshold
>>>>>to be crossed here.
>>>>>
>>>>>>genuinely asking here, given that I'm mostly a stranger to the mm
>>>>>>world. The default should be something we're somewhat confident works
>>>>>>everywhere.
>>>>>
>>>>>Agreed. Do we really need to match X86 right now ? Do we really have
>>>>>systems that has 64 nodes ? We should not increase the default node
>>>>>value and then try to solve some new problems, when there might not
>>>>>be any system which could even use that. I would suggest increase
>>>>>NODES_SHIFT value upto as required by a real and available system.
>>>>
>>>>I'm not going to give precise numbers on near future systems but
>>>>it is public
>>>>that we ship 8 NUMA node ARM64 systems today.?? Things will get more
>>>>interesting as CXL and CCIX enter the market on ARM systems,
>>>>given chances are every CXL device will look like another NUMA
>>>>node (CXL spec says they should be presented as such) and you
>>>>may be able to rack up lots of them.
>>>>
>>>>So I'd argue minimum that makes sense today is 16 nodes, but
>>>>looking forward
>>>>even a little and 64 is not a great stretch.
>>>>I'd make the jump to 64 so we can forget about this again for a
>>>>year or two.
>>>>People will want to run today's distros on these new machines and we'd
>>>>rather not have to go around all the distros asking them to
>>>>carry a patch
>>>>increasing this count (I assume they are already carrying such a patch
>>>>due to those 8 node systems)
>>
>>To echo Jonathan's statement above we are looking at systems that will
>>need approximately 64 NUMA nodes over the next 5-6 years - the time for
>>which an LTS kernel would be maintained. Some of the reason's for
>>increasing NUMA nodes during this time period include CXL, CCIX and
>>NVDIMM (like Jonathan pointed out).
>>
>>The main argument against increasing the NODES_SHIFT seems to be a
>>concern that it negatively impacts other ARM64 systems. Could anyone
>>share what kind of systems we are talking about? For a system that has
>>NEED_MULTIPLE_NODES set, would the impact be noticeable?
>
>Systems like the ESPRESSObin - sure, sane people aren't trying to run
>desktops or development environments in 1GB of RAM, but it's not
>uncommon for them to use a minimal headless install of their favourite
>generic arm64 distro rather than something more "embedded" like
>OpenWrt or Armbian. Increasing a generic kernel's memory footprint
>(and perhaps more importantly, cache footprint) more than necessary is
>going to have *some* impact.
>

Ampere’s platforms support multiple NUMA configuration options to meet
different customer requirements. Multiple configurations have more than
4 (currrent default) NUMA nodes. These fail to initialize NUMA with the
following errors in dmesg:

[ 0.000000] ACPI: SRAT: Too many proximity domains.
[ 0.000000] ACPI: SRAT: SRAT not used.

[ 0.000000] SRAT: Invalid NUMA node -1 in ITS affinity
[ 0.000000] SRAT: Invalid NUMA node -1 in ITS affinity

If we look at the forecast for the next LTS kernel lifetime, the number
of NUMA nodes will increase significantly due to SOCs with significantly
higher core counts, increased number of memory channels, and new devices
such as CCIX attached memory, etc. Supporting these platforms with a
default kernel config will require a minimum NODES_SHIFT value = 6.

Vanshi

>Robin.
>
>>
>>Vanshi
>>
>>>
>>>Nit: I doubt any sane distro is going to carry a patch to adjust
>>>the *default* value of a Kconfig option. They might tune the
>>>actual value in their config, but, well, isn't that the whole
>>>point of configs? ;)
>>>
>>>Robin.
>>>
>>>>
>>>>Jonathan
>>>>
>>>>>
>>>>>>>?????????? depends on NEED_MULTIPLE_NODES
>>>>>>>?????????? help
>>>>>>>?????????????? Specify the maximum number of NUMA Nodes
>>>>>>>available on the target
>>>>>
>>>>>_______________________________________________
>>>>>linux-arm-kernel mailing list
>>>>>[email protected]
>>>>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>linux-arm-kernel mailing list
>>>>[email protected]
>>>>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>