2024-02-04 07:26:11

by Gang Li

[permalink] [raw]
Subject: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system

Randy Dunlap and kernel test robot reported a warning:

```
WARNING: unmet direct dependencies detected for PADATA
Depends on [n]: SMP [=n]
Selected by [y]:
- HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
```

hugetlb parallelization depends on PADATA, and PADATA depends on SMP, so
when the SMP config is disabled, the dependency of hugetlb on padata
should be downgraded to single thread.

Fixes: f2f635264b98 ("hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA")
Reported-by: Randy Dunlap <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]/
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Gang Li <[email protected]>
```
Hi Andrew, this fix patch is based on mm/mm-unstable.
Thanks!
```
---
fs/Kconfig | 2 +-
mm/hugetlb.c | 9 ++++++++-
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 3abc107ab2fbd..f2bc73fc0417e 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -261,7 +261,7 @@ menuconfig HUGETLBFS
depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
depends on (SYSFS || SYSCTL)
select MEMFD_CREATE
- select PADATA
+ select PADATA if SMP
help
hugetlbfs is a filesystem backing for HugeTLB pages, based on
ramfs. For architectures that support it, say Y here and read
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bf3d5dfb921e6..1b01b244fb50b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3457,6 +3457,7 @@ static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned lo

static void __init gather_bootmem_prealloc(void)
{
+#ifdef CONFIG_PADATA
struct padata_mt_job job = {
.thread_fn = gather_bootmem_prealloc_node,
.fn_arg = NULL,
@@ -3469,6 +3470,9 @@ static void __init gather_bootmem_prealloc(void)
};

padata_do_multithreaded(&job);
+#else
+ gather_bootmem_prealloc_node(0, 0, NULL);
+#endif
}

static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
@@ -3568,6 +3572,7 @@ static unsigned long __init hugetlb_gigantic_pages_alloc_boot(struct hstate *h)

static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h)
{
+#ifdef CONFIG_PADATA
struct padata_mt_job job = {
.fn_arg = h,
.align = 1,
@@ -3600,7 +3605,9 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h)
job.max_threads = num_node_state(N_MEMORY) * 2;
job.min_chunk = h->max_huge_pages / num_node_state(N_MEMORY) / 2;
padata_do_multithreaded(&job);
-
+#else
+ hugetlb_pages_alloc_boot_node(0, h->max_huge_pages, h);
+#endif
return h->nr_huge_pages;
}

--
2.20.1



2024-02-04 07:45:11

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system



On 2024/2/4 15:25, Gang Li wrote:
> Randy Dunlap and kernel test robot reported a warning:
>
> ```
> WARNING: unmet direct dependencies detected for PADATA
> Depends on [n]: SMP [=n]
> Selected by [y]:
> - HUGETLBFS [=y] && (X86 [=y] || SPARC64 || ARCH_SUPPORTS_HUGETLBFS [=n] || BROKEN [=n]) && (SYSFS [=y] || SYSCTL [=n])
> ```
>
> hugetlb parallelization depends on PADATA, and PADATA depends on SMP, so
> when the SMP config is disabled, the dependency of hugetlb on padata
> should be downgraded to single thread.
>
> Fixes: f2f635264b98 ("hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA")
> Reported-by: Randy Dunlap <[email protected]>
> Closes: https://lore.kernel.org/lkml/[email protected]/
> Reported-by: kernel test robot <[email protected]>
> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> Signed-off-by: Gang Li <[email protected]>
> ```
> Hi Andrew, this fix patch is based on mm/mm-unstable.
> Thanks!
> ```
> ---
> fs/Kconfig | 2 +-
> mm/hugetlb.c | 9 ++++++++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 3abc107ab2fbd..f2bc73fc0417e 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -261,7 +261,7 @@ menuconfig HUGETLBFS
> depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
> depends on (SYSFS || SYSCTL)
> select MEMFD_CREATE
> - select PADATA
> + select PADATA if SMP

I don't think it is a clear way to fix this. If someone want to
use PADATA in a non-SMP system, he should be carefully to handle
the non-SMP case himself. I think the better way is to make PADATA
handle the non-SMP case, I think it should be easy for it, which
could just call ->thread_fn() many times instead of creating many
threads in the non-SMP case.

Thanks.

> help
> hugetlbfs is a filesystem backing for HugeTLB pages, based on
> ramfs. For architectures that support it, say Y here and read
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index bf3d5dfb921e6..1b01b244fb50b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3457,6 +3457,7 @@ static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned lo
>
> static void __init gather_bootmem_prealloc(void)
> {
> +#ifdef CONFIG_PADATA
> struct padata_mt_job job = {
> .thread_fn = gather_bootmem_prealloc_node,
> .fn_arg = NULL,
> @@ -3469,6 +3470,9 @@ static void __init gather_bootmem_prealloc(void)
> };
>
> padata_do_multithreaded(&job);
> +#else
> + gather_bootmem_prealloc_node(0, 0, NULL);
> +#endif
> }
>
> static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
> @@ -3568,6 +3572,7 @@ static unsigned long __init hugetlb_gigantic_pages_alloc_boot(struct hstate *h)
>
> static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h)
> {
> +#ifdef CONFIG_PADATA
> struct padata_mt_job job = {
> .fn_arg = h,
> .align = 1,
> @@ -3600,7 +3605,9 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h)
> job.max_threads = num_node_state(N_MEMORY) * 2;
> job.min_chunk = h->max_huge_pages / num_node_state(N_MEMORY) / 2;
> padata_do_multithreaded(&job);
> -
> +#else
> + hugetlb_pages_alloc_boot_node(0, h->max_huge_pages, h);
> +#endif
> return h->nr_huge_pages;
> }
>


2024-02-04 07:48:38

by Gang Li

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system

On 2024/2/4 15:44, Muchun Song wrote:
> I don't think it is a clear way to fix this. If someone want to
> use PADATA in a non-SMP system, he should be carefully to handle
> the non-SMP case himself. I think the better way is to make PADATA
> handle the non-SMP case, I think it should be easy for it, which
> could just call ->thread_fn() many times instead of creating many
> threads in the non-SMP case.
>
> Thanks.
>

Sounds good, I'll take a look at padata and send a new patch.

2024-02-05 06:55:48

by Gang Li

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system


On 2024/2/4 15:48, Gang Li wrote:
> On 2024/2/4 15:44, Muchun Song wrote:
>> I don't think it is a clear way to fix this. If someone want to
>> use PADATA in a non-SMP system, he should be carefully to handle
>> the non-SMP case himself. I think the better way is to make PADATA
>> handle the non-SMP case, I think it should be easy for it, which
>> could just call ->thread_fn() many times instead of creating many
>> threads in the non-SMP case.
>>
>> Thanks.
>>
>
> Sounds good, I'll take a look at padata and send a new patch.

1. delete the dependency on SMP

PADATA only depends on workqueue and completion. It works well with !SMP
currently but has no performance benefits. What we can do is make PADATA
handle the non-SMP case more elegantly.

PADATA has two parts: "Running Multithreaded Jobs" and "Running
Serialized Jobs".

"Running Multithreaded Jobs", which hugetlb parallelization relies on
can be easily deparallelize through this patch:

```
@@ -495,7 +495,7 @@ void __init padata_do_multithreaded(struct
padata_mt_job *job)
nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
nworks = min(nworks, job->max_threads);

- if (nworks == 1) {
+ if (nworks == 1 || !IS_ENABLED(CONFIG_SMP)) {
/* Single thread, no coordination needed, cut to the
chase. */
job->thread_fn(job->start, job->start + job->size,
job->fn_arg);
return;
```

However, "Running Serialized Jobs" is more challenging due to its
various workers queuing each other, making it more complex than "Running
Multithreaded Jobs." I am currently in the process of deciphering the
code.

To eliminate kconfig warnings, other methods could be considered:

2. Split hugetlb parallelization into a separate kconfig.
3. Wrap hugetlb parallelization with SMP or PADATA macros (already ruled
out).
4. Split PADATA into PADATA_SERIALIZED and PADATA_MULTITHREADED (too heavy).

Anyway, this is only FYI. I will continue exploring how to deparallelize
"Running Serialized Jobs."

2024-02-05 07:38:43

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system



> On Feb 5, 2024, at 14:55, Gang Li <[email protected]> wrote:
>
>
> On 2024/2/4 15:48, Gang Li wrote:
>> On 2024/2/4 15:44, Muchun Song wrote:
>>> I don't think it is a clear way to fix this. If someone want to
>>> use PADATA in a non-SMP system, he should be carefully to handle
>>> the non-SMP case himself. I think the better way is to make PADATA
>>> handle the non-SMP case, I think it should be easy for it, which
>>> could just call ->thread_fn() many times instead of creating many
>>> threads in the non-SMP case.
>>>
>>> Thanks.
>>>
>> Sounds good, I'll take a look at padata and send a new patch.
>
> 1. delete the dependency on SMP
>
> PADATA only depends on workqueue and completion. It works well with !SMP
> currently but has no performance benefits. What we can do is make PADATA
> handle the non-SMP case more elegantly.
>
> PADATA has two parts: "Running Multithreaded Jobs" and "Running
> Serialized Jobs".
>
> "Running Multithreaded Jobs", which hugetlb parallelization relies on
> can be easily deparallelize through this patch:
>
> ```
> @@ -495,7 +495,7 @@ void __init padata_do_multithreaded(struct padata_mt_job *job)
> nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
> nworks = min(nworks, job->max_threads);
>
> - if (nworks == 1) {
> + if (nworks == 1 || !IS_ENABLED(CONFIG_SMP)) {
> /* Single thread, no coordination needed, cut to the chase. */
> job->thread_fn(job->start, job->start + job->size, job->fn_arg);
> return;
> ```
>
> However, "Running Serialized Jobs" is more challenging due to its
> various workers queuing each other, making it more complex than "Running
> Multithreaded Jobs." I am currently in the process of deciphering the
> code.

Actually, I did not get it. Why the above code cannot work? The above
code already make it serialized in one call, right? What do I miss here?

Thanks.

>
> To eliminate kconfig warnings, other methods could be considered:
>
> 2. Split hugetlb parallelization into a separate kconfig.
> 3. Wrap hugetlb parallelization with SMP or PADATA macros (already ruled out).
> 4. Split PADATA into PADATA_SERIALIZED and PADATA_MULTITHREADED (too heavy).
>
> Anyway, this is only FYI. I will continue exploring how to deparallelize
> "Running Serialized Jobs."


2024-02-05 08:09:00

by Gang Li

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system

On 2024/2/5 15:37, Muchun Song wrote:
> Actually, I did not get it. Why the above code cannot work? The above
> code already make it serialized in one call, right? What do I miss here?
>
> Thanks.
>

PADATA consists of two distinct functionality:

One part is `padata_do_multithreaded`, which disregards
order and simply divides tasks into several groups for parallel
execution. My patch use `padata_do_multithreaded`.

The other part is composed of a set of APIs that, while handling data in
an out-of-order parallel manner, can eventually return the data with
ordered sequence. Only `crypto/pcrypt.c` use them. I guess these APIs
are designed specifically for `crypto/pcrypt.c`.
```
padata_alloc
padata_alloc_shell
padata_do_parallel
padata_do_serial
padata_free_shell
padata_free
```

2024-02-08 02:40:02

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system



> On Feb 5, 2024, at 16:08, Gang Li <[email protected]> wrote:
>
> On 2024/2/5 15:37, Muchun Song wrote:
>> Actually, I did not get it. Why the above code cannot work? The above
>> code already make it serialized in one call, right? What do I miss here?
>> Thanks.
>
> PADATA consists of two distinct functionality:
>
> One part is `padata_do_multithreaded`, which disregards
> order and simply divides tasks into several groups for parallel
> execution. My patch use `padata_do_multithreaded`.

OK. Since all users of PADATA of non-SMP case currently only use padata_do_multithreaded, it’s possible and easy to implement a variant of that in non-SMP case. It is not necessary to implement another functionality unless the only user of crypto/pcrypt.c does not depend on SMP in the future, at least now it’s dependent on SMP.

Thanks.

>
> The other part is composed of a set of APIs that, while handling data in
> an out-of-order parallel manner, can eventually return the data with
> ordered sequence. Only `crypto/pcrypt.c` use them. I guess these APIs
> are designed specifically for `crypto/pcrypt.c`.
> ```
> padata_alloc
> padata_alloc_shell
> padata_do_parallel
> padata_do_serial
> padata_free_shell
> padata_free
> ```

2024-02-10 02:46:31

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH 1/1] hugetlb: fix CONFIG_PADATA dependency for non-SMP system


CC Jane, Daniel Jordan, Steffen as well in this thread.

> On Feb 8, 2024, at 10:39, Muchun Song <[email protected]> wrote:
> 
>
>> On Feb 5, 2024, at 16:08, Gang Li <[email protected]> wrote:
>>
>> On 2024/2/5 15:37, Muchun Song wrote:
>>> Actually, I did not get it. Why the above code cannot work? The above
>>> code already make it serialized in one call, right? What do I miss here?
>>> Thanks.
>>
>> PADATA consists of two distinct functionality:
>>
>> One part is `padata_do_multithreaded`, which disregards
>> order and simply divides tasks into several groups for parallel
>> execution. My patch use `padata_do_multithreaded`.
>
> OK. Since all users of PADATA of non-SMP case currently only use padata_do_multithreaded, it’s possible and easy to implement a variant of that in non-SMP case. It is not necessary to implement another functionality unless the only user of crypto/pcrypt.c does not depend on SMP in the future, at least now it’s dependent on SMP.
>
> Thanks.
>
>>
>> The other part is composed of a set of APIs that, while handling data in
>> an out-of-order parallel manner, can eventually return the data with
>> ordered sequence. Only `crypto/pcrypt.c` use them. I guess these APIs
>> are designed specifically for `crypto/pcrypt.c`.
>> ```
>> padata_alloc
>> padata_alloc_shell
>> padata_do_parallel
>> padata_do_serial
>> padata_free_shell
>> padata_free
>> ```