2022-08-15 12:20:59

by lizhe.67

[permalink] [raw]
Subject: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

From: Li Zhe <[email protected]>

In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
we call page_ext_init() after page_alloc_init_late() to avoid some panic
problem. It seems that we cannot track early page allocations in current
kernel even if page structure has been initialized early.

This patch move up page_ext_init() to catch early page allocations when
DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
allocations. This is useful especially when we find that the free memory
value is not the same right after different kernel booting.

Signed-off-by: Li Zhe <[email protected]>
---
include/linux/page_ext.h | 30 +++++++++++++++++++++++++++---
init/main.c | 7 +++++--
mm/page_ext.c | 2 +-
3 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index fabb2e1e087f..b77a13689e00 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -43,14 +43,34 @@ extern void pgdat_page_ext_init(struct pglist_data *pgdat);
static inline void page_ext_init_flatmem(void)
{
}
-extern void page_ext_init(void);
static inline void page_ext_init_flatmem_late(void)
{
}
+extern void _page_ext_init(void);
+#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
+static inline void page_ext_init_early(void)
+{
+}
+static inline void page_ext_init_late(void)
+{
+ _page_ext_init();
+}
+#else
+static inline void page_ext_init_early(void)
+{
+ _page_ext_init();
+}
+static inline void page_ext_init_late(void)
+{
+}
+#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
#else
extern void page_ext_init_flatmem(void);
extern void page_ext_init_flatmem_late(void);
-static inline void page_ext_init(void)
+static inline void page_ext_init_early(void)
+{
+}
+static inline void page_ext_init_late(void)
{
}
#endif
@@ -76,7 +96,11 @@ static inline struct page_ext *lookup_page_ext(const struct page *page)
return NULL;
}

-static inline void page_ext_init(void)
+static inline void page_ext_init_early(void)
+{
+}
+
+static inline void page_ext_init_late(void)
{
}

diff --git a/init/main.c b/init/main.c
index 91642a4e69be..7f9533ba527d 100644
--- a/init/main.c
+++ b/init/main.c
@@ -845,6 +845,7 @@ static void __init mm_init(void)
* slab is ready so that stack_depot_init() works properly
*/
page_ext_init_flatmem_late();
+ page_ext_init_early();
kmemleak_init();
pgtable_init();
debug_objects_mem_init();
@@ -1605,8 +1606,10 @@ static noinline void __init kernel_init_freeable(void)

padata_init();
page_alloc_init_late();
- /* Initialize page ext after all struct pages are initialized. */
- page_ext_init();
+ /* Initialize page ext after all struct pages are initialized if
+ * CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled
+ */
+ page_ext_init_late();

do_basic_setup();

diff --git a/mm/page_ext.c b/mm/page_ext.c
index 3dc715d7ac29..50419e7349cb 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -378,7 +378,7 @@ static int __meminit page_ext_callback(struct notifier_block *self,
return notifier_from_errno(ret);
}

-void __init page_ext_init(void)
+void __init _page_ext_init(void)
{
unsigned long pfn;
int nid;
--
2.20.1


2022-08-18 08:07:57

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

On Mon 15-08-22 20:09:54, [email protected] wrote:
> From: Li Zhe <[email protected]>
>
> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
> we call page_ext_init() after page_alloc_init_late() to avoid some panic
> problem. It seems that we cannot track early page allocations in current
> kernel even if page structure has been initialized early.
>
> This patch move up page_ext_init() to catch early page allocations when
> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
> allocations. This is useful especially when we find that the free memory
> value is not the same right after different kernel booting.

is this actually useful in practice? I mean who is going to disable
DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
allocations?

I do see how debugging those early allocations might be useful but that
would require a boot time option to be practical IMHO. Would it make
sense to add a early_page_ext parameter which would essentially disable
the deferred ipage initialization. That should be quite trivial to
achieve (just hook into defer_init AFAICS).
--
Michal Hocko
SUSE Labs

2022-08-20 01:29:05

by lizhe.67

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

On 2022-08-18 7:36 UTC, [email protected] wrote:
>> From: Li Zhe <[email protected]>
>>
>> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
>> we call page_ext_init() after page_alloc_init_late() to avoid some panic
>> problem. It seems that we cannot track early page allocations in current
>> kernel even if page structure has been initialized early.
>>
>> This patch move up page_ext_init() to catch early page allocations when
>> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
>> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
>> allocations. This is useful especially when we find that the free memory
>> value is not the same right after different kernel booting.
>
>is this actually useful in practice? I mean who is going to disable
>DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
>allocations?

Yes it is useful. We use this method to catch the difference of early
page allocations between two kernel.

> I do see how debugging those early allocations might be useful but that
> would require a boot time option to be practical IMHO. Would it make
> sense to add a early_page_ext parameter which would essentially disable
> the deferred ipage initialization. That should be quite trivial to
> achieve (just hook into defer_init AFAICS).

It is a good idea. A cmdline parameter is a flexible and dynamic method for
us to decide whether to defer page's and page_ext's initilization. For
comparison, this patch provides a static method to decide whether to defer
page's and page_ext's initilization. They are not conflicting. My next
work is trying to achieve your idea.
--
Li Zhe

2022-08-22 07:25:42

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

On 8/20/22 03:02, [email protected] wrote:
> On 2022-08-18 7:36 UTC, [email protected] wrote:
>>> From: Li Zhe <[email protected]>
>>>
>>> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
>>> we call page_ext_init() after page_alloc_init_late() to avoid some panic
>>> problem. It seems that we cannot track early page allocations in current
>>> kernel even if page structure has been initialized early.
>>>
>>> This patch move up page_ext_init() to catch early page allocations when
>>> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
>>> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
>>> allocations. This is useful especially when we find that the free memory
>>> value is not the same right after different kernel booting.
>>
>>is this actually useful in practice? I mean who is going to disable
>>DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
>>allocations?
>
> Yes it is useful. We use this method to catch the difference of early
> page allocations between two kernel.
>
>> I do see how debugging those early allocations might be useful but that
>> would require a boot time option to be practical IMHO. Would it make
>> sense to add a early_page_ext parameter which would essentially disable
>> the deferred ipage initialization. That should be quite trivial to
>> achieve (just hook into defer_init AFAICS).
>
> It is a good idea. A cmdline parameter is a flexible and dynamic method for
> us to decide whether to defer page's and page_ext's initilization. For
> comparison, this patch provides a static method to decide whether to defer
> page's and page_ext's initilization. They are not conflicting. My next
> work is trying to achieve your idea.

As we already have to pass page_owner=on parameter to enable the page
allocation tracking in the first place, maybe that alone could also disable
deffered init, and no need for another parameter?

> --
> Li Zhe

2022-08-22 07:28:07

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

On Sat 20-08-22 09:02:57, [email protected] wrote:
> On 2022-08-18 7:36 UTC, [email protected] wrote:
> >> From: Li Zhe <[email protected]>
> >>
> >> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
> >> we call page_ext_init() after page_alloc_init_late() to avoid some panic
> >> problem. It seems that we cannot track early page allocations in current
> >> kernel even if page structure has been initialized early.
> >>
> >> This patch move up page_ext_init() to catch early page allocations when
> >> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
> >> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
> >> allocations. This is useful especially when we find that the free memory
> >> value is not the same right after different kernel booting.
> >
> >is this actually useful in practice? I mean who is going to disable
> >DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
> >allocations?
>
> Yes it is useful. We use this method to catch the difference of early
> page allocations between two kernel.

I was not questioning the functionality itself but the way how it is
achieved. Recompiling the kernel to achieve debuggability has proven to
be really a bad approach historically. Most people are using
pre-compiled kernels these days.

> > I do see how debugging those early allocations might be useful but that
> > would require a boot time option to be practical IMHO. Would it make
> > sense to add a early_page_ext parameter which would essentially disable
> > the deferred ipage initialization. That should be quite trivial to
> > achieve (just hook into defer_init AFAICS).
>
> It is a good idea. A cmdline parameter is a flexible and dynamic method for
> us to decide whether to defer page's and page_ext's initilization. For
> comparison, this patch provides a static method to decide whether to defer
> page's and page_ext's initilization. They are not conflicting. My next
> work is trying to achieve your idea.

They are not conflicting but this patch adds ifdefs and additional code
that needs compile time testing with different options set. I.e. it adds
maintenance burden for something that can be achieved by better means.
So if you are ok to work on the runtime knob then I would propose to
drop this patch from the mm tree and replace it by a trivial patch to
allow early boot debugging by a cmd line parameter.
--
Michal Hocko
SUSE Labs

2022-08-24 03:57:20

by lizhe.67

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n

On 2022-08-22 7:08 UTC, [email protected] wrote:
>On Sat 20-08-22 09:02:57, [email protected] wrote:
>> On 2022-08-18 7:36 UTC, [email protected] wrote:
>> >> From: Li Zhe <[email protected]>
>> >>
>> >> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
>> >> we call page_ext_init() after page_alloc_init_late() to avoid some panic
>> >> problem. It seems that we cannot track early page allocations in current
>> >> kernel even if page structure has been initialized early.
>> >>
>> >> This patch move up page_ext_init() to catch early page allocations when
>> >> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
>> >> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
>> >> allocations. This is useful especially when we find that the free memory
>> >> value is not the same right after different kernel booting.
>> >
>> >is this actually useful in practice? I mean who is going to disable
>> >DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
>> >allocations?
>>
>> Yes it is useful. We use this method to catch the difference of early
>> page allocations between two kernel.
>
>I was not questioning the functionality itself but the way how it is
>achieved. Recompiling the kernel to achieve debuggability has proven to
>be really a bad approach historically. Most people are using
>pre-compiled kernels these days.
>
>> > I do see how debugging those early allocations might be useful but that
>> > would require a boot time option to be practical IMHO. Would it make
>> > sense to add a early_page_ext parameter which would essentially disable
>> > the deferred ipage initialization. That should be quite trivial to
>> > achieve (just hook into defer_init AFAICS).
>>
>> It is a good idea. A cmdline parameter is a flexible and dynamic method for
>> us to decide whether to defer page's and page_ext's initilization. For
>> comparison, this patch provides a static method to decide whether to defer
>> page's and page_ext's initilization. They are not conflicting. My next
>> work is trying to achieve your idea.
>
>They are not conflicting but this patch adds ifdefs and additional code
>that needs compile time testing with different options set. I.e. it adds
>maintenance burden for something that can be achieved by better means.
>So if you are ok to work on the runtime knob then I would propose to
>drop this patch from the mm tree and replace it by a trivial patch to
>allow early boot debugging by a cmd line parameter.

Yes you are right. Recompiling the kernel is not a clever method. I will send
another patch with a cmd line parameter 'early_page_ext' to achieve this idea.
Thanks for your advice.

2022-08-24 03:59:02

by lizhe.67

[permalink] [raw]
Subject: Re: [PATCH] page_ext: move up page_ext_init() to catch early page allocation if DEFERRED_STRUCT_PAGE_INIT is n'

On 2022-08-22 7:00 UTC, [email protected] wrote:
>> On 2022-08-18 7:36 UTC, [email protected] wrote:
>>>> From: Li Zhe <[email protected]>
>>>>
>>>> In 'commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init"")',
>>>> we call page_ext_init() after page_alloc_init_late() to avoid some panic
>>>> problem. It seems that we cannot track early page allocations in current
>>>> kernel even if page structure has been initialized early.
>>>>
>>>> This patch move up page_ext_init() to catch early page allocations when
>>>> DEFERRED_STRUCT_PAGE_INIT is n. After this patch, we only need to turn
>>>> DEFERRED_STRUCT_PAGE_INIT to n then we are able to analyze the early page
>>>> allocations. This is useful especially when we find that the free memory
>>>> value is not the same right after different kernel booting.
>>>
>>>is this actually useful in practice? I mean who is going to disable
>>>DEFERRED_STRUCT_PAGE_INIT and recompile the kernel for debugging early
>>>allocations?
>>
>> Yes it is useful. We use this method to catch the difference of early
>> page allocations between two kernel.
>>
>>> I do see how debugging those early allocations might be useful but that
>>> would require a boot time option to be practical IMHO. Would it make
>>> sense to add a early_page_ext parameter which would essentially disable
>>> the deferred ipage initialization. That should be quite trivial to
>>> achieve (just hook into defer_init AFAICS).
>>
>> It is a good idea. A cmdline parameter is a flexible and dynamic method for
>> us to decide whether to defer page's and page_ext's initilization. For
>> comparison, this patch provides a static method to decide whether to defer
>> page's and page_ext's initilization. They are not conflicting. My next
>> work is trying to achieve your idea.
>
>As we already have to pass page_owner=on parameter to enable the page
>allocation tracking in the first place, maybe that alone could also disable
>deffered init, and no need for another parameter?

In my opinion, adding a new parameter is better. Page owner is not the only
feature attached to page_ext. For scalability reasons, adding a new parameter
is a more flexible method. Thanks for your advice.