2019-01-28 14:46:21

by Michal Hocko

[permalink] [raw]
Subject: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

Hi,
Mikhail has posted fixes for the two bugs quite some time ago [1]. I
have pushed back on those fixes because I believed that it is much
better to plug the problem at the initialization time rather than play
whack-a-mole all over the hotplug code and find all the places which
expect the full memory section to be initialized. We have ended up with
2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
memory section") merged and cause a regression [2][3]. The reason is
that there might be memory layouts when two NUMA nodes share the same
memory section so the merged fix is simply incorrect.

In order to plug this hole we really have to be zone range aware in
those handlers. I have split up the original patch into two. One is
unchanged (patch 2) and I took a different approach for `removable'
crash. It would be great if Mikhail could test it still works for his
memory layout.

[1] http://lkml.kernel.org/r/[email protected]
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
[3] http://lkml.kernel.org/r/[email protected]




2019-01-28 14:45:52

by Michal Hocko

[permalink] [raw]
Subject: [PATCH 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone

From: Michal Hocko <[email protected]>

Mikhail has reported the following VM_BUG_ON triggered when reading
sysfs removable state of a memory block:
page:000003d082008000 is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
Call Trace:
([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
[<00000000008f15c4>] show_valid_zones+0x5c/0x190
[<00000000008cf9c4>] dev_attr_show+0x34/0x70
[<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
[<00000000003e4194>] seq_read+0x204/0x480
[<00000000003b53ea>] __vfs_read+0x32/0x178
[<00000000003b55b2>] vfs_read+0x82/0x138
[<00000000003b5be2>] ksys_read+0x5a/0xb0
[<0000000000b86ba0>] system_call+0xdc/0x2d8
Last Breaking-Event-Address:
[<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
Kernel panic - not syncing: Fatal exception: panic_on_oops

The reason is that the memory block spans the zone boundary and we are
stumbling over an unitialized struct page. Fix this by enforcing zone
range in is_mem_section_removable so that we never run away from a
zone.

Reported-by: Mikhail Zaslonko <[email protected]>
Debugged-by: Mikhail Zaslonko <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memory_hotplug.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b9a667d36c55..07872789d778 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1233,7 +1233,8 @@ static bool is_pageblock_removable_nolock(struct page *page)
bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
{
struct page *page = pfn_to_page(start_pfn);
- struct page *end_page = page + nr_pages;
+ unsigned long end_pfn = min(start_pfn + nr_pages, zone_end_pfn(page_zone(page)));
+ struct page *end_page = pfn_to_page(end_pfn);

/* Check the starting page of each pageblock within the range */
for (; page < end_page; page = next_active_pageblock(page)) {
--
2.20.1


2019-01-28 14:46:39

by Michal Hocko

[permalink] [raw]
Subject: [PATCH 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone

From: Mikhail Zaslonko <[email protected]>

If memory end is not aligned with the sparse memory section boundary, the
mapping of such a section is only partly initialized. This may lead to
VM_BUG_ON due to uninitialized struct pages access from test_pages_in_a_zone()
function triggered by memory_hotplug sysfs handlers.

Here are the the panic examples:
CONFIG_DEBUG_VM_PGFLAGS=y
kernel parameter mem=2050M
--------------------------
page:000003d082008000 is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
Call Trace:
([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
[<00000000008f15c4>] show_valid_zones+0x5c/0x190
[<00000000008cf9c4>] dev_attr_show+0x34/0x70
[<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
[<00000000003e4194>] seq_read+0x204/0x480
[<00000000003b53ea>] __vfs_read+0x32/0x178
[<00000000003b55b2>] vfs_read+0x82/0x138
[<00000000003b5be2>] ksys_read+0x5a/0xb0
[<0000000000b86ba0>] system_call+0xdc/0x2d8
Last Breaking-Event-Address:
[<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
Kernel panic - not syncing: Fatal exception: panic_on_oops

Fix this by checking whether the pfn to check is within the zone.

[[email protected]: separated this change from
http://lkml.kernel.org/r/[email protected]]
Signed-off-by: Mikhail Zaslonko <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/memory_hotplug.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 07872789d778..7711d0e327b6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1274,6 +1274,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
i++;
if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn)
continue;
+ /* Check if we got outside of the zone */
+ if (zone && !zone_spans_pfn(zone, pfn + i))
+ return 0;
page = pfn_to_page(pfn + i);
if (zone && page_zone(page) != zone)
return 0;
--
2.20.1


2019-01-28 17:51:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <[email protected]> wrote:

> Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> have pushed back on those fixes because I believed that it is much
> better to plug the problem at the initialization time rather than play
> whack-a-mole all over the hotplug code and find all the places which
> expect the full memory section to be initialized. We have ended up with
> 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> memory section") merged and cause a regression [2][3]. The reason is
> that there might be memory layouts when two NUMA nodes share the same
> memory section so the merged fix is simply incorrect.
>
> In order to plug this hole we really have to be zone range aware in
> those handlers. I have split up the original patch into two. One is
> unchanged (patch 2) and I took a different approach for `removable'
> crash. It would be great if Mikhail could test it still works for his
> memory layout.
>
> [1] http://lkml.kernel.org/r/[email protected]
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> [3] http://lkml.kernel.org/r/[email protected]

Any thoughts on which kernel version(s) need these patches?

2019-01-28 18:42:16

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Mon 28-01-19 09:50:54, Andrew Morton wrote:
> On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <[email protected]> wrote:
>
> > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > have pushed back on those fixes because I believed that it is much
> > better to plug the problem at the initialization time rather than play
> > whack-a-mole all over the hotplug code and find all the places which
> > expect the full memory section to be initialized. We have ended up with
> > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > memory section") merged and cause a regression [2][3]. The reason is
> > that there might be memory layouts when two NUMA nodes share the same
> > memory section so the merged fix is simply incorrect.
> >
> > In order to plug this hole we really have to be zone range aware in
> > those handlers. I have split up the original patch into two. One is
> > unchanged (patch 2) and I took a different approach for `removable'
> > crash. It would be great if Mikhail could test it still works for his
> > memory layout.
> >
> > [1] http://lkml.kernel.org/r/[email protected]
> > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > [3] http://lkml.kernel.org/r/[email protected]
>
> Any thoughts on which kernel version(s) need these patches?

My remark in 2830bf6f05fb still holds
: This has alwways been problem AFAIU. It just went unnoticed because we
: have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop
: zeroing memory during allocation in vmemmap") and so the above test
: would simply skip these ranges as belonging to zone 0 or provided a
: garbage.
:
: So I guess we do care for post f7f99100d8d9 kernels mostly and
: therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during
: allocation in vmemmap")

But, please let's wait for the patch 1 to be confirmed to fix the issue.
--
Michal Hocko
SUSE Labs

2019-01-28 18:47:27

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Mon 28-01-19 19:41:39, Michal Hocko wrote:
> On Mon 28-01-19 09:50:54, Andrew Morton wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <[email protected]> wrote:
> >
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > >
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > >
> > > [1] http://lkml.kernel.org/r/[email protected]
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/[email protected]
> >
> > Any thoughts on which kernel version(s) need these patches?
>
> My remark in 2830bf6f05fb still holds
> : This has alwways been problem AFAIU. It just went unnoticed because we
> : have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop
> : zeroing memory during allocation in vmemmap") and so the above test
> : would simply skip these ranges as belonging to zone 0 or provided a
> : garbage.
> :
> : So I guess we do care for post f7f99100d8d9 kernels mostly and
> : therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during
> : allocation in vmemmap")
>
> But, please let's wait for the patch 1 to be confirmed to fix the issue.

Also the revert [1] should be applied first. I thought Linus would pick
it up but he hasn't done so yet.

[1] http://lkml.kernel.org/r/[email protected]
--
Michal Hocko
SUSE Labs

2019-01-29 09:08:13

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone

On Mon, Jan 28, 2019 at 03:45:05PM +0100, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> Mikhail has reported the following VM_BUG_ON triggered when reading
> sysfs removable state of a memory block:
> page:000003d082008000 is uninitialized and poisoned
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> Call Trace:
> ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
> [<00000000008f15c4>] show_valid_zones+0x5c/0x190
> [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> [<00000000003e4194>] seq_read+0x204/0x480
> [<00000000003b53ea>] __vfs_read+0x32/0x178
> [<00000000003b55b2>] vfs_read+0x82/0x138
> [<00000000003b5be2>] ksys_read+0x5a/0xb0
> [<0000000000b86ba0>] system_call+0xdc/0x2d8
> Last Breaking-Event-Address:
> [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
> Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> The reason is that the memory block spans the zone boundary and we are
> stumbling over an unitialized struct page. Fix this by enforcing zone
> range in is_mem_section_removable so that we never run away from a
> zone.

Does that mean that the remaining pages(escaping from the current zone) are not tied to
any other zone? Why? Are these pages "holes" or how that came to be?

>
> Reported-by: Mikhail Zaslonko <[email protected]>
> Debugged-by: Mikhail Zaslonko <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/memory_hotplug.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b9a667d36c55..07872789d778 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1233,7 +1233,8 @@ static bool is_pageblock_removable_nolock(struct page *page)
> bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> {
> struct page *page = pfn_to_page(start_pfn);
> - struct page *end_page = page + nr_pages;
> + unsigned long end_pfn = min(start_pfn + nr_pages, zone_end_pfn(page_zone(page)));
> + struct page *end_page = pfn_to_page(end_pfn);
>
> /* Check the starting page of each pageblock within the range */
> for (; page < end_page; page = next_active_pageblock(page)) {
> --
> 2.20.1
>

--
Oscar Salvador
SUSE L3

2019-01-29 09:10:02

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone

On Mon, Jan 28, 2019 at 03:45:06PM +0100, Michal Hocko wrote:
> From: Mikhail Zaslonko <[email protected]>
>
> If memory end is not aligned with the sparse memory section boundary, the
> mapping of such a section is only partly initialized. This may lead to
> VM_BUG_ON due to uninitialized struct pages access from test_pages_in_a_zone()
> function triggered by memory_hotplug sysfs handlers.
>
> Here are the the panic examples:
> CONFIG_DEBUG_VM_PGFLAGS=y
> kernel parameter mem=2050M
> --------------------------
> page:000003d082008000 is uninitialized and poisoned
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> Call Trace:
> ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
> [<00000000008f15c4>] show_valid_zones+0x5c/0x190
> [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> [<00000000003e4194>] seq_read+0x204/0x480
> [<00000000003b53ea>] __vfs_read+0x32/0x178
> [<00000000003b55b2>] vfs_read+0x82/0x138
> [<00000000003b5be2>] ksys_read+0x5a/0xb0
> [<0000000000b86ba0>] system_call+0xdc/0x2d8
> Last Breaking-Event-Address:
> [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
> Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> Fix this by checking whether the pfn to check is within the zone.
>
> [[email protected]: separated this change from
> http://lkml.kernel.org/r/[email protected]]
> Signed-off-by: Mikhail Zaslonko <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>

Looks good to me:

Reviewed-by: Oscar Salvador <[email protected]>

> ---
> mm/memory_hotplug.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 07872789d778..7711d0e327b6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1274,6 +1274,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
> i++;
> if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn)
> continue;
> + /* Check if we got outside of the zone */
> + if (zone && !zone_spans_pfn(zone, pfn + i))
> + return 0;
> page = pfn_to_page(pfn + i);

Since we are already checking if the zone spans that pfn, is it safe to get
rid of the below check? Or maybe not because we might have intersected zones?

> if (zone && page_zone(page) != zone)
> return 0;

--
Oscar Salvador
SUSE L3

2019-01-29 09:12:51

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone

On Tue 29-01-19 10:06:05, Oscar Salvador wrote:
> On Mon, Jan 28, 2019 at 03:45:05PM +0100, Michal Hocko wrote:
> > From: Michal Hocko <[email protected]>
> >
> > Mikhail has reported the following VM_BUG_ON triggered when reading
> > sysfs removable state of a memory block:
> > page:000003d082008000 is uninitialized and poisoned
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > Call Trace:
> > ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
> > [<00000000008f15c4>] show_valid_zones+0x5c/0x190
> > [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > [<00000000003e4194>] seq_read+0x204/0x480
> > [<00000000003b53ea>] __vfs_read+0x32/0x178
> > [<00000000003b55b2>] vfs_read+0x82/0x138
> > [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > Last Breaking-Event-Address:
> > [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
> > Kernel panic - not syncing: Fatal exception: panic_on_oops
> >
> > The reason is that the memory block spans the zone boundary and we are
> > stumbling over an unitialized struct page. Fix this by enforcing zone
> > range in is_mem_section_removable so that we never run away from a
> > zone.
>
> Does that mean that the remaining pages(escaping from the current zone) are not tied to
> any other zone? Why? Are these pages "holes" or how that came to be?

Yes, those pages should be unreachable because they are out of the zone.
Reasons might be various. The memory range is not mem section aligned,
or cut due to mem parameter etc.

--
Michal Hocko
SUSE Labs

2019-01-29 09:13:47

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone

On Tue 29-01-19 10:09:08, Oscar Salvador wrote:
> On Mon, Jan 28, 2019 at 03:45:06PM +0100, Michal Hocko wrote:
> > From: Mikhail Zaslonko <[email protected]>
> >
> > If memory end is not aligned with the sparse memory section boundary, the
> > mapping of such a section is only partly initialized. This may lead to
> > VM_BUG_ON due to uninitialized struct pages access from test_pages_in_a_zone()
> > function triggered by memory_hotplug sysfs handlers.
> >
> > Here are the the panic examples:
> > CONFIG_DEBUG_VM_PGFLAGS=y
> > kernel parameter mem=2050M
> > --------------------------
> > page:000003d082008000 is uninitialized and poisoned
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > Call Trace:
> > ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
> > [<00000000008f15c4>] show_valid_zones+0x5c/0x190
> > [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > [<00000000003e4194>] seq_read+0x204/0x480
> > [<00000000003b53ea>] __vfs_read+0x32/0x178
> > [<00000000003b55b2>] vfs_read+0x82/0x138
> > [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > Last Breaking-Event-Address:
> > [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
> > Kernel panic - not syncing: Fatal exception: panic_on_oops
> >
> > Fix this by checking whether the pfn to check is within the zone.
> >
> > [[email protected]: separated this change from
> > http://lkml.kernel.org/r/[email protected]]
> > Signed-off-by: Mikhail Zaslonko <[email protected]>
> > Signed-off-by: Michal Hocko <[email protected]>
>
> Looks good to me:
>
> Reviewed-by: Oscar Salvador <[email protected]>
>
> > ---
> > mm/memory_hotplug.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 07872789d778..7711d0e327b6 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1274,6 +1274,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
> > i++;
> > if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn)
> > continue;
> > + /* Check if we got outside of the zone */
> > + if (zone && !zone_spans_pfn(zone, pfn + i))
> > + return 0;
> > page = pfn_to_page(pfn + i);
>
> Since we are already checking if the zone spans that pfn, is it safe to get
> rid of the below check? Or maybe not because we might have intersected zones?

Exactly. The new zone might start at the next pfn. Look for the cover
leter for such an example.

> > if (zone && page_zone(page) != zone)
> > return 0;
>
> --
> Oscar Salvador
> SUSE L3

--
Michal Hocko
SUSE Labs

2019-01-29 13:15:31

by Gerald Schaefer

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Mon, 28 Jan 2019 15:45:04 +0100
Michal Hocko <[email protected]> wrote:

> Hi,
> Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> have pushed back on those fixes because I believed that it is much
> better to plug the problem at the initialization time rather than play
> whack-a-mole all over the hotplug code and find all the places which
> expect the full memory section to be initialized. We have ended up with
> 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> memory section") merged and cause a regression [2][3]. The reason is
> that there might be memory layouts when two NUMA nodes share the same
> memory section so the merged fix is simply incorrect.
>
> In order to plug this hole we really have to be zone range aware in
> those handlers. I have split up the original patch into two. One is
> unchanged (patch 2) and I took a different approach for `removable'
> crash. It would be great if Mikhail could test it still works for his
> memory layout.
>
> [1] http://lkml.kernel.org/r/[email protected]
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> [3] http://lkml.kernel.org/r/[email protected]

I verified that both patches fix the issues we had with valid_zones
(with mem=2050M) and removable (with mem=3075M).

However, the call trace in the description of your patch 1 is wrong.
You basically have the same call trace for test_pages_in_a_zone in
both patches. The "removable" patch should have the call trace for
is_mem_section_removable from Mikhails original patches:

CONFIG_DEBUG_VM_PGFLAGS=y
kernel parameter mem=3075M
--------------------------
page:000003d08300c000 is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
Call Trace:
([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
[<00000000008f12fa>] show_mem_removable+0x9a/0xd8
[<00000000008cf9c4>] dev_attr_show+0x34/0x70
[<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
[<00000000003e4194>] seq_read+0x204/0x480
[<00000000003b53ea>] __vfs_read+0x32/0x178
[<00000000003b55b2>] vfs_read+0x82/0x138
[<00000000003b5be2>] ksys_read+0x5a/0xb0
[<0000000000b86ba0>] system_call+0xdc/0x2d8
Last Breaking-Event-Address:
[<000000000038596c>] is_mem_section_removable+0xb4/0x190
Kernel panic - not syncing: Fatal exception: panic_on_oops


2019-01-29 13:50:12

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> On Mon, 28 Jan 2019 15:45:04 +0100
> Michal Hocko <[email protected]> wrote:
>
> > Hi,
> > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > have pushed back on those fixes because I believed that it is much
> > better to plug the problem at the initialization time rather than play
> > whack-a-mole all over the hotplug code and find all the places which
> > expect the full memory section to be initialized. We have ended up with
> > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > memory section") merged and cause a regression [2][3]. The reason is
> > that there might be memory layouts when two NUMA nodes share the same
> > memory section so the merged fix is simply incorrect.
> >
> > In order to plug this hole we really have to be zone range aware in
> > those handlers. I have split up the original patch into two. One is
> > unchanged (patch 2) and I took a different approach for `removable'
> > crash. It would be great if Mikhail could test it still works for his
> > memory layout.
> >
> > [1] http://lkml.kernel.org/r/[email protected]
> > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > [3] http://lkml.kernel.org/r/[email protected]
>
> I verified that both patches fix the issues we had with valid_zones
> (with mem=2050M) and removable (with mem=3075M).
>
> However, the call trace in the description of your patch 1 is wrong.
> You basically have the same call trace for test_pages_in_a_zone in
> both patches. The "removable" patch should have the call trace for
> is_mem_section_removable from Mikhails original patches:

Thanks for testing. Can I use you Tested-by?

> CONFIG_DEBUG_VM_PGFLAGS=y
> kernel parameter mem=3075M
> --------------------------
> page:000003d08300c000 is uninitialized and poisoned
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> Call Trace:
> ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> [<00000000003e4194>] seq_read+0x204/0x480
> [<00000000003b53ea>] __vfs_read+0x32/0x178
> [<00000000003b55b2>] vfs_read+0x82/0x138
> [<00000000003b5be2>] ksys_read+0x5a/0xb0
> [<0000000000b86ba0>] system_call+0xdc/0x2d8
> Last Breaking-Event-Address:
> [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> Kernel panic - not syncing: Fatal exception: panic_on_oops

Yeah, this is c&p mistake on my end. I will use this trace instead.
Thanks for spotting.
--
Michal Hocko
SUSE Labs

2019-01-29 14:10:46

by Gerald Schaefer

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Tue, 29 Jan 2019 14:49:20 +0100
Michal Hocko <[email protected]> wrote:

> On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100
> > Michal Hocko <[email protected]> wrote:
> >
> > > Hi,
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > >
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > >
> > > [1] http://lkml.kernel.org/r/[email protected]
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/[email protected]
> >
> > I verified that both patches fix the issues we had with valid_zones
> > (with mem=2050M) and removable (with mem=3075M).
> >
> > However, the call trace in the description of your patch 1 is wrong.
> > You basically have the same call trace for test_pages_in_a_zone in
> > both patches. The "removable" patch should have the call trace for
> > is_mem_section_removable from Mikhails original patches:
>
> Thanks for testing. Can I use you Tested-by?

Sure, forgot to add this:
Tested-by: Gerald Schaefer <[email protected]>

>
> > CONFIG_DEBUG_VM_PGFLAGS=y
> > kernel parameter mem=3075M
> > --------------------------
> > page:000003d08300c000 is uninitialized and poisoned
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > Call Trace:
> > ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> > [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> > [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > [<00000000003e4194>] seq_read+0x204/0x480
> > [<00000000003b53ea>] __vfs_read+0x32/0x178
> > [<00000000003b55b2>] vfs_read+0x82/0x138
> > [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > Last Breaking-Event-Address:
> > [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> > Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> Yeah, this is c&p mistake on my end. I will use this trace instead.
> Thanks for spotting.


2019-01-29 17:41:17

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Tue, 29 Jan 2019 at 18:49, Michal Hocko <[email protected]> wrote:
>
> On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100
> > Michal Hocko <[email protected]> wrote:
> >
> > > Hi,
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > >
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > >
> > > [1] http://lkml.kernel.org/r/[email protected]
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/[email protected]
> >
> > I verified that both patches fix the issues we had with valid_zones
> > (with mem=2050M) and removable (with mem=3075M).
> >
> > However, the call trace in the description of your patch 1 is wrong.
> > You basically have the same call trace for test_pages_in_a_zone in
> > both patches. The "removable" patch should have the call trace for
> > is_mem_section_removable from Mikhails original patches:
>
> Thanks for testing. Can I use you Tested-by?
>
> > CONFIG_DEBUG_VM_PGFLAGS=y
> > kernel parameter mem=3075M
> > --------------------------
> > page:000003d08300c000 is uninitialized and poisoned
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > Call Trace:
> > ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> > [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> > [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > [<00000000003e4194>] seq_read+0x204/0x480
> > [<00000000003b53ea>] __vfs_read+0x32/0x178
> > [<00000000003b55b2>] vfs_read+0x82/0x138
> > [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > Last Breaking-Event-Address:
> > [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> > Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> Yeah, this is c&p mistake on my end. I will use this trace instead.
> Thanks for spotting.


Michal, I am late?
I am also tested these patches and can confirm that issue fixed again
with new approach.
I also attach two dmesg first when issue was reproduced and second
with applied patch (problem not reproduced).

--
Best Regards,
Mike Gavrilov.


Attachments:
dmesg_git_4aa9fc2a435a_with_bug.txt (135.76 kB)
dmesg_git_4aa9fc2a435a_after_patch.txt (128.12 kB)
Download all attachments

2019-01-29 20:25:19

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Tue 29-01-19 22:38:19, Mikhail Gavrilov wrote:
> On Tue, 29 Jan 2019 at 18:49, Michal Hocko <[email protected]> wrote:
> >
> > On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > > On Mon, 28 Jan 2019 15:45:04 +0100
> > > Michal Hocko <[email protected]> wrote:
> > >
> > > > Hi,
> > > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > > have pushed back on those fixes because I believed that it is much
> > > > better to plug the problem at the initialization time rather than play
> > > > whack-a-mole all over the hotplug code and find all the places which
> > > > expect the full memory section to be initialized. We have ended up with
> > > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > > memory section") merged and cause a regression [2][3]. The reason is
> > > > that there might be memory layouts when two NUMA nodes share the same
> > > > memory section so the merged fix is simply incorrect.
> > > >
> > > > In order to plug this hole we really have to be zone range aware in
> > > > those handlers. I have split up the original patch into two. One is
> > > > unchanged (patch 2) and I took a different approach for `removable'
> > > > crash. It would be great if Mikhail could test it still works for his
> > > > memory layout.
> > > >
> > > > [1] http://lkml.kernel.org/r/[email protected]
> > > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > > [3] http://lkml.kernel.org/r/[email protected]
> > >
> > > I verified that both patches fix the issues we had with valid_zones
> > > (with mem=2050M) and removable (with mem=3075M).
> > >
> > > However, the call trace in the description of your patch 1 is wrong.
> > > You basically have the same call trace for test_pages_in_a_zone in
> > > both patches. The "removable" patch should have the call trace for
> > > is_mem_section_removable from Mikhails original patches:
> >
> > Thanks for testing. Can I use you Tested-by?
> >
> > > CONFIG_DEBUG_VM_PGFLAGS=y
> > > kernel parameter mem=3075M
> > > --------------------------
> > > page:000003d08300c000 is uninitialized and poisoned
> > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > Call Trace:
> > > ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> > > [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> > > [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > > [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > > [<00000000003e4194>] seq_read+0x204/0x480
> > > [<00000000003b53ea>] __vfs_read+0x32/0x178
> > > [<00000000003b55b2>] vfs_read+0x82/0x138
> > > [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > > [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > > Last Breaking-Event-Address:
> > > [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> > > Kernel panic - not syncing: Fatal exception: panic_on_oops
> >
> > Yeah, this is c&p mistake on my end. I will use this trace instead.
> > Thanks for spotting.
>
>
> Michal, I am late?

I do not think so. I plan to repost tomorrow with the updated changelog
and gathered review and tested-by tags. Can I assume yours as well?

> I am also tested these patches and can confirm that issue fixed again
> with new approach.
> I also attach two dmesg first when issue was reproduced and second
> with applied patch (problem not reproduced).

Thanks!
--
Michal Hocko
SUSE Labs

2019-01-29 20:58:01

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

On Wed, 30 Jan 2019 at 01:24, Michal Hocko <[email protected]> wrote:
> I do not think so. I plan to repost tomorrow with the updated changelog
> and gathered review and tested-by tags. Can I assume yours as well?

Sure

--
Best Regards,
Mike Gavrilov.

2019-01-30 07:55:29

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone

On Tue, Jan 29, 2019 at 10:12:24AM +0100, Michal Hocko wrote:
> Yes, those pages should be unreachable because they are out of the zone.
> Reasons might be various. The memory range is not mem section aligned,
> or cut due to mem parameter etc.

I see, thanks.

Reviewed-by: Oscar Salvador <[email protected]>

--
Oscar Salvador
SUSE L3