When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
"arg". As the result, it triggers warnings below. Also, it is only
necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.
page:ffffea0001200000 count:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x3fffe000001000(reserved)
raw: 003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665
kasan_mem_notifier+0x34/0x23b
CPU: 25 PID: 1665 Comm: bash Tainted: G W 5.0.0+ #94
Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20
10/25/2017
RIP: 0010:kasan_mem_notifier+0x34/0x23b
RSP: 0018:ffff8883ec737890 EFLAGS: 00010206
RAX: 0000000000000246 RBX: ff10f0f4435f1000 RCX: f887a7a21af88000
RDX: dffffc0000000000 RSI: 0000000000000020 RDI: ffff8881f221af88
RBP: ffff8883ec737898 R08: ffff888000000000 R09: ffffffffb0bddcd0
R10: ffffed103e857088 R11: ffff8881f42b8443 R12: dffffc0000000000
R13: 00000000fffffff9 R14: dffffc0000000000 R15: 0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560fbd31d730 CR3: 00000004049c6003 CR4: 00000000001606a0
Call Trace:
notifier_call_chain+0xbf/0x130
__blocking_notifier_call_chain+0x76/0xc0
blocking_notifier_call_chain+0x16/0x20
memory_notify+0x1b/0x20
__offline_pages+0x3e2/0x1210
offline_pages+0x11/0x20
memory_block_action+0x144/0x300
memory_subsys_offline+0xe5/0x170
device_offline+0x13f/0x1e0
state_store+0xeb/0x110
dev_attr_store+0x3f/0x70
sysfs_kf_write+0x104/0x150
kernfs_fop_write+0x25c/0x410
__vfs_write+0x66/0x120
vfs_write+0x15a/0x4f0
ksys_write+0xd2/0x1b0
__x64_sys_write+0x73/0xb0
do_syscall_64+0xeb/0xb78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f14f75cc3b8
RSP: 002b:00007ffe84d01d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f14f75cc3b8
RDX: 0000000000000008 RSI: 0000563f8e433d70 RDI: 0000000000000001
RBP: 0000563f8e433d70 R08: 000000000000000a R09: 00007ffe84d018f0
R10: 000000000000000a R11: 0000000000000246 R12: 00007f14f789e780
R13: 0000000000000008 R14: 00007f14f7899740 R15: 0000000000000008
Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure")
CC: [email protected] # 5.0.x
Reviewed-by: Oscar Salvador <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
---
mm/memory_hotplug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0e0a16021fd5..0082d699be94 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1699,12 +1699,12 @@ static int __ref __offline_pages(unsigned long start_pfn,
failed_removal_isolated:
undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+ memory_notify(MEM_CANCEL_OFFLINE, &arg);
failed_removal:
pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
(unsigned long long) start_pfn << PAGE_SHIFT,
((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
reason);
- memory_notify(MEM_CANCEL_OFFLINE, &arg);
/* pushback to free area */
mem_hotplug_done();
return ret;
--
2.17.2 (Apple Git-113)
On Thu, Mar 21, 2019 at 2:13 AM Qian Cai <[email protected]> wrote:
>
> When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
> calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
> "arg". As the result, it triggers warnings below. Also, it is only
> necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.
For my clarification, if test_pages_in_a_zone() failed in __offline_pages(),
we have the similar scenario as well. If yes, do we need to capture it
in change log ?
>
> page:ffffea0001200000 count:1 mapcount:0 mapping:0000000000000000
> index:0x0
> flags: 0x3fffe000001000(reserved)
> raw: 003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000
> raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665
> kasan_mem_notifier+0x34/0x23b
> CPU: 25 PID: 1665 Comm: bash Tainted: G W 5.0.0+ #94
> Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20
> 10/25/2017
> RIP: 0010:kasan_mem_notifier+0x34/0x23b
> RSP: 0018:ffff8883ec737890 EFLAGS: 00010206
> RAX: 0000000000000246 RBX: ff10f0f4435f1000 RCX: f887a7a21af88000
> RDX: dffffc0000000000 RSI: 0000000000000020 RDI: ffff8881f221af88
> RBP: ffff8883ec737898 R08: ffff888000000000 R09: ffffffffb0bddcd0
> R10: ffffed103e857088 R11: ffff8881f42b8443 R12: dffffc0000000000
> R13: 00000000fffffff9 R14: dffffc0000000000 R15: 0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000560fbd31d730 CR3: 00000004049c6003 CR4: 00000000001606a0
> Call Trace:
> notifier_call_chain+0xbf/0x130
> __blocking_notifier_call_chain+0x76/0xc0
> blocking_notifier_call_chain+0x16/0x20
> memory_notify+0x1b/0x20
> __offline_pages+0x3e2/0x1210
> offline_pages+0x11/0x20
> memory_block_action+0x144/0x300
> memory_subsys_offline+0xe5/0x170
> device_offline+0x13f/0x1e0
> state_store+0xeb/0x110
> dev_attr_store+0x3f/0x70
> sysfs_kf_write+0x104/0x150
> kernfs_fop_write+0x25c/0x410
> __vfs_write+0x66/0x120
> vfs_write+0x15a/0x4f0
> ksys_write+0xd2/0x1b0
> __x64_sys_write+0x73/0xb0
> do_syscall_64+0xeb/0xb78
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f14f75cc3b8
> RSP: 002b:00007ffe84d01d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f14f75cc3b8
> RDX: 0000000000000008 RSI: 0000563f8e433d70 RDI: 0000000000000001
> RBP: 0000563f8e433d70 R08: 000000000000000a R09: 00007ffe84d018f0
> R10: 000000000000000a R11: 0000000000000246 R12: 00007f14f789e780
> R13: 0000000000000008 R14: 00007f14f7899740 R15: 0000000000000008
>
> Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure")
> CC: [email protected] # 5.0.x
> Reviewed-by: Oscar Salvador <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
> Signed-off-by: Qian Cai <[email protected]>
> ---
> mm/memory_hotplug.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0e0a16021fd5..0082d699be94 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1699,12 +1699,12 @@ static int __ref __offline_pages(unsigned long start_pfn,
>
> failed_removal_isolated:
> undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
> + memory_notify(MEM_CANCEL_OFFLINE, &arg);
> failed_removal:
> pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
> (unsigned long long) start_pfn << PAGE_SHIFT,
> ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
> reason);
> - memory_notify(MEM_CANCEL_OFFLINE, &arg);
> /* pushback to free area */
> mem_hotplug_done();
> return ret;
> --
> 2.17.2 (Apple Git-113)
>
On Fri 22-03-19 12:20:12, Souptick Joarder wrote:
> On Thu, Mar 21, 2019 at 2:13 AM Qian Cai <[email protected]> wrote:
> >
> > When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
> > calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
> > "arg". As the result, it triggers warnings below. Also, it is only
> > necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.
>
> For my clarification, if test_pages_in_a_zone() failed in __offline_pages(),
> we have the similar scenario as well. If yes, do we need to capture it
> in change log ?
Yes this is the same situation. We can add a note that the same applies
to test_pages_in_a_zone failure path but I do not think it is strictly
necessary. Thanks for the note anyway.
--
Michal Hocko
SUSE Labs