2015-11-30 14:37:42

by Sasha Levin

[permalink] [raw]
Subject: mm: kernel BUG at mm/huge_memory.c:3272!

Hi Kirill,

I've hit the following while fuzzing with trinity on the latest -next kernel:

[ 321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
[ 321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
[ 321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
[ 321.455353] page->mem_cgroup:ffff880286620000
[ 321.456482] ------------[ cut here ]------------
[ 321.457158] kernel BUG at mm/huge_memory.c:3272!
[ 321.457811] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[ 321.458598] Modules linked in:
[ 321.459057] CPU: 18 PID: 24106 Comm: trinity-c129 Not tainted 4.4.0-rc2-next-20151127-sasha-00012-gf0498ca-dirty #2661
[ 321.460516] task: ffff880042fd2000 ti: ffff8800428c0000 task.ti: ffff8800428c0000
[ 321.461732] RIP: split_huge_page_to_list (mm/huge_memory.c:3272 (discriminator 1))
[ 321.464004] RSP: 0000:ffff8800428c71d0 EFLAGS: 00010246
[ 321.464733] RAX: ffff880042fd2000 RBX: ffffea0011a20080 RCX: 0000000000000000
[ 321.465735] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffed0008518e1f
[ 321.466719] RBP: ffff8800428c72b0 R08: fffffbfff4f9eaf1 R09: ffffffffa7cf578f
[ 321.467704] R10: ffffed0105fe6293 R11: 1ffffffff4f9eaed R12: ffffea0011a20060
[ 321.468702] R13: ffffea0011a200a0 R14: ffffea0011a20080 R15: ffff8800428c7300
[ 321.469718] FS: 00007f9d611bb700(0000) GS:ffff880686800000(0000) knlGS:0000000000000000
[ 321.470807] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 321.471608] CR2: 0000000001b54fe8 CR3: 0000000042869000 CR4: 00000000000006a0
[ 321.472633] DR0: 00007f9d5cb76000 DR1: 0000000000000000 DR2: 0000000000000000
[ 321.473612] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 321.474619] Stack:
[ 321.474935] ffff8800428c7300 ffff8800428c72b0 ffffffff9950f2c8 dffffc0000000000
[ 321.476071] 0000000041b58ab3 ffffffffa4871335 ffffffff9950f0c0 dffffc0000000000
[ 321.477184] ffffea0011a20000 0000000000000000 0000000000000000 0000000000000001
[ 321.478297] Call Trace:
[ 321.481234] deferred_split_scan (mm/huge_memory.c:3392)
[ 321.484688] shrink_slab (mm/vmscan.c:354 mm/vmscan.c:446)
[ 321.488008] shrink_zone (mm/vmscan.c:2449)
[ 321.493105] do_try_to_free_pages (mm/vmscan.c:2600 mm/vmscan.c:2650)
[ 321.496657] try_to_free_pages (mm/vmscan.c:2858)
[ 321.498346] __alloc_pages_nodemask (mm/page_alloc.c:2878 mm/page_alloc.c:2896 mm/page_alloc.c:3149 mm/page_alloc.c:3260)
[ 321.508819] alloc_pages_vma (mm/mempolicy.c:2042)
[ 321.509629] wp_page_copy.isra.41 (mm/memory.c:2064)
[ 321.512347] do_wp_page (mm/memory.c:2339)
[ 321.518569] handle_mm_fault (mm/memory.c:3302 mm/memory.c:3396 mm/memory.c:3425)
[ 321.527500] __do_page_fault (arch/x86/mm/fault.c:1239)
[ 321.528411] do_page_fault (arch/x86/mm/fault.c:1301 include/linux/context_tracking_state.h:30 include/linux/context_tracking.h:50 arch/x86/mm/fault.c:1302)
[ 321.530053] do_async_page_fault (./arch/x86/include/asm/traps.h:82 arch/x86/kernel/kvm.c:264)
[ 321.532125] async_page_fault (arch/x86/entry/entry_64.S:989)
[ 321.533057] Code: ea 03 80 3c 02 00 74 08 48 89 df e8 58 4d fe ff 48 8b 03 a8 01 75 16 e8 7c 51 fe ff 48 c7 c6 80 3c 4f a2 4c 89 f7 e8 2d 84 f5 ff <0f> 0b e8 66 51 fe ff 48 8b 55 c8 48 b8 00 00 00 00 00 fc ff df
All code
========
0: ea (bad)
1: 03 80 3c 02 00 74 add 0x7400023c(%rax),%eax
7: 08 48 89 or %cl,-0x77(%rax)
a: df e8 fucomip %st(0),%st
c: 58 pop %rax
d: 4d fe rex.WRB (bad)
f: ff 48 8b decl -0x75(%rax)
12: 03 a8 01 75 16 e8 add -0x17e98aff(%rax),%ebp
18: 7c 51 jl 0x6b
1a: fe (bad)
1b: ff 48 c7 decl -0x39(%rax)
1e: c6 80 3c 4f a2 4c 89 movb $0x89,0x4ca24f3c(%rax)
25: f7 e8 imul %eax
27: 2d 84 f5 ff 0f sub $0xffff584,%eax
2c: 0b e8 or %eax,%ebp
2e: 66 51 push %cx
30: fe (bad)
31: ff 48 8b decl -0x75(%rax)
34: 55 push %rbp
35:* c8 48 b8 00 enterq $0xb848,$0x0 <-- trapping instruction
39: 00 00 add %al,(%rax)
3b: 00 00 add %al,(%rax)
3d: fc cld
3e: ff df lcallq *<internal disassembler error>
...

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e8 66 51 fe ff callq 0xfffffffffffe516d
7: 48 8b 55 c8 mov -0x38(%rbp),%rdx
b: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
12: fc ff df
...
[ 321.537072] RIP split_huge_page_to_list (mm/huge_memory.c:3272 (discriminator 1))
[ 321.537942] RSP <ffff8800428c71d0>


2015-12-01 21:26:59

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: mm: kernel BUG at mm/huge_memory.c:3272!

On Mon, Nov 30, 2015 at 09:37:33AM -0500, Sasha Levin wrote:
> Hi Kirill,
>
> I've hit the following while fuzzing with trinity on the latest -next kernel:
>
> [ 321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
> [ 321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
> [ 321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> [ 321.455353] page->mem_cgroup:ffff880286620000

I think this should help:

>From aadc911f047b094c68b350550556dafabf05af13 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <[email protected]>
Date: Fri, 20 Nov 2015 12:20:00 +0200
Subject: [PATCH] thp: fix split_huge_page vs. deferred_split_scan race

Minchan[1] and Sasha[2] had reported crash in split_huge_page_to_list()
called from deferred_split_scan() due VM_BUG_ON_PAGE(!PageLocked(page)).

This can happen because race between deferred_split_scan() and
split_huge_page(). The result of the race is that the page can be split
under deferred_split_scan().

The patch prevents this by taking split_queue_lock in
split_huge_page_to_list() when we check if the page can be split.
If the page is suitable for splitting, we remove page from splitting
queue under the same lock, before splitting starts.

[1] http://lkml.kernel.org/g/20151117073539.GB32578@bbox
[2] http://lkml.kernel.org/g/[email protected]

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reported-by: Minchan Kim <[email protected]>
Reported-by: Sasha Levin <[email protected]>
---
mm/huge_memory.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index dc2b947d4f85..7c0ad4d9110b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3186,13 +3186,6 @@ static void __split_huge_page(struct page *page, struct list_head *list)
spin_lock_irq(&zone->lru_lock);
lruvec = mem_cgroup_page_lruvec(head, zone);

- spin_lock(&split_queue_lock);
- if (!list_empty(page_deferred_list(head))) {
- split_queue_len--;
- list_del(page_deferred_list(head));
- }
- spin_unlock(&split_queue_lock);
-
/* complete memcg works before add pages to LRU */
mem_cgroup_split_huge_fixup(head);

@@ -3299,12 +3292,20 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
freeze_page(anon_vma, head);
VM_BUG_ON_PAGE(compound_mapcount(head), head);

+ /* Prevent deferred_split_scan() touching ->_count */
+ spin_lock(&split_queue_lock);
count = page_count(head);
mapcount = total_mapcount(head);
if (mapcount == count - 1) {
+ if (!list_empty(page_deferred_list(head))) {
+ split_queue_len--;
+ list_del(page_deferred_list(head));
+ }
+ spin_unlock(&split_queue_lock);
__split_huge_page(page, list);
ret = 0;
} else if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount > count - 1) {
+ spin_unlock(&split_queue_lock);
pr_alert("total_mapcount: %u, page_count(): %u\n",
mapcount, count);
if (PageTail(page))
@@ -3312,6 +3313,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
dump_page(page, "total_mapcount(head) > page_count(head) - 1");
BUG();
} else {
+ spin_unlock(&split_queue_lock);
unfreeze_page(anon_vma, head);
ret = -EBUSY;
}
--
2.6.2

--
Kirill A. Shutemov

2015-12-01 23:41:33

by Minchan Kim

[permalink] [raw]
Subject: Re: mm: kernel BUG at mm/huge_memory.c:3272!

On Tue, Dec 01, 2015 at 11:26:36PM +0200, Kirill A. Shutemov wrote:
> On Mon, Nov 30, 2015 at 09:37:33AM -0500, Sasha Levin wrote:
> > Hi Kirill,
> >
> > I've hit the following while fuzzing with trinity on the latest -next kernel:
> >
> > [ 321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
> > [ 321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
> > [ 321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> > [ 321.455353] page->mem_cgroup:ffff880286620000
>
> I think this should help:
>
> From aadc911f047b094c68b350550556dafabf05af13 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <[email protected]>
> Date: Fri, 20 Nov 2015 12:20:00 +0200
> Subject: [PATCH] thp: fix split_huge_page vs. deferred_split_scan race
>
> Minchan[1] and Sasha[2] had reported crash in split_huge_page_to_list()
> called from deferred_split_scan() due VM_BUG_ON_PAGE(!PageLocked(page)).
>
> This can happen because race between deferred_split_scan() and
> split_huge_page(). The result of the race is that the page can be split
> under deferred_split_scan().
>
> The patch prevents this by taking split_queue_lock in
> split_huge_page_to_list() when we check if the page can be split.
> If the page is suitable for splitting, we remove page from splitting
> queue under the same lock, before splitting starts.
>
> [1] http://lkml.kernel.org/g/20151117073539.GB32578@bbox
> [2] http://lkml.kernel.org/g/[email protected]
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> Reported-by: Minchan Kim <[email protected]>
> Reported-by: Sasha Levin <[email protected]>

With this, I cannot reprocude the error.