2014-06-03 04:21:34

by Dave Jones

[permalink] [raw]
Subject: 3.15-rc8 mm/filemap.c:202 BUG

I'm still seeing this one from time to time, though it takes me quite a while to hit it,
despite my attempts at trying to narrow down the set of syscalls that cause it.

kernel BUG at mm/filemap.c:202!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225
task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000
RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
RSP: 0018:ffff880055963b90 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388
RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17
RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530
R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880
R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8
FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0
DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568
0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000
0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae
Call Trace:
[<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70
[<ffffffffac16921b>] truncate_inode_page+0x5b/0x90
[<ffffffffac174493>] shmem_undo_range+0x363/0x790
[<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30
[<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340
[<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0
[<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0
[<ffffffffac182260>] SyS_madvise+0x3c0/0x870
[<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffffac74c41f>] tracesys+0xdd/0xe2
Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff
RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360

There was also another variant of the same BUG with a slighty different stack trace.

kernel BUG at mm/filemap.c:202!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208
task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000
RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320
RSP: 0018:ffff880186147b18 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002
RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0
RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668
R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0
R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678
FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0
DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0
000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000
0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d
Call Trace:
[<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70
[<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90
[<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780
[<ffffffff84176394>] shmem_truncate_range+0x14/0x30
[<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150
[<ffffffff841e4b17>] evict+0xa7/0x170
[<ffffffff841e5435>] iput+0xf5/0x180
[<ffffffff841df8a0>] dentry_kill+0x260/0x2d0
[<ffffffff841df97c>] dput+0x6c/0x110
[<ffffffff841c92a9>] __fput+0x189/0x200
[<ffffffff841c936e>] ____fput+0xe/0x10
[<ffffffff84090484>] task_work_run+0xb4/0xe0
[<ffffffff8406ee42>] do_exit+0x302/0xb80
[<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff8407073c>] do_group_exit+0x4c/0xc0
[<ffffffff840707c4>] SyS_exit_group+0x14/0x20
[<ffffffff8475bf64>] tracesys+0xdd/0xe2
Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41


2014-06-03 09:11:12

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <[email protected]> wrote:
> I'm still seeing this one from time to time, though it takes me quite a while to hit it,
> despite my attempts at trying to narrow down the set of syscalls that cause it.
>
> kernel BUG at mm/filemap.c:202!
> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225
> task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000
> RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
> RSP: 0018:ffff880055963b90 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388
> RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17
> RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530
> R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880
> R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8
> FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0
> DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
> ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568
> 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000
> 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae
> Call Trace:
> [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70
> [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90
> [<ffffffffac174493>] shmem_undo_range+0x363/0x790
> [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30
> [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340
> [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0
> [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0
> [<ffffffffac182260>] SyS_madvise+0x3c0/0x870
> [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20
> [<ffffffffac74c41f>] tracesys+0xdd/0xe2
> Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff
> RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
>
> There was also another variant of the same BUG with a slighty different stack trace.
>
> kernel BUG at mm/filemap.c:202!
> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208
> task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000
> RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320
> RSP: 0018:ffff880186147b18 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002
> RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0
> RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668
> R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0
> R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678
> FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0
> DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
> ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0
> 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000
> 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d
> Call Trace:
> [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70
> [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90
> [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780
> [<ffffffff84176394>] shmem_truncate_range+0x14/0x30
> [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150
> [<ffffffff841e4b17>] evict+0xa7/0x170
> [<ffffffff841e5435>] iput+0xf5/0x180
> [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0
> [<ffffffff841df97c>] dput+0x6c/0x110
> [<ffffffff841c92a9>] __fput+0x189/0x200
> [<ffffffff841c936e>] ____fput+0xe/0x10
> [<ffffffff84090484>] task_work_run+0xb4/0xe0
> [<ffffffff8406ee42>] do_exit+0x302/0xb80
> [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20
> [<ffffffff8407073c>] do_group_exit+0x4c/0xc0
> [<ffffffff840707c4>] SyS_exit_group+0x14/0x20
> [<ffffffff8475bf64>] tracesys+0xdd/0xe2
> Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

This might shine some light, CONFIG_DEBUG_VM should be =y.

--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page,
void *shadow)
__dec_zone_page_state(page, NR_FILE_PAGES);
if (PageSwapBacked(page))
__dec_zone_page_state(page, NR_SHMEM);
- BUG_ON(page_mapped(page));
+ VM_BUG_ON_PAGE(page_mapped(page), page);

/*
* Some filesystems seem to re-dirty the page even after



Hugh, As I see shmem truncate/punch hole might race with
shmem_getpage_gfp() (when it converts
swap-entries into normal pages) and leave pages in truncated area. Am I right?
Currently I don't see how exactly this could lead to this problem, but
this looks suspicious.
I don't like the way in which truncate silently skips page entries
when they are changing under it.
Completely untested patch follows.

--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode,
loff_t lstart, loff_t lend,
if (radix_tree_exceptional_entry(page)) {
if (unfalloc)
continue;
- nr_swaps_freed += !shmem_free_swap(mapping,
- index, page);
+ if (shmem_free_swap(mapping, index, page))
+ goto retry;
+ nr_swaps_freed++;
continue;
}

@@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode
*inode, loff_t lstart, loff_t lend,
}
unlock_page(page);
}
+ index++;
+retry:
pagevec_remove_exceptionals(&pvec);
pagevec_release(&pvec);
mem_cgroup_uncharge_end();
- index++;
}

spin_lock(&info->lock);

2014-06-03 23:13:09

by Hugh Dickins

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote:
> On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <[email protected]> wrote:
> > I'm still seeing this one from time to time, though it takes me quite a while to hit it,
> > despite my attempts at trying to narrow down the set of syscalls that cause it.
> >
> > kernel BUG at mm/filemap.c:202!
> > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225
> > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000
> > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
> > RSP: 0018:ffff880055963b90 EFLAGS: 00010046
> > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388
> > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17
> > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530
> > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880
> > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8
> > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0
> > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> > Stack:
> > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568
> > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000
> > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae
> > Call Trace:
> > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70
> > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90
> > [<ffffffffac174493>] shmem_undo_range+0x363/0x790
> > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30
> > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340
> > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0
> > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0
> > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870
> > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20
> > [<ffffffffac74c41f>] tracesys+0xdd/0xe2
> > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff
> > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
> >
> > There was also another variant of the same BUG with a slighty different stack trace.
> >
> > kernel BUG at mm/filemap.c:202!
> > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208
> > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000
> > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320
> > RSP: 0018:ffff880186147b18 EFLAGS: 00010046
> > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002
> > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0
> > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668
> > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0
> > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678
> > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0
> > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> > Stack:
> > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0
> > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000
> > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d
> > Call Trace:
> > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70
> > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90
> > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780
> > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30
> > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150
> > [<ffffffff841e4b17>] evict+0xa7/0x170
> > [<ffffffff841e5435>] iput+0xf5/0x180
> > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0
> > [<ffffffff841df97c>] dput+0x6c/0x110
> > [<ffffffff841c92a9>] __fput+0x189/0x200
> > [<ffffffff841c936e>] ____fput+0xe/0x10
> > [<ffffffff84090484>] task_work_run+0xb4/0xe0
> > [<ffffffff8406ee42>] do_exit+0x302/0xb80
> > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20
> > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0
> > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20
> > [<ffffffff8475bf64>] tracesys+0xdd/0xe2
> > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41
> >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to [email protected]. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>
> This might shine some light, CONFIG_DEBUG_VM should be =y.
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page,
> void *shadow)
> __dec_zone_page_state(page, NR_FILE_PAGES);
> if (PageSwapBacked(page))
> __dec_zone_page_state(page, NR_SHMEM);
> - BUG_ON(page_mapped(page));
> + VM_BUG_ON_PAGE(page_mapped(page), page);
>
> /*
> * Some filesystems seem to re-dirty the page even after

Yes, there's a chance that will tell us more (but I don't have high
hopes of it). I'm still stumped by this issue, just as before.

Sasha (or Dave), any update on whether you see this without THP?
and whether you see the remove_migration_pte oops without THP?

(I do have my own idea of the answers to those questions,
but trying not to prejudice your answers.)

> Hugh, As I see shmem truncate/punch hole might race with
> shmem_getpage_gfp() (when it converts swap-entries into normal pages)
> and leave pages in truncated area. Am I right?

Certainly there can be that race, but (correct me if I'm wrong, of
course) shmem_undo_range() already allows for it, and does not need
your "goto retry" patch below.

Observe how that second loop is a "for ( ; ; )" loop (like the one
in truncate_inode_pages_range() that it was cloned from). The loop
continues around and around until it has squeezed everything out of
the hole or EOF, page or swap entry.

So, yes, what looked like a swap entry just before we called
shmem_free_swap(), may be a page pointer by the time we get to try
to delete it from the radix_tree; but then a later find_get_entries()
(after index is reset to start) will pick up the page (or swap entry
if it's gone back) and deal with it. The retry is already built in.

> Currently I don't see how exactly this could lead to this problem, but
> this looks suspicious.

page<->swap is definitely fertile ground for you to ponder on; though
I think myself that the answer to the page_mapped() BUG will lie in a
different direction.

Mind you, I've probably given too little weight to the fact that every
stacktrace shown has been a shmem one: originally I assumed that just
reflected trinity running its tests on a tmpfs, now I wonder: Dave,
Sasha, are you running similar tests on tmpfs and other filesystems,
and find this only in the tmpfs case?

Hugh

> I don't like the way in which truncate silently skips page entries
> when they are changing under it.
> Completely untested patch follows.
>
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode,
> loff_t lstart, loff_t lend,
> if (radix_tree_exceptional_entry(page)) {
> if (unfalloc)
> continue;
> - nr_swaps_freed += !shmem_free_swap(mapping,
> - index, page);
> + if (shmem_free_swap(mapping, index, page))
> + goto retry;
> + nr_swaps_freed++;
> continue;
> }
>
> @@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode
> *inode, loff_t lstart, loff_t lend,
> }
> unlock_page(page);
> }
> + index++;
> +retry:
> pagevec_remove_exceptionals(&pvec);
> pagevec_release(&pvec);
> mem_cgroup_uncharge_end();
> - index++;
> }
>
> spin_lock(&info->lock);
>

2014-06-03 23:41:26

by Dave Jones

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On Tue, Jun 03, 2014 at 04:11:43PM -0700, Hugh Dickins wrote:

> > - BUG_ON(page_mapped(page));
> > + VM_BUG_ON_PAGE(page_mapped(page), page);
> >
> > /*
> > * Some filesystems seem to re-dirty the page even after
>
> Yes, there's a chance that will tell us more (but I don't have high
> hopes of it). I'm still stumped by this issue, just as before.

running with that applied now.

> Sasha (or Dave), any update on whether you see this without THP?
> and whether you see the remove_migration_pte oops without THP?

haven't tried yet. I wish I had a better reproducer, because it can
take up to a day to show up, and if disabling THP makes it go away,
it's hard to judge if that's the case, or if I haven't been running
long enough.. Sort of a Schr?dinger's BUG_ON.

After I get a trace with the above patch applied, I'll give it a shot
though, just to see what happens.

> Mind you, I've probably given too little weight to the fact that every
> stacktrace shown has been a shmem one: originally I assumed that just
> reflected trinity running its tests on a tmpfs, now I wonder: Dave,
> Sasha, are you running similar tests on tmpfs and other filesystems,
> and find this only in the tmpfs case?

In my case, there's a tmpfs mounted, but it's extremely unlikely that
trinity walked into it. Perhaps I should try that, to see if it happens
faster.


> > I don't like the way in which truncate silently skips page entries
> > when they are changing under it.
> > Completely untested patch follows.
> >
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -495,8 +495,9 @@ static void shmem_undo_range(struct inode *inode,
> > loff_t lstart, loff_t lend,
> > if (radix_tree_exceptional_entry(page)) {
> > if (unfalloc)
> > continue;
> > - nr_swaps_freed += !shmem_free_swap(mapping,
> > - index, page);
> > + if (shmem_free_swap(mapping, index, page))
> > + goto retry;
> > + nr_swaps_freed++;
> > continue;
> > }
> >
> > @@ -509,10 +510,11 @@ static void shmem_undo_range(struct inode
> > *inode, loff_t lstart, loff_t lend,
> > }
> > unlock_page(page);
> > }
> > + index++;
> > +retry:
> > pagevec_remove_exceptionals(&pvec);
> > pagevec_release(&pvec);
> > mem_cgroup_uncharge_end();
> > - index++;
> > }

I'll add this to the queue of things to test, but that queue is now
about two days deep already :)

Dave

2014-06-04 12:33:49

by Sasha Levin

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On 06/03/2014 07:11 PM, Hugh Dickins wrote:
> On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote:
>> > On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <[email protected]> wrote:
>>> > > I'm still seeing this one from time to time, though it takes me quite a while to hit it,
>>> > > despite my attempts at trying to narrow down the set of syscalls that cause it.
>>> > >
>>> > > kernel BUG at mm/filemap.c:202!
>>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> > > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225
>>> > > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000
>>> > > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
>>> > > RSP: 0018:ffff880055963b90 EFLAGS: 00010046
>>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388
>>> > > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17
>>> > > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530
>>> > > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880
>>> > > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8
>>> > > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000
>>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> > > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0
>>> > > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000
>>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>>> > > Stack:
>>> > > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568
>>> > > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000
>>> > > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae
>>> > > Call Trace:
>>> > > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70
>>> > > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90
>>> > > [<ffffffffac174493>] shmem_undo_range+0x363/0x790
>>> > > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30
>>> > > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340
>>> > > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0
>>> > > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0
>>> > > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870
>>> > > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20
>>> > > [<ffffffffac74c41f>] tracesys+0xdd/0xe2
>>> > > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff
>>> > > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
>>> > >
>>> > > There was also another variant of the same BUG with a slighty different stack trace.
>>> > >
>>> > > kernel BUG at mm/filemap.c:202!
>>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> > > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208
>>> > > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000
>>> > > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320
>>> > > RSP: 0018:ffff880186147b18 EFLAGS: 00010046
>>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002
>>> > > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0
>>> > > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668
>>> > > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0
>>> > > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678
>>> > > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000
>>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> > > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0
>>> > > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000
>>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>>> > > Stack:
>>> > > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0
>>> > > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000
>>> > > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d
>>> > > Call Trace:
>>> > > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70
>>> > > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90
>>> > > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780
>>> > > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30
>>> > > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150
>>> > > [<ffffffff841e4b17>] evict+0xa7/0x170
>>> > > [<ffffffff841e5435>] iput+0xf5/0x180
>>> > > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0
>>> > > [<ffffffff841df97c>] dput+0x6c/0x110
>>> > > [<ffffffff841c92a9>] __fput+0x189/0x200
>>> > > [<ffffffff841c936e>] ____fput+0xe/0x10
>>> > > [<ffffffff84090484>] task_work_run+0xb4/0xe0
>>> > > [<ffffffff8406ee42>] do_exit+0x302/0xb80
>>> > > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20
>>> > > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0
>>> > > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20
>>> > > [<ffffffff8475bf64>] tracesys+0xdd/0xe2
>>> > > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41
>>> > >
>>> > >
>>> > > --
>>> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> > > the body to [email protected]. For more info on Linux MM,
>>> > > see: http://www.linux-mm.org/ .
>>> > > Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>> >
>> > This might shine some light, CONFIG_DEBUG_VM should be =y.
>> >
>> > --- a/mm/filemap.c
>> > +++ b/mm/filemap.c
>> > @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page,
>> > void *shadow)
>> > __dec_zone_page_state(page, NR_FILE_PAGES);
>> > if (PageSwapBacked(page))
>> > __dec_zone_page_state(page, NR_SHMEM);
>> > - BUG_ON(page_mapped(page));
>> > + VM_BUG_ON_PAGE(page_mapped(page), page);
>> >
>> > /*
>> > * Some filesystems seem to re-dirty the page even after
> Yes, there's a chance that will tell us more (but I don't have high
> hopes of it). I'm still stumped by this issue, just as before.
>
> Sasha (or Dave), any update on whether you see this without THP?
> and whether you see the remove_migration_pte oops without THP?

I'm pretty sure at this point that I only see both with THP enabled.

I've started seeing much less of them during fuzzing. Timing changes?


Thanks,
Sasha

2014-06-06 23:06:52

by Hugh Dickins

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On Wed, 4 Jun 2014, Sasha Levin wrote:
> On 06/03/2014 07:11 PM, Hugh Dickins wrote:
> > On Tue, 3 Jun 2014, Konstantin Khlebnikov wrote:
> >> > On Tue, Jun 3, 2014 at 8:21 AM, Dave Jones <[email protected]> wrote:
> >>> > > I'm still seeing this one from time to time, though it takes me quite a while to hit it,
> >>> > > despite my attempts at trying to narrow down the set of syscalls that cause it.
> >>> > >
> >>> > > kernel BUG at mm/filemap.c:202!
> >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> >>> > > CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225
> >>> > > task: ffff88006c610000 ti: ffff880055960000 task.ti: ffff880055960000
> >>> > > RIP: 0010:[<ffffffffac158e28>] [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
> >>> > > RSP: 0018:ffff880055963b90 EFLAGS: 00010046
> >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff880146f68388
> >>> > > RDX: 000000000000022a RSI: ffffffffaca8db38 RDI: ffffffffaca62b17
> >>> > > RBP: ffff880055963be0 R08: 0000000000000002 R09: ffff88000613d530
> >>> > > R10: ffff880055963ba8 R11: ffff880007f49a40 R12: ffffea0006795880
> >>> > > R13: ffff880143232ad0 R14: 0000000000000000 R15: ffff880143232ad8
> >>> > > FS: 00007f1e40673700(0000) GS:ffff88024d180000(0000) knlGS:0000000000000000
> >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> > > CR2: 00007f1e404e6000 CR3: 00000000603eb000 CR4: 00000000001407e0
> >>> > > DR0: 0000000001bb1000 DR1: 0000000002537000 DR2: 00000000016a5000
> >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> >>> > > Stack:
> >>> > > ffff880143232ae8 0000000000000000 ffff88000613d530 ffff88000613d568
> >>> > > 0000000008828259 ffffea0006795880 ffff880143232ae8 0000000000000000
> >>> > > 0000000000000002 0000000000000002 ffff880055963c08 ffffffffac158eae
> >>> > > Call Trace:
> >>> > > [<ffffffffac158eae>] delete_from_page_cache+0x3e/0x70
> >>> > > [<ffffffffac16921b>] truncate_inode_page+0x5b/0x90
> >>> > > [<ffffffffac174493>] shmem_undo_range+0x363/0x790
> >>> > > [<ffffffffac1748d4>] shmem_truncate_range+0x14/0x30
> >>> > > [<ffffffffac174bcf>] shmem_fallocate+0x9f/0x340
> >>> > > [<ffffffffac324d40>] ? timerqueue_add+0x60/0xb0
> >>> > > [<ffffffffac1c5ff6>] do_fallocate+0x116/0x1a0
> >>> > > [<ffffffffac182260>] SyS_madvise+0x3c0/0x870
> >>> > > [<ffffffffac346b33>] ? __this_cpu_preempt_check+0x13/0x20
> >>> > > [<ffffffffac74c41f>] tracesys+0xdd/0xe2
> >>> > > Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff
> >>> > > RIP [<ffffffffac158e28>] __delete_from_page_cache+0x318/0x360
> >>> > >
> >>> > > There was also another variant of the same BUG with a slighty different stack trace.
> >>> > >
> >>> > > kernel BUG at mm/filemap.c:202!
> >>> > > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> >>> > > CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208
> >>> > > task: ffff88023669d0a0 ti: ffff880186146000 task.ti: ffff880186146000
> >>> > > RIP: 0010:[<ffffffff8415ba05>] [<ffffffff8415ba05>] __delete_from_page_cache+0x315/0x320
> >>> > > RSP: 0018:ffff880186147b18 EFLAGS: 00010046
> >>> > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002
> >>> > > RDX: 000000000000012a RSI: ffffffff84a9a83c RDI: ffffffff84a6e0c0
> >>> > > RBP: ffff880186147b68 R08: 0000000000000002 R09: ffff88002669e668
> >>> > > R10: ffff880186147b30 R11: 0000000000000000 R12: ffffea0008b067c0
> >>> > > R13: ffff880025355670 R14: 0000000000000000 R15: ffff880025355678
> >>> > > FS: 00007fc10026f740(0000) GS:ffff880244400000(0000) knlGS:0000000000000000
> >>> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> > > CR2: 00002ab350f5c004 CR3: 000000018566c000 CR4: 00000000001407e0
> >>> > > DR0: 0000000001989000 DR1: 0000000000944000 DR2: 0000000002494000
> >>> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> >>> > > Stack:
> >>> > > ffff880025355688 ffff8800253556a0 ffff88002669e668 ffff88002669e6a0
> >>> > > 000000008ea099ef ffffea0008b067c0 ffff880025355688 0000000000000000
> >>> > > 0000000000000000 0000000000000002 ffff880186147b90 ffffffff8415ba4d
> >>> > > Call Trace:
> >>> > > [<ffffffff8415ba4d>] delete_from_page_cache+0x3d/0x70
> >>> > > [<ffffffff8416b0ab>] truncate_inode_page+0x5b/0x90
> >>> > > [<ffffffff84175f0b>] shmem_undo_range+0x30b/0x780
> >>> > > [<ffffffff84176394>] shmem_truncate_range+0x14/0x30
> >>> > > [<ffffffff8417647d>] shmem_evict_inode+0xcd/0x150
> >>> > > [<ffffffff841e4b17>] evict+0xa7/0x170
> >>> > > [<ffffffff841e5435>] iput+0xf5/0x180
> >>> > > [<ffffffff841df8a0>] dentry_kill+0x260/0x2d0
> >>> > > [<ffffffff841df97c>] dput+0x6c/0x110
> >>> > > [<ffffffff841c92a9>] __fput+0x189/0x200
> >>> > > [<ffffffff841c936e>] ____fput+0xe/0x10
> >>> > > [<ffffffff84090484>] task_work_run+0xb4/0xe0
> >>> > > [<ffffffff8406ee42>] do_exit+0x302/0xb80
> >>> > > [<ffffffff84349e13>] ? __this_cpu_preempt_check+0x13/0x20
> >>> > > [<ffffffff8407073c>] do_group_exit+0x4c/0xc0
> >>> > > [<ffffffff840707c4>] SyS_exit_group+0x14/0x20
> >>> > > [<ffffffff8475bf64>] tracesys+0xdd/0xe2
> >>> > > Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41
> >>> > >
> >>> > >
> >>> > > --
> >>> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >>> > > the body to [email protected]. For more info on Linux MM,
> >>> > > see: http://www.linux-mm.org/ .
> >>> > > Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
> >> >
> >> > This might shine some light, CONFIG_DEBUG_VM should be =y.
> >> >
> >> > --- a/mm/filemap.c
> >> > +++ b/mm/filemap.c
> >> > @@ -199,7 +199,7 @@ void __delete_from_page_cache(struct page *page,
> >> > void *shadow)
> >> > __dec_zone_page_state(page, NR_FILE_PAGES);
> >> > if (PageSwapBacked(page))
> >> > __dec_zone_page_state(page, NR_SHMEM);
> >> > - BUG_ON(page_mapped(page));
> >> > + VM_BUG_ON_PAGE(page_mapped(page), page);
> >> >
> >> > /*
> >> > * Some filesystems seem to re-dirty the page even after
> > Yes, there's a chance that will tell us more (but I don't have high
> > hopes of it). I'm still stumped by this issue, just as before.
> >
> > Sasha (or Dave), any update on whether you see this without THP?
> > and whether you see the remove_migration_pte oops without THP?
>
> I'm pretty sure at this point that I only see both with THP enabled.

Thanks for the hint, though I've made nothing of it. Of course, some
problems come from THP itself, and some from its pressure for migration.

Though I'd wanted to see the remove_migration_pte oops as a key to the
page_mapped bug, my guess is that they're actually independent.

I might have a potential fix for the remove_migration_pte one, I just
want to go back and look at my logic again, will reply in that thread
if I'm still convinced.

But a couple of hours ago had a thought on the page_mapped() bug,
and would like to propose a patch which could be the answer to that -
though frankly I remain pessimistic. See below.

>
> I've started seeing much less of them during fuzzing. Timing changes?

Strange; I've no idea. Anyway, here's today's thought...


[PATCH] mm: entry = ACCESS_ONCE(*pte) in handle_pte_fault

Use ACCESS_ONCE() in handle_pte_fault() when getting the entry or orig_pte
upon which all subsequent decisions and pte_same() tests will be made.

I have no evidence that its lack is responsible for the mm/filemap.c:202
BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity,
and I am not optimistic that it will fix it. But I have found no other
explanation, and ACCESS_ONCE() here will surely not hurt.

If gcc does re-access the pte before passing it down, then that would be
disastrous for correct page fault handling, and certainly could explain
the page_mapped() BUGs seen (concurrent fault causing page to be mapped
in a second time on top of itself: mapcount 2 for a single pte).

Signed-off-by: Hugh Dickins <[email protected]>
---

mm/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- 3.15-rc8/mm/memory.c 2014-04-27 23:55:53.608801152 -0700
+++ linux/mm/memory.c 2014-06-06 15:02:35.044320183 -0700
@@ -3810,7 +3810,7 @@ static int handle_pte_fault(struct mm_st
pte_t entry;
spinlock_t *ptl;

- entry = *pte;
+ entry = ACCESS_ONCE(*pte);
if (!pte_present(entry)) {
if (pte_none(entry)) {
if (vma->vm_ops) {

2014-06-06 23:16:57

by Linus Torvalds

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On Fri, Jun 6, 2014 at 4:05 PM, Hugh Dickins <[email protected]> wrote:
>
> [PATCH] mm: entry = ACCESS_ONCE(*pte) in handle_pte_fault
>
> Use ACCESS_ONCE() in handle_pte_fault() when getting the entry or orig_pte
> upon which all subsequent decisions and pte_same() tests will be made.
>
> I have no evidence that its lack is responsible for the mm/filemap.c:202
> BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity,
> and I am not optimistic that it will fix it. But I have found no other
> explanation, and ACCESS_ONCE() here will surely not hurt.

The patch looks obviously correct to me, although like you, I have no
real reason to believe it really fixes anything. But we definitely
should just load it once, since it's very much an optimistic load done
before we take the real lock and re-compare.

I'm somewhat dubious whether it actually would change code generation
- it doesn't change anything with the test-configuration I tried with
- but it's unquestionably a good patch. And hey, maybe some
configurations have sufficiently different code generation that gcc
actually _can_ sometimes do reloads, perhaps explaining why some
people see problems. So it's certainly worth testing even if it
doesn't make any change to code generation with *my* compiler and
config..

Linus

2014-06-06 23:22:02

by Sasha Levin

[permalink] [raw]
Subject: Re: 3.15-rc8 mm/filemap.c:202 BUG

On 06/06/2014 07:16 PM, Linus Torvalds wrote:
>> I have no evidence that its lack is responsible for the mm/filemap.c:202
>> > BUG_ON(page_mapped(page)) in __delete_from_page_cache() found by trinity,
>> > and I am not optimistic that it will fix it. But I have found no other
>> > explanation, and ACCESS_ONCE() here will surely not hurt.
> The patch looks obviously correct to me, although like you, I have no
> real reason to believe it really fixes anything. But we definitely
> should just load it once, since it's very much an optimistic load done
> before we take the real lock and re-compare.
>
> I'm somewhat dubious whether it actually would change code generation
> - it doesn't change anything with the test-configuration I tried with
> - but it's unquestionably a good patch. And hey, maybe some
> configurations have sufficiently different code generation that gcc
> actually _can_ sometimes do reloads, perhaps explaining why some
> people see problems. So it's certainly worth testing even if it
> doesn't make any change to code generation with *my* compiler and
> config..

I'm seeing the same code generated here as well. I won't carry the
patch unless Andrew/Linus take it so it won't hide possible bugs that
trinity might stumble on.


Thanks,
Sasha