2018-07-21 02:24:48

by Chen, Rong A

[permalink] [raw]
Subject: [lkp-robot] [confidence: ] 7757d607c6 [ 56.996267] BUG: Bad page map in process trinity-c2 pte:0d755065 pmd:0d55b067


Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/pti

commit 7757d607c6b31867777de42e1fb0210b9c5d8b70
Author: Joerg Roedel <[email protected]>
AuthorDate: Wed Jul 18 11:41:14 2018 +0200
Commit: Thomas Gleixner <[email protected]>
CommitDate: Fri Jul 20 01:11:48 2018 +0200

x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32

Allow PTI to be compiled on x86_32.

Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Pavel Machek <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: David Laight <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Andrea Arcangeli <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: "David H . Gutteridge" <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

6df934b92a x86/ldt: Enable LDT user-mapping for PAE
7757d607c6 x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
8c934e01a7 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
4edd5aa38c Merge branch 'linus'
+-----------------------------------------------+------------+------------+------------+------------+
| | 6df934b92a | 7757d607c6 | 8c934e01a7 | 4edd5aa38c |
+-----------------------------------------------+------------+------------+------------+------------+
| boot_successes | 1286 | 307 | 282 | 622 |
| boot_failures | 5 | 3 | 4 | 3 |
| Mem-Info | 5 | 1 | 2 | 1 |
| invoked_oom-killer:gfp_mask=0x | 3 | | | |
| BUG:Bad_page_map_in_process | 0 | 2 | 2 | 1 |
| BUG:Bad_page_state_in_process | 0 | 2 | 2 | 1 |
| BUG:Bad_rss-counter_state_mm:(ptrval)idx:#val | 0 | 1 | 2 | 1 |
| kernel_BUG_at_mm/filemap.c | 0 | 0 | 1 | |
| invalid_opcode:#[##] | 0 | 0 | 1 | |
| EIP:unaccount_page_cache_page | 0 | 0 | 1 | |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 0 | 1 | 1 |
| BUG:unable_to_handle_kernel | 0 | 0 | 0 | 1 |
| Oops:#[##] | 0 | 0 | 0 | 1 |
| EIP:put_pid | 0 | 0 | 0 | 1 |
+-----------------------------------------------+------------+------------+------------+------------+

[child3:1823] fdatasync (148) returned ENOSYS, marking as inactive.
[main] 12067 iterations. [F:7570 S:4509 HI:2917]
[ 28.550832] warning: process `trinity-c0' used the obsolete bdflush system call
[ 28.558277] Fix your initscripts?
[ 50.480036] trinity-c0 (1412) used greatest stack depth: 5416 bytes left
[ 56.996267] BUG: Bad page map in process trinity-c2 pte:0d755065 pmd:0d55b067
[ 56.997421] page:bfa00aa0 count:1 mapcount:-1 mapping:00000000 index:0x0
[ 56.998417] flags: 0xc000014(referenced|dirty)
[ 56.999087] raw: 0c000014 00000100 00000200 00000000 00000000 00000000 fffffffe 00000001
[ 57.000311] page dumped because: bad pte
[ 57.000909] addr:21632421 vm_flags:00100873 anon_vma:d90b494d mapping:7c8e0e7b index:3ec
[ 57.002102] file:trinity fault:filemap_fault mmap:generic_file_mmap readpage:simple_readpage
[ 57.003344] CPU: 0 PID: 1670 Comm: trinity-c2 Not tainted 4.18.0-rc4-00209-g7757d60 #1
[ 57.004505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 57.005724] Call Trace:
[ 57.006112] dump_stack+0x75/0xa9
[ 57.006612] ? dcache_readdir+0x15a/0x15a
[ 57.007221] print_bad_pte+0x166/0x180
[ 57.007789] ? __lock_page_or_retry+0xa1/0xa1
[ 57.008436] ? read_cache_page_gfp+0x1c/0x1c
[ 57.009072] ? dcache_readdir+0x15a/0x15a
[ 57.009676] unmap_page_range+0x3e1/0x596
[ 57.010296] unmap_single_vma+0x8b/0x95
[ 57.010882] unmap_vmas+0x27/0x36
[ 57.011380] exit_mmap+0x93/0x121
[ 57.011905] mmput+0x44/0xbf
[ 57.012342] do_exit+0x31d/0x802
[ 57.012836] ? _raw_spin_unlock_irq+0x22/0x44
[ 57.013489] do_group_exit+0x30/0x86
[ 57.014032] get_signal+0x5e0/0x605
[ 57.014560] do_signal+0x24/0x4c1
[ 57.015071] ? _raw_spin_unlock_irqrestore+0x3a/0x5f
[ 57.015809] ? trace_hardirqs_on_caller+0x14b/0x166
[ 57.016549] ? trace_hardirqs_on_caller+0x14b/0x166
[ 57.017276] exit_to_usermode_loop+0x37/0x69
[ 57.017919] do_fast_syscall_32+0x217/0x249
[ 57.018542] entry_SYSENTER_32+0x70/0xc8
[ 57.019127] EIP: 0xa7fc8bf9
[ 57.019546] Code: ff 85 d2 74 02 89 02 5d c3 8b 04 24 c3 8b 1c 24 c3 8b 34 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
[ 57.022438] EAX: fffffe00 EBX: a6d55000 ECX: 00085000 EDX: 00000002
[ 57.023356] ESI: 00000008 EDI: 00400000 EBP: fffffffc ESP: afe06e5c
[ 57.024269] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
[ 57.025347] Disabling lock debugging due to kernel taint
[main] trace_fd was -1
[main] kernel became tainted! (32/0) Last seed was 2560220593
trinity: Detected kernel tainting. Last seed was 2560220593
[main] exit_reason=7, but 3 children still running.
[child1:2134] trace_fd was -1
[main] trace_fd was -1
[main] kernel became tainted! (32/0) Last seed was 1446110596
trinity: Detected kernel tainting. Last seed was 1446110596
[ 57.096675] BUG: Bad page state in process trinity-c2 pfn:0d755
[ 57.097586] page:bfa00aa0 count:0 mapcount:-1 mapping:00000000 index:0x0
[ 57.098572] flags: 0xc000014(referenced|dirty)
[ 57.099238] raw: 0c000014 bdf45ce4 bdf45ce4 00000000 00000000 00000000 fffffffe 00000000
[ 57.104018] page dumped because: nonzero mapcount
[ 57.104732] Modules linked in:
[ 57.105200] CPU: 0 PID: 1670 Comm: trinity-c2 Tainted: G B 4.18.0-rc4-00209-g7757d60 #1
[ 57.106552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 57.107782] Call Trace:
[ 57.108160] dump_stack+0x75/0xa9
[ 57.108658] bad_page+0xec/0x109
[ 57.109146] free_pages_check_bad+0x40/0x42
[ 57.109768] free_unref_page_prepare+0x4d/0xef
[ 57.110421] free_unref_page_list+0x3e/0x115
[ 57.111059] release_pages+0xd9/0x28f
[ 57.111604] free_pages_and_swap_cache+0x72/0x78
[ 57.112292] tlb_flush_mmu_free+0x20/0x33
[ 57.112886] unmap_page_range+0x56f/0x596
[ 57.113486] unmap_single_vma+0x8b/0x95
[ 57.114058] unmap_vmas+0x27/0x36
[ 57.114553] exit_mmap+0x93/0x121
[ 57.115056] mmput+0x44/0xbf
[ 57.115488] do_exit+0x31d/0x802
[ 57.115976] ? _raw_spin_unlock_irq+0x22/0x44
[ 57.116617] do_group_exit+0x30/0x86
[ 57.117154] get_signal+0x5e0/0x605
[ 57.117676] do_signal+0x24/0x4c1
[ 57.118175] ? _raw_spin_unlock_irqrestore+0x3a/0x5f
[ 57.118912] ? trace_hardirqs_on_caller+0x14b/0x166
[ 57.119631] ? trace_hardirqs_on_caller+0x14b/0x166
[ 57.120347] exit_to_usermode_loop+0x37/0x69
[ 57.120981] do_fast_syscall_32+0x217/0x249
[ 57.121596] entry_SYSENTER_32+0x70/0xc8
[ 57.122178] EIP: 0xa7fc8bf9
[ 57.122592] Code: Bad RIP value.
[ 57.123085] EAX: fffffe00 EBX: a6d55000 ECX: 00085000 EDX: 00000002
[ 57.124006] ESI: 00000008 EDI: 00400000 EBP: fffffffc ESP: afe06e5c
[ 57.124921] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
[main] exit_reason=7, but 2 children still running.
[main] trace_fd was -1
[main] kernel became tainted! (32/0) Last seed was 669750733
trinity: Detected kernel tainting. Last seed was 669750733
Failed to write post mortem log (Permission denied)

# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 8c934e01a7ce685d98e970880f5941d79272c654 37b5dca2898d1471729194f45e281c2443eb9d6c --
git bisect good 8372d66865deb45ee3ec21401a9c80f231b728c8 # 23:36 G 304 0 4 4 x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h
git bisect good b976690f5db26fbc7c2be413bfa0fbd270547a94 # 00:10 G 305 0 5 5 x86/mm/pti: Introduce pti_finalize()
git bisect good 9bae3197e15dd5e03ce8e237db6fe4486b08a775 # 00:49 G 308 0 3 4 x86/ldt: Split out sanity check in map_ldt_struct()
git bisect bad 5e8105950a8b3e03e805299b4d05020ee4eda31a # 01:12 B 131 1 4 4 x86/mm/pti: Add Warning when booting on a PCID capable CPU
git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 01:47 G 310 0 7 10 x86/ldt: Enable LDT user-mapping for PAE
git bisect bad 7757d607c6b31867777de42e1fb0210b9c5d8b70 # 02:08 B 134 1 2 2 x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
# first bad commit: [7757d607c6b31867777de42e1fb0210b9c5d8b70] x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 03:14 G 900 0 9 19 x86/ldt: Enable LDT user-mapping for PAE
# extra tests on HEAD of tip/x86/pti
git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:15 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
# extra tests on tree/branch tip/x86/pti
git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:20 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
# extra tests with first bad commit reverted
git bisect good 90d0ce801fac8115d424e40a4a258aeed0e409dd # 03:58 G 302 0 7 7 Revert "x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32"
# extra tests on tree/branch tip/master
git bisect bad 4edd5aa38cec47346e0d0a85fa43964828b982d0 # 04:37 B 257 1 5 6 Merge branch 'linus'

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation


Attachments:
(No filename) (11.08 kB)
dmesg-yocto-ivb41-118:20180721023515:i386-randconfig-s1-201828:4.18.0-rc4-00209-g7757d60:1.gz (22.10 kB)
dmesg-yocto-ivb41-105:20180721030145:i386-randconfig-s1-201828:4.18.0-rc4-00208-g6df934b:20.gz (31.47 kB)
reproduce-yocto-ivb41-118:20180721023515:i386-randconfig-s1-201828:4.18.0-rc4-00209-g7757d60:1 (975.00 B)
config-4.18.0-rc4-00209-g7757d60 (118.65 kB)
Download all attachments

2018-07-27 13:26:49

by Jörg Rödel

[permalink] [raw]
Subject: Re: [lkp-robot] [confidence: ] 7757d607c6 [ 56.996267] BUG: Bad page map in process trinity-c2 pte:0d755065 pmd:0d55b067

Hey,

thanks for the report! It did a lot of testing and the issue is fixed
now with this patch:

https://lore.kernel.org/lkml/[email protected]/

I did 2150 runs of your reproducer with the reproducer attached to this
report, and the issue triggered three times.

With the patch above the issue did not trigger anymore in (up to now)
2170 runs of the reproducer.

On Sat, Jul 21, 2018 at 10:23:25AM +0800, kernel test robot wrote:
> git bisect start 8c934e01a7ce685d98e970880f5941d79272c654 37b5dca2898d1471729194f45e281c2443eb9d6c --
> git bisect good 8372d66865deb45ee3ec21401a9c80f231b728c8 # 23:36 G 304 0 4 4 x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h
> git bisect good b976690f5db26fbc7c2be413bfa0fbd270547a94 # 00:10 G 305 0 5 5 x86/mm/pti: Introduce pti_finalize()
> git bisect good 9bae3197e15dd5e03ce8e237db6fe4486b08a775 # 00:49 G 308 0 3 4 x86/ldt: Split out sanity check in map_ldt_struct()
> git bisect bad 5e8105950a8b3e03e805299b4d05020ee4eda31a # 01:12 B 131 1 4 4 x86/mm/pti: Add Warning when booting on a PCID capable CPU
> git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 01:47 G 310 0 7 10 x86/ldt: Enable LDT user-mapping for PAE
> git bisect bad 7757d607c6b31867777de42e1fb0210b9c5d8b70 # 02:08 B 134 1 2 2 x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
> # first bad commit: [7757d607c6b31867777de42e1fb0210b9c5d8b70] x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
> git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 03:14 G 900 0 9 19 x86/ldt: Enable LDT user-mapping for PAE
> # extra tests on HEAD of tip/x86/pti
> git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:15 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
> # extra tests on tree/branch tip/x86/pti
> git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:20 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
> # extra tests with first bad commit reverted
> git bisect good 90d0ce801fac8115d424e40a4a258aeed0e409dd # 03:58 G 302 0 7 7 Revert "x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32"
> # extra tests on tree/branch tip/master
> git bisect bad 4edd5aa38cec47346e0d0a85fa43964828b982d0 # 04:37 B 257 1 5 6 Merge branch 'linus'

May I ask how this was bisected? I found it very hard to reproduce, it
triggered less then two times a day with the attached reproducer. Given
that the report came pretty fast after the patches landed in
tip/x86/pti, I guess it triggered a lot faster in your testing.

Regards,

Joerg


2018-07-31 02:06:15

by kernel test robot

[permalink] [raw]
Subject: Re: [LKP] [lkp-robot] [confidence: ] 7757d607c6 [ 56.996267] BUG: Bad page map in process trinity-c2 pte:0d755065 pmd:0d55b067

Hi,

On 07/27, Joerg Roedel wrote:
>Hey,
>
>thanks for the report! It did a lot of testing and the issue is fixed
>now with this patch:
>
> https://lore.kernel.org/lkml/[email protected]/
>
>I did 2150 runs of your reproducer with the reproducer attached to this
>report, and the issue triggered three times.
>
>With the patch above the issue did not trigger anymore in (up to now)
>2170 runs of the reproducer.
>
>On Sat, Jul 21, 2018 at 10:23:25AM +0800, kernel test robot wrote:
>> git bisect start 8c934e01a7ce685d98e970880f5941d79272c654 37b5dca2898d1471729194f45e281c2443eb9d6c --
>> git bisect good 8372d66865deb45ee3ec21401a9c80f231b728c8 # 23:36 G 304 0 4 4 x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h
>> git bisect good b976690f5db26fbc7c2be413bfa0fbd270547a94 # 00:10 G 305 0 5 5 x86/mm/pti: Introduce pti_finalize()
>> git bisect good 9bae3197e15dd5e03ce8e237db6fe4486b08a775 # 00:49 G 308 0 3 4 x86/ldt: Split out sanity check in map_ldt_struct()
>> git bisect bad 5e8105950a8b3e03e805299b4d05020ee4eda31a # 01:12 B 131 1 4 4 x86/mm/pti: Add Warning when booting on a PCID capable CPU
>> git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 01:47 G 310 0 7 10 x86/ldt: Enable LDT user-mapping for PAE
>> git bisect bad 7757d607c6b31867777de42e1fb0210b9c5d8b70 # 02:08 B 134 1 2 2 x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
>> # first bad commit: [7757d607c6b31867777de42e1fb0210b9c5d8b70] x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
>> git bisect good 6df934b92a549cb3badb6d576f71aeb133e2f110 # 03:14 G 900 0 9 19 x86/ldt: Enable LDT user-mapping for PAE
>> # extra tests on HEAD of tip/x86/pti
>> git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:15 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
>> # extra tests on tree/branch tip/x86/pti
>> git bisect bad 8c934e01a7ce685d98e970880f5941d79272c654 # 03:20 B 279 2 0 6 x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
>> # extra tests with first bad commit reverted
>> git bisect good 90d0ce801fac8115d424e40a4a258aeed0e409dd # 03:58 G 302 0 7 7 Revert "x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32"
>> # extra tests on tree/branch tip/master
>> git bisect bad 4edd5aa38cec47346e0d0a85fa43964828b982d0 # 04:37 B 257 1 5 6 Merge branch 'linus'
>
>May I ask how this was bisected? I found it very hard to reproduce, it
>triggered less then two times a day with the attached reproducer. Given
>that the report came pretty fast after the patches landed in
>tip/x86/pti, I guess it triggered a lot faster in your testing.
>

There is higher reproduce rate (2/310) in 0day environment, and we got dozens
of qemu running simultaneously for bisection thus can get bisect result promptly.

Thanks,
Xiaolong

>Regards,
>
> Joerg
>
>_______________________________________________
>LKP mailing list
>[email protected]
>https://lists.01.org/mailman/listinfo/lkp

2018-07-31 07:28:41

by Jörg Rödel

[permalink] [raw]
Subject: Re: [LKP] [lkp-robot] [confidence: ] 7757d607c6 [ 56.996267] BUG: Bad page map in process trinity-c2 pte:0d755065 pmd:0d55b067

On Tue, Jul 31, 2018 at 10:00:36AM +0800, Ye Xiaolong wrote:
> On 07/27, Joerg Roedel wrote:
> >May I ask how this was bisected? I found it very hard to reproduce, it
> >triggered less then two times a day with the attached reproducer. Given
> >that the report came pretty fast after the patches landed in
> >tip/x86/pti, I guess it triggered a lot faster in your testing.
> >
>
> There is higher reproduce rate (2/310) in 0day environment, and we got dozens
> of qemu running simultaneously for bisection thus can get bisect result promptly.

Ah okay, that makes sense. Thanks for your answer and thanks for the
report again.


Regards,

Joerg