2014-12-02 04:33:17

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [mm] BUG: unable to handle kernel paging request at c2446ffc

On Fri, Nov 28, 2014 at 02:10:00AM -0800, Fengguang Wu wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>
> commit 1e491e9be4c97229a3a88763aada9582e37c7eaf
> Author: Joonsoo Kim <[email protected]>
> AuthorDate: Thu Nov 27 11:09:34 2014 +1100
> Commit: Stephen Rothwell <[email protected]>
> CommitDate: Thu Nov 27 11:09:34 2014 +1100
>
> mm/debug-pagealloc: prepare boottime configurable on/off
>
> Until now, debug-pagealloc needs extra flags in struct page, so we need to
> recompile whole source code when we decide to use it. This is really
> painful, because it takes some time to recompile and sometimes rebuild is
> not possible due to third party module depending on struct page. So, we
> can't use this good feature in many cases.
>
> Now, we have the page extension feature that allows us to insert extra
> flags to outside of struct page. This gets rid of third party module
> issue mentioned above. And, this allows us to determine if we need extra
> memory for this page extension in boottime. With these property, we can
> avoid using debug-pagealloc in boottime with low computational overhead in
> the kernel built with CONFIG_DEBUG_PAGEALLOC. This will help our
> development process greatly.
>
> This patch is the preparation step to achive above goal. debug-pagealloc
> originally uses extra field of struct page, but, after this patch, it will
> use field of struct page_ext. Because memory for page_ext is allocated
> later than initialization of page allocator in CONFIG_SPARSEMEM, we should
> disable debug-pagealloc feature temporarily until initialization of
> page_ext. This patch implements this.
>
> Signed-off-by: Joonsoo Kim <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Michal Nazarewicz <[email protected]>
> Cc: Jungsoo Son <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
>
> Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.
>
> +-------------------------------------------------+------------+------------+---------------+
> | | 34bf7903e1 | 1e491e9be4 | next-20141127 |
> +-------------------------------------------------+------------+------------+---------------+
> | boot_successes | 95 | 26 | 11 |
> | boot_failures | 10 | 9 | 3 |
> | BUG:kernel_early_hang_without_any_printk_output | 10 | 8 | |
> | BUG:unable_to_handle_kernel | 0 | 1 | 3 |
> | Oops | 0 | 1 | 3 |
> | EIP_is_at__free_pages_ok | 0 | 1 | 3 |
> | Kernel_panic-not_syncing:Fatal_exception | 0 | 1 | 3 |
> | backtrace:put_tty_driver | 0 | 1 | 3 |
> | backtrace:rp_init | 0 | 1 | 3 |
> | backtrace:kernel_init_freeable | 0 | 1 | 3 |
> +-------------------------------------------------+------------+------------+---------------+
>
> [ 13.206984] RocketPort device driver module, version 2.09, 12-June-2003
> [ 13.208641] No rocketport ports found; unloading driver
> [ 13.208641] No rocketport ports found; unloading driver
> [ 13.213422] BUG: unable to handle kernel
> [ 13.213422] BUG: unable to handle kernel paging requestpaging request at c2446ffc
> at c2446ffc
> [ 13.214380] IP:
> [ 13.214380] IP: [<b11ab6fe>] __free_pages_ok+0x376/0x62c
> [<b11ab6fe>] __free_pages_ok+0x376/0x62c
> [ 13.214380] *pde = 123ca067
> [ 13.214380] *pde = 123ca067 *pte = 12446060 *pte = 12446060
>
> [ 13.214380] Oops: 0000 [#1]
> [ 13.214380] Oops: 0000 [#1] SMP SMP DEBUG_PAGEALLOCDEBUG_PAGEALLOC
>
> [ 13.214380] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #14
> [ 13.214380] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #14
> [ 13.214380] task: c1c40000 ti: c1c48000 task.ti: c1c48000
> [ 13.214380] task: c1c40000 ti: c1c48000 task.ti: c1c48000
> [ 13.214380] EIP: 0060:[<b11ab6fe>] EFLAGS: 00010097 CPU: 0
> [ 13.214380] EIP: 0060:[<b11ab6fe>] EFLAGS: 00010097 CPU: 0
> [ 13.214380] EIP is at __free_pages_ok+0x376/0x62c
> [ 13.214380] EIP is at __free_pages_ok+0x376/0x62c
> [ 13.214380] EAX: c2446ffc EBX: c2513200 ECX: 00000004 EDX: c2447000
> [ 13.214380] EAX: c2446ffc EBX: c2513200 ECX: 00000004 EDX: c2447000
> [ 13.214380] ESI: c2513300 EDI: 00000004 EBP: c1c49e94 ESP: c1c49e64
> [ 13.214380] ESI: c2513300 EDI: 00000004 EBP: c1c49e94 ESP: c1c49e64
> [ 13.214380] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 13.214380] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 13.214380] CR0: 8005003b CR2: c2446ffc CR3: 02729000 CR4: 00000690
> [ 13.214380] CR0: 8005003b CR2: c2446ffc CR3: 02729000 CR4: 00000690
> [ 13.214380] Stack:
> [ 13.214380] Stack:
> [ 13.214380] 00000008
> [ 13.214380] 00000008 b26054ec b26054ec b2605440 b2605440 00000246 00000246 00000010 00000010 00000000 00000000 c2513000 c2513000 00000000 00000000
>
> [ 13.214380] 00000003
> [ 13.214380] 00000003 00000000 00000000 c2513300 c2513300 00000000 00000000 c1c49e9c c1c49e9c b11aba32 b11aba32 c1c49ea4 c1c49ea4 b11abc52 b11abc52
>
> [ 13.214380] c1c49ec4
> [ 13.214380] c1c49ec4 b1204fb5 b1204fb5 c1c49ebc c1c49ebc b10deca7 b10deca7 b0018000 b0018000 bbf20984 bbf20984 00000100 00000100 bbf20980 bbf20980
>
> [ 13.214380] Call Trace:
> [ 13.214380] Call Trace:
> [ 13.214380] [<b11aba32>] __free_pages+0x7e/0x8e
> [ 13.214380] [<b11aba32>] __free_pages+0x7e/0x8e
> [ 13.214380] [<b11abc52>] __free_kmem_pages+0x16/0x26
> [ 13.214380] [<b11abc52>] __free_kmem_pages+0x16/0x26
> [ 13.214380] [<b1204fb5>] kfree+0x292/0x4e1
> [ 13.214380] [<b1204fb5>] kfree+0x292/0x4e1
> [ 13.214380] [<b10deca7>] ? debug_mutex_unlock+0x2f3/0x398
> [ 13.214380] [<b10deca7>] ? debug_mutex_unlock+0x2f3/0x398
> [ 13.214380] [<b14f5c36>] destruct_tty_driver+0xee/0x158
> [ 13.214380] [<b14f5c36>] destruct_tty_driver+0xee/0x158
> [ 13.214380] [<b14f5fe5>] tty_driver_kref_put+0xb4/0xc6
> [ 13.214380] [<b14f5fe5>] tty_driver_kref_put+0xb4/0xc6
> [ 13.214380] [<b14f6121>] put_tty_driver+0x16/0x26
> [ 13.214380] [<b14f6121>] put_tty_driver+0x16/0x26
> [ 13.214380] [<b26aa177>] rp_init+0xc04/0xc23
> [ 13.214380] [<b26aa177>] rp_init+0xc04/0xc23
> [ 13.214380] [<b12050e9>] ? kfree+0x3c6/0x4e1
> [ 13.214380] [<b12050e9>] ? kfree+0x3c6/0x4e1
> [ 13.214380] [<b26a9573>] ? register_PCI+0x1091/0x1091
> [ 13.214380] [<b26a9573>] ? register_PCI+0x1091/0x1091
> [ 13.214380] [<b26469d2>] do_one_initcall+0x1ed/0x356
> [ 13.214380] [<b26469d2>] do_one_initcall+0x1ed/0x356
> [ 13.214380] [<b2646d8b>] kernel_init_freeable+0x250/0x3ab
> [ 13.214380] [<b2646d8b>] kernel_init_freeable+0x250/0x3ab
> [ 13.214380] [<b1d95dd0>] kernel_init+0x16/0x1e7
> [ 13.214380] [<b1d95dd0>] kernel_init+0x16/0x1e7
> [ 13.214380] [<b1dcdd01>] ret_from_kernel_thread+0x21/0x30
> [ 13.214380] [<b1dcdd01>] ret_from_kernel_thread+0x21/0x30
> [ 13.214380] [<b1d95dba>] ? rest_init+0x15b/0x15b
> [ 13.214380] [<b1d95dba>] ? rest_init+0x15b/0x15b
> [ 13.214380] Code:
> [ 13.214380] Code: 31 31 d0 d0 29 29 d0 d0 c1 c1 e0 e0 05 05 80 80 3d 3d 00 00 30 30 64 64 b2 b2 00 00 8d 8d 14 14 03 03 74 74 71 71 83 83 05 05 28 28 90 90 da da b2 b2 01 01 89 89 d0 d0 89 89 55 55 e8 e8 83 83 15 15 2c 2c 90 90 da da b2 b2 00 00 e8 e8 67 67 d9 d9 05 05 00 00 <8b> <8b> 00 00 83 83 05 05 30 30 90 90 da da b2 b2 01 01 83 83 15 15 34 34 90 90 da da b2 b2 00 00 a8 a8 02 02 8b 8b 55 55 e8 e8
>
> [ 13.214380] EIP: [<b11ab6fe>]
> [ 13.214380] EIP: [<b11ab6fe>] __free_pages_ok+0x376/0x62c__free_pages_ok+0x376/0x62c SS:ESP 0068:c1c49e64
> SS:ESP 0068:c1c49e64
> [ 13.214380] CR2: 00000000c2446ffc
> [ 13.214380] CR2: 00000000c2446ffc
> [ 13.214380] ---[ end trace fe261d43ae421f43 ]---
> [ 13.214380] ---[ end trace fe261d43ae421f43 ]---
>
> git bisect start 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f 5d01410fe4d92081f349b013a2e7a95429e4f2c9 --
> git bisect good 14692f2c9f01c7f21f83d41a8cb99fea1e4f803f # 10:20 35+ 0 Merge remote-tracking branch 'dlm/next'
> git bisect good 17623427488fe306376e18e0ee63c2c1bcbf5612 # 10:42 35+ 0 Merge remote-tracking branch 'edac-amd/for-next'
> git bisect good 6acfd0c5752274ad5099152d9a00c99f81c273b5 # 10:54 35+ 0 Merge remote-tracking branch 'char-misc/char-misc-next'
> git bisect good 574733068e280900745b7241a51f26815f25ca64 # 11:24 35+ 5 Merge remote-tracking branch 'userns/for-next'
> git bisect good d3d6c2b2574a1700a33c3f40a8adcd11db728926 # 11:36 35+ 11 Merge remote-tracking branch 'llvmlinux/for-next'
> git bisect good 749230afd0fa54770f95063071b1bdfb6dee9bc2 # 11:45 35+ 13 Merge remote-tracking branch 'y2038/y2038'
> git bisect bad 35cc8c3f978f75a04ac96b3cb72b8f7630ea04f4 # 11:50 0- 1 Merge branch 'akpm-current/current'
> git bisect bad 6aab9099af555bf5a464f318d312ba5baa5cf516 # 11:59 0- 1 stacktrace: introduce snprint_stack_trace for buffer output
> git bisect good 15c2416b0e6f21f17152e0ba32202bb1354394e3 # 12:10 35+ 18 mm-compaction-more-focused-lru-and-pcplists-draining-fix
> git bisect good c5c825302103a196aa94efa121c011121ffff14b # 12:17 35+ 2 uprobes: share the i_mmap_rwsem
> git bisect good b225ec73923a04a6d00dd28c6372c167780921b8 # 12:24 35+ 0 hugetlb: hugetlb_register_all_nodes(): add __init marker
> git bisect good 4fb10ba778d4c4ccefee3ce833e487a6695068b1 # 12:32 35+ 1 mm: support madvise(MADV_FREE)
> git bisect good 0aba43a2670028ec26cfeb59d3c2610ab0ee140b # 12:42 35+ 4 arm64: add pmd_[dirty|mkclean] for THP
> git bisect bad 1e491e9be4c97229a3a88763aada9582e37c7eaf # 12:51 0- 1 mm/debug-pagealloc: prepare boottime configurable on/off
> git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 12:57 35+ 0 mm/page_ext: resurrect struct page extending code for debugging
> # first bad commit: [1e491e9be4c97229a3a88763aada9582e37c7eaf] mm/debug-pagealloc: prepare boottime configurable on/off
> git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 13:01 105+ 10 mm/page_ext: resurrect struct page extending code for debugging
> # extra tests on HEAD of next/master
> git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:01 0- 3 Add linux-next specific files for 20141127
> # extra tests on tree/branch next/master
> git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:01 0- 3 Add linux-next specific files for 20141127
> # extra tests on tree/branch linus/master
> git bisect good 98e8d2e094de67315f786cd81b1dccb4ac040cc2 # 13:11 105+ 21 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
> # extra tests on tree/branch next/master
> git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:11 0- 3 Add linux-next specific files for 20141127

Hello, Fengguang.

First of all, thanks for reporting!
It always helps me a lot.

But, in this time, I can't reproduce this failure with your attached
configuration. Instead of this failure, sometimes, OOM happens in my
testing with your configuration. I don't know why it happens. :)

Could you have another configuration to trigger this bug?

Anyway, this calltrace is really suspicious, because it looks like
failure of tty driver initialization. Why it is failed in this early
phase? Maybe, it is also related to OOM mentioned above.

Thanks.

2014-12-02 05:04:43

by Fengguang Wu

[permalink] [raw]
Subject: Re: [mm] BUG: unable to handle kernel paging request at c2446ffc

Hi Joonsoo,

On Tue, Dec 02, 2014 at 01:36:38PM +0900, Joonsoo Kim wrote:
> On Fri, Nov 28, 2014 at 02:10:00AM -0800, Fengguang Wu wrote:
> > Greetings,
> >
> > 0day kernel testing robot got the below dmesg and the first bad commit is
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >
> > commit 1e491e9be4c97229a3a88763aada9582e37c7eaf
> > Author: Joonsoo Kim <[email protected]>
> > AuthorDate: Thu Nov 27 11:09:34 2014 +1100
> > Commit: Stephen Rothwell <[email protected]>
> > CommitDate: Thu Nov 27 11:09:34 2014 +1100
> >
> > mm/debug-pagealloc: prepare boottime configurable on/off
> >
> > Until now, debug-pagealloc needs extra flags in struct page, so we need to
> > recompile whole source code when we decide to use it. This is really
> > painful, because it takes some time to recompile and sometimes rebuild is
> > not possible due to third party module depending on struct page. So, we
> > can't use this good feature in many cases.
> >
> > Now, we have the page extension feature that allows us to insert extra
> > flags to outside of struct page. This gets rid of third party module
> > issue mentioned above. And, this allows us to determine if we need extra
> > memory for this page extension in boottime. With these property, we can
> > avoid using debug-pagealloc in boottime with low computational overhead in
> > the kernel built with CONFIG_DEBUG_PAGEALLOC. This will help our
> > development process greatly.
> >
> > This patch is the preparation step to achive above goal. debug-pagealloc
> > originally uses extra field of struct page, but, after this patch, it will
> > use field of struct page_ext. Because memory for page_ext is allocated
> > later than initialization of page allocator in CONFIG_SPARSEMEM, we should
> > disable debug-pagealloc feature temporarily until initialization of
> > page_ext. This patch implements this.
> >
> > Signed-off-by: Joonsoo Kim <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: Michal Nazarewicz <[email protected]>
> > Cc: Jungsoo Son <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Joonsoo Kim <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> >
> > Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.
> >
> > +-------------------------------------------------+------------+------------+---------------+
> > | | 34bf7903e1 | 1e491e9be4 | next-20141127 |
> > +-------------------------------------------------+------------+------------+---------------+
> > | boot_successes | 95 | 26 | 11 |
> > | boot_failures | 10 | 9 | 3 |
> > | BUG:kernel_early_hang_without_any_printk_output | 10 | 8 | |
> > | BUG:unable_to_handle_kernel | 0 | 1 | 3 |
> > | Oops | 0 | 1 | 3 |
> > | EIP_is_at__free_pages_ok | 0 | 1 | 3 |
> > | Kernel_panic-not_syncing:Fatal_exception | 0 | 1 | 3 |
> > | backtrace:put_tty_driver | 0 | 1 | 3 |
> > | backtrace:rp_init | 0 | 1 | 3 |
> > | backtrace:kernel_init_freeable | 0 | 1 | 3 |
> > +-------------------------------------------------+------------+------------+---------------+
> >
> > [ 13.206984] RocketPort device driver module, version 2.09, 12-June-2003
> > [ 13.208641] No rocketport ports found; unloading driver
> > [ 13.208641] No rocketport ports found; unloading driver
> > [ 13.213422] BUG: unable to handle kernel
> > [ 13.213422] BUG: unable to handle kernel paging requestpaging request at c2446ffc
> > at c2446ffc
> > [ 13.214380] IP:
> > [ 13.214380] IP: [<b11ab6fe>] __free_pages_ok+0x376/0x62c
> > [<b11ab6fe>] __free_pages_ok+0x376/0x62c
> > [ 13.214380] *pde = 123ca067
> > [ 13.214380] *pde = 123ca067 *pte = 12446060 *pte = 12446060
> >
> > [ 13.214380] Oops: 0000 [#1]
> > [ 13.214380] Oops: 0000 [#1] SMP SMP DEBUG_PAGEALLOCDEBUG_PAGEALLOC
> >
> > [ 13.214380] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #14
> > [ 13.214380] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #14
> > [ 13.214380] task: c1c40000 ti: c1c48000 task.ti: c1c48000
> > [ 13.214380] task: c1c40000 ti: c1c48000 task.ti: c1c48000
> > [ 13.214380] EIP: 0060:[<b11ab6fe>] EFLAGS: 00010097 CPU: 0
> > [ 13.214380] EIP: 0060:[<b11ab6fe>] EFLAGS: 00010097 CPU: 0
> > [ 13.214380] EIP is at __free_pages_ok+0x376/0x62c
> > [ 13.214380] EIP is at __free_pages_ok+0x376/0x62c
> > [ 13.214380] EAX: c2446ffc EBX: c2513200 ECX: 00000004 EDX: c2447000
> > [ 13.214380] EAX: c2446ffc EBX: c2513200 ECX: 00000004 EDX: c2447000
> > [ 13.214380] ESI: c2513300 EDI: 00000004 EBP: c1c49e94 ESP: c1c49e64
> > [ 13.214380] ESI: c2513300 EDI: 00000004 EBP: c1c49e94 ESP: c1c49e64
> > [ 13.214380] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 13.214380] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 13.214380] CR0: 8005003b CR2: c2446ffc CR3: 02729000 CR4: 00000690
> > [ 13.214380] CR0: 8005003b CR2: c2446ffc CR3: 02729000 CR4: 00000690
> > [ 13.214380] Stack:
> > [ 13.214380] Stack:
> > [ 13.214380] 00000008
> > [ 13.214380] 00000008 b26054ec b26054ec b2605440 b2605440 00000246 00000246 00000010 00000010 00000000 00000000 c2513000 c2513000 00000000 00000000
> >
> > [ 13.214380] 00000003
> > [ 13.214380] 00000003 00000000 00000000 c2513300 c2513300 00000000 00000000 c1c49e9c c1c49e9c b11aba32 b11aba32 c1c49ea4 c1c49ea4 b11abc52 b11abc52
> >
> > [ 13.214380] c1c49ec4
> > [ 13.214380] c1c49ec4 b1204fb5 b1204fb5 c1c49ebc c1c49ebc b10deca7 b10deca7 b0018000 b0018000 bbf20984 bbf20984 00000100 00000100 bbf20980 bbf20980
> >
> > [ 13.214380] Call Trace:
> > [ 13.214380] Call Trace:
> > [ 13.214380] [<b11aba32>] __free_pages+0x7e/0x8e
> > [ 13.214380] [<b11aba32>] __free_pages+0x7e/0x8e
> > [ 13.214380] [<b11abc52>] __free_kmem_pages+0x16/0x26
> > [ 13.214380] [<b11abc52>] __free_kmem_pages+0x16/0x26
> > [ 13.214380] [<b1204fb5>] kfree+0x292/0x4e1
> > [ 13.214380] [<b1204fb5>] kfree+0x292/0x4e1
> > [ 13.214380] [<b10deca7>] ? debug_mutex_unlock+0x2f3/0x398
> > [ 13.214380] [<b10deca7>] ? debug_mutex_unlock+0x2f3/0x398
> > [ 13.214380] [<b14f5c36>] destruct_tty_driver+0xee/0x158
> > [ 13.214380] [<b14f5c36>] destruct_tty_driver+0xee/0x158
> > [ 13.214380] [<b14f5fe5>] tty_driver_kref_put+0xb4/0xc6
> > [ 13.214380] [<b14f5fe5>] tty_driver_kref_put+0xb4/0xc6
> > [ 13.214380] [<b14f6121>] put_tty_driver+0x16/0x26
> > [ 13.214380] [<b14f6121>] put_tty_driver+0x16/0x26
> > [ 13.214380] [<b26aa177>] rp_init+0xc04/0xc23
> > [ 13.214380] [<b26aa177>] rp_init+0xc04/0xc23
> > [ 13.214380] [<b12050e9>] ? kfree+0x3c6/0x4e1
> > [ 13.214380] [<b12050e9>] ? kfree+0x3c6/0x4e1
> > [ 13.214380] [<b26a9573>] ? register_PCI+0x1091/0x1091
> > [ 13.214380] [<b26a9573>] ? register_PCI+0x1091/0x1091
> > [ 13.214380] [<b26469d2>] do_one_initcall+0x1ed/0x356
> > [ 13.214380] [<b26469d2>] do_one_initcall+0x1ed/0x356
> > [ 13.214380] [<b2646d8b>] kernel_init_freeable+0x250/0x3ab
> > [ 13.214380] [<b2646d8b>] kernel_init_freeable+0x250/0x3ab
> > [ 13.214380] [<b1d95dd0>] kernel_init+0x16/0x1e7
> > [ 13.214380] [<b1d95dd0>] kernel_init+0x16/0x1e7
> > [ 13.214380] [<b1dcdd01>] ret_from_kernel_thread+0x21/0x30
> > [ 13.214380] [<b1dcdd01>] ret_from_kernel_thread+0x21/0x30
> > [ 13.214380] [<b1d95dba>] ? rest_init+0x15b/0x15b
> > [ 13.214380] [<b1d95dba>] ? rest_init+0x15b/0x15b
> > [ 13.214380] Code:
> > [ 13.214380] Code: 31 31 d0 d0 29 29 d0 d0 c1 c1 e0 e0 05 05 80 80 3d 3d 00 00 30 30 64 64 b2 b2 00 00 8d 8d 14 14 03 03 74 74 71 71 83 83 05 05 28 28 90 90 da da b2 b2 01 01 89 89 d0 d0 89 89 55 55 e8 e8 83 83 15 15 2c 2c 90 90 da da b2 b2 00 00 e8 e8 67 67 d9 d9 05 05 00 00 <8b> <8b> 00 00 83 83 05 05 30 30 90 90 da da b2 b2 01 01 83 83 15 15 34 34 90 90 da da b2 b2 00 00 a8 a8 02 02 8b 8b 55 55 e8 e8
> >
> > [ 13.214380] EIP: [<b11ab6fe>]
> > [ 13.214380] EIP: [<b11ab6fe>] __free_pages_ok+0x376/0x62c__free_pages_ok+0x376/0x62c SS:ESP 0068:c1c49e64
> > SS:ESP 0068:c1c49e64
> > [ 13.214380] CR2: 00000000c2446ffc
> > [ 13.214380] CR2: 00000000c2446ffc
> > [ 13.214380] ---[ end trace fe261d43ae421f43 ]---
> > [ 13.214380] ---[ end trace fe261d43ae421f43 ]---
> >
> > git bisect start 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f 5d01410fe4d92081f349b013a2e7a95429e4f2c9 --
> > git bisect good 14692f2c9f01c7f21f83d41a8cb99fea1e4f803f # 10:20 35+ 0 Merge remote-tracking branch 'dlm/next'
> > git bisect good 17623427488fe306376e18e0ee63c2c1bcbf5612 # 10:42 35+ 0 Merge remote-tracking branch 'edac-amd/for-next'
> > git bisect good 6acfd0c5752274ad5099152d9a00c99f81c273b5 # 10:54 35+ 0 Merge remote-tracking branch 'char-misc/char-misc-next'
> > git bisect good 574733068e280900745b7241a51f26815f25ca64 # 11:24 35+ 5 Merge remote-tracking branch 'userns/for-next'
> > git bisect good d3d6c2b2574a1700a33c3f40a8adcd11db728926 # 11:36 35+ 11 Merge remote-tracking branch 'llvmlinux/for-next'
> > git bisect good 749230afd0fa54770f95063071b1bdfb6dee9bc2 # 11:45 35+ 13 Merge remote-tracking branch 'y2038/y2038'
> > git bisect bad 35cc8c3f978f75a04ac96b3cb72b8f7630ea04f4 # 11:50 0- 1 Merge branch 'akpm-current/current'
> > git bisect bad 6aab9099af555bf5a464f318d312ba5baa5cf516 # 11:59 0- 1 stacktrace: introduce snprint_stack_trace for buffer output
> > git bisect good 15c2416b0e6f21f17152e0ba32202bb1354394e3 # 12:10 35+ 18 mm-compaction-more-focused-lru-and-pcplists-draining-fix
> > git bisect good c5c825302103a196aa94efa121c011121ffff14b # 12:17 35+ 2 uprobes: share the i_mmap_rwsem
> > git bisect good b225ec73923a04a6d00dd28c6372c167780921b8 # 12:24 35+ 0 hugetlb: hugetlb_register_all_nodes(): add __init marker
> > git bisect good 4fb10ba778d4c4ccefee3ce833e487a6695068b1 # 12:32 35+ 1 mm: support madvise(MADV_FREE)
> > git bisect good 0aba43a2670028ec26cfeb59d3c2610ab0ee140b # 12:42 35+ 4 arm64: add pmd_[dirty|mkclean] for THP
> > git bisect bad 1e491e9be4c97229a3a88763aada9582e37c7eaf # 12:51 0- 1 mm/debug-pagealloc: prepare boottime configurable on/off
> > git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 12:57 35+ 0 mm/page_ext: resurrect struct page extending code for debugging
> > # first bad commit: [1e491e9be4c97229a3a88763aada9582e37c7eaf] mm/debug-pagealloc: prepare boottime configurable on/off
> > git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 13:01 105+ 10 mm/page_ext: resurrect struct page extending code for debugging
> > # extra tests on HEAD of next/master
> > git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:01 0- 3 Add linux-next specific files for 20141127
> > # extra tests on tree/branch next/master
> > git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:01 0- 3 Add linux-next specific files for 20141127
> > # extra tests on tree/branch linus/master
> > git bisect good 98e8d2e094de67315f786cd81b1dccb4ac040cc2 # 13:11 105+ 21 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
> > # extra tests on tree/branch next/master
> > git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 13:11 0- 3 Add linux-next specific files for 20141127
>
> Hello, Fengguang.
>
> First of all, thanks for reporting!
> It always helps me a lot.
>
> But, in this time, I can't reproduce this failure with your attached

Judging from the bisect log, it showed up once per 26 boots, so may
not be easy to reproduce. The other bisect below looks easier to
reproduce.

> configuration. Instead of this failure, sometimes, OOM happens in my
> testing with your configuration. I don't know why it happens. :)

Yes, it's not really obvious how the change will be related to this BUG.

> Could you have another configuration to trigger this bug?

Sure. Below is another bisect result.

> Anyway, this calltrace is really suspicious, because it looks like
> failure of tty driver initialization. Why it is failed in this early
> phase? Maybe, it is also related to OOM mentioned above.
>
> Thanks.

Thanks,
Fengguang

PS. one more bisect report.

+-------------------------------------------------+------------+------------+---------------+
| | 34bf7903e1 | 1e491e9be4 | next-20141127 |
+-------------------------------------------------+------------+------------+---------------+
| boot_successes | 194 | 0 | 0 |
| boot_failures | 46 | 80 | 12 |
| BUG:kernel_early_hang_without_any_printk_output | 46 | 31 | |
| BUG:unable_to_handle_kernel | 0 | 49 | 12 |
| Oops | 0 | 49 | 12 |
| EIP_is_at_free_one_page | 0 | 49 | 12 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 49 | 12 |
| backtrace:rhashtable_expand | 0 | 49 | 6 |
| backtrace:test_rht_init | 0 | 49 | 12 |
| backtrace:kernel_init_freeable | 0 | 49 | 12 |
| backtrace:rhashtable_shrink | 0 | 0 | 6 |
+-------------------------------------------------+------------+------------+---------------+

[ 0.466754] Verifying lookups...
[ 0.467836] Table expansion iteration 3...
[ 0.467836] Table expansion iteration 3...
[ 0.480155] BUG: unable to handle kernel
[ 0.480155] BUG: unable to handle kernel paging requestpaging request at d26bdffc
at d26bdffc
[ 0.481566] IP:
[ 0.481566] IP: [<c110bc7a>] free_one_page+0x31a/0x3e0
[<c110bc7a>] free_one_page+0x31a/0x3e0
[ 0.482801] *pdpt = 0000000001866001
[ 0.482801] *pdpt = 0000000001866001 *pde = 0000000012584067 *pde = 0000000012584067 *pte = 80000000126bd060 *pte = 80000000126bd060

[ 0.483333] Oops: 0000 [#1]
[ 0.483333] Oops: 0000 [#1] SMP SMP DEBUG_PAGEALLOCDEBUG_PAGEALLOC

[ 0.483333] Modules linked in:
[ 0.483333] Modules linked in:

[ 0.483333] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #11
[ 0.483333] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-00201-g1e491e9 #11
[ 0.483333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 0.483333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 0.483333] task: c0016430 ti: c0040000 task.ti: c0040000
[ 0.483333] task: c0016430 ti: c0040000 task.ti: c0040000
[ 0.483333] EIP: 0060:[<c110bc7a>] EFLAGS: 00010086 CPU: 0
[ 0.483333] EIP: 0060:[<c110bc7a>] EFLAGS: 00010086 CPU: 0
[ 0.483333] EIP is at free_one_page+0x31a/0x3e0
[ 0.483333] EIP is at free_one_page+0x31a/0x3e0
[ 0.483333] EAX: d26bdffc EBX: d22f8000 ECX: 00000008 EDX: d26be000
[ 0.483333] EAX: d26bdffc EBX: d22f8000 ECX: 00000008 EDX: d26be000
[ 0.483333] ESI: 00000007 EDI: d22fb000 EBP: c0041e08 ESP: c0041de0
[ 0.483333] ESI: 00000007 EDI: d22fb000 EBP: c0041e08 ESP: c0041de0
[ 0.483333] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 0.483333] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 0.483333] CR0: 8005003b CR2: d26bdffc CR3: 01869000 CR4: 000006b0
[ 0.483333] CR0: 8005003b CR2: d26bdffc CR3: 01869000 CR4: 000006b0
[ 0.483333] Stack:
[ 0.483333] Stack:
[ 0.483333] 00000040
[ 0.483333] 00000040 d3fdb060 d3fdb060 d22fa000 d22fa000 d3fdb000 d3fdb000 00000080 00000080 00000008 00000008 d22fa000 d22fa000 d22fb800 d22fb800

[ 0.483333] 00000000
[ 0.483333] 00000000 00000006 00000006 c0041e2c c0041e2c c110bdec c110bdec 00000006 00000006 00000000 00000000 000001c0 000001c0 00000246 00000246

[ 0.483333] c014acb8
[ 0.483333] c014acb8 c01c0000 c01c0000 c01c0000 c01c0000 c0041e34 c0041e34 c110c769 c110c769 c0041e4c c0041e4c c1144ca5 c1144ca5 c111b595 c111b595

[ 0.483333] Call Trace:
[ 0.483333] Call Trace:
[ 0.483333] [<c110bdec>] __free_pages_ok+0xac/0xf0
[ 0.483333] [<c110bdec>] __free_pages_ok+0xac/0xf0
[ 0.483333] [<c110c769>] __free_pages+0x19/0x30
[ 0.483333] [<c110c769>] __free_pages+0x19/0x30
[ 0.483333] [<c1144ca5>] kfree+0x75/0xf0
[ 0.483333] [<c1144ca5>] kfree+0x75/0xf0
[ 0.483333] [<c111b595>] ? kvfree+0x45/0x50
[ 0.483333] [<c111b595>] ? kvfree+0x45/0x50
[ 0.483333] [<c111b595>] kvfree+0x45/0x50
[ 0.483333] [<c111b595>] kvfree+0x45/0x50
[ 0.483333] [<c134bb73>] rhashtable_expand+0x1b3/0x1e0
[ 0.483333] [<c134bb73>] rhashtable_expand+0x1b3/0x1e0
[ 0.483333] [<c17fc9f9>] test_rht_init+0x173/0x2e8
[ 0.483333] [<c17fc9f9>] test_rht_init+0x173/0x2e8
[ 0.483333] [<c134b750>] ? jhash2+0xe0/0xe0
[ 0.483333] [<c134b750>] ? jhash2+0xe0/0xe0
[ 0.483333] [<c134b790>] ? rhashtable_hashfn+0x20/0x20
[ 0.483333] [<c134b790>] ? rhashtable_hashfn+0x20/0x20
[ 0.483333] [<c134b7b0>] ? rht_grow_above_75+0x20/0x20
[ 0.483333] [<c134b7b0>] ? rht_grow_above_75+0x20/0x20
[ 0.483333] [<c134b7d0>] ? rht_shrink_below_30+0x20/0x20
[ 0.483333] [<c134b7d0>] ? rht_shrink_below_30+0x20/0x20
[ 0.483333] [<c134b750>] ? jhash2+0xe0/0xe0
[ 0.483333] [<c134b750>] ? jhash2+0xe0/0xe0
[ 0.483333] [<c134b790>] ? rhashtable_hashfn+0x20/0x20
[ 0.483333] [<c134b790>] ? rhashtable_hashfn+0x20/0x20
[ 0.483333] [<c134b7b0>] ? rht_grow_above_75+0x20/0x20
[ 0.483333] [<c134b7b0>] ? rht_grow_above_75+0x20/0x20
[ 0.483333] [<c134b7d0>] ? rht_shrink_below_30+0x20/0x20
[ 0.483333] [<c134b7d0>] ? rht_shrink_below_30+0x20/0x20
[ 0.483333] [<c17fc886>] ? test_rht_lookup+0x8f/0x8f
[ 0.483333] [<c17fc886>] ? test_rht_lookup+0x8f/0x8f
[ 0.483333] [<c1000486>] do_one_initcall+0xc6/0x210
[ 0.483333] [<c1000486>] do_one_initcall+0xc6/0x210
[ 0.483333] [<c17fc886>] ? test_rht_lookup+0x8f/0x8f
[ 0.483333] [<c17fc886>] ? test_rht_lookup+0x8f/0x8f
[ 0.483333] [<c17d0505>] ? repair_env_string+0x12/0x54
[ 0.483333] [<c17d0505>] ? repair_env_string+0x12/0x54
[ 0.483333] [<c17d0cf3>] kernel_init_freeable+0x193/0x213
[ 0.483333] [<c17d0cf3>] kernel_init_freeable+0x193/0x213
[ 0.483333] [<c1512500>] kernel_init+0x10/0xf0
[ 0.483333] [<c1512500>] kernel_init+0x10/0xf0
[ 0.483333] [<c151c5c1>] ret_from_kernel_thread+0x21/0x30
[ 0.483333] [<c151c5c1>] ret_from_kernel_thread+0x21/0x30
[ 0.483333] [<c15124f0>] ? rest_init+0xb0/0xb0
[ 0.483333] [<c15124f0>] ? rest_init+0xb0/0xb0
[ 0.483333] Code:
[ 0.483333] Code: 85 85 b1 b1 00 00 00 00 00 00 83 83 7d 7d 0c 0c 05 05 0f 0f 85 85 5c 5c fd fd ff ff ff ff e9 e9 79 79 fd fd ff ff ff ff 8d 8d b6 b6 00 00 00 00 00 00 00 00 8d 8d bc bc 27 27 00 00 00 00 00 00 00 00 89 89 4d 4d ec ec 89 89 d8 d8 e8 e8 16 16 4d 4d 04 04 00 00 <8b> <8b> 00 00 a8 a8 02 02 8b 8b 4d 4d ec ec 0f 0f 84 84 93 93 fe fe ff ff ff ff 3b 3b 4b 4b 1c 1c 74 74 11 11 8b 8b 43 43 0c 0c

[ 0.483333] EIP: [<c110bc7a>]
[ 0.483333] EIP: [<c110bc7a>] free_one_page+0x31a/0x3e0free_one_page+0x31a/0x3e0 SS:ESP 0068:c0041de0
SS:ESP 0068:c0041de0
[ 0.483333] CR2: 00000000d26bdffc
[ 0.483333] CR2: 00000000d26bdffc
[ 0.483333] ---[ end trace 7648e12f817ef2ad ]---
[ 0.483333] ---[ end trace 7648e12f817ef2ad ]---

git bisect start 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f 5d01410fe4d92081f349b013a2e7a95429e4f2c9 --
git bisect good 14692f2c9f01c7f21f83d41a8cb99fea1e4f803f # 16:17 60+ 10 Merge remote-tracking branch 'dlm/next'
git bisect good 17623427488fe306376e18e0ee63c2c1bcbf5612 # 16:19 60+ 80 Merge remote-tracking branch 'edac-amd/for-next'
git bisect good 6acfd0c5752274ad5099152d9a00c99f81c273b5 # 16:22 60+ 80 Merge remote-tracking branch 'char-misc/char-misc-next'
git bisect good 574733068e280900745b7241a51f26815f25ca64 # 16:25 60+ 80 Merge remote-tracking branch 'userns/for-next'
git bisect good d3d6c2b2574a1700a33c3f40a8adcd11db728926 # 16:28 60+ 80 Merge remote-tracking branch 'llvmlinux/for-next'
git bisect good 749230afd0fa54770f95063071b1bdfb6dee9bc2 # 16:31 60+ 80 Merge remote-tracking branch 'y2038/y2038'
git bisect bad 35cc8c3f978f75a04ac96b3cb72b8f7630ea04f4 # 16:31 0- 20 Merge branch 'akpm-current/current'
git bisect bad 6aab9099af555bf5a464f318d312ba5baa5cf516 # 16:31 0- 20 stacktrace: introduce snprint_stack_trace for buffer output
git bisect good 15c2416b0e6f21f17152e0ba32202bb1354394e3 # 16:33 60+ 21 mm-compaction-more-focused-lru-and-pcplists-draining-fix
git bisect good c5c825302103a196aa94efa121c011121ffff14b # 16:35 60+ 19 uprobes: share the i_mmap_rwsem
git bisect good b225ec73923a04a6d00dd28c6372c167780921b8 # 16:37 60+ 21 hugetlb: hugetlb_register_all_nodes(): add __init marker
git bisect good 4fb10ba778d4c4ccefee3ce833e487a6695068b1 # 16:39 60+ 21 mm: support madvise(MADV_FREE)
git bisect good 0aba43a2670028ec26cfeb59d3c2610ab0ee140b # 16:43 60+ 24 arm64: add pmd_[dirty|mkclean] for THP
git bisect bad 1e491e9be4c97229a3a88763aada9582e37c7eaf # 16:43 0- 80 mm/debug-pagealloc: prepare boottime configurable on/off
git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 16:50 60+ 15 mm/page_ext: resurrect struct page extending code for debugging
# first bad commit: [1e491e9be4c97229a3a88763aada9582e37c7eaf] mm/debug-pagealloc: prepare boottime configurable on/off
git bisect good 34bf7903e195347898a225220357f3a49dd65e7e # 16:55 180+ 46 mm/page_ext: resurrect struct page extending code for debugging
# extra tests on HEAD of next/master
git bisect bad 3bcf494d225fd193d02e8cb2e2c3fe3cc476ff3f # 16:55 0- 12 Add linux-next specific files for 20141127
# extra tests on tree/branch next/master
git bisect bad 29340b8ee0b4def844aa8d6b09babcede80993d2 # 17:01 0- 61 Add linux-next specific files for 20141128
# extra tests on tree/branch linus/master
git bisect good 98e8d2e094de67315f786cd81b1dccb4ac040cc2 # 17:07 180+ 26 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
# extra tests on tree/branch next/master
git bisect bad 29340b8ee0b4def844aa8d6b09babcede80993d2 # 17:07 0- 180 Add linux-next specific files for 20141128


This script may reproduce the error.

----------------------------------------------------------------------------
#!/bin/bash

kernel=$1
initrd=quantal-core-i386.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
qemu-system-x86_64
-cpu kvm64
-enable-kvm
-kernel $kernel
-initrd $initrd
-m 320
-smp 2
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)

append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)

"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------

Thanks,
Fengguang


Attachments:
(No filename) (23.96 kB)
dmesg-quantal-client7-9:20141128161237:i386-randconfig-ib0-11281331:3.18.0-rc6-00201-g1e491e9:11 (41.44 kB)
config-3.18.0-rc6-00201-g1e491e9 (70.41 kB)
Download all attachments

2014-12-03 01:10:29

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [mm] BUG: unable to handle kernel paging request at c2446ffc

On Mon, Dec 01, 2014 at 09:04:17PM -0800, Fengguang Wu wrote:
> > Hello, Fengguang.
> >
> > First of all, thanks for reporting!
> > It always helps me a lot.
> >
> > But, in this time, I can't reproduce this failure with your attached
>
> Judging from the bisect log, it showed up once per 26 boots, so may
> not be easy to reproduce. The other bisect below looks easier to
> reproduce.
>
> > configuration. Instead of this failure, sometimes, OOM happens in my
> > testing with your configuration. I don't know why it happens. :)
>
> Yes, it's not really obvious how the change will be related to this BUG.
>
> > Could you have another configuration to trigger this bug?
>
> Sure. Below is another bisect result.

Hello, Fengguang.

Great! I can reproduce the problem with you another bisect result.
Thanks a lot!
Will send the fix to Andrew.

Thanks.