2018-04-19 05:51:31

by Fengguang Wu

[permalink] [raw]
Subject: WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830

Hello,

FYI this warning dates back to v4.16-rc5 . It's rather rare and often
happen together with other errors. For example,

[ 168.976238] perf: interrupt took too long (10016 > 9973), lowering kernel.perf_event_max_sample_rate to 19000
[ 171.793224] WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830
[ 225.573912] BUG: Bad page map in process sort pte:00000002 pmd:1b6303067
[ 225.574302] addr:00000000efb51519 vm_flags:00000070 anon_vma: (null) mapping:00000000c7c7d07a index:12f

[ 171.556542] perf: interrupt took too long (9849 > 9811), lowering kernel.perf_event_max_sample_rate to 20000
[ 172.667037] WARNING: stack going in the wrong direction? ip=sched_slice+0x51/0xa0
[ 350.325279] BUG: Bad page map in process wc pte:00000002 pmd:167b93067
[ 350.325595] addr:0000000006ece489 vm_flags:00000070 anon_vma: (null) mapping:00000000e8941173 index:1cd

[ 133.751073] WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830
[ 134.048965] perf: interrupt took too long (9682 > 9626), lowering kernel.perf_event_max_sample_rate to 20000
[ 134.472390] perf: interrupt took too long (12178 > 12102), lowering kernel.perf_event_max_sample_rate to 16000
[ 234.324541] 2018-04-17 16:08:50 umount /fs/pmem0
[ 234.324546]
[ 240.185400] WARNING: CPU: 0 PID: 6954 at kernel/workqueue.c:4142 destroy_workqueue+0x64/0x1e0

[ 174.376074] perf: interrupt took too long (7722 > 7721), lowering kernel.perf_event_max_sample_rate to 25000
[ 178.761072] WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830
[ 304.683193] usemem invoked oom-killer: gfp_mask=0x15080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0

[ 43.869050] perf: interrupt took too long (6180 > 6147), lowering kernel.perf_event_max_sample_rate to 32000
[ 48.272805] perf: interrupt took too long (7733 > 7725), lowering kernel.perf_event_max_sample_rate to 25000
[ 49.568211] WARNING: stack going in the wrong direction? ip=__slab_free+0x14b/0x2c0
[ 53.576116] perf: page allocation failure: order:2, mode:0x108c020(GFP_ATOMIC|__GFP_COMP|__GFP_ZERO), nodemask=(null)

[ 168.465169] perf: interrupt took too long (5016 > 4992), lowering kernel.perf_event_max_sample_rate to 39000
[ 168.529886] perf: interrupt took too long (6301 > 6270), lowering kernel.perf_event_max_sample_rate to 31000
[ 168.657802] perf: interrupt took too long (7979 > 7876), lowering kernel.perf_event_max_sample_rate to 25000
[ 168.976238] perf: interrupt took too long (10016 > 9973), lowering kernel.perf_event_max_sample_rate to 19000
[ 171.793224] WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830:
perf_sw_event_sched at include/linux/perf_event.h:1062
(inlined by) perf_event_task_sched_out at include/linux/perf_event.h:1100
(inlined by) prepare_task_switch at kernel/sched/core.c:2636
(inlined by) context_switch at kernel/sched/core.c:2813
(inlined by) __schedule at kernel/sched/core.c:3490
[ 225.573912] BUG: Bad page map in process sort pte:00000002 pmd:1b6303067
[ 225.574302] addr:00000000efb51519 vm_flags:00000070 anon_vma: (null) mapping:00000000c7c7d07a index:12f
[ 225.574820] file:libpthread-2.23.so fault:filemap_fault mmap:generic_file_mmap readpage:simple_readpage
[ 225.575327] CPU: 5 PID: 29228 Comm: sort Not tainted 4.17.0-rc1 #1
[ 225.575643] Hardware name: Dell Inc. Studio XPS 8000/0X231R, BIOS A01 08/11/2009
[ 225.576038] Call Trace:
[ 225.576207] dump_stack+0x5c/0x7b:
dump_stack at lib/dump_stack.c:115
[ 225.576409] print_bad_pte+0x1de/0x290:
print_bad_pte at mm/memory.c:776 (discriminator 12)
[ 225.576628] unmap_page_range+0x803/0xa20:
zap_pte_range at mm/memory.c:1384
(inlined by) zap_pmd_range at mm/memory.c:1441
(inlined by) zap_pud_range at mm/memory.c:1470
(inlined by) zap_p4d_range at mm/memory.c:1491
(inlined by) unmap_page_range at mm/memory.c:1512
[ 225.576855] unmap_vmas+0x4c/0xa0:
unmap_vmas at mm/memory.c:1586 (discriminator 3)
[ 225.577060] exit_mmap+0x82/0x150:
constant_test_bit at arch/x86/include/asm/bitops.h:328
(inlined by) mm_is_oom_victim at include/linux/oom.h:75
(inlined by) exit_mmap at mm/mmap.c:3040
[ 225.577263] mmput+0x67/0x160:
__mmput at kernel/fork.c:963
(inlined by) mmput at kernel/fork.c:983
[ 225.577453] do_exit+0x2a5/0xb80:
constant_test_bit at arch/x86/include/asm/bitops.h:328
(inlined by) test_ti_thread_flag at include/linux/thread_info.h:79
(inlined by) exit_mm at kernel/exit.c:545
(inlined by) do_exit at kernel/exit.c:852
[ 225.577652] ? __fput+0x18d/0x220:
__fput at fs/file_table.c:229
[ 225.577856] ? _cond_resched+0x19/0x30:
_cond_resched at kernel/sched/core.c:4982
[ 225.578078] do_group_exit+0x3a/0xa0:
__read_once_size at include/linux/compiler.h:188
(inlined by) list_empty at include/linux/list.h:203
(inlined by) thread_group_empty at include/linux/sched/signal.h:594
(inlined by) do_group_exit at kernel/exit.c:953
[ 225.578291] __x64_sys_exit_group+0x14/0x20:
__x64_sys_exit_group at kernel/exit.c:979
[ 225.578528] do_syscall_64+0x5b/0x180:
do_syscall_64 at arch/x86/entry/common.c:287
[ 225.578745] entry_SYSCALL_64_after_hwframe+0x44/0xa9:
entry_SYSCALL_64_after_hwframe at arch/x86/entry/entry_64.S:247
[ 225.579017] RIP: 0033:0x7f36a8f501c8
[ 225.579230] RSP: 002b:00007ffcf2d32458 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 225.579635] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f36a8f501c8
[ 225.579988] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[ 225.580343] RBP: 00007f36a92308e0 R08: 00000000000000e7 R09: ffffffffffffff98
[ 225.580697] R10: 00007f36a9452250 R11: 0000000000000246 R12: 00007f36a92308e0
[ 225.581046] R13: 00007f36a9235c40 R14: 0000000000000000 R15: 0000000000000000
[ 225.581429] Disabling lock debugging due to kernel taint
[ 225.581791] BUG: Bad rss-counter state mm:00000000997e66fa idx:2 val:-1
[ 225.602808] general protection fault: 0000 [#1] SMP PTI
[ 225.603092] Modules linked in: netconsole sr_mod cdrom sd_mod sg snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic snd_hda_codec_hdmi coretemp uas snd_hda_intel kvm_intel ata_generic dcdbas pata_acpi snd_hda_codec dell_smm_hwmon snd_hda_core kvm snd_hwdep snd_pcm firewire_ohci irqbypass crc32c_intel usb_storage pcspkr snd_timer serio_raw firewire_core ata_piix crc_itu_t snd i7core_edac soundcore libata shpchp acpi_cpufreq ip_tables broadcom bcm_phy_lib
[ 225.604945] CPU: 3 PID: 29384 Comm: tee Tainted: G B 4.17.0-rc1 #1
[ 225.605370] Hardware name: Dell Inc. Studio XPS 8000/0X231R, BIOS A01 08/11/2009
[ 225.606395] RIP: 0010:kmem_cache_alloc+0xa0/0x1e0:
prefetch_freepointer at mm/slub.c:275
(inlined by) slab_alloc_node at mm/slub.c:2734
(inlined by) slab_alloc at mm/slub.c:2749
(inlined by) kmem_cache_alloc at mm/slub.c:2754
[ 225.606648] RSP: 0018:ffffc90003f3bc68 EFLAGS: 00010202
[ 225.606927] RAX: 0000000000000000 RBX: 0003ffff88018f94 RCX: 00000000004d265b
[ 225.607312] RDX: 00000000004d265a RSI: 00000000014080c0 RDI: 0000000000027360
[ 225.607665] RBP: ffff88018f94dff2 R08: ffff8801bfce7360 R09: ffff8801b7c44100
[ 225.608018] R10: ffffc90003f3bef0 R11: 8080808080808080 R12: 00000000014080c0
[ 225.608372] R13: ffffffff813b13f2 R14: ffff8801bf15ac00 R15: ffff8801bf15ac00
[ 225.608726] FS: 00007fcfb09a3700(0000) GS:ffff8801bfcc0000(0000) knlGS:0000000000000000
[ 225.609151] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 225.609450] CR2: 00007fcfb082a000 CR3: 00000001b7cf6000 CR4: 00000000000006e0
[ 225.609803] Call Trace:
[ 225.609975] selinux_file_alloc_security+0x32/0x50:
file_alloc_security at security/selinux/hooks.c:392
(inlined by) selinux_file_alloc_security at security/selinux/hooks.c:3549
[ 225.610239] security_file_alloc+0x22/0x40:
security_file_alloc at security/security.c:879 (discriminator 19)
[ 225.610474] get_empty_filp+0x8d/0x1b0:
get_empty_filp at fs/file_table.c:129
[ 225.610694] path_openat+0x2d/0x1710:
path_openat at fs/namei.c:3478
[ 225.610907] ? page_add_file_rmap+0x13/0x200:
page_add_file_rmap at mm/rmap.c:1184
[ 225.611149] ? alloc_set_pte+0x25e/0x520:
set_pte_at at arch/x86/include/asm/paravirt.h:458
(inlined by) alloc_set_pte at mm/memory.c:3449
[ 225.611376] ? filemap_map_pages+0x30a/0x320:
filemap_map_pages at mm/filemap.c:2681
[ 225.611616] do_filp_open+0x8c/0xf0:
do_filp_open at fs/namei.c:3536
[ 225.611825] ? __handle_mm_fault+0xd69/0x10a0:
do_fault_around at mm/memory.c:3611
(inlined by) do_read_fault at mm/memory.c:3627
(inlined by) do_fault at mm/memory.c:3732
(inlined by) handle_pte_fault at mm/memory.c:3963
(inlined by) __handle_mm_fault at mm/memory.c:4087
[ 225.612112] ? _cond_resched+0x19/0x30:
_cond_resched at kernel/sched/core.c:4982
[ 225.612331] ? __alloc_fd+0x44/0x180:
__alloc_fd at fs/file.c:505
[ 225.612542] ? do_sys_open+0x1a6/0x230:
do_sys_open at fs/open.c:1094
[ 225.612758] do_sys_open+0x1a6/0x230:
do_sys_open at fs/open.c:1094
[ 225.612971] do_syscall_64+0x5b/0x180:
do_syscall_64 at arch/x86/entry/common.c:287
[ 225.613187] entry_SYSCALL_64_after_hwframe+0x44/0xa9:
entry_SYSCALL_64_after_hwframe at arch/x86/entry/entry_64.S:247
[ 225.613454] RIP: 0033:0x7fcfb041191c
[ 225.613662] RSP: 002b:00007fff8fb09f40 EFLAGS: 00000202 ORIG_RAX: 0000000000000002
[ 225.614060] RAX: ffffffffffffffda RBX: 0000000000aca200 RCX: 00007fcfb041191c
[ 225.614407] RDX: 0000000000000001 RSI: 0000000000080000 RDI: 0000000000aca1d0
[ 225.614756] RBP: 00007fff8fb0a020 R08: 0000000000aca160 R09: 0000000000000300
[ 225.615102] R10: 00007fcfb04645b0 R11: 0000000000000202 R12: 0000000000000000
[ 225.615449] R13: 0000000000000000 R14: 00007fcfb05512a0 R15: 0000000000000002
[ 225.615797] Code: 01 00 00 41 8b 46 20 49 8b 3e 48 8d 4a 01 48 8b 5c 05 00 48 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 ba 48 85 db 74 0b 41 8b 46 20 <48> 8b 04 03 0f 18 08 41 f7 c4 00 80 00 00 0f 85 08 01 00 00 66
[ 225.616765] RIP: kmem_cache_alloc+0xa0/0x1e0:
prefetch_freepointer at mm/slub.c:275
(inlined by) slab_alloc_node at mm/slub.c:2734
(inlined by) slab_alloc at mm/slub.c:2749
(inlined by) kmem_cache_alloc at mm/slub.c:2754 RSP: ffffc90003f3bc68
[ 225.617102] ---[ end trace 8032327fef00e4ff ]---
[ 225.617353] Kernel panic - not syncing: Fatal exception

Attached the full dmesg and kconfig.

Thanks,
Fengguang


Attachments:
(No filename) (10.64 kB)
dmesg-nhm-white:20180416174615:x86_64-rhel-7.2:gcc-7:4.17.0-rc1:1 (79.55 kB)
.config (166.89 kB)
Download all attachments

2018-04-19 05:59:36

by Fengguang Wu

[permalink] [raw]
Subject: Re: WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830

On Thu, Apr 19, 2018 at 01:49:41PM +0800, Fengguang Wu wrote:
>Hello,
>
>FYI this warning dates back to v4.16-rc5 .

>It's rather rare and often happen together with other errors.

Sorry, that should be 0day didn't catch this particular WARNING.
So it just occasionally show up in the context of other errors.

I jut added that WARNING pattern to 0day and hope we can get more
information about it.

Thanks,
Fengguang