2023-01-08 16:44:07

by kernel test robot

[permalink] [raw]
Subject: [linus:master] [mm, slub] fa9b88e459: kernel_BUG_at_include/linux/page-flags.h

Greeting,

FYI, we noticed kernel_BUG_at_include/linux/page-flags.h due to commit (built with gcc-11):

commit: fa9b88e459d710cadf3b01e8a64eda00cc91cdd6 ("mm, slub: refactor free debug processing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linux-next/master 469a89fd3bb73bb2eea628da2b3e0f695f80b7ce]

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


[ 14.864365][ T181] page:6b5e897b refcount:1 mapcount:0 mapping:00000000 index:0xedc20780 pfn:0x2dc20
[ 14.865191][ T181] head:e9959bfb order:1 compound_mapcount:0 compound_pincount:0
[ 14.865805][ T181] flags: 0x10200(slab|head|zone=0)
[ 14.866254][ T181] raw: 00010200 ead174b1 c0100a70 00000400 edc20780 00080002 ffffffff 00000001
[ 14.867009][ T181] raw: 00000000 00000000
[ 14.867363][ T181] head: 00010200 00000000 00000122 c01a0200 00000000 00010001 ffffffff 00000001
[ 14.868096][ T181] head: 00000000 00000000
[ 14.868449][ T181] page dumped because: VM_BUG_ON_PAGE(PageTail(page))
[ 14.869002][ T181] page_owner tracks the page as allocated
[ 14.869497][ T181] page last allocated via order 1, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 194, tgid 194 (lkp-setup-rootf), ts 14489802849, free_ts 14483904809
[ 14.871285][ T181] post_alloc_hook+0x1fa/0x280
[ 14.871703][ T181] get_page_from_freelist+0x226/0x310
[ 14.872149][ T181] __alloc_pages+0xdd/0x360
[ 14.872516][ T181] alloc_slab_page+0x12d/0x200
[ 14.872917][ T181] allocate_slab+0x6a/0x350
[ 14.873288][ T181] new_slab+0x48/0xc0
[ 14.873616][ T181] ___slab_alloc+0x8ff/0x11c0
[ 14.874084][ T181] __slab_alloc+0x47/0x80
[ 14.874559][ T181] kmem_cache_alloc+0x47a/0x610
[ 14.875048][ T181] mm_alloc+0x20/0x90
[ 14.875407][ T181] bprm_mm_init+0x24/0x180
[ 14.875773][ T181] alloc_bprm+0xc9/0x1a0
[ 14.876138][ T181] do_execveat_common+0x55/0x330
[ 14.876537][ T181] __ia32_sys_execve+0x64/0xa0
[ 14.876935][ T181] __do_fast_syscall_32+0x72/0xd0
[ 14.877320][ T181] do_fast_syscall_32+0x32/0x70
[ 14.877711][ T181] page last free stack trace:
[ 14.878105][ T181] free_pcp_prepare+0x34f/0x940
[ 14.878501][ T181] free_unref_page_prepare+0x29/0x210
[ 14.878969][ T181] free_unref_page+0x3a/0x3b0
[ 14.879348][ T181] __free_pages+0x187/0x1f0
[ 14.879751][ T181] __free_slab+0x1fd/0x350
[ 14.880238][ T181] free_slab+0x22/0x70
[ 14.880620][ T181] __slab_free+0x2ab/0x340
[ 14.881070][ T181] kmem_cache_free+0x288/0x2d0
[ 14.881457][ T181] free_task+0x6d/0xd0
[ 14.881790][ T181] __put_task_struct+0x10d/0x2d0
[ 14.882202][ T181] put_task_struct+0x9d/0x100
[ 14.882668][ T181] delayed_put_task_struct+0x5e/0x70
[ 14.883126][ T181] rcu_do_batch+0x267/0xab0
[ 14.883519][ T181] rcu_core+0x21b/0x460
[ 14.883879][ T181] rcu_core_si+0x16/0x30
[ 14.884231][ T181] __do_softirq+0x178/0x53b
[ 14.884634][ T181] ------------[ cut here ]------------
[ 14.885088][ T181] kernel BUG at include/linux/page-flags.h:319!
[ 14.885586][ T181] invalid opcode: 0000 [#1] SMP
[ 14.885973][ T181] CPU: 1 PID: 181 Comm: mountall Tainted: G S 6.1.0-rc2-00012-gfa9b88e459d7 #1 8a586a2648e09836656ae144dee8b8278b48383e
[ 14.887074][ T181] EIP: folio_flags+0x31/0x70
[ 14.887532][ T181] Code: 48 83 05 28 19 bf c5 01 ba 18 47 2f c4 55 89 e5 83 15 2c 19 bf c5 00 e8 2d 2e f6 ff 83 05 38 19 bf c5 01 83 15 3c 19 bf c5 00 <0f> 0b 83 05 40 19 bf c5 01 b8 14 32 a5 c4 83 15 44 19 bf c5 00 e8
[ 14.889269][ T181] EAX: 00000000 EBX: edc21680 ECX: 00000000 EDX: ffffffff
[ 14.889816][ T181] ESI: ead17500 EDI: c01a0480 EBP: edcdfd0c ESP: edcdfd0c
[ 14.890364][ T181] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[ 14.890976][ T181] CR0: 80050033 CR2: b7d1e03c CR3: 2dcfb000 CR4: 000406d0
[ 14.891528][ T181] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 14.892094][ T181] DR6: fffe0ff0 DR7: 00000400
[ 14.892483][ T181] Call Trace:
[ 14.892754][ T181] kmem_cache_free+0xc3/0x2d0
[ 14.893139][ T181] __mmdrop+0xe4/0x210
[ 14.893478][ T181] ? lockdep_hardirqs_on_prepare+0x242/0x400
[ 14.893960][ T181] ? do_raw_spin_unlock+0x1c/0xa0
[ 14.894358][ T181] finish_task_switch+0x23e/0x370
[ 14.894760][ T181] ? finish_task_switch+0x42/0x370
[ 14.895168][ T181] __schedule+0x472/0x790
[ 14.895516][ T181] schedule+0x4a/0x130
[ 14.895830][ T181] pipe_read+0x417/0x780
[ 14.896159][ T181] ? prepare_to_wait_exclusive+0x160/0x160
[ 14.896603][ T181] vfs_read+0x35e/0x3f0
[ 14.896933][ T181] ksys_read+0x12e/0x1d0
[ 14.897286][ T181] __ia32_sys_read+0x1e/0x30
[ 14.897655][ T181] __do_fast_syscall_32+0x72/0xd0
[ 14.898059][ T181] ? kvm_clock_read+0x3f/0x70
[ 14.898452][ T181] ? kvm_sched_clock_read+0x16/0x40
[ 14.898877][ T181] ? sched_clock+0x16/0x30
[ 14.899222][ T181] ? sched_clock_cpu+0x12b/0x160
[ 14.899605][ T181] ? __lock_release+0x3bc/0x410
[ 14.899981][ T181] ? do_user_addr_fault+0x326/0xcb0
[ 14.900389][ T181] ? lockdep_hardirqs_on_prepare+0x242/0x400
[ 14.900849][ T181] ? irqentry_exit_to_user_mode+0x23/0x30
[ 14.901287][ T181] ? irqentry_exit+0x7f/0xc0
[ 14.901641][ T181] do_fast_syscall_32+0x32/0x70
[ 14.902018][ T181] do_SYSENTER_32+0x15/0x20
[ 14.902413][ T181] entry_SYSENTER_32+0xa2/0xfb
[ 14.902836][ T181] EIP: 0xb7efd549
[ 14.903129][ T181] Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
[ 14.904668][ T181] EAX: ffffffda EBX: 0000000d ECX: bfc4fecf EDX: 00000001
[ 14.905396][ T181] ESI: 00522a78 EDI: 00526b28 EBP: bfc4fee8 ESP: bfc4fe08
[ 14.905942][ T181] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[ 14.906593][ T181] Modules linked in:
[ 14.907010][ T181] ---[ end trace 0000000000000000 ]---
[ 14.907474][ T181] EIP: folio_flags+0x31/0x70
[ 14.907934][ T181] Code: 48 83 05 28 19 bf c5 01 ba 18 47 2f c4 55 89 e5 83 15 2c 19 bf c5 00 e8 2d 2e f6 ff 83 05 38 19 bf c5 01 83 15 3c 19 bf c5 00 <0f> 0b 83 05 40 19 bf c5 01 b8 14 32 a5 c4 83 15 44 19 bf c5 00 e8
[ 14.909475][ T181] EAX: 00000000 EBX: edc21680 ECX: 00000000 EDX: ffffffff
[ 14.910036][ T181] ESI: ead17500 EDI: c01a0480 EBP: edcdfd0c ESP: edcdfd0c
[ 14.910675][ T181] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[ 14.911310][ T181] CR0: 80050033 CR2: b7d1e03c CR3: 2dcfb000 CR4: 000406d0
[ 14.911835][ T181] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 14.912399][ T181] DR6: fffe0ff0 DR7: 00000400
[ 14.912769][ T181] Kernel panic - not syncing: Fatal exception
[ 14.913288][ T181] Kernel Offset: disabled


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


To reproduce:

# build kernel
cd linux
cp config-6.1.0-rc2-00012-gfa9b88e459d7 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Attachments:
(No filename) (7.91 kB)
config-6.1.0-rc2-00012-gfa9b88e459d7 (149.80 kB)
job-script (4.83 kB)
dmesg.xz (54.46 kB)
Download all attachments

2023-01-10 00:07:56

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [linus:master] [mm, slub] fa9b88e459: kernel_BUG_at_include/linux/page-flags.h

On 1/8/23 17:28, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed kernel_BUG_at_include/linux/page-flags.h due to commit (built with gcc-11):
>
> commit: fa9b88e459d710cadf3b01e8a64eda00cc91cdd6 ("mm, slub: refactor free debug processing")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Thanks for trying to bisect the root cause to earlier commit than
0af8489b0216f, as we discussed. Here I strongly suspect it's also earlier
than this commit. Because the code changed here is only used with
SLUB_DEBUG, which is not enabled in your config. Only later in 0af8489b0216f
it becomes used also by SLUB_TINY.

> [test failed on linux-next/master 469a89fd3bb73bb2eea628da2b3e0f695f80b7ce]
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>

2023-01-11 02:01:21

by kernel test robot

[permalink] [raw]
Subject: Re: [linus:master] [mm, slub] fa9b88e459: kernel_BUG_at_include/linux/page-flags.h

On Mon, Jan 09, 2023 at 11:57:12PM +0100, Vlastimil Babka wrote:
> On 1/8/23 17:28, kernel test robot wrote:
> > Greeting,
> >
> > FYI, we noticed kernel_BUG_at_include/linux/page-flags.h due to commit (built with gcc-11):
> >
> > commit: fa9b88e459d710cadf3b01e8a64eda00cc91cdd6 ("mm, slub: refactor free debug processing")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> Thanks for trying to bisect the root cause to earlier commit than
> 0af8489b0216f, as we discussed. Here I strongly suspect it's also earlier
> than this commit. Because the code changed here is only used with
> SLUB_DEBUG, which is not enabled in your config. Only later in 0af8489b0216f
> it becomes used also by SLUB_TINY.

Sorry for this false report. We don't have much confidence in bisecting
this issue since the reproducing rate is very low, so it's difficult to
mark good or bad during bisection.

Thanks for the latest update at:
https://lore.kernel.org/all/[email protected]/
The test results so far have been summarized by Oliver in the previous
report. Will keep you updated if we find other clues. Thanks.

--
Best Regards,
Yujie

> > [test failed on linux-next/master 469a89fd3bb73bb2eea628da2b3e0f695f80b7ce]
> >
> > in testcase: boot
> >
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
>