2023-03-01 07:21:57

by kernel test robot

[permalink] [raw]
Subject: [linus:master] [mm/mmap] 04241ffe3f: Kernel_panic-not_syncing:Fatal_exception


Greeting,

FYI, we noticed Kernel_panic-not_syncing:Fatal_exception due to commit (built with gcc-11):

commit: 04241ffe3f0458d54c61cf6c9d58d703efda4dd5 ("mm/mmap: introduce dup_vma_anon() helper")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master f3a2439f20d918930cc4ae8f76fe1c1afd26958f]
[test failed on linux-next/master 7f7a8831520f12a3cf894b0627641fad33971221]

in testcase: trinity
version: trinity-static-i386-x86_64-1c734c75-1_2020-01-06
with following parameters:

runtime: 300s
group: group-04

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


===========================================================================
sorry for there are still (??:?) below which we are investigating now, will
update after we fix this issue.
===========================================================================


[ 27.110980][ T4917] can: raw protocol
[ 27.111989][ T4917] initcall raw_module_init+0x0/0x1000 [can_raw] returned 0 after 1009 usecs
[ 27.122540][ T4918] calling bcm_module_init+0x0/0x1000 [can_bcm] @ 4918
[ 27.124271][ T4918] can: broadcast manager protocol
[ 27.125560][ T4918] initcall bcm_module_init+0x0/0x1000 [can_bcm] returned 0 after 1289 usecs
[ 27.437568][ T4968] general protection fault, probably for non-canonical address 0x63f7d3f61dcd64f8: 0000 [#1] SMP PTI
[ 27.440138][ T4968] CPU: 0 PID: 4968 Comm: trinity-c6 Not tainted 6.2.0-rc4-00441-g04241ffe3f04 #1
[ 27.442229][ T4968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
[ 27.444764][ T4968] RIP: 0010:anon_vma_clone (??:?)
[ 27.446920][ T4968] Code: 0c 00 00 e8 4c e1 ff ff 49 89 c4 48 85 c0 75 17 48 c7 45 58 00 00 00 00 48 89 ef e8 44 fe ff ff b8 f4 ff ff ff eb 67 45 31 ff <4c> 8b 73 08 4c 89 ff 49 8b 36 e8 07 e5 ff ff 4c 89 f2 4c 89 e6 48
All code
========
0: 0c 00 or $0x0,%al
2: 00 e8 add %ch,%al
4: 4c e1 ff rex.WR loope 0x6
7: ff 49 89 decl -0x77(%rcx)
a: c4 (bad)
b: 48 85 c0 test %rax,%rax
e: 75 17 jne 0x27
10: 48 c7 45 58 00 00 00 movq $0x0,0x58(%rbp)
17: 00
18: 48 89 ef mov %rbp,%rdi
1b: e8 44 fe ff ff callq 0xfffffffffffffe64
20: b8 f4 ff ff ff mov $0xfffffff4,%eax
25: eb 67 jmp 0x8e
27: 45 31 ff xor %r15d,%r15d
2a:* 4c 8b 73 08 mov 0x8(%rbx),%r14 <-- trapping instruction
2e: 4c 89 ff mov %r15,%rdi
31: 49 8b 36 mov (%r14),%rsi
34: e8 07 e5 ff ff callq 0xffffffffffffe540
39: 4c 89 f2 mov %r14,%rdx
3c: 4c 89 e6 mov %r12,%rsi
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 4c 8b 73 08 mov 0x8(%rbx),%r14
4: 4c 89 ff mov %r15,%rdi
7: 49 8b 36 mov (%r14),%rsi
a: e8 07 e5 ff ff callq 0xffffffffffffe516
f: 4c 89 f2 mov %r14,%rdx
12: 4c 89 e6 mov %r12,%rsi
15: 48 rex.W
[ 27.450788][ T4968] RSP: 0000:ffffc90000323cc8 EFLAGS: 00010286
[ 27.452201][ T4968] RAX: ffff88810cb6b2c0 RBX: 63f7d3f61dcd64f0 RCX: 0000000000002800
[ 27.454159][ T4968] RDX: ffff88810c843c00 RSI: ffff88810cb6b2c0 RDI: ffffffff811eb537
[ 27.456050][ T4968] RBP: ffff88817ceb5990 R08: 00000000ffffffff R09: 0000000000000000
[ 27.458086][ T4968] R10: 0000000000000001 R11: ffff888119a4480c R12: ffff88810cb6b2c0
[ 27.460061][ T4968] R13: 0000000000000000 R14: ffff88817ceb5990 R15: 0000000000000000
[ 27.461953][ T4968] FS: 0000000000000000(0000) GS:ffff88842fc00000(0063) knlGS:0000000008acb840
[ 27.464014][ T4968] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 27.465582][ T4968] CR2: 0000000004000000 CR3: 0000000118cbe000 CR4: 00000000000406f0
[ 27.467580][ T4968] DR0: fffffffff6939000 DR1: 0000000000000000 DR2: 0000000000000000
[ 27.469573][ T4968] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 27.471579][ T4968] Call Trace:
[ 27.474129][ T4968] <TASK>
[ 27.474972][ T4968] __vma_adjust (??:?)
[ 27.476078][ T4968] ? mt_find (??:?)
[ 27.477120][ T4968] vma_merge (??:?)
[ 27.478184][ T4968] madvise_vma_behavior (madvise.c:?)
[ 27.479398][ T4968] do_madvise (??:?)
[ 27.480478][ T4968] __ia32_sys_madvise (??:?)
[ 27.481632][ T4968] do_int80_syscall_32 (??:?)
[ 27.482793][ T4968] entry_INT80_compat (??:?)
[ 27.483910][ T4968] RIP: 0023:0x80a3392
[ 27.484931][ T4968] Code: 89 c8 c3 90 8d 74 26 00 85 c0 c7 01 01 00 00 00 75 d8 a1 c8 a9 ac 08 eb d1 66 90 66 90 66 90 66 90 66 90 66 90 66 90 90 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 10 a3 f0 a9 ac 08 85
All code
========
0: 89 c8 mov %ecx,%eax
2: c3 retq
3: 90 nop
4: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
8: 85 c0 test %eax,%eax
a: c7 01 01 00 00 00 movl $0x1,(%rcx)
10: 75 d8 jne 0xffffffffffffffea
12: a1 c8 a9 ac 08 eb d1 movabs 0x9066d1eb08aca9c8,%eax
19: 66 90
1b: 66 90 xchg %ax,%ax
1d: 66 90 xchg %ax,%ax
1f: 66 90 xchg %ax,%ax
21: 66 90 xchg %ax,%ax
23: 66 90 xchg %ax,%ax
25: 66 90 xchg %ax,%ax
27: 90 nop
28: cd 80 int $0x80
2a:* c3 retq <-- trapping instruction
2b: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi
31: 8d bc 27 00 00 00 00 lea 0x0(%rdi,%riz,1),%edi
38: 8b 10 mov (%rax),%edx
3a: a3 .byte 0xa3
3b: f0 lock
3c: a9 .byte 0xa9
3d: ac lods %ds:(%rsi),%al
3e: 08 .byte 0x8
3f: 85 .byte 0x85

Code starting with the faulting instruction
===========================================
0: c3 retq
1: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi
7: 8d bc 27 00 00 00 00 lea 0x0(%rdi,%riz,1),%edi
e: 8b 10 mov (%rax),%edx
10: a3 .byte 0xa3
11: f0 lock
12: a9 .byte 0xa9
13: ac lods %ds:(%rsi),%al
14: 08 .byte 0x8
15: 85 .byte 0x85
[ 27.493398][ T4968] RSP: 002b:00000000fffe31a8 EFLAGS: 00000292 ORIG_RAX: 00000000000000db
[ 27.495648][ T4968] RAX: ffffffffffffffda RBX: 00000000f703e000 RCX: 00000000000ec000
[ 27.497643][ T4968] RDX: 0000000000000000 RSI: 00000000dfffffff RDI: 000000007509d669
[ 27.499549][ T4968] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[ 27.501564][ T4968] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 27.503516][ T4968] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 27.505185][ T4968] </TASK>
[ 27.506063][ T4968] Modules linked in: can_bcm can_raw can cn scsi_transport_iscsi sr_mod cdrom ata_generic
[ 27.508517][ T4968] ---[ end trace 0000000000000000 ]---
[ 27.509861][ T4968] RIP: 0010:anon_vma_clone (??:?)
[ 27.511207][ T4968] Code: 0c 00 00 e8 4c e1 ff ff 49 89 c4 48 85 c0 75 17 48 c7 45 58 00 00 00 00 48 89 ef e8 44 fe ff ff b8 f4 ff ff ff eb 67 45 31 ff <4c> 8b 73 08 4c 89 ff 49 8b 36 e8 07 e5 ff ff 4c 89 f2 4c 89 e6 48
All code
========
0: 0c 00 or $0x0,%al
2: 00 e8 add %ch,%al
4: 4c e1 ff rex.WR loope 0x6
7: ff 49 89 decl -0x77(%rcx)
a: c4 (bad)
b: 48 85 c0 test %rax,%rax
e: 75 17 jne 0x27
10: 48 c7 45 58 00 00 00 movq $0x0,0x58(%rbp)
17: 00
18: 48 89 ef mov %rbp,%rdi
1b: e8 44 fe ff ff callq 0xfffffffffffffe64
20: b8 f4 ff ff ff mov $0xfffffff4,%eax
25: eb 67 jmp 0x8e
27: 45 31 ff xor %r15d,%r15d
2a:* 4c 8b 73 08 mov 0x8(%rbx),%r14 <-- trapping instruction
2e: 4c 89 ff mov %r15,%rdi
31: 49 8b 36 mov (%r14),%rsi
34: e8 07 e5 ff ff callq 0xffffffffffffe540
39: 4c 89 f2 mov %r14,%rdx
3c: 4c 89 e6 mov %r12,%rsi
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 4c 8b 73 08 mov 0x8(%rbx),%r14
4: 4c 89 ff mov %r15,%rdi
7: 49 8b 36 mov (%r14),%rsi
a: e8 07 e5 ff ff callq 0xffffffffffffe516
f: 4c 89 f2 mov %r14,%rdx
12: 4c 89 e6 mov %r12,%rsi
15: 48 rex.W
[ 27.515465][ T4968] RSP: 0000:ffffc90000323cc8 EFLAGS: 00010286
[ 27.516952][ T4968] RAX: ffff88810cb6b2c0 RBX: 63f7d3f61dcd64f0 RCX: 0000000000002800
[ 27.518954][ T4968] RDX: ffff88810c843c00 RSI: ffff88810cb6b2c0 RDI: ffffffff811eb537
[ 27.520850][ T4968] RBP: ffff88817ceb5990 R08: 00000000ffffffff R09: 0000000000000000
[ 27.522647][ T4968] R10: 0000000000000001 R11: ffff888119a4480c R12: ffff88810cb6b2c0
[ 27.524607][ T4968] R13: 0000000000000000 R14: ffff88817ceb5990 R15: 0000000000000000
[ 27.527653][ T4968] FS: 0000000000000000(0000) GS:ffff88842fc00000(0063) knlGS:0000000008acb840
[ 27.532675][ T4968] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 27.536282][ T4968] CR2: 0000000004000000 CR3: 0000000118cbe000 CR4: 00000000000406f0
[ 27.542599][ T4968] DR0: fffffffff6939000 DR1: 0000000000000000 DR2: 0000000000000000
[ 27.547699][ T4968] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 27.646310][ T4968] Kernel panic - not syncing: Fatal exception
[ 27.651934][ T4968] Kernel Offset: disabled

Kboot worker: lkp-worker46
Elapsed time: 60



To reproduce:

# build kernel
cd linux
cp config-6.2.0-rc4-00441-g04241ffe3f04 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests



Attachments:
(No filename) (10.99 kB)
config-6.2.0-rc4-00441-g04241ffe3f04 (127.89 kB)
job-script (4.37 kB)
dmesg.xz (27.55 kB)
Download all attachments

2023-03-01 16:51:28

by Liam R. Howlett

[permalink] [raw]
Subject: Re: [linus:master] [mm/mmap] 04241ffe3f: Kernel_panic-not_syncing:Fatal_exception

* kernel test robot <[email protected]> [230301 02:21]:
>
> Greeting,
>
> FYI, we noticed Kernel_panic-not_syncing:Fatal_exception due to commit (built with gcc-11):
>
> commit: 04241ffe3f0458d54c61cf6c9d58d703efda4dd5 ("mm/mmap: introduce dup_vma_anon() helper")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master f3a2439f20d918930cc4ae8f76fe1c1afd26958f]
> [test failed on linux-next/master 7f7a8831520f12a3cf894b0627641fad33971221]

I tracked the problem down in that commit. The fix is simple enough:

-----------------
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -802,10 +802,13 @@ int __vma_adjust(struct vma_iterator *vmi, struct vm_area_struct *vma,
* If next doesn't have anon_vma, import from vma after
* next, if the vma overlaps with it.
*/
- if (remove != NULL && !next->anon_vma)
+ if (remove2 != NULL && !next->anon_vma)
----------------

However, that will not fix the problem in linux-next or linus/master
since this code is completely changed shortly after.

You need the fix from Vlastimil (Cc'ed). After cherry-picking
07dc4b186203 ("mm/mremap: fix dup_anon_vma() in vma_merge() case 4") on
top of linus/master, I don't get this particular failure anymore.

I do get the "kernel BUG at mm/filemap.c:155!", so it might be masking
another problem. (Added Matthew to Cc)

I think the right thing to do is to include Vlastimil's fix.

Thanks,
Liam

...


2023-03-01 19:10:33

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [linus:master] [mm/mmap] 04241ffe3f: Kernel_panic-not_syncing:Fatal_exception

On 3/1/23 17:50, Liam R. Howlett wrote:
> * kernel test robot <[email protected]> [230301 02:21]:
>>
>> Greeting,
>>
>> FYI, we noticed Kernel_panic-not_syncing:Fatal_exception due to commit (built with gcc-11):
>>
>> commit: 04241ffe3f0458d54c61cf6c9d58d703efda4dd5 ("mm/mmap: introduce dup_vma_anon() helper")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> [test failed on linus/master f3a2439f20d918930cc4ae8f76fe1c1afd26958f]
>> [test failed on linux-next/master 7f7a8831520f12a3cf894b0627641fad33971221]
>
> I tracked the problem down in that commit. The fix is simple enough:
>
> -----------------
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -802,10 +802,13 @@ int __vma_adjust(struct vma_iterator *vmi, struct vm_area_struct *vma,
> * If next doesn't have anon_vma, import from vma after
> * next, if the vma overlaps with it.
> */
> - if (remove != NULL && !next->anon_vma)
> + if (remove2 != NULL && !next->anon_vma)

Oh I actually did notice that one too, but as it was only temporary within
the series and already baked into git, thought there's no benefit in
pointing it out. A problem for bisect obviously as was just confirmed.

> ----------------
>
> However, that will not fix the problem in linux-next or linus/master
> since this code is completely changed shortly after.
>
> You need the fix from Vlastimil (Cc'ed). After cherry-picking

> 07dc4b186203 ("mm/mremap: fix dup_anon_vma() in vma_merge() case 4") on

Which tree is that?
The mm-hotfixes-stable commit is here:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-stable&id=4c6759967826b87f56c73e0f1deb7b76379ccd23

> top of linus/master, I don't get this particular failure anymore.
>
> I do get the "kernel BUG at mm/filemap.c:155!", so it might be masking
> another problem. (Added Matthew to Cc)
>
> I think the right thing to do is to include Vlastimil's fix.

Great, thanks.

> Thanks,
> Liam
>
> ...
>


2023-03-01 20:35:51

by Liam R. Howlett

[permalink] [raw]
Subject: Re: [linus:master] [mm/mmap] 04241ffe3f: Kernel_panic-not_syncing:Fatal_exception

* Vlastimil Babka <[email protected]> [230301 14:10]:
> On 3/1/23 17:50, Liam R. Howlett wrote:
> > * kernel test robot <[email protected]> [230301 02:21]:
> >>
> >> Greeting,
> >>
> >> FYI, we noticed Kernel_panic-not_syncing:Fatal_exception due to commit (built with gcc-11):
> >>
> >> commit: 04241ffe3f0458d54c61cf6c9d58d703efda4dd5 ("mm/mmap: introduce dup_vma_anon() helper")
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >>
> >> [test failed on linus/master f3a2439f20d918930cc4ae8f76fe1c1afd26958f]
> >> [test failed on linux-next/master 7f7a8831520f12a3cf894b0627641fad33971221]
> >
> > I tracked the problem down in that commit. The fix is simple enough:
> >
> > -----------------
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -802,10 +802,13 @@ int __vma_adjust(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > * If next doesn't have anon_vma, import from vma after
> > * next, if the vma overlaps with it.
> > */
> > - if (remove != NULL && !next->anon_vma)
> > + if (remove2 != NULL && !next->anon_vma)
>
> Oh I actually did notice that one too, but as it was only temporary within
> the series and already baked into git, thought there's no benefit in
> pointing it out. A problem for bisect obviously as was just confirmed.
>
> > ----------------
> >
> > However, that will not fix the problem in linux-next or linus/master
> > since this code is completely changed shortly after.
> >
> > You need the fix from Vlastimil (Cc'ed). After cherry-picking
>
> > 07dc4b186203 ("mm/mremap: fix dup_anon_vma() in vma_merge() case 4") on
>
> Which tree is that?
> The mm-hotfixes-stable commit is here:
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-stable&id=4c6759967826b87f56c73e0f1deb7b76379ccd23

I got that git id from the build bots previous report of the failure.
It appears that it isn't in any of my trees, so it must be from
linux-next daily from some point.

>
> > top of linus/master, I don't get this particular failure anymore.
> >
> > I do get the "kernel BUG at mm/filemap.c:155!", so it might be masking
> > another problem. (Added Matthew to Cc)
> >
> > I think the right thing to do is to include Vlastimil's fix.
>
> Great, thanks.
>
> > Thanks,
> > Liam
> >
> > ...
> >
>