2024-02-07 06:43:10

by Oliver Sang

[permalink] [raw]
Subject: [linus:master] [kobject] 1b28cb81da: canonical_address#:#[##]



Hello,

kernel test robot noticed "canonical_address#:#[##]" on:

commit: 1b28cb81dab7c1eedc6034206f4e8d644046ad31 ("kobject: Remove redundant checks for whether ktype is NULL")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478 (v6.8-rc3)]
[test failed on linux-next/master 076d56d74f17e625b3d63cf4743b3d7d02180379]

in testcase: boot

compiler: gcc-11
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



we noticed this issue is very random, as below, observed 4 times out of 68 runs.
but we didn't see in on parent.

meanwhile, we noticed there is another random issue on parent, the dmesg is
attached as dmesg-4d0fe8c52b.xz. we didn't see this issue on 1b28cb81da. FYI
(the dmesg for 1b28cb81da is uploaded to [1])

4d0fe8c52bb3029d 1b28cb81dab7c1eedc6034206f4
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:68 6% 4:68 dmesg.Kernel_panic-not_syncing:Fatal_exception
:68 6% 4:68 dmesg.RIP:__kobject_del
5:68 -7% :68 dmesg.RIP:kobject_put
5:68 -7% :68 dmesg.RIP:refcount_warn_saturate
5:68 -7% :68 dmesg.WARNING:at_lib/kobject.c:#kobject_put
5:68 -7% :68 dmesg.WARNING:at_lib/refcount.c:#refcount_warn_saturate
:68 6% 4:68 dmesg.canonical_address#:#[##]



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]



[ 75.238491][ T1] IRQ13 -> 0:13
[ 75.238931][ T1] IRQ14 -> 0:14
[ 75.239377][ T1] IRQ15 -> 0:15
[ 75.239833][ T1] .................................... done.
[ 75.342389][ T1] sched_clock: Marking stable (75300013917, 40623244)->(75406217264, -65580103)
[ 84.458843][ T6] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
[ 84.459548][ T1] kmemleak: Kernel memory leak detector initialized (mem pool available: 13821)
[ 84.460356][ T6] KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
[ 84.460364][ T6] CPU: 0 PID: 6 Comm: kworker/0:0 Tainted: G N 6.5.0-rc4-00030-g1b28cb81dab7 #1
[ 84.463835][ T6] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 84.464399][ T61] kmemleak: Automatic memory scanning thread started
[ 84.465113][ T6] Workqueue: events slab_caches_to_rcu_destroy_workfn
[ 84.465127][ T6] RIP: 0010:__kobject_del (kbuild/src/consumer/lib/kobject.c:592)
[ 84.465137][ T6] Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 23 02 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 6b 28 48 8d 7d 10 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 f6 01 00 00 48 8b 75 10 48 89 df 48 8d 6b 3c e8
All code
========
0: 48 89 fa mov %rdi,%rdx
3: 48 c1 ea 03 shr $0x3,%rdx
7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
b: 0f 85 23 02 00 00 jne 0x234
11: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
18: fc ff df
1b: 48 8b 6b 28 mov 0x28(%rbx),%rbp
1f: 48 8d 7d 10 lea 0x10(%rbp),%rdi
23: 48 89 fa mov %rdi,%rdx
26: 48 c1 ea 03 shr $0x3,%rdx
2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
2e: 0f 85 f6 01 00 00 jne 0x22a
34: 48 8b 75 10 mov 0x10(%rbp),%rsi
38: 48 89 df mov %rbx,%rdi
3b: 48 8d 6b 3c lea 0x3c(%rbx),%rbp
3f: e8 .byte 0xe8

Code starting with the faulting instruction
===========================================
0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
4: 0f 85 f6 01 00 00 jne 0x200
a: 48 8b 75 10 mov 0x10(%rbp),%rsi
e: 48 89 df mov %rbx,%rdi
11: 48 8d 6b 3c lea 0x3c(%rbx),%rbp
15: e8 .byte 0xe8
[ 84.469942][ T6] RSP: 0000:ffffc9000006fcb0 EFLAGS: 00010202
[ 84.470698][ T6] RAX: dffffc0000000000 RBX: ffff888129354838 RCX: 0000000000000000
[ 84.471694][ T6] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000010
[ 84.472675][ T6] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 84.473661][ T6] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 84.474651][ T6] R13: ffffc9000006fdb8 R14: ffff8881002c2ea8 R15: ffffffff83ac74c0
[ 84.475621][ T6] FS: 0000000000000000(0000) GS:ffff8883aee00000(0000) knlGS:0000000000000000
[ 84.476712][ T6] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 84.477516][ T6] CR2: ffff88843ffff000 CR3: 0000000003435000 CR4: 00000000000406b0
[ 84.478471][ T6] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 84.479449][ T6] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 84.480436][ T6] Call Trace:
[ 84.480849][ T6] <TASK>
[ 84.481216][ T6] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460)
[ 84.481738][ T6] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:787 kbuild/src/consumer/arch/x86/kernel/traps.c:729)
[ 84.482409][ T6] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:564)
[ 84.483127][ T6] ? __kobject_del (kbuild/src/consumer/lib/kobject.c:592)


[1]
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240207/[email protected]



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Attachments:
(No filename) (6.06 kB)
dmesg-4d0fe8c52b.xz (13.93 kB)
Download all attachments

2024-02-08 15:50:05

by Greg KH

[permalink] [raw]
Subject: Re: [linus:master] [kobject] 1b28cb81da: canonical_address#:#[##]

On Wed, Feb 07, 2024 at 02:42:43PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "canonical_address#:#[##]" on:
>
> commit: 1b28cb81dab7c1eedc6034206f4e8d644046ad31 ("kobject: Remove redundant checks for whether ktype is NULL")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478 (v6.8-rc3)]
> [test failed on linux-next/master 076d56d74f17e625b3d63cf4743b3d7d02180379]
>
> in testcase: boot
>
> compiler: gcc-11
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> we noticed this issue is very random, as below, observed 4 times out of 68 runs.
> but we didn't see in on parent.

Ok, this is odd, but a good enough reason to revert this for now. I was
worried about it, and this confirms my worry that there's some codepath
we aren't taking into account here that those checks were protecting us
from doing bad things.

thanks for the report, and Zhen, if you want to dig into this and see if
you can figure out what is happening so that you can submit your change
again, that would be great.

greg k-h

2024-02-18 02:01:17

by Leizhen (ThunderTown)

[permalink] [raw]
Subject: Re: [linus:master] [kobject] 1b28cb81da: canonical_address#:#[##]



On 2024/2/8 23:48, Greg Kroah-Hartman wrote:
> On Wed, Feb 07, 2024 at 02:42:43PM +0800, kernel test robot wrote:
>>
>>
>> Hello,
>>
>> kernel test robot noticed "canonical_address#:#[##]" on:
>>
>> commit: 1b28cb81dab7c1eedc6034206f4e8d644046ad31 ("kobject: Remove redundant checks for whether ktype is NULL")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> [test failed on linus/master 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478 (v6.8-rc3)]
>> [test failed on linux-next/master 076d56d74f17e625b3d63cf4743b3d7d02180379]
>>
>> in testcase: boot
>>
>> compiler: gcc-11
>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>
>>
>>
>> we noticed this issue is very random, as below, observed 4 times out of 68 runs.
>> but we didn't see in on parent.
>
> Ok, this is odd, but a good enough reason to revert this for now. I was
> worried about it, and this confirms my worry that there's some codepath
> we aren't taking into account here that those checks were protecting us
> from doing bad things.

Yes, there may be some non-standard usage. kobj->ktype was detached first?

>
> thanks for the report, and Zhen, if you want to dig into this and see if
> you can figure out what is happening so that you can submit your change
> again, that would be great.

I'm trying to reproduce it. However, for me, it may take a lot of time to
prepare the environment. If kobj->name was printed when the error is
detected, we may be able to solve it directly by reviewing the code.

>
> greg k-h
> .
>

--
Regards,
Zhen Lei