2023-04-26 09:25:27

by Pengfei Xu

[permalink] [raw]
Subject: [Syzkaller & bisect] There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark ACPI and OF related data as maybe unused"

Hi Jarkko and Krzysztof Kozlowski,

Greeting!

Platform: x86 platforms

There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark
ACPI and OF related data as maybe unused":
https://lore.kernel.org/lkml/[email protected]/
-> https://lore.kernel.org/lkml/[email protected]/

We tested Intel internal kernel and found that, the above patch caused below
WARNING and then kernel BUG dmesg info. After reverted above commit on top
of Intel internal kernel, this issue was gone.
I checked that internal commit:"c3985d8b9c22 tpm: st33zp24: Mark ACPI and OF
related data as maybe unused" was same as above link patch.
This issue could be reproduced in 155s in VM.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230426_132902_tpm_chip_unregister_warning
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.c
Syzkaller syscall reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.prog
Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.report
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/bisect_info.log

"
[ 24.638052] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=329 'systemd'
[ 28.731375] cgroup: Unknown subsys name 'net'
[ 28.741433] cgroup: Unknown subsys name 'rlimit'
[ 35.900833] tpm tpm0: Operation Canceled
[ 35.901377] ------------[ cut here ]------------
[ 35.901648] refcount_t: addition on 0; use-after-free.
[ 35.901986] WARNING: CPU: 0 PID: 4095 at lib/refcount.c:25 refcount_warn_saturate+0xe6/0x1c0
[ 35.902444] Modules linked in:
[ 35.902627] CPU: 0 PID: 4095 Comm: repro Not tainted 6.3.0-2023-04-24-intel-next-591f7c2026cb+ #1
[ 35.903093] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 35.903679] RIP: 0010:refcount_warn_saturate+0xe6/0x1c0
[ 35.903968] Code: 1d 99 79 26 02 31 ff 89 de e8 86 b1 55 ff 84 db 75 97 e8 1d b0 55 ff 48 c7 c7 78 a8 9e 83 c6 05 79 79 26 02 01 e8 3a a9 39 ff <0f> 0b e9 78 ff ff ff e8 fe af 55 ff 0f b6 1d 63 79 26 02 31 ff 89
[ 35.904941] RSP: 0000:ffffc900015efd88 EFLAGS: 00010286
[ 35.905218] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8112384b
[ 35.905600] RDX: 0000000000000000 RSI: ffff8880147f8000 RDI: 0000000000000002
[ 35.905981] RBP: ffffc900015efd98 R08: 0000000000000000 R09: 0000000000000001
[ 35.906358] R10: 0000000000000001 R11: ffffffff83d638d8 R12: ffff8880146e0028
[ 35.906736] R13: ffff8880146e0028 R14: ffff8880180cbae0 R15: ffff888007095fb8
[ 35.907104] FS: 00007fac3687f800(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 35.907521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.907825] CR2: 00007f843e8b4720 CR3: 0000000007538005 CR4: 0000000000770ef0
[ 35.908197] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 35.908564] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 35.908949] PKRU: 55555554
[ 35.909097] Call Trace:
[ 35.909233] <TASK>
[ 35.909357] kthread_stop+0x349/0x360
[ 35.909577] hwrng_unregister+0x182/0x210
[ 35.909816] tpm_chip_unregister+0x1cc/0x1f0
[ 35.910060] ? __pfx_vtpm_proxy_fops_release+0x10/0x10
[ 35.910342] vtpm_proxy_fops_release+0x8f/0xa0
[ 35.910589] __fput+0x11f/0x450
[ 35.910628] tpm tpm1: Operation Canceled
[ 35.910780] ____fput+0x1e/0x30
[ 35.911178] task_work_run+0xb6/0x120
[ 35.911393] exit_to_user_mode_prepare+0x200/0x210
[ 35.911660] syscall_exit_to_user_mode+0x2d/0x60
[ 35.911926] do_syscall_64+0x4a/0x90
[ 35.912136] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 35.912414] RIP: 0033:0x7fac36a8fe37
[ 35.912613] Code: 12 b8 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3b c3 66 90 53 89 fb 48 83 ec 10 e8 d4 fb ff ff 89 c2 89 df b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 16 fc ff ff 8b 44 24
[ 35.913548] RSP: 002b:00007ffc4fdbe320 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[ 35.913878] ------------[ cut here ]------------
[ 35.913944] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007fac36a8fe37
[ 35.914212] refcount_t: saturated; leaking memory.
[ 35.914564] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000000000000004
[ 35.914869] WARNING: CPU: 1 PID: 4103 at lib/refcount.c:22 refcount_warn_saturate+0xb1/0x1c0
[ 35.915203] RBP: 00007ffc4fdbe350 R08: 0000000000000000 R09: 0000000000000032
[ 35.915630] Modules linked in:
[ 35.915993] R10: 00007ffc4fdbe320 R11: 0000000000000293 R12: 0000000000402400
[ 35.916164] CPU: 1 PID: 4103 Comm: repro Not tainted 6.3.0-2023-04-24-intel-next-591f7c2026cb+ #1
[ 35.916525] R13: 00007ffc4fdbe4e0 R14: 0000000000000000 R15: 0000000000000000
[ 35.917000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 35.917363] </TASK>
[ 35.917934] RIP: 0010:refcount_warn_saturate+0xb1/0x1c0
[ 35.918061] irq event stamp: 1787
[ 35.918332] Code: 1d cf 79 26 02 31 ff 89 de e8 bb b1 55 ff 84 db 75 cc e8 52 b0 55 ff 48 c7 c7 50 a8 9e 83 c6 05 af 79 26 02 01 e8 6f a9 39 ff <0f> 0b eb b0 e8 36 b0 55 ff 0f b6 1d 99 79 26 02 31 ff 89 de e8 86
[ 35.918509] hardirqs last enabled at (1795): [<ffffffff811f2d2e>] console_flush_all+0x3ee/0x790
[ 35.919437] RSP: 0018:ffffc900015ffd88 EFLAGS: 00010286
[ 35.919874] hardirqs last disabled at (1802): [<ffffffff811f2efa>] console_flush_all+0x5ba/0x790
[ 35.920161] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8112384b
[ 35.920605] softirqs last enabled at (1484): [<ffffffff82fda6a9>] __do_softirq+0x2d9/0x3c3
[ 35.920974] RDX: 0000000000000000 RSI: ffff888018168000 RDI: 0000000000000002
[ 35.921397] softirqs last disabled at (1471): [<ffffffff81132b14>] irq_exit_rcu+0xc4/0x100
[ 35.921765] RBP: ffffc900015ffd98 R08: 0000000000000001 R09: 0000000000000001
[ 35.922196] ---[ end trace 0000000000000000 ]---
[ 35.922562] R10: 0000000000000001 R11: ffffffff83c830d8 R12: ffff8880146e0028
[ 35.922809] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 35.923171] R13: ffff8880146e0028 R14: ffff8880180c9ae0 R15: ffff888007095fb8
[ 35.923525] #PF: supervisor write access in kernel mode
[ 35.923888] FS: 00007fac3687f800(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[ 35.924169] #PF: error_code(0x0002) - not-present page
[ 35.924594] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.924859] PGD 12df2067 P4D 12df2067 PUD c390067 PMD 0
[ 35.925167] CR2: 00007fe31371f000 CR3: 00000000144c4003 CR4: 0000000000770ee0
[ 35.925441]
[ 35.925443] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 35.925817] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 35.925904] CPU: 0 PID: 4095 Comm: repro Tainted: G W 6.3.0-2023-04-24-intel-next-591f7c2026cb+ #1
[ 35.926134] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 35.926501] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 35.927027] PKRU: 55555554
[ 35.927388] RIP: 0010:kthread_stop+0xd9/0x360
[ 35.927963] Call Trace:
[ 35.927966] <TASK>
[ 35.928108] Code: 44 8b 63 2c 31 ff 41 81 e4 00 00 20 00 44 89 e6 e8 1c 08 17 00 45 85 e4 0f 84 81 02 00 00 e8 2e 06 17 00 4c 8b a3 40 0a 00 00 <f0> 41 80 0c 24 02 48 89 df e8 f9 f1 ff ff f0 80 4b 02 02 48 89 df
[ 35.928339] kthread_stop+0x31b/0x360
[ 35.928470] RSP: 0000:ffffc900015efda8 EFLAGS: 00010246
[ 35.928592] hwrng_unregister+0x182/0x210
[ 35.929518]
[ 35.929520] RAX: 0000000000000000 RBX: ffff8880146e0000 RCX: ffffffff81173814
[ 35.929725] tpm_chip_unregister+0x1cc/0x1f0
[ 35.929995] RDX: 0000000000000000 RSI: ffff8880147f8000 RDI: 0000000000000002
[ 35.930216] ? __pfx_vtpm_proxy_fops_release+0x10/0x10
[ 35.930303] RBP: ffffc900015efdc8 R08: 0000000000000000 R09: 0000000000000001
[ 35.930669] vtpm_proxy_fops_release+0x8f/0xa0
[ 35.930890] R10: 0000000000000001 R11: ffffffff83d638d8 R12: 0000000000000000
[ 35.931252] __fput+0x11f/0x450
[ 35.931512] R13: ffff8880146e0028 R14: ffff8880180cbae0 R15: ffff888007095fb8
[ 35.931874] ____fput+0x1e/0x30
[ 35.932101] FS: 00007fac3687f800(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 35.932460] task_work_run+0xb6/0x120
[ 35.932627] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.932999] exit_to_user_mode_prepare+0x200/0x210
[ 35.933160] CR2: 0000000000000000 CR3: 0000000007538005 CR4: 0000000000770ef0
[ 35.933570] syscall_exit_to_user_mode+0x2d/0x60
[ 35.933761] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 35.934056] do_syscall_64+0x4a/0x90
[ 35.934301] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 35.934664] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 35.934900] PKRU: 55555554
[ 35.935276] RIP: 0033:0x7fac36a8fe37
[ 35.935465] Call Trace:
[ 35.935467] <TASK>
[ 35.935822] Code: 12 b8 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3b c3 66 90 53 89 fb 48 83 ec 10 e8 d4 fb ff ff 89 c2 89 df b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 16 fc ff ff 8b 44 24
[ 35.936080] hwrng_unregister+0x182/0x210
[ 35.936226] RSP: 002b:00007ffc4fdbe320 EFLAGS: 00000293
[ 35.936414] tpm_chip_unregister+0x1cc/0x1f0
[ 35.936546] ORIG_RAX: 0000000000000003
[ 35.936663] ? __pfx_vtpm_proxy_fops_release+0x10/0x10
[ 35.937599] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007fac36a8fe37
[ 35.937803] vtpm_proxy_fops_release+0x8f/0xa0
[ 35.938068] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000000000000004
[ 35.938289] __fput+0x11f/0x450
[ 35.938488] RBP: 00007ffc4fdbe350 R08: 0000000000000000 R09: 0000000000000032
[ 35.938751] ____fput+0x1e/0x30
[ 35.939109] R10: 00007ffc4fdbe320 R11: 0000000000000293 R12: 0000000000402400
[ 35.939341] task_work_run+0xb6/0x120
[ 35.939698] R13: 00007ffc4fdbe4e0 R14: 0000000000000000 R15: 0000000000000000
[ 35.939864] exit_to_user_mode_prepare+0x200/0x210
[ 35.940225] </TASK>
[ 35.940387] syscall_exit_to_user_mode+0x2d/0x60
[ 35.940747] irq event stamp: 1668
[ 35.940938] do_syscall_64+0x4a/0x90
[ 35.941301] hardirqs last enabled at (1667): [<ffffffff811f5718>] vprintk_emit+0x3f8/0x590
[ 35.941547] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 35.941666] hardirqs last disabled at (1668): [<ffffffff811f2efa>] console_flush_all+0x5ba/0x790
[ 35.941899] RIP: 0033:0x7fac36a8fe37
[ 35.942075] softirqs last enabled at (1596): [<ffffffff82fda6a9>] __do_softirq+0x2d9/0x3c3
[ 35.942265] Code: 12 b8 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3b c3 66 90 53 89 fb 48 83 ec 10 e8 d4 fb ff ff 89 c2 89 df b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 16 fc ff ff 8b 44 24
[ 35.942704] softirqs last disabled at (1585): [<ffffffff81132b14>] irq_exit_rcu+0xc4/0x100
[ 35.942975] RSP: 002b:00007ffc4fdbe320 EFLAGS: 00000293
[ 35.943438] ---[ end trace 0000000000000000 ]---
"

I hope above info is helpful to solve this issue.

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.


2023-04-26 13:16:45

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [Syzkaller & bisect] There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark ACPI and OF related data as maybe unused"

On 26/04/2023 11:10, Pengfei Xu wrote:
> Hi Jarkko and Krzysztof Kozlowski,
>
> Greeting!
>
> Platform: x86 platforms
>
> There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark
> ACPI and OF related data as maybe unused":
> https://lore.kernel.org/lkml/[email protected]/
> -> https://lore.kernel.org/lkml/[email protected]/
>
> We tested Intel internal kernel and found that, the above patch caused below
> WARNING and then kernel BUG dmesg info. After reverted above commit on top
> of Intel internal kernel, this issue was gone.
> I checked that internal commit:"c3985d8b9c22 tpm: st33zp24: Mark ACPI and OF
> related data as maybe unused" was same as above link patch.
> This issue could be reproduced in 155s in VM.
>
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230426_132902_tpm_chip_unregister_warning
> Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.c
> Syzkaller syscall reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.prog
> Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.report
> Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/kconfig_origin
> Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/bisect_info.log
>
> "
> [ 24.638052] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=329 'systemd'
> [ 28.731375] cgroup: Unknown subsys name 'net'
> [ 28.741433] cgroup: Unknown subsys name 'rlimit'
> [ 35.900833] tpm tpm0: Operation Canceled
> [ 35.901377] ------------[ cut here ]------------
> [ 35.901648] refcount_t: addition on 0; use-after-free.
> [ 35.901986] WARNING: CPU: 0 PID: 4095 at lib/refcount.c:25 refcount_warn_saturate+0xe6/0x1c0

Thanks for the report. I am quite surprised that this commit was pointed
as possible cause. The patch does not touch anything related to reported
issue... At least I do not see it. Could be some tooling problem in your
reproduction steps?


Best regards,
Krzysztof

2023-04-26 14:43:21

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [Syzkaller & bisect] There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark ACPI and OF related data as maybe unused"

On 26/04/2023 11:10, Pengfei Xu wrote:
> Hi Jarkko and Krzysztof Kozlowski,
>
> Greeting!
>
> Platform: x86 platforms
>
> There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark
> ACPI and OF related data as maybe unused":
> https://lore.kernel.org/lkml/[email protected]/
> -> https://lore.kernel.org/lkml/[email protected]/
>
> We tested Intel internal kernel and found that, the above patch caused below
> WARNING and then kernel BUG dmesg info. After reverted above commit on top
> of Intel internal kernel, this issue was gone.
> I checked that internal commit:"c3985d8b9c22 tpm: st33zp24: Mark ACPI and OF
> related data as maybe unused" was same as above link patch.
> This issue could be reproduced in 155s in VM.

I am trying to reproduce it with mentioned reproducer.

One problem is that commit c3985d8b9c22 (and that patch of mine) was
never in the linux-next. I checked last next releases, including
next-20230421 and there is no such change. It seems it was applied just
before merge window, so was not tested by anything before.

Best regards,
Krzysztof

2023-04-26 18:13:23

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [Syzkaller & bisect] There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark ACPI and OF related data as maybe unused"

On Wed, 2023-04-26 at 15:12 +0200, Krzysztof Kozlowski wrote:
> On 26/04/2023 11:10, Pengfei Xu wrote:
> > Hi Jarkko and Krzysztof Kozlowski,
> >
> > Greeting!
> >
> > Platform: x86 platforms
> >
> > There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark
> > ACPI and OF related data as maybe unused":
> > https://lore.kernel.org/lkml/[email protected]/
> > -> https://lore.kernel.org/lkml/[email protected]/
> >
> > We tested Intel internal kernel and found that, the above patch caused below
> > WARNING and then kernel BUG dmesg info. After reverted above commit on top
> > of Intel internal kernel, this issue was gone.
> > I checked that internal commit:"c3985d8b9c22 tpm: st33zp24: Mark ACPI and OF
> > related data as maybe unused" was same as above link patch.
> > This issue could be reproduced in 155s in VM.
> >
> > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230426_132902_tpm_chip_unregister_warning
> > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.c
> > Syzkaller syscall reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.prog
> > Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.report
> > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/kconfig_origin
> > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/bisect_info.log
> >
> > "
> > [ 24.638052] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=329 'systemd'
> > [ 28.731375] cgroup: Unknown subsys name 'net'
> > [ 28.741433] cgroup: Unknown subsys name 'rlimit'
> > [ 35.900833] tpm tpm0: Operation Canceled
> > [ 35.901377] ------------[ cut here ]------------
> > [ 35.901648] refcount_t: addition on 0; use-after-free.
> > [ 35.901986] WARNING: CPU: 0 PID: 4095 at lib/refcount.c:25 refcount_warn_saturate+0xe6/0x1c0
>
> Thanks for the report. I am quite surprised that this commit was pointed
> as possible cause. The patch does not touch anything related to reported
> issue... At least I do not see it. Could be some tooling problem in your
> reproduction steps?

I see a critical bug in this commit.

Because only tpm_tis_core calls tpm_chip_startup(), in effect none of this
code is ever executd for e.g. tpm_crb and tpm_vtpm_proxy and some other
drivers that are not based on TIS/FIFO specification.

Lino, can you submit a fix re-enabling non-tpm_tis drivers?

BR, Jarkko

2023-04-27 02:31:42

by Pengfei Xu

[permalink] [raw]
Subject: Re: [Syzkaller & bisect] There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark ACPI and OF related data as maybe unused"

Hi Jarkko and Krzysztof Kozlowski,

On 2023-04-26 at 21:03:19 +0300, Jarkko Sakkinen wrote:
> On Wed, 2023-04-26 at 15:12 +0200, Krzysztof Kozlowski wrote:
> > On 26/04/2023 11:10, Pengfei Xu wrote:
> > > Hi Jarkko and Krzysztof Kozlowski,
> > >
> > > Greeting!
> > >
> > > Platform: x86 platforms
> > >
> > > There is WARNING in tpm_chip_unregister in upstream patch "tpm: st33zp24: Mark
> > > ACPI and OF related data as maybe unused":
> > > https://lore.kernel.org/lkml/[email protected]/
> > > -> https://lore.kernel.org/lkml/[email protected]/
> > >
> > > We tested Intel internal kernel and found that, the above patch caused below
> > > WARNING and then kernel BUG dmesg info. After reverted above commit on top
> > > of Intel internal kernel, this issue was gone.
> > > I checked that internal commit:"c3985d8b9c22 tpm: st33zp24: Mark ACPI and OF
> > > related data as maybe unused" was same as above link patch.
> > > This issue could be reproduced in 155s in VM.
> > >
> > > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230426_132902_tpm_chip_unregister_warning
> > > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.c
> > > Syzkaller syscall reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.prog
> > > Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/repro.report
> > > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/kconfig_origin
> > > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/bisect_info.log
> > >
> > > "
> > > [ 24.638052] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=329 'systemd'
> > > [ 28.731375] cgroup: Unknown subsys name 'net'
> > > [ 28.741433] cgroup: Unknown subsys name 'rlimit'
> > > [ 35.900833] tpm tpm0: Operation Canceled
> > > [ 35.901377] ------------[ cut here ]------------
> > > [ 35.901648] refcount_t: addition on 0; use-after-free.
> > > [ 35.901986] WARNING: CPU: 0 PID: 4095 at lib/refcount.c:25 refcount_warn_saturate+0xe6/0x1c0
> >
> > Thanks for the report. I am quite surprised that this commit was pointed
> > as possible cause. The patch does not touch anything related to reported
> > issue... At least I do not see it. Could be some tooling problem in your
> > reproduction steps?
After checking the bisect detailed info, this issue should be related to
link https://lore.kernel.org/lkml/[email protected]/

And this email's bisect info commit was wrong.
Because https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/548eb516ec0f7a484a23a902835899341164b8ea_dmesg.log
Tested problem commit 548eb516ec which should trigger this issue, but it
met stuck issue as follow and didn't find the problem dmesg info.
"
[ 36.521619] tpm tpm2: Operation Canceled
[ 36.536390] tpm tpm0: Operation Canceled
[ 36.550622] tpm tpm1: Operation Canceled
[ 36.597232] tpm tpm3: Operation Canceled
[ 36.631157] tpm tpm4: Operation Canceled
[ 36.634093] tpm tpm5: Operation Canceled
[ 36.639222] tpm tpm6: Operation Canceled
[ 36.648086] tpm tpm5: Operation Canceled
[ 36.652523] tpm tpm6: Operation Canceled
[ 36.660375] tpm tpm5: Operation Canceled
"
And then the bisect was wrong.

When do double check step, reverted the commit on top of the kernel, it met
the same stuck issue accidently again and gave the wrong conclusion:
https://github.com/xupengfe/syzkaller_logs/blob/main/230426_132902_tpm_chip_unregister_warning/591f7c2026cba9889839a16a1a3579c38ad6234f_c3985d8b9c224b359851f0a521ad25a83db6bdca_revert_dmesg.log
"
[ 36.813933] tpm tpm0: Operation Canceled
[ 36.819376] tpm tpm1: Operation Canceled
[ 36.827303] tpm tpm2: Operation Canceled
[ 36.832653] tpm tpm2: Operation Canceled
[ 36.851290] tpm tpm4: Operation Canceled
[ 36.852480] tpm tpm5: Operation Canceled
[ 36.868413] tpm tpm5: Operation Canceled
[ 36.912134] tpm tpm6: Operation Canceled
[ 36.920819] tpm tpm6: Operation Canceled
"
So the issue was not related to commit: "tpm: st33zp24: Mark ACPI and OF
related data as maybe unused", sorry for inconvience!
And the issue was fixed by Jarkko and here is the link:
https://lore.kernel.org/lkml/[email protected]/

>
> I see a critical bug in this commit.
>
> Because only tpm_tis_core calls tpm_chip_startup(), in effect none of this
> code is ever executd for e.g. tpm_crb and tpm_vtpm_proxy and some other
> drivers that are not based on TIS/FIFO specification.
>
Thanks Jarkko's fixed patch!

Thanks!
BR.
-Pengfei

> Lino, can you submit a fix re-enabling non-tpm_tis drivers?
>
> BR, Jarkko