2023-04-26 03:56:28

by Pengfei Xu

[permalink] [raw]
Subject: [Syzkaller & bisect] There is "refcount bug in tpm_chip_unregister" in upstream patch "tpm_tis: startup chip before testing for interrupts"

Hi Jarkko and Lino Sanfilippo,

Greeting!

Platform: Raptor lake and so on x86 platforms

There is "refcount bug in tpm_chip_unregister" in upstream patch "tpm_tis:
startup chip before testing for interrupts":
https://lore.kernel.org/lkml/[email protected]/
-> https://lore.kernel.org/linux-integrity/[email protected]/

We tested the intel internal kernel and found that
"tpm_tis: startup chip before testing for interrupts" commit caused
the below issue, after reverted this commit on top of intel internal kernel.
This issue was gone.
And I checked that this commit was same as above link one.
It could be reproduced in 150s.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230425_154720_tpm_chip_unregister
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230425_154720_tpm_chip_unregister/repro.c
Syzkaller reproduced prog syscalls: https://github.com/xupengfe/syzkaller_logs/blob/main/230425_154720_tpm_chip_unregister/repro.prog
Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/230425_154720_tpm_chip_unregister/repro.report
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230425_154720_tpm_chip_unregister/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230425_154720_tpm_chip_unregister/bisect_info.log

"
[ 24.716504] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=331 'systemd'
[ 28.304015] loop0: detected capacity change from 0 to 8192
[ 28.319753] loop0: p1 p2 p3 p4
[ 28.319919] loop0: partition table partially beyond EOD, truncated
[ 28.320692] loop0: p3 start 520097793 is beyond EOD, truncated
[ 28.320919] loop0: p4 start 524032 is beyond EOD, truncated
[ 28.322438] loop0: p1 p2 p3 p4
[ 28.322577] loop0: partition table partially beyond EOD, truncated
[ 28.322581] tpm tpm0: Operation Canceled
[ 28.323057] loop0: p3 start 520097793 is beyond EOD, truncated
[ 28.323284] loop0: p4 start 524032 is beyond EOD, truncated
[ 28.345542] loop0: detected capacity change from 0 to 8192
[ 28.355853] loop0: p1 p2 p3 p4
[ 28.355997] loop0: partition table partially beyond EOD, truncated
[ 28.356592] loop0: p3 start 520097793 is beyond EOD, truncated
[ 28.356845] loop0: p4 start 524032 is beyond EOD, truncated
[ 28.357902] ------------[ cut here ]------------
[ 28.358110] refcount_t: addition on 0; use-after-free.
[ 28.358394] WARNING: CPU: 1 PID: 536 at lib/refcount.c:25 refcount_warn_saturate+0xe6/0x1c0
[ 28.358759] Modules linked in:
[ 28.358894] CPU: 1 PID: 536 Comm: repro Not tainted 6.3.0-2023-04-24-intel-next-591f7c2026cb+ #1
[ 28.359257] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 28.359779] RIP: 0010:refcount_warn_saturate+0xe6/0x1c0
[ 28.360012] Code: 1d 99 79 26 02 31 ff 89 de e8 86 b1 55 ff 84 db 75 97 e8 1d b0 55 ff 48 c7 c7 78 a8 9e 83 c6 05 79 79 26 02 01 e8 3a a9 39 ff <0f> 0b e9 78 ff ff ff e8 fe af 55 ff 0f b6 1d 63 79 26 02 31 ff 89
[ 28.360756] RSP: 0018:ffffc90000f2fcf8 EFLAGS: 00010282
[ 28.360980] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8112384b
[ 28.361275] RDX: 0000000000000000 RSI: ffff88800dfba340 RDI: 0000000000000002
[ 28.361571] RBP: ffffc90000f2fd08 R08: 0000000000000000 R09: 0000000000000001
[ 28.361866] R10: 0000000000000001 R11: ffffffff83d638d8 R12: ffff88800dfbc6a8
[ 28.362161] R13: ffff88800dfbc6a8 R14: ffff88800d4a0ae0 R15: ffff88800708f660
[ 28.362456] FS: 0000000000000000(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[ 28.362789] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.363031] CR2: 00007efc8ff36500 CR3: 000000000b788004 CR4: 0000000000770ee0
[ 28.363332] PKRU: 55555554
[ 28.363461] Call Trace:
[ 28.363570] <TASK>
[ 28.363668] kthread_stop+0x349/0x360
[ 28.363840] hwrng_unregister+0x182/0x210
[ 28.364026] tpm_chip_unregister+0x1cc/0x1f0
[ 28.364216] ? __pfx_vtpm_proxy_fops_release+0x10/0x10
[ 28.364442] vtpm_proxy_fops_release+0x8f/0xa0
[ 28.364640] __fput+0x11f/0x450
[ 28.364794] ____fput+0x1e/0x30
[ 28.364941] task_work_run+0xb6/0x120
[ 28.365110] do_exit+0x547/0x12b0
[ 28.365263] ? __this_cpu_preempt_check+0x20/0x30
[ 28.365478] ? lockdep_hardirqs_on+0x8a/0x110
[ 28.365672] do_group_exit+0x5e/0xf0
[ 28.365836] __x64_sys_exit_group+0x25/0x30
[ 28.366020] do_syscall_64+0x3b/0x90
[ 28.366187] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 28.366411] RIP: 0033:0x7efc8fe2ccf6
[ 28.366565] Code: Unable to access opcode bytes at 0x7efc8fe2cccc.
[ 28.366815] RSP: 002b:00007fff8ff8a448 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 28.367123] RAX: ffffffffffffffda RBX: 00007efc8ff37490 RCX: 00007efc8fe2ccf6
[ 28.367417] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[ 28.367711] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff80
[ 28.368003] R10: 0000000000000004 R11: 0000000000000246 R12: 00007efc8ff37490
[ 28.368291] R13: 0000000000000001 R14: 00007efc8ff3ae88 R15: 0000000000000000
[ 28.368587] </TASK>
[ 28.368682] irq event stamp: 19811
[ 28.368826] hardirqs last enabled at (19819): [<ffffffff811f0cb1>] __up_console_sem+0x91/0xb0
[ 28.369184] hardirqs last disabled at (19826): [<ffffffff811f0c96>] __up_console_sem+0x76/0xb0
[ 28.369539] softirqs last enabled at (19488): [<ffffffff82fda6a9>] __do_softirq+0x2d9/0x3c3
[ 28.369889] softirqs last disabled at (19427): [<ffffffff81132b14>] irq_exit_rcu+0xc4/0x100
[ 28.370235] ---[ end trace 0000000000000000 ]---
[ 28.370438] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 28.370727] #PF: supervisor write access in kernel mode
[ 28.370942] #PF: error_code(0x0002) - not-present page
[ 28.371151] PGD ded5067 P4D ded5067 PUD df78067 PMD 0
[ 28.371370] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 28.371554] CPU: 1 PID: 536 Comm: repro Tainted: G W 6.3.0-2023-04-24-intel-next-591f7c2026cb+ #1
[ 28.371975] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 28.372424] RIP: 0010:kthread_stop+0xd9/0x360
[ 28.372610] Code: 44 8b 63 2c 31 ff 41 81 e4 00 00 20 00 44 89 e6 e8 1c 08 17 00 45 85 e4 0f 84 81 02 00 00 e8 2e 06 17 00 4c 8b a3 40 0a 00 00 <f0> 41 80 0c 24 02 48 89 df e8 f9 f1 ff ff f0 80 4b 02 02 48 89 df
[ 28.373338] RSP: 0018:ffffc90000f2fd18 EFLAGS: 00010246
[ 28.373553] RAX: 0000000000000000 RBX: ffff88800dfbc680 RCX: ffffffff81173814
[ 28.373839] RDX: 0000000000000000 RSI: ffff88800dfba340 RDI: 0000000000000002
[ 28.374127] RBP: ffffc90000f2fd38 R08: 0000000000000000 R09: 0000000000000001
[ 28.374414] R10: 0000000000000001 R11: ffffffff83d638d8 R12: 0000000000000000
[ 28.374702] R13: ffff88800dfbc6a8 R14: ffff88800d4a0ae0 R15: ffff88800708f660
[ 28.374990] FS: 0000000000000000(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[ 28.375317] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.375557] CR2: 0000000000000000 CR3: 000000000b788004 CR4: 0000000000770ee0
[ 28.375855] PKRU: 55555554
[ 28.375973] Call Trace:
[ 28.376081] <TASK>
[ 28.376176] hwrng_unregister+0x182/0x210
[ 28.376359] tpm_chip_unregister+0x1cc/0x1f0
[ 28.376546] ? __pfx_vtpm_proxy_fops_release+0x10/0x10
[ 28.376769] vtpm_proxy_fops_release+0x8f/0xa0
[ 28.376965] __fput+0x11f/0x450
[ 28.377112] ____fput+0x1e/0x30
[ 28.377258] task_work_run+0xb6/0x120
[ 28.377423] do_exit+0x547/0x12b0
[ 28.377571] ? __this_cpu_preempt_check+0x20/0x30
[ 28.377778] ? lockdep_hardirqs_on+0x8a/0x110
[ 28.377970] do_group_exit+0x5e/0xf0
[ 28.378129] __x64_sys_exit_group+0x25/0x30
[ 28.378311] do_syscall_64+0x3b/0x90
[ 28.378476] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 28.378697] RIP: 0033:0x7efc8fe2ccf6
[ 28.378852] Code: Unable to access opcode bytes at 0x7efc8fe2cccc.
[ 28.379106] RSP: 002b:00007fff8ff8a448 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 28.379419] RAX: ffffffffffffffda RBX: 00007efc8ff37490 RCX: 00007efc8fe2ccf6
[ 28.379712] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[ 28.380022] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff80
[ 28.380315] R10: 0000000000000004 R11: 0000000000000246 R12: 00007efc8ff37490
[ 28.380614] R13: 0000000000000001 R14: 00007efc8ff3ae88 R15: 0000000000000000
[ 28.380911] </TASK>
[ 28.381010] Modules linked in:
[ 28.381145] CR2: 0000000000000000
[ 28.381290] ---[ end trace 0000000000000000 ]---
"

And syzkaller reported another issue: "task hung in tpm_chip_unregister".
This another issue could be reproduced in 2100s.
After bisected and found it's same pach caused the another issue too, and reverted
this commit, this another issue was gone also.

All detailed info for another issue:
https://github.com/xupengfe/syzkaller_logs/tree/main/230425_154720_tpm_chip_unregister/0425_214338_tpm_chip_unregister_task_hang_same_commit_issue

I hope this info is helpful.

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.

Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.