2023-01-12 09:04:16

by Sosthène Guédon

[permalink] [raw]
Subject: kernel BUG at mm/usercopy.c when using usbip

Hi!

I have stumbled upon a bug that is triggered reliably by using usbip.

We are using usbip to test our firmware. usbip attach works, but once
`opgpcard-tools` interacts with the firmware through pcscd, a kernel bug
happens.
Then usbip stops working, and `lsusb` as well as other tools interacting
with usb devices hang.
The symptoms are similar to
https://bugzilla.kernel.org/show_bug.cgi?id=215487 but the kernel bug is
not the same (see attached dmesg logs).

The bug can be reproduced on arch (Linux archlinux 6.1.4-arch1-1 #1 SMP
PREEMPT_DYNAMIC Sat, 07 Jan 2023 15:10:07 +0000 x86_64 GNU/Linux
) and debian (Linux nitropc 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2
(2022-10-21) x86_64 GNU/Linux), though the reproduction is not minimal.

To reproduce the bug on Arch Linux, with the packages rust, pcsclite,
and openpgp-card-tools installed:

- Compile and run the usbip runner from this
PR: https://github.com/Nitrokey/nitrokey-3-firmware/pull/149 (`cd
nitrokey-3-firmware/runners/usbip && cargo run --features alpha`)
- Run `usbip attach -r localhost -b 1-1`
- Run `pcscd`
- Run `opgpcard status`

The bug should happen.

--
Sosthène Guédon,
Software engineer

Nitrokey GmbH
https://www.nitrokey.com

Email: [email protected]
Phone: +49 30 1205 3434

Rheinstr. 10 C, 14513 Teltow, Germany
CEO / Geschäftsführer: Jan Suhr
Register: AG Potsdam, HRB 32882 P
VAT ID / USt-IdNr.: DE300136599


Attachments:
dmesg-error (8.46 kB)

2023-01-12 16:02:45

by Shuah Khan

[permalink] [raw]
Subject: Re: kernel BUG at mm/usercopy.c when using usbip

On 1/12/23 01:40, Sosthène Guédon wrote:
> Hi!
>
> I have stumbled upon a bug that is triggered reliably by using usbip.
>
> We are using usbip to test our firmware. usbip attach works, but once `opgpcard-tools` interacts with the firmware through pcscd, a kernel bug happens.
> Then usbip stops working, and `lsusb` as well as other tools interacting with usb devices hang.
> The symptoms are similar to https://bugzilla.kernel.org/show_bug.cgi?id=215487 but the kernel bug is not the same (see attached dmesg logs).
>
> The bug can be reproduced on arch (Linux archlinux 6.1.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 07 Jan 2023 15:10:07 +0000 x86_64 GNU/Linux
> ) and debian (Linux nitropc 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux), though the reproduction is not minimal.
>
> To reproduce the bug on Arch Linux, with the packages rust, pcsclite, and openpgp-card-tools installed:
>
> - Compile and run the usbip runner from this PR: https://github.com/Nitrokey/nitrokey-3-firmware/pull/149 (`cd nitrokey-3-firmware/runners/usbip && cargo run --features alpha`)
> - Run `usbip attach -r localhost -b 1-1`
> - Run `pcscd`
> - Run `opgpcard status`
>
> The bug should happen.
>

Thank you for reporting the problem. I will take a look and
get back to you with possible fix. Will you be able to test
fixes?

thanks,
-- Shuah

2023-01-12 16:29:41

by Sosthène Guédon

[permalink] [raw]
Subject: Re: kernel BUG at mm/usercopy.c when using usbip

On 1/12/23 16:13, Shuah Khan wrote:
> Thank you for reporting the problem. I will take a look and
> get back to you with possible fix. Will you be able to test
> fixes?
>
> thanks,
> -- Shuah

I can probably build then insmod a custom build of just vhci-hcd (or
some other module), but I'm not sure I want to reboot in a completely
new kernel and risk having a laptop that doesn't boot. If necessary I
might try to reproduce the bug in a VM.

> Please send me complete dmesg and your kernel config.
>
> thanks,
> -- Shuah

Yes, see attachment for the full dmesg. I'm not sure what you mean by
kernel config. I attached the config at /proc/config.gz and the results
of the ver_linux script mentioned in the bug reporting documentation.

Thanks!

--
Sosthène Guédon,
Software engineer

Nitrokey GmbH
https://www.nitrokey.com

Email:[email protected]
Phone: +49 30 1205 3434

Rheinstr. 10 C, 14513 Teltow, Germany
CEO / Geschäftsführer: Jan Suhr
Register: AG Potsdam, HRB 32882 P
VAT ID / USt-IdNr.: DE300136599


Attachments:
ver_linux (2.42 kB)
kernel_config (252.76 kB)
full_dmesg.log (125.17 kB)
Download all attachments

2023-01-12 16:31:48

by Shuah Khan

[permalink] [raw]
Subject: Re: kernel BUG at mm/usercopy.c when using usbip

On 1/12/23 08:13, Shuah Khan wrote:
> On 1/12/23 01:40, Sosthène Guédon wrote:
>> Hi!
>>
>> I have stumbled upon a bug that is triggered reliably by using usbip.
>>
>> We are using usbip to test our firmware. usbip attach works, but once `opgpcard-tools` interacts with the firmware through pcscd, a kernel bug happens.
>> Then usbip stops working, and `lsusb` as well as other tools interacting with usb devices hang.
>> The symptoms are similar to https://bugzilla.kernel.org/show_bug.cgi?id=215487 but the kernel bug is not the same (see attached dmesg logs).
>>
>> The bug can be reproduced on arch (Linux archlinux 6.1.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 07 Jan 2023 15:10:07 +0000 x86_64 GNU/Linux
>> ) and debian (Linux nitropc 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux), though the reproduction is not minimal.
>>
>> To reproduce the bug on Arch Linux, with the packages rust, pcsclite, and openpgp-card-tools installed:
>>
>> - Compile and run the usbip runner from this PR: https://github.com/Nitrokey/nitrokey-3-firmware/pull/149 (`cd nitrokey-3-firmware/runners/usbip && cargo run --features alpha`)
>> - Run `usbip attach -r localhost -b 1-1`
>> - Run `pcscd`
>> - Run `opgpcard status`
>>
>> The bug should happen.
>>
>
> Thank you for reporting the problem. I will take a look and
> get back to you with possible fix. Will you be able to test
> fixes?
>

Please send me complete dmesg and your kernel config.

thanks,
-- Shuah

2023-03-15 16:26:01

by Hongren Zheng

[permalink] [raw]
Subject: Re: kernel BUG at mm/usercopy.c when using usbip

On Thu, Jan 12, 2023 at 09:40:59AM +0100, Sosth?ne Gu?don wrote:
> Hi!
>
> I have stumbled upon a bug that is triggered reliably by using usbip.
>
> We are using usbip to test our firmware. usbip attach works, but once
> `opgpcard-tools` interacts with the firmware through pcscd, a kernel bug
> happens.
> Then usbip stops working, and `lsusb` as well as other tools interacting
> with usb devices hang.
> The symptoms are similar to
> https://bugzilla.kernel.org/show_bug.cgi?id=215487 but the kernel bug is not
> the same (see attached dmesg logs).
>
> The bug can be reproduced on arch (Linux archlinux 6.1.4-arch1-1 #1 SMP
> PREEMPT_DYNAMIC Sat, 07 Jan 2023 15:10:07 +0000 x86_64 GNU/Linux
> ) and debian (Linux nitropc 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2
> (2022-10-21) x86_64 GNU/Linux), though the reproduction is not minimal.
>
> To reproduce the bug on Arch Linux, with the packages rust, pcsclite, and
> openpgp-card-tools installed:
>
> - Compile and run the usbip runner from this
> PR:?https://github.com/Nitrokey/nitrokey-3-firmware/pull/149 (`cd
> nitrokey-3-firmware/runners/usbip && cargo run --features alpha`)
> - Run `usbip attach -r localhost -b 1-1`
> - Run `pcscd`
> - Run `opgpcard status`
>
> The bug should happen.

I can reproduce the bug and I have captured the buggy usbip request/response
by running `tcpdump -i lo "tcp port 3240"`

First the architecture is
pcscd --ioctl-> vhci-hcd --kernel usb stack-> usbip-host --usbip tcp-> usbip-runner

This is a CTRL IN request which expect a response of length 0x22,
which has indicated its expectation in the transfer_buffer_length
and USB SETUP wLength field

CMD_SUBMIT:

usbip_header_basic:
00 00 00 01 00 00 00 0d 00 01 00 02 00 00 00 01 00 00 00 00
trasfer_flags:
00 00 02 00
**transfer_buffer_length**:
00 00 00 22
other meta:
00 00 00 00 00 00 00 00 00 00 00 00
**setup**:
81 06 00 22 00 00 **22 00**

But usbip-runner did not behave according to the
SETUP request, instead it returned a buffer of length 0x32.
It seems that it accidently added a descriptor twice.

RET_SUBMIT:

usbip_header_basic:
00 00 00 03 00 00 00 0d 00 00 00 02 00 00 00 01 00 00 00 00
status:
00 00 00 00
**actual_length:**
00 00 00 32
other metas:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00
payload of length 0x32:
06 d0 f1 09 01 a1 01 09 03 15 00 26 ff 00 75 08
06 d0 f1 09 01 a1 01 09 03 15 00 26 ff 00 75 08
95 40 81 08 09 04 15 00 26 ff 00 75 08 95 40 91
08 c0

Then this leads to consequent errors in kernel

>
> --
> Sosth?ne Gu?don,
> Software engineer
>
> Nitrokey GmbH
> https://www.nitrokey.com
>
> Email: [email protected]
> Phone: +49 30 1205 3434
>
> Rheinstr. 10 C, 14513 Teltow, Germany
> CEO / Gesch?ftsf?hrer: Jan Suhr
> Register: AG Potsdam, HRB 32882 P
> VAT ID / USt-IdNr.: DE300136599

> [ 1464.886564] usb 5-1: new high-speed USB device number 2 using vhci_hcd
> [ 1465.019883] usb 5-1: SetAddress Request (2) to port 0
> [ 1465.036533] usb 5-1: Using ep0 maxpacket: 8
> [ 1465.153573] usb 5-1: config 1 interface 1 altsetting 0 bulk endpoint 0x82 has invalid maxpacket 64
> [ 1465.153591] usb 5-1: config 1 interface 1 altsetting 0 bulk endpoint 0x2 has invalid maxpacket 64
> [ 1465.189963] usb 5-1: New USB device found, idVendor=20a0, idProduct=42b2, bcdDevice= 0.10
> [ 1465.189979] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> [ 1465.189982] usb 5-1: Product: Nitrokey 3
> [ 1465.189986] usb 5-1: Manufacturer: Nitrokey
> [ 1465.231395] usb 5-1: recv xbuf, 0

usbip-host first detected such abnormality:

the returned urb->actualy_length is greater than
urb->transfer_buffer_length, the "recv xbuf, 0" path
is gone through.

https://github.com/torvalds/linux/blob/6015b1aca1a233379625385feb01dd014aca60b5/drivers/usb/usbip/usbip_common.c#L789-L791

> [ 1465.231548] vhci_hcd: stop threads
> [ 1465.231552] vhci_hcd: release socket
> [ 1465.231558] vhci_hcd: disconnect device
> [ 1466.209577] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
> [ 1466.209586] vhci_hcd vhci_hcd.0: devid(65538) speed(3) speed_str(high-speed)
> [ 1466.209632] vhci_hcd vhci_hcd.0: Device attached
> [ 1466.234515] hid-generic 0003:20A0:42B2.0007: hiddev98,hidraw6: USB HID v1.11 Device [Nitrokey Nitrokey 3] on usb-vhci_hcd.0-1/input0
> [ 1480.014598] usb 5-1: recv xbuf, 0
> [ 1480.014620] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmalloc-16' (offset 0, size 64)!

Then vhci-hcd is not happy.

I did not reproduce the kernel BUG though. I was using 5.15.47-1-lts
from Arch.

I guess if usbip-runner behaves normally then such error
wont be happen.

But other way around: the abnormal exit of usbip-host
should not lead to vhci_hcd bugs. There may be URB
not freed when usbip-host exits, but
I have not investigated into it yet.

> [ 1480.014631] ------------[ cut here ]------------
> [ 1480.014633] kernel BUG at mm/usercopy.c:101!
> [ 1480.014640] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 1480.014644] CPU: 2 PID: 3912 Comm: pcscd Not tainted 6.1.4-arch1-1 #1 b56a0be67d6a5f69f99015da4a908cae98ee5acc
> [ 1480.014649] Hardware name: LENOVO 82NW/LNVNB161216, BIOS G9CN23WW 09/01/2021
> [ 1480.014652] RIP: 0010:usercopy_abort+0x79/0x7b
> [ 1480.014665] Code: a5 a5 51 48 0f 45 d6 48 89 c1 49 c7 c3 b0 82 a5 a5 41 52 48 c7 c6 2a de a4 a5 48 c7 c7 b0 2e ae a5 49 0f 45 f3 e8 34 57 ff ff <0f> 0b 48 89 f1 49 89 e8 44 89 e2 31 f6 48 c7 c7 fa 82 a5 a5 e8 6e
> [ 1480.014667] vhci_hcd: stop threads
> [ 1480.014669] RSP: 0018:ffffbc854a5bbd28 EFLAGS: 00010246
> [ 1480.014674] RAX: 0000000000000064 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1480.014678] RDX: 0000000000000000 RSI: ffff98e85fea1660 RDI: ffff98e85fea1660
> [ 1480.014680] RBP: 0000000000000040 R08: 0000000000000000 R09: ffffbc854a5bbbc0
> [ 1480.014682] vhci_hcd: release socket
> [ 1480.014682] R10: 0000000000000003 R11: ffffffffa62cb828 R12: ffff98e5c0042300
> [ 1480.014683] R13: 0000000000000001 R14: 0000000000000040 R15: ffff98e67a690e80
> [ 1480.014686] FS: 00007f7b4a3a66c0(0000) GS:ffff98e85fe80000(0000) knlGS:0000000000000000
> [ 1480.014687] vhci_hcd: disconnect device
> [ 1480.014688] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1480.014690] CR2: 00007f7b4a374ff8 CR3: 00000001ba794000 CR4: 0000000000750ee0
> [ 1480.014692] PKRU: 55555554
> [ 1480.014694] Call Trace:
> [ 1480.014697] <TASK>
> [ 1480.014699] __check_heap_object+0xd9/0x110
> [ 1480.014706] __check_object_size+0x1ea/0x210
> [ 1480.014711] copy_urb_data_to_user+0xfc/0x120
> [ 1480.014715] ? __rseq_handle_notify_resume+0xad/0x4a0
> [ 1480.014720] processcompl+0xc8/0x140
> [ 1480.014722] usbdev_ioctl+0x120/0x1280
> [ 1480.014724] usb 5-1: USB disconnect, device number 2
> [ 1480.014726] ? exit_to_user_mode_prepare+0x16f/0x1d0
> [ 1480.014731] ? syscall_exit_to_user_mode+0x1b/0x40
> [ 1480.014735] ? do_syscall_64+0x6b/0x90
> [ 1480.014739] __x64_sys_ioctl+0x94/0xd0
> [ 1480.014744] do_syscall_64+0x5f/0x90
> [ 1480.014746] ? syscall_exit_to_user_mode+0x1b/0x40
> [ 1480.014749] ? do_syscall_64+0x6b/0x90
> [ 1480.014751] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 1480.014756] RIP: 0033:0x7f7b4d79cc0f
> [ 1480.014778] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> [ 1480.014780] RSP: 002b:00007f7b4a385500 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 1480.014783] RAX: ffffffffffffffda RBX: 00007f7b440705a0 RCX: 00007f7b4d79cc0f
> [ 1480.014785] RDX: 00007f7b4a385580 RSI: 000000004008550d RDI: 000000000000000d
> [ 1480.014786] RBP: 000000000000001a R08: 000000000000000d R09: 0000000000000000
> [ 1480.014787] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [ 1480.014789] R13: 0000000000000003 R14: 0000000000000001 R15: 00007f7b440705a0
> [ 1480.014794] </TASK>
> [ 1480.014795] Modules linked in: vhci_hcd rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel cmac algif_hash algif_skcipher af_alg snd_sof_amd_rembrandt bnep intel_rapl_msr snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof snd_sof_utils rtw89_8852ae snd_hda_codec_realtek intel_rapl_common snd_soc_core rtw89_8852a snd_hda_codec_generic ledtrig_audio snd_compress rtw89_pci snd_hda_codec_hdmi ac97_bus rtw89_core snd_hda_intel snd_pcm_dmaengine snd_intel_dspcfg snd_intel_sdw_acpi hid_multitouch snd_pci_ps amdgpu wmi_bmof edac_mce_amd snd_hda_codec snd_rpl_pci_acp6x mac80211 btusb snd_hda_core snd_acp_pci snd_pci_acp6x snd_hwdep btrtl snd_pci_acp5x gpu_sched btbcm snd_pcm snd_rn_pci_acp3x r8169 drm_buddy kvm_amd btintel ideapad_laptop libarc4 snd_acp_config realtek drm_ttm_helper ucsi_acpi cfg80211 sp5100_tco typec_ucsi ttm btmtk snd_timer
> [ 1480.014861] mdio_devres cm32181 snd_soc_acpi vfat sparse_keymap kvm irqbypass rapl wdat_wdt pcspkr typec fat bluetooth k10temp platform_profile ecdh_generic i2c_piix4 snd snd_pci_acp3x drm_display_helper soundcore libphy cec rfkill roles video wmi mousedev industrialio lzo_rle i2c_hid_acpi joydev i2c_hid acpi_cpufreq mac_hid usbip_host usbip_core pkcs8_key_parser dm_multipath crypto_user fuse zram bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod serio_raw atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel vivaldi_fmap sha512_ssse3 aesni_intel nvme crypto_simd nvme_core xhci_pci cryptd ccp i8042 xhci_pci_renesas nvme_common serio
> [ 1480.014912] ---[ end trace 0000000000000000 ]---
> [ 1480.014914] RIP: 0010:usercopy_abort+0x79/0x7b
> [ 1480.014917] Code: a5 a5 51 48 0f 45 d6 48 89 c1 49 c7 c3 b0 82 a5 a5 41 52 48 c7 c6 2a de a4 a5 48 c7 c7 b0 2e ae a5 49 0f 45 f3 e8 34 57 ff ff <0f> 0b 48 89 f1 49 89 e8 44 89 e2 31 f6 48 c7 c7 fa 82 a5 a5 e8 6e
> [ 1480.014919] RSP: 0018:ffffbc854a5bbd28 EFLAGS: 00010246
> [ 1480.014921] RAX: 0000000000000064 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1480.014923] RDX: 0000000000000000 RSI: ffff98e85fea1660 RDI: ffff98e85fea1660
> [ 1480.014924] RBP: 0000000000000040 R08: 0000000000000000 R09: ffffbc854a5bbbc0
> [ 1480.014925] R10: 0000000000000003 R11: ffffffffa62cb828 R12: ffff98e5c0042300
> [ 1480.014927] R13: 0000000000000001 R14: 0000000000000040 R15: ffff98e67a690e80
> [ 1480.014928] FS: 00007f7b4a3a66c0(0000) GS:ffff98e85fe80000(0000) knlGS:0000000000000000
> [ 1480.014930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1480.014932] CR2: 00007f7b4a374ff8 CR3: 00000001ba794000 CR4: 0000000000750ee0
> [ 1480.014934] PKRU: 55555554
> [ 1480.030021] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
> [ 1480.030025] vhci_hcd vhci_hcd.0: devid(65538) speed(3) speed_str(high-speed)
> [ 1480.030034] vhci_hcd vhci_hcd.0: Device attached
> [ 1480.030396] kauditd_printk_skb: 3 callbacks suppressed