2022-09-04 18:55:35

by Mateusz Jończyk

[permalink] [raw]
Subject: [Bisected] Disconnecting pendrive with opened files causes trouble on 6.0.0-rc3

Hello,

On Linux 6.0.0-rc3, disconnecting a mounted pendrive with opened files causes these symptoms:

- "task kworker/X blocked for more than Y seconds" warnings (see below) are repeatedly printed in dmesg,
- after reconnecting the pendrive (or possibly other USB devices), it is not detected,
- the login screen (Gnome on Ubuntu 20.04) freezes for some time (around 20s), due to a hang in colord-sane.

These symptoms disappear after closing the opened files (and perhaps also directories) on the pendrive.

I am aware that disconnecting a pendrive with a mounted filesystem is unwise, but many people do so
nonetheless.

I have bisected this down to
commit 16728aaba62e ("scsi: core: Make sure that hosts outlive targets")

With the previous
commit fe442604199e ("scsi: core: Make sure that targets outlive devices")
the problem does not happen.

To reproduce:

1. Plug a pendrive into a USB port (checked with a pendrive formatted with a FAT32 filesystem).
2. Mount it, open a file on this pendrive with the "less" command-line program.
3. Optionally suspend the system to RAM.
4. Disconnect the pendrive.
5. Resume the system, if it was suspended.

Warnings from dmesg:

[   76.212690] usb 1-3.4.3: USB disconnect, device number 6
[   76.219765] device offline error, dev sdc, sector 2049 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 2
[   76.219791] Buffer I/O error on dev sdc1, logical block 1, lost async page write
[  242.764446] INFO: task kworker/0:7:1701 blocked for more than 120 seconds.
[  242.764463]       Tainted: G            E      6.0.0-rc3mj6 #239
[  242.764470] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.764474] task:kworker/0:7     state:D stack:    0 pid: 1701 ppid:     2 flags:0x00004000
[  242.764488] Workqueue: usb_hub_wq hub_event
[  242.764504] Call Trace:
[  242.764507]  <TASK>
[  242.764514]  __schedule+0x2c5/0xcd0
[  242.764527]  schedule+0x5d/0xf0
[  242.764535]  scsi_remove_host+0x163/0x1b0
[  242.764547]  ? var_wake_function+0x60/0x60
[  242.764559]  usb_stor_disconnect+0x50/0xd0 [usb_storage]
[  242.764571]  usb_unbind_interface+0x8c/0x240
[  242.764581]  device_remove+0x64/0x70
[  242.764590]  device_release_driver_internal+0xd1/0x160
[  242.764599]  device_release_driver+0x12/0x20
[  242.764608]  bus_remove_device+0xde/0x150
[  242.764616]  device_del+0x192/0x3e0
[  242.764621]  ? usb_remove_ep_devs+0x1f/0x30
[  242.764632]  usb_disable_device+0xab/0x240
[  242.764639]  usb_disconnect+0xc7/0x260
[  242.764647]  ? set_port_feature+0x37/0x40
[  242.764656]  hub_event+0xeb0/0x17c0
[  242.764668]  ? __schedule+0x2cd/0xcd0
[  242.764675]  ? queue_rcu_work+0x2c/0x40
[  242.764684]  process_one_work+0x21c/0x3c0
[  242.764693]  worker_thread+0x4a/0x3b0
[  242.764703]  ? process_one_work+0x3c0/0x3c0
[  242.764711]  kthread+0xcf/0xf0
[  242.764719]  ? kthread_complete_and_exit+0x20/0x20
[  242.764728]  ret_from_fork+0x22/0x30
[  242.764737]  </TASK>

The kernel is tainted AFAIK because of the following:

[    0.442486] backlight: module verification failed: signature and/or required key missing - tainting kernel
[...]
[    0.446462] Loading compiled-in X.509 certificates
[    0.447431] Loaded X.509 cert 'Build time autogenerated kernel key: cbf6b5d58385db4124f3ddd5fa0457b112415dcb'

The module was loaded before the signing keys were processed - which is an unrelated bug, present since a long time.


System information:
- Ubuntu 20.04 amd64,
- HP 17-by0001nw laptop,
- problem happens also when using an external USB 3.0 UNITEK hub.

Greetings,

Mateusz Jończyk


Subject: Re: [Bisected] Disconnecting pendrive with opened files causes trouble on 6.0.0-rc3 #forregzbot

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker. CCing the regression
mailing list, as it should be in the loop for all regressions, as
explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

On 04.09.22 20:28, Mateusz Jończyk wrote:
> Hello,
>
> On Linux 6.0.0-rc3, disconnecting a mounted pendrive with opened files causes these symptoms:
>
> - "task kworker/X blocked for more than Y seconds" warnings (see below) are repeatedly printed in dmesg,
> - after reconnecting the pendrive (or possibly other USB devices), it is not detected,
> - the login screen (Gnome on Ubuntu 20.04) freezes for some time (around 20s), due to a hang in colord-sane.
>
> These symptoms disappear after closing the opened files (and perhaps also directories) on the pendrive.
>
> I am aware that disconnecting a pendrive with a mounted filesystem is unwise, but many people do so
> nonetheless.
>
> I have bisected this down to
> commit 16728aaba62e ("scsi: core: Make sure that hosts outlive targets")
>
> With the previous
> commit fe442604199e ("scsi: core: Make sure that targets outlive devices")
> the problem does not happen.
>
> To reproduce:
>
> 1. Plug a pendrive into a USB port (checked with a pendrive formatted with a FAT32 filesystem).
> 2. Mount it, open a file on this pendrive with the "less" command-line program.
> 3. Optionally suspend the system to RAM.
> 4. Disconnect the pendrive.
> 5. Resume the system, if it was suspended.
>
> Warnings from dmesg:
>
> [   76.212690] usb 1-3.4.3: USB disconnect, device number 6
> [   76.219765] device offline error, dev sdc, sector 2049 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 2
> [   76.219791] Buffer I/O error on dev sdc1, logical block 1, lost async page write
> [  242.764446] INFO: task kworker/0:7:1701 blocked for more than 120 seconds.
> [  242.764463]       Tainted: G            E      6.0.0-rc3mj6 #239
> [  242.764470] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  242.764474] task:kworker/0:7     state:D stack:    0 pid: 1701 ppid:     2 flags:0x00004000
> [  242.764488] Workqueue: usb_hub_wq hub_event
> [  242.764504] Call Trace:
> [  242.764507]  <TASK>
> [  242.764514]  __schedule+0x2c5/0xcd0
> [  242.764527]  schedule+0x5d/0xf0
> [  242.764535]  scsi_remove_host+0x163/0x1b0
> [  242.764547]  ? var_wake_function+0x60/0x60
> [  242.764559]  usb_stor_disconnect+0x50/0xd0 [usb_storage]
> [  242.764571]  usb_unbind_interface+0x8c/0x240
> [  242.764581]  device_remove+0x64/0x70
> [  242.764590]  device_release_driver_internal+0xd1/0x160
> [  242.764599]  device_release_driver+0x12/0x20
> [  242.764608]  bus_remove_device+0xde/0x150
> [  242.764616]  device_del+0x192/0x3e0
> [  242.764621]  ? usb_remove_ep_devs+0x1f/0x30
> [  242.764632]  usb_disable_device+0xab/0x240
> [  242.764639]  usb_disconnect+0xc7/0x260
> [  242.764647]  ? set_port_feature+0x37/0x40
> [  242.764656]  hub_event+0xeb0/0x17c0
> [  242.764668]  ? __schedule+0x2cd/0xcd0
> [  242.764675]  ? queue_rcu_work+0x2c/0x40
> [  242.764684]  process_one_work+0x21c/0x3c0
> [  242.764693]  worker_thread+0x4a/0x3b0
> [  242.764703]  ? process_one_work+0x3c0/0x3c0
> [  242.764711]  kthread+0xcf/0xf0
> [  242.764719]  ? kthread_complete_and_exit+0x20/0x20
> [  242.764728]  ret_from_fork+0x22/0x30
> [  242.764737]  </TASK>
>
> The kernel is tainted AFAIK because of the following:
>
> [    0.442486] backlight: module verification failed: signature and/or required key missing - tainting kernel
> [...]
> [    0.446462] Loading compiled-in X.509 certificates
> [    0.447431] Loaded X.509 cert 'Build time autogenerated kernel key: cbf6b5d58385db4124f3ddd5fa0457b112415dcb'
>
> The module was loaded before the signing keys were processed - which is an unrelated bug, present since a long time.
>
>
> System information:
> - Ubuntu 20.04 amd64,
> - HP 17-by0001nw laptop,
> - problem happens also when using an external USB 3.0 UNITEK hub.
>
> Greetings,
>
> Mateusz Jończyk
>

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 16728aaba62e
#regzbot title scsi: core: Disconnecting pendrive with opened files
suddenly causes trouble
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

2022-09-06 15:21:46

by Bart Van Assche

[permalink] [raw]
Subject: Re: [Bisected] Disconnecting pendrive with opened files causes trouble on 6.0.0-rc3

On 9/4/22 11:28, Mateusz Jończyk wrote:
> I have bisected this down to
> commit 16728aaba62e ("scsi: core: Make sure that hosts outlive targets")
>
> With the previous
> commit fe442604199e ("scsi: core: Make sure that targets outlive devices")
> the problem does not happen.

Hi Mateusz,

A revert for the patch series "Call blk_mq_free_tag_set() earlier" has been
queued and is expected to be sent soon to Linus. See also
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/log/?h=6.0/scsi-fixes

Thanks,

Bart.


Subject: Re: [Bisected] Disconnecting pendrive with opened files causes trouble on 6.0.0-rc3 #forregzbot

On 05.09.22 12:27, Thorsten Leemhuis wrote:
>
> [TLDR: I'm adding this regression report to the list of tracked
> regressions; all text from me you find below is based on a few templates
> paragraphs you might have encountered already already in similar form.]

#regzbot fixed-by: f782201ebc2b5f6

For details see Bart's reply to the reporter.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.