LinuxLists.cc - regression caused by block: freeze the queue earlier in del

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

Hi, this is your Linux kernel regression tracker. Thx for the report.

CCing the regression mailing list, as it should be in the loop for all
regressions, as explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

On 26.08.22 18:15, Dusty Mabe wrote:
>
> I think I've found a regression introduced by:
> a09b314 o block: freeze the queue earlier in del_gendisk

Just FYI, in case you are not aware of it already: there was another
report that this commit causes problems. See this thread for details:
https://lore.kernel.org/all/[email protected]/#t

Anyway, let me add this report to the regressions tracking:

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

> In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions
> and then subsequently removes one of the disks to simulate a failure. Sometime recently
> this test started timing out occasionally. Looking a bit closer it appears instances are
> getting stuck during reboot with a bunch of looping messages:
>
> ```
> [ 17.978854] block device autoloading is deprecated and will be removed.
> [ 17.982555] block device autoloading is deprecated and will be removed.
> [ 17.985537] block device autoloading is deprecated and will be removed.
> [ 17.987546] block device autoloading is deprecated and will be removed.
> [ 17.989540] block device autoloading is deprecated and will be removed.
> [ 17.991547] block device autoloading is deprecated and will be removed.
> [ 17.993555] block device autoloading is deprecated and will be removed.
> [ 17.995539] block device autoloading is deprecated and will be removed.
> [ 17.997577] block device autoloading is deprecated and will be removed.
> [ 17.999544] block device autoloading is deprecated and will be removed.
> [ 22.979465] blkdev_get_no_open: 1666 callbacks suppressed
> ...
> ...
> ...
> [ 618.221270] blkdev_get_no_open: 1664 callbacks suppressed
> [ 618.221273] block device autoloading is deprecated and will be removed.
> [ 618.224274] block device autoloading is deprecated and will be removed.
> [ 618.227267] block device autoloading is deprecated and will be removed.
> [ 618.229274] block device autoloading is deprecated and will be removed.
> [ 618.231277] block device autoloading is deprecated and will be removed.
> [ 618.233277] block device autoloading is deprecated and will be removed.
> [ 618.235282] block device autoloading is deprecated and will be removed.
> [ 618.237370] block device autoloading is deprecated and will be removed.
> [ 618.239356] block device autoloading is deprecated and will be removed.
> [ 618.241290] block device autoloading is deprecated and will be removed.
> ```
>
> Using the Fedora kernels I narrowed it down to being introduced between
> `kernel-5.19.0-0.rc3.27.fc37` (good) and `kernel-5.19.0-0.rc4.33.fc37` (bad).
>
> I then did a bisect and found:
>
> ```
> $ git bisect bad
> a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
> commit a09b314005f3a0956ebf56e01b3b80339df577cc
> Author: Christoph Hellwig <[email protected]>
> Date: Tue Jun 14 09:48:27 2022 +0200
>
> block: freeze the queue earlier in del_gendisk
>
> Freeze the queue earlier in del_gendisk so that the state does not
> change while we remove debugfs and sysfs files.
>
> Ming mentioned that being able to observer request in debugfs might
> be useful while the queue is being frozen in del_gendisk, which is
> made possible by this change.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Jens Axboe <[email protected]>
>
> block/genhd.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> ```
>
> Reverting this commit on top of latest git master (4c612826b) gave me successful results.
>
> Any ideas on what could be amiss here? Luckily the patch is tiny so hopefully it might
> be obvious.
>
> More details (including logs) in the following locations:
>
> - https://bugzilla.redhat.com/show_bug.cgi?id=2121791
> - https://github.com/coreos/fedora-coreos-tracker/issues/1282
>
>
> Thanks!
> Dusty
>

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced a09b314005f3a0
#regzbot title block: timeouts when removing a disk from a RAID1
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

2022-08-31 13:35:22

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

[CCing the mdraid maintainer and the raid ml to keep them in the loop]

Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Christoph, Jens, what's up here? Dusty bisected this and even confirmed
a revert on-top of current mainline fixes things for him, nevertheless
he didn't get a single reply since he reported the issue last Friday.

BTW, it seems quite a few Fedora users are now hitting this with the
slightly patched Fedora 5.19.y kernels they since a few days ship as
regular update, as comments in
https://bugzilla.redhat.com/show_bug.cgi?id=2121791 show -- so it seems
it's not something specific to Dusty's setup.

Could you please look into the issue? tia!

Ciao, Thorsten

On 28.08.22 12:24, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. Thx for the report.
>
> CCing the regression mailing list, as it should be in the loop for all
> regressions, as explained here:
> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>
> On 26.08.22 18:15, Dusty Mabe wrote:
>>
>> I think I've found a regression introduced by:
>> a09b314 o block: freeze the queue earlier in del_gendisk
>
> Just FYI, in case you are not aware of it already: there was another
> report that this commit causes problems. See this thread for details:
> https://lore.kernel.org/all/[email protected]/#t
>
> Anyway, let me add this report to the regressions tracking:
>
> [TLDR: I'm adding this regression report to the list of tracked
> regressions; all text from me you find below is based on a few templates
> paragraphs you might have encountered already already in similar form.]
>
>> In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions
>> and then subsequently removes one of the disks to simulate a failure. Sometime recently
>> this test started timing out occasionally. Looking a bit closer it appears instances are
>> getting stuck during reboot with a bunch of looping messages:
>>
>> ```
>> [ 17.978854] block device autoloading is deprecated and will be removed.
>> [ 17.982555] block device autoloading is deprecated and will be removed.
>> [ 17.985537] block device autoloading is deprecated and will be removed.
>> [ 17.987546] block device autoloading is deprecated and will be removed.
>> [ 17.989540] block device autoloading is deprecated and will be removed.
>> [ 17.991547] block device autoloading is deprecated and will be removed.
>> [ 17.993555] block device autoloading is deprecated and will be removed.
>> [ 17.995539] block device autoloading is deprecated and will be removed.
>> [ 17.997577] block device autoloading is deprecated and will be removed.
>> [ 17.999544] block device autoloading is deprecated and will be removed.
>> [ 22.979465] blkdev_get_no_open: 1666 callbacks suppressed
>> ...
>> ...
>> ...
>> [ 618.221270] blkdev_get_no_open: 1664 callbacks suppressed
>> [ 618.221273] block device autoloading is deprecated and will be removed.
>> [ 618.224274] block device autoloading is deprecated and will be removed.
>> [ 618.227267] block device autoloading is deprecated and will be removed.
>> [ 618.229274] block device autoloading is deprecated and will be removed.
>> [ 618.231277] block device autoloading is deprecated and will be removed.
>> [ 618.233277] block device autoloading is deprecated and will be removed.
>> [ 618.235282] block device autoloading is deprecated and will be removed.
>> [ 618.237370] block device autoloading is deprecated and will be removed.
>> [ 618.239356] block device autoloading is deprecated and will be removed.
>> [ 618.241290] block device autoloading is deprecated and will be removed.
>> ```
>>
>> Using the Fedora kernels I narrowed it down to being introduced between
>> `kernel-5.19.0-0.rc3.27.fc37` (good) and `kernel-5.19.0-0.rc4.33.fc37` (bad).
>>
>> I then did a bisect and found:
>>
>> ```
>> $ git bisect bad
>> a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
>> commit a09b314005f3a0956ebf56e01b3b80339df577cc
>> Author: Christoph Hellwig <[email protected]>
>> Date: Tue Jun 14 09:48:27 2022 +0200
>>
>> block: freeze the queue earlier in del_gendisk
>>
>> Freeze the queue earlier in del_gendisk so that the state does not
>> change while we remove debugfs and sysfs files.
>>
>> Ming mentioned that being able to observer request in debugfs might
>> be useful while the queue is being frozen in del_gendisk, which is
>> made possible by this change.
>>
>> Signed-off-by: Christoph Hellwig <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> Signed-off-by: Jens Axboe <[email protected]>
>>
>> block/genhd.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>> ```
>>
>> Reverting this commit on top of latest git master (4c612826b) gave me successful results.
>>
>> Any ideas on what could be amiss here? Luckily the patch is tiny so hopefully it might
>> be obvious.
>>
>> More details (including logs) in the following locations:
>>
>> - https://bugzilla.redhat.com/show_bug.cgi?id=2121791
>> - https://github.com/coreos/fedora-coreos-tracker/issues/1282
>>
>>
>> Thanks!
>> Dusty
>>
>
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced a09b314005f3a0
> #regzbot title block: timeouts when removing a disk from a RAID1
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply -- ideally with also
> telling regzbot about it, as explained here:
> https://linux-regtracking.leemhuis.info/tracked-regression/
>
> Reminder for developers: When fixing the issue, add 'Link:' tags
> pointing to the report (the mail this one replies to), as explained for
> in the Linux kernel's documentation; above webpage explains why this is
> important for tracked regressions.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.

BTW:

#regzbot link: https://bugzilla.redhat.com/show_bug.cgi?id=2121791
#regzbot poke

2022-09-01 08:00:49

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

Hi Dusty,

On Fri, Aug 26, 2022 at 12:15:22PM -0400, Dusty Mabe wrote:
> Hey All,
>
> I think I've found a regression introduced by:
>
> a09b314 o block: freeze the queue earlier in del_gendisk
>
> In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions
> and then subsequently removes one of the disks to simulate a failure. Sometime recently

Do you have test case which doesn't need raid1 over /boot or /root? such
as by create raid1 over two disks, then mount & remove one of device, ...

It isn't easy to setup/observe such test case and observe what is wrong.

> this test started timing out occasionally. Looking a bit closer it appears instances are
> getting stuck during reboot with a bunch of looping messages:
>
> ```
> [ 17.978854] block device autoloading is deprecated and will be removed.
> [ 17.982555] block device autoloading is deprecated and will be removed.
> [ 17.985537] block device autoloading is deprecated and will be removed.
> [ 17.987546] block device autoloading is deprecated and will be removed.
> [ 17.989540] block device autoloading is deprecated and will be removed.
> [ 17.991547] block device autoloading is deprecated and will be removed.
> [ 17.993555] block device autoloading is deprecated and will be removed.
> [ 17.995539] block device autoloading is deprecated and will be removed.
> [ 17.997577] block device autoloading is deprecated and will be removed.
> [ 17.999544] block device autoloading is deprecated and will be removed.
> [ 22.979465] blkdev_get_no_open: 1666 callbacks suppressed
> ...
> ...
> ...
> [ 618.221270] blkdev_get_no_open: 1664 callbacks suppressed
> [ 618.221273] block device autoloading is deprecated and will be removed.
> [ 618.224274] block device autoloading is deprecated and will be removed.
> [ 618.227267] block device autoloading is deprecated and will be removed.
> [ 618.229274] block device autoloading is deprecated and will be removed.
> [ 618.231277] block device autoloading is deprecated and will be removed.
> [ 618.233277] block device autoloading is deprecated and will be removed.
> [ 618.235282] block device autoloading is deprecated and will be removed.
> [ 618.237370] block device autoloading is deprecated and will be removed.
> [ 618.239356] block device autoloading is deprecated and will be removed.
> [ 618.241290] block device autoloading is deprecated and will be removed.
> ```
>
> Using the Fedora kernels I narrowed it down to being introduced between
> `kernel-5.19.0-0.rc3.27.fc37` (good) and `kernel-5.19.0-0.rc4.33.fc37` (bad).
>
> I then did a bisect and found:
>
> ```
> $ git bisect bad
> a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
> commit a09b314005f3a0956ebf56e01b3b80339df577cc
> Author: Christoph Hellwig <[email protected]>
> Date: Tue Jun 14 09:48:27 2022 +0200
>
> block: freeze the queue earlier in del_gendisk

It is a bit hard to associate the above commit with reported issue.

Thanks,
Ming

2022-09-03 14:27:54

by Dusty Mabe

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/1/22 03:06, Ming Lei wrote:
> Hi Dusty,

Hi Ming,

>
> On Fri, Aug 26, 2022 at 12:15:22PM -0400, Dusty Mabe wrote:
>> Hey All,
>>
>> I think I've found a regression introduced by:
>>
>> a09b314 o block: freeze the queue earlier in del_gendisk
>>
>> In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions
>> and then subsequently removes one of the disks to simulate a failure. Sometime recently
>
> Do you have test case which doesn't need raid1 over /boot or /root? such
> as by create raid1 over two disks, then mount & remove one of device, ...
>
> It isn't easy to setup/observe such test case and observe what is wrong.

I don't have such a test case. For Fedora CoreOS we have a very
specific partition layout [1] so it's not easy to change that
and continue to run our test framework.

That being said there are plenty of people in the bug report [2]
that are reporint seeing this as well, so they might have other
test cases they can share.

[1] https://github.com/coreos/fedora-coreos-tracker/blob/main/Design.md#disk-layout
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2121791

>
>> this test started timing out occasionally. Looking a bit closer it appears instances are
>> getting stuck during reboot with a bunch of looping messages:
>>
>> ```
>> [ 17.978854] block device autoloading is deprecated and will be removed.
>> [ 17.982555] block device autoloading is deprecated and will be removed.
>> [ 17.985537] block device autoloading is deprecated and will be removed.
>> [ 17.987546] block device autoloading is deprecated and will be removed.
>> [ 17.989540] block device autoloading is deprecated and will be removed.
>> [ 17.991547] block device autoloading is deprecated and will be removed.
>> [ 17.993555] block device autoloading is deprecated and will be removed.
>> [ 17.995539] block device autoloading is deprecated and will be removed.
>> [ 17.997577] block device autoloading is deprecated and will be removed.
>> [ 17.999544] block device autoloading is deprecated and will be removed.
>> [ 22.979465] blkdev_get_no_open: 1666 callbacks suppressed
>> ...
>> ...
>> ...
>> [ 618.221270] blkdev_get_no_open: 1664 callbacks suppressed
>> [ 618.221273] block device autoloading is deprecated and will be removed.
>> [ 618.224274] block device autoloading is deprecated and will be removed.
>> [ 618.227267] block device autoloading is deprecated and will be removed.
>> [ 618.229274] block device autoloading is deprecated and will be removed.
>> [ 618.231277] block device autoloading is deprecated and will be removed.
>> [ 618.233277] block device autoloading is deprecated and will be removed.
>> [ 618.235282] block device autoloading is deprecated and will be removed.
>> [ 618.237370] block device autoloading is deprecated and will be removed.
>> [ 618.239356] block device autoloading is deprecated and will be removed.
>> [ 618.241290] block device autoloading is deprecated and will be removed.
>> ```
>>
>> Using the Fedora kernels I narrowed it down to being introduced between
>> `kernel-5.19.0-0.rc3.27.fc37` (good) and `kernel-5.19.0-0.rc4.33.fc37` (bad).
>>
>> I then did a bisect and found:
>>
>> ```
>> $ git bisect bad
>> a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
>> commit a09b314005f3a0956ebf56e01b3b80339df577cc
>> Author: Christoph Hellwig <[email protected]>
>> Date: Tue Jun 14 09:48:27 2022 +0200
>>
>> block: freeze the queue earlier in del_gendisk
>
> It is a bit hard to associate the above commit with reported issue.

Indeed, though I think now there is enough emperical evidence that
points directly at this commit. It may ultimately end up as not the
root cause, but it's definitely related.

Dusty

2022-09-07 07:29:46

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

Just curious: what is the underlying device (or more speicifically
driver) under the md raids?

On Fri, Aug 26, 2022 at 12:15:22PM -0400, Dusty Mabe wrote:
> Hey All,
>
> I think I've found a regression introduced by:
>
> a09b314 o block: freeze the queue earlier in del_gendisk
>
> In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions
> and then subsequently removes one of the disks to simulate a failure. Sometime recently
> this test started timing out occasionally. Looking a bit closer it appears instances are
> getting stuck during reboot with a bunch of looping messages:
>
> ```
> [ 17.978854] block device autoloading is deprecated and will be removed.
> [ 17.982555] block device autoloading is deprecated and will be removed.
> [ 17.985537] block device autoloading is deprecated and will be removed.
> [ 17.987546] block device autoloading is deprecated and will be removed.
> [ 17.989540] block device autoloading is deprecated and will be removed.
> [ 17.991547] block device autoloading is deprecated and will be removed.
> [ 17.993555] block device autoloading is deprecated and will be removed.
> [ 17.995539] block device autoloading is deprecated and will be removed.
> [ 17.997577] block device autoloading is deprecated and will be removed.
> [ 17.999544] block device autoloading is deprecated and will be removed.
> [ 22.979465] blkdev_get_no_open: 1666 callbacks suppressed
> ...
> ...
> ...
> [ 618.221270] blkdev_get_no_open: 1664 callbacks suppressed
> [ 618.221273] block device autoloading is deprecated and will be removed.
> [ 618.224274] block device autoloading is deprecated and will be removed.
> [ 618.227267] block device autoloading is deprecated and will be removed.
> [ 618.229274] block device autoloading is deprecated and will be removed.
> [ 618.231277] block device autoloading is deprecated and will be removed.
> [ 618.233277] block device autoloading is deprecated and will be removed.
> [ 618.235282] block device autoloading is deprecated and will be removed.
> [ 618.237370] block device autoloading is deprecated and will be removed.
> [ 618.239356] block device autoloading is deprecated and will be removed.
> [ 618.241290] block device autoloading is deprecated and will be removed.
> ```
>
> Using the Fedora kernels I narrowed it down to being introduced between
> `kernel-5.19.0-0.rc3.27.fc37` (good) and `kernel-5.19.0-0.rc4.33.fc37` (bad).
>
> I then did a bisect and found:
>
> ```
> $ git bisect bad
> a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
> commit a09b314005f3a0956ebf56e01b3b80339df577cc
> Author: Christoph Hellwig <[email protected]>
> Date: Tue Jun 14 09:48:27 2022 +0200
>
> block: freeze the queue earlier in del_gendisk
>
> Freeze the queue earlier in del_gendisk so that the state does not
> change while we remove debugfs and sysfs files.
>
> Ming mentioned that being able to observer request in debugfs might
> be useful while the queue is being frozen in del_gendisk, which is
> made possible by this change.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Jens Axboe <[email protected]>
>
> block/genhd.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> ```
>
> Reverting this commit on top of latest git master (4c612826b) gave me successful results.
>
> Any ideas on what could be amiss here? Luckily the patch is tiny so hopefully it might
> be obvious.
>
> More details (including logs) in the following locations:
>
> - https://bugzilla.redhat.com/show_bug.cgi?id=2121791
> - https://github.com/coreos/fedora-coreos-tracker/issues/1282
>
>
> Thanks!
> Dusty
---end quoted text---

2022-09-07 07:52:44

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> It is a bit hard to associate the above commit with reported issue.

So the messages clearly are about something trying to open a device
that went away at the block layer, but somehow does not get removed
in time by udev (which seems to be a userspace bug in CoreOS). But
even with that we really should not hang.

Now that fact that it did hang before and this now becomes reproducible
also makes me assume the change is not the root cause. It might still
be a good vehicle to fix the issue for real, but it really broadens
the scope.

2022-09-07 09:22:20

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

2022-09-07 14:45:52

by Chaitanya Kulkarni

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

Hi all,

On 9/7/22 01:38, Ming Lei wrote:
> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
>> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
>>> It is a bit hard to associate the above commit with reported issue.
>>
>> So the messages clearly are about something trying to open a device
>> that went away at the block layer, but somehow does not get removed
>> in time by udev (which seems to be a userspace bug in CoreOS). But
>> even with that we really should not hang.
>
> The new device should be allocated from md_probe() via blk_request_module(),
> and the underlying devices are virtio-blk from the fedora BZ2121791.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2121791
>
> Thanks,
> Ming
>

It would be really helpful if mdraid experts can write blktests so it
will get tested in the nightly builds along with other tests with
different distros.

-ck

2022-09-07 15:26:34

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Wed, Sep 07, 2022 at 02:40:57PM +0000, Chaitanya Kulkarni wrote:
> Hi all,
>
> On 9/7/22 01:38, Ming Lei wrote:
> > On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
> >> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> >>> It is a bit hard to associate the above commit with reported issue.
> >>
> >> So the messages clearly are about something trying to open a device
> >> that went away at the block layer, but somehow does not get removed
> >> in time by udev (which seems to be a userspace bug in CoreOS). But
> >> even with that we really should not hang.
> >
> > The new device should be allocated from md_probe() via blk_request_module(),
> > and the underlying devices are virtio-blk from the fedora BZ2121791.
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=2121791
> >
> > Thanks,
> > Ming
> >
>
> It would be really helpful if mdraid experts can write blktests so it
> will get tested in the nightly builds along with other tests with
> different distros.

Can't agree more, and Cc linux-raid and our raid guys.

And looks this one is more related with imsm.

Thanks,
Ming

2022-09-07 15:27:47

by Dusty Mabe

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/7/22 03:22, Christoph Hellwig wrote:
> Just curious: what is the underlying device (or more speicifically
> driver) under the md raids?

I think Ming already answered this, but yes virtio-blk:

```
Aug 23 12:50:59.764004 kernel: virtio_blk virtio1: [vda] 10485760 512-byte logical blocks (5.37 GB/5.00 GiB)

Aug 23 12:50:59.795968 kernel: virtio_blk virtio2: [vdb] 20971520 512-byte logical blocks (10.7 GB/10.0 GiB)
```

Dusty

2022-09-07 16:01:23

by Chaitanya Kulkarni

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/7/22 07:58, Ming Lei wrote:
> On Wed, Sep 07, 2022 at 02:40:57PM +0000, Chaitanya Kulkarni wrote:
>> Hi all,
>>
>> On 9/7/22 01:38, Ming Lei wrote:
>>> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
>>>> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
>>>>> It is a bit hard to associate the above commit with reported issue.
>>>>
>>>> So the messages clearly are about something trying to open a device
>>>> that went away at the block layer, but somehow does not get removed
>>>> in time by udev (which seems to be a userspace bug in CoreOS). But
>>>> even with that we really should not hang.
>>>
>>> The new device should be allocated from md_probe() via blk_request_module(),
>>> and the underlying devices are virtio-blk from the fedora BZ2121791.
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2121791
>>>
>>> Thanks,
>>> Ming
>>>
>>
>> It would be really helpful if mdraid experts can write blktests so it
>> will get tested in the nightly builds along with other tests with
>> different distros.
>
> Can't agree more, and Cc linux-raid and our raid guys.
>

along with linux-block, Shin'ichiro and me.

-ck

2022-09-09 08:39:22

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> > It is a bit hard to associate the above commit with reported issue.
>
> So the messages clearly are about something trying to open a device
> that went away at the block layer, but somehow does not get removed
> in time by udev (which seems to be a userspace bug in CoreOS). But
> even with that we really should not hang.

Xiao Ni provides one script[1] which can reproduce the issue more or less.

- create raid
#./imsm.sh imsm /dev/md/test 1 /dev/sda /dev/sdb
#ls /dev/md/
[root@ktest-36 md]# ls -l /dev/md/
total 0
lrwxrwxrwx. 1 root root 8 Sep 9 08:10 imsm -> ../md127
lrwxrwxrwx. 1 root root 8 Sep 9 08:10 test -> ../md126

- destroy the two raid devices
# mdadm --stop /dev/md/test /dev/md/imsm
mdadm: stopped /dev/md/test
mdadm: stopped /dev/md/imsm

# lsblk
...
md126 9:126 0 0B 0 md
md127 9:127 0 0B 0 md

md126 is actually added after it is deleted, and with the log of "block
device autoloading is deprecated and will be removed.", and bcc stack trace
shows that the device is added by mdadm.

08:20:03 456 456 kworker/6:2 del_gendisk disk b'md126'
b'del_gendisk+0x1 [kernel]'
b'md_kobj_release+0x34 [kernel]'
b'kobject_put+0x87 [kernel]'
b'process_one_work+0x1c4 [kernel]'
b'worker_thread+0x4d [kernel]'
b'kthread+0xe6 [kernel]'
b'ret_from_fork+0x1f [kernel]'

08:20:03 2476 2476 mdadm device_add_disk disk b'md126'
b'device_add_disk+0x1 [kernel]'
b'md_alloc+0x3ba [kernel]'
b'md_probe+0x25 [kernel]'
b'blk_request_module+0x5f [kernel]'
b'blkdev_get_no_open+0x5c [kernel]'
b'blkdev_get_by_dev.part.0+0x1e [kernel]'
b'blkdev_open+0x52 [kernel]'
b'do_dentry_open+0x1ce [kernel]'
b'path_openat+0xc43 [kernel]'
b'do_filp_open+0xa1 [kernel]'
b'do_sys_openat2+0x7c [kernel]'
b'__x64_sys_openat+0x5c [kernel]'
b'do_syscall_64+0x37 [kernel]'
b'entry_SYSCALL_64_after_hwframe+0x63 [kernel]'

Also the md device is delayed to remove by scheduling wq, and it is
actually deleted in mddev's release handler:

mddev_delayed_delete():
kobject_put(&mddev->kobj)

...

md_kobj_release():
del_gendisk(mddev->gendisk);

>
> Now that fact that it did hang before and this now becomes reproducible
> also makes me assume the change is not the root cause. It might still
> be a good vehicle to fix the issue for real, but it really broadens
> the scope.
>

[1] create one imsm raid1

./imsm.sh imsm /dev/md/test 1 /dev/sda /dev/sdb

#!/bin/bash
export IMSM_NO_PLATFORM=1
export IMSM_DEVNAME_AS_SERIAL=1

echo ""
echo "==========================================================="
echo "./test.sh container raid devlist level devnum"
echo "example: ./test.sh imsm /dev/md/test 1 /dev/loop0 /dev/loop1"
echo "==========================================================="
echo ""

container=$1
raid=$2
level=$3

shift 3
dev_num=$#
dev_list=$@

mdadm -CR $container -e imsm -n $dev_num $dev_list
mdadm -CR $raid -l $level -n $dev_num $dev_list

[2] destroy created raid devices
mdadm --stop /dev/md/test /dev/md/imsm

Thanks,
Ming

2022-09-12 08:04:41

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
> > On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> > > It is a bit hard to associate the above commit with reported issue.
> >
> > So the messages clearly are about something trying to open a device
> > that went away at the block layer, but somehow does not get removed
> > in time by udev (which seems to be a userspace bug in CoreOS). But
> > even with that we really should not hang.
>
> Xiao Ni provides one script[1] which can reproduce the issue more or less.

I've run the reproduced 10000 times on current mainline, and while
it prints one of the autoloading messages per run, I've not actually
seen any kind of hang.

2022-09-13 02:06:15

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Mon, Sep 12, 2022 at 09:16:18AM +0200, Christoph Hellwig wrote:
> On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
> > On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
> > > On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
> > > > It is a bit hard to associate the above commit with reported issue.
> > >
> > > So the messages clearly are about something trying to open a device
> > > that went away at the block layer, but somehow does not get removed
> > > in time by udev (which seems to be a userspace bug in CoreOS). But
> > > even with that we really should not hang.
> >
> > Xiao Ni provides one script[1] which can reproduce the issue more or less.
>
> I've run the reproduced 10000 times on current mainline, and while
> it prints one of the autoloading messages per run, I've not actually
> seen any kind of hang.

I can't reproduce the hang too.

What I meant is that new raid disk can be added by mdadm after stopping
the imsm container and raid disk with the autoloading messages printed,
I understand this behavior isn't correct, but I am not familiar with
raid enough.

It might be related with the delay deleting gendisk from wq & md kobj
release handler.

During reboot, if mdadm does this stupid thing without stopping, the hang
could be caused.

I think the root cause is that why mdadm tries to open/add new raid bdev
crazily during reboot.

Thanks,
Ming

2022-09-13 02:52:07

by Dusty Mabe

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/12/22 21:55, Ming Lei wrote:
> On Mon, Sep 12, 2022 at 09:16:18AM +0200, Christoph Hellwig wrote:
>> On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
>>> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
>>>> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
>>>>> It is a bit hard to associate the above commit with reported issue.
>>>>
>>>> So the messages clearly are about something trying to open a device
>>>> that went away at the block layer, but somehow does not get removed
>>>> in time by udev (which seems to be a userspace bug in CoreOS). But
>>>> even with that we really should not hang.
>>>
>>> Xiao Ni provides one script[1] which can reproduce the issue more or less.
>>
>> I've run the reproduced 10000 times on current mainline, and while
>> it prints one of the autoloading messages per run, I've not actually
>> seen any kind of hang.
>
> I can't reproduce the hang too.

I obviously can reproduce the issue with the test in our Fedora CoreOS
test suite. It's part of a framework (i.e. it's not simple some script
you can run) but it is very reproducible so one can add some instrumentation
to the kernel and feed it through a build/test cycle to see different
results or logs.

I'm willing to share this with other people (maybe a screen share or
some written down instructions) if anyone would be interested.

>
> What I meant is that new raid disk can be added by mdadm after stopping
> the imsm container and raid disk with the autoloading messages printed,
> I understand this behavior isn't correct, but I am not familiar with
> raid enough.
>
> It might be related with the delay deleting gendisk from wq & md kobj
> release handler.
>
> During reboot, if mdadm does this stupid thing without stopping, the hang
> could be caused.
>
> I think the root cause is that why mdadm tries to open/add new raid bdev
> crazily during reboot.
>

Dusty

2022-09-20 09:52:54

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

Hi, this is your Linux kernel regression tracker.

On 13.09.22 04:36, Dusty Mabe wrote:
> On 9/12/22 21:55, Ming Lei wrote:
>> On Mon, Sep 12, 2022 at 09:16:18AM +0200, Christoph Hellwig wrote:
>>> On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
>>>> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
>>>>> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
>>>>>> It is a bit hard to associate the above commit with reported issue.
>>>>>
>>>>> So the messages clearly are about something trying to open a device
>>>>> that went away at the block layer, but somehow does not get removed
>>>>> in time by udev (which seems to be a userspace bug in CoreOS). But
>>>>> even with that we really should not hang.
>>>>
>>>> Xiao Ni provides one script[1] which can reproduce the issue more or less.
>>>
>>> I've run the reproduced 10000 times on current mainline, and while
>>> it prints one of the autoloading messages per run, I've not actually
>>> seen any kind of hang.
>>
>> I can't reproduce the hang too.
>
> I obviously can reproduce the issue with the test in our Fedora CoreOS
> test suite. It's part of a framework (i.e. it's not simple some script
> you can run) but it is very reproducible so one can add some instrumentation
> to the kernel and feed it through a build/test cycle to see different
> results or logs.
>
> I'm willing to share this with other people (maybe a screen share or
> some written down instructions) if anyone would be interested.

This thread looked stalled, or was there any progress in the past week?
If not: Fedora apparently removed the patch in their kernels a while
ago, as quite a few users where hitting it. What is preventing us from
doing the same in mainline and 5.19.y until the issue can be resolved?
The description of a09b314005f3 ("block: freeze the queue earlier in
del_gendisk") doesn't sound like the change does something crucial that
can't wait a bit. I might be totally wrong with that, but I think it's
my duty to ask that question at this point.

>> What I meant is that new raid disk can be added by mdadm after stopping
>> the imsm container and raid disk with the autoloading messages printed,
>> I understand this behavior isn't correct, but I am not familiar with
>> raid enough.
>>
>> It might be related with the delay deleting gendisk from wq & md kobj
>> release handler.
>>
>> During reboot, if mdadm does this stupid thing without stopping, the hang
>> could be caused.
>>
>> I think the root cause is that why mdadm tries to open/add new raid bdev
>> crazily during reboot.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

2022-09-20 14:25:03

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/20/22 3:11 AM, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker.
>
> On 13.09.22 04:36, Dusty Mabe wrote:
>> On 9/12/22 21:55, Ming Lei wrote:
>>> On Mon, Sep 12, 2022 at 09:16:18AM +0200, Christoph Hellwig wrote:
>>>> On Fri, Sep 09, 2022 at 04:24:40PM +0800, Ming Lei wrote:
>>>>> On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote:
>>>>>> On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote:
>>>>>>> It is a bit hard to associate the above commit with reported issue.
>>>>>>
>>>>>> So the messages clearly are about something trying to open a device
>>>>>> that went away at the block layer, but somehow does not get removed
>>>>>> in time by udev (which seems to be a userspace bug in CoreOS). But
>>>>>> even with that we really should not hang.
>>>>>
>>>>> Xiao Ni provides one script[1] which can reproduce the issue more or less.
>>>>
>>>> I've run the reproduced 10000 times on current mainline, and while
>>>> it prints one of the autoloading messages per run, I've not actually
>>>> seen any kind of hang.
>>>
>>> I can't reproduce the hang too.
>>
>> I obviously can reproduce the issue with the test in our Fedora CoreOS
>> test suite. It's part of a framework (i.e. it's not simple some script
>> you can run) but it is very reproducible so one can add some instrumentation
>> to the kernel and feed it through a build/test cycle to see different
>> results or logs.
>>
>> I'm willing to share this with other people (maybe a screen share or
>> some written down instructions) if anyone would be interested.
>
> This thread looked stalled, or was there any progress in the past week?
> If not: Fedora apparently removed the patch in their kernels a while
> ago, as quite a few users where hitting it. What is preventing us from
> doing the same in mainline and 5.19.y until the issue can be resolved?
> The description of a09b314005f3 ("block: freeze the queue earlier in
> del_gendisk") doesn't sound like the change does something crucial that
> can't wait a bit. I might be totally wrong with that, but I think it's
> my duty to ask that question at this point.

Christoph and I discussed this one last week, and he has a plan to try
a flag approach. Christoph, did you get a chance to bang that out? Would
be nice to get this one wrapped up.

--
Jens Axboe

2022-09-20 14:36:50

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/20/22 8:12 AM, Christoph Hellwig wrote:
> On Tue, Sep 20, 2022 at 08:05:06AM -0600, Jens Axboe wrote:
>> Christoph and I discussed this one last week, and he has a plan to try
>> a flag approach. Christoph, did you get a chance to bang that out? Would
>> be nice to get this one wrapped up.
>
> I gave up on that as it will be far too much change so late in
> the cycle and sent you the revert yesterday.

Gotcha, haven't made it all the way through the emails of the morning yet.
I'll queue it up.

--
Jens Axboe

2022-09-20 15:12:27

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Tue, Sep 20, 2022 at 08:05:06AM -0600, Jens Axboe wrote:
> Christoph and I discussed this one last week, and he has a plan to try
> a flag approach. Christoph, did you get a chance to bang that out? Would
> be nice to get this one wrapped up.

I gave up on that as it will be far too much change so late in
the cycle and sent you the revert yesterday.

2022-09-21 09:44:36

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 20.09.22 16:14, Jens Axboe wrote:
> On 9/20/22 8:12 AM, Christoph Hellwig wrote:
>> On Tue, Sep 20, 2022 at 08:05:06AM -0600, Jens Axboe wrote:
>>> Christoph and I discussed this one last week, and he has a plan to try
>>> a flag approach. Christoph, did you get a chance to bang that out? Would
>>> be nice to get this one wrapped up.
>>
>> I gave up on that as it will be far too much change so late in
>> the cycle and sent you the revert yesterday.
>
> Gotcha, haven't made it all the way through the emails of the morning yet.
> I'll queue it up.

Thx to both of you for taking care of this.

Nitpicking: that patch is missing a "CC: stable@..." tag to ensure
automatic and quick backporting to 5.19.y. Or is the block layer among
the subsystems that prefer to handle such things manually?

Ohh, and a fixes tag might have been good as well; a "Link:" tag
pointing to the report, too. If either would have been there, regzbot
would have noticed Christoph's patch posting and I wouldn't have
bothered you yesterday. :-) But whatever, not that important.

#regzbot fixed-by: 4c66a326b5ab784cddd72d

Ciao, Thorsten

2022-09-21 14:55:19

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On 9/21/22 3:25 AM, Thorsten Leemhuis wrote:
> On 20.09.22 16:14, Jens Axboe wrote:
>> On 9/20/22 8:12 AM, Christoph Hellwig wrote:
>>> On Tue, Sep 20, 2022 at 08:05:06AM -0600, Jens Axboe wrote:
>>>> Christoph and I discussed this one last week, and he has a plan to try
>>>> a flag approach. Christoph, did you get a chance to bang that out? Would
>>>> be nice to get this one wrapped up.
>>>
>>> I gave up on that as it will be far too much change so late in
>>> the cycle and sent you the revert yesterday.
>>
>> Gotcha, haven't made it all the way through the emails of the morning yet.
>> I'll queue it up.
>
> Thx to both of you for taking care of this.
>
> Nitpicking: that patch is missing a "CC: stable@..." tag to ensure
> automatic and quick backporting to 5.19.y. Or is the block layer among
> the subsystems that prefer to handle such things manually?
>
> Ohh, and a fixes tag might have been good as well; a "Link:" tag
> pointing to the report, too. If either would have been there, regzbot
> would have noticed Christoph's patch posting and I wouldn't have
> bothered you yesterday. :-) But whatever, not that important.

We'll just have to ensure we ping stable on it when it goes in.

--
Jens Axboe

2022-09-21 15:11:23

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk

On Wed, Sep 21, 2022 at 08:34:26AM -0600, Jens Axboe wrote:
> On 9/21/22 3:25 AM, Thorsten Leemhuis wrote:
> > On 20.09.22 16:14, Jens Axboe wrote:
> >> On 9/20/22 8:12 AM, Christoph Hellwig wrote:
> >>> On Tue, Sep 20, 2022 at 08:05:06AM -0600, Jens Axboe wrote:
> >>>> Christoph and I discussed this one last week, and he has a plan to try
> >>>> a flag approach. Christoph, did you get a chance to bang that out? Would
> >>>> be nice to get this one wrapped up.
> >>>
> >>> I gave up on that as it will be far too much change so late in
> >>> the cycle and sent you the revert yesterday.
> >>
> >> Gotcha, haven't made it all the way through the emails of the morning yet.
> >> I'll queue it up.
> >
> > Thx to both of you for taking care of this.
> >
> > Nitpicking: that patch is missing a "CC: stable@..." tag to ensure
> > automatic and quick backporting to 5.19.y. Or is the block layer among
> > the subsystems that prefer to handle such things manually?
> >
> > Ohh, and a fixes tag might have been good as well; a "Link:" tag
> > pointing to the report, too. If either would have been there, regzbot
> > would have noticed Christoph's patch posting and I wouldn't have
> > bothered you yesterday. :-) But whatever, not that important.
>
> We'll just have to ensure we ping stable on it when it goes in.

If you have a git id that is not going to change, I can watch out for it
to land in Linus's tree...

thanks,

greg k-h

2022-09-21 15:26:09