2023-01-06 17:42:10

by Tejun Heo

[permalink] [raw]
Subject: Re: [bug report] memcontrol: schedule throttling if we are congested

Hello,

(cc'ing Luis, Christoph and Jens and quoting whole body)

On Fri, Jan 06, 2023 at 05:58:55PM +0300, Dan Carpenter wrote:
> Hello Tejun Heo,
>
> The patch 2cf855837b89: "memcontrol: schedule throttling if we are
> congested" from Jul 3, 2018, leads to the following Smatch static
> checker warning:
>
> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
>
> The call tree looks like:
>
> ioc_rqos_merge() <- disables preempt
> __cgroup_throttle_swaprate() <- disables preempt
> -> blkcg_schedule_throttle()
>
> Here is one of the callers:
> mm/swapfile.c
> 3657 spin_lock(&swap_avail_lock);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Takes spin lock.
>
> 3658 plist_for_each_entry_safe(si, next, &swap_avail_heads[nid],
> 3659 avail_lists[nid]) {
> 3660 if (si->bdev) {
> 3661 blkcg_schedule_throttle(si->bdev->bd_disk, true);
> ^^^^^^^^^^^^^^^^^^^^^^^
> Calls blkcg_schedule_throttle().
>
> 3662 break;
> 3663 }
> 3664 }
>
> block/blk-cgroup.c
> 1851 void blkcg_schedule_throttle(struct gendisk *disk, bool use_memdelay)
> 1852 {
> 1853 struct request_queue *q = disk->queue;
> 1854
> 1855 if (unlikely(current->flags & PF_KTHREAD))
> 1856 return;
> 1857
> 1858 if (current->throttle_queue != q) {
> 1859 if (!blk_get_queue(q))
> 1860 return;
> 1861
> 1862 if (current->throttle_queue)
> 1863 blk_put_queue(current->throttle_queue);
> ^^^^^^^^^^^^^^
> Sleeps.
>
> 1864 current->throttle_queue = q;
> 1865 }
> 1866
> 1867 if (use_memdelay)
> 1868 current->use_memdelay = use_memdelay;
> 1869 set_notify_resume(current);
> 1870 }

In general, it's quite unusual for a put operation to require a sleepable
context and I could be missing sth but the actual put / release paths don't
seem to actually need might_sleep(). It seems sprious.

The might_sleep() in put was added by Christoph's 63f93fd6fa57 ("block: mark
blk_put_queue as potentially blocking") which promoted it from release to
put cuz the caller usually can't tell whether its put is the last put.

And that put in release was added by Luis in e8c7d14ac6c3 ("block: revert
back to synchronous request_queue removal") while making the release path
synchronous, the rationale being that releasing asynchronously makes dynamic
device removal / readdition behaviors unpredictable and it also seems to
note that might_sleep() is no longer needed but still kept, which seems a
bit odd to me.

Here's my take on it:

* Let's please not require a sleepable context in a put operation. It's
unusual, inconvenient and error-prone, and likely to cause its users to
implement multiple copies of async mechanisms around it.

* A better way to deal with removal / readdition race is flushing release
operaitons either at the end of removal or before trying to add something
(you can get fancy w/ flushing only if there's name collision too), not
making a put path synchronously call release which needs to sleep.

* If might_sleep() is currently not needed, let's please drop it. It just
makes people scratch their head when reading the code.

Thanks.

--
tejun


2023-01-06 19:07:16

by Jens Axboe

[permalink] [raw]
Subject: Re: [bug report] memcontrol: schedule throttling if we are congested

On 1/6/23 10:33 AM, Tejun Heo wrote:
> Hello,
>
> (cc'ing Luis, Christoph and Jens and quoting whole body)
>
> On Fri, Jan 06, 2023 at 05:58:55PM +0300, Dan Carpenter wrote:
>> Hello Tejun Heo,
>>
>> The patch 2cf855837b89: "memcontrol: schedule throttling if we are
>> congested" from Jul 3, 2018, leads to the following Smatch static
>> checker warning:
>>
>> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
>>
>> The call tree looks like:
>>
>> ioc_rqos_merge() <- disables preempt
>> __cgroup_throttle_swaprate() <- disables preempt
>> -> blkcg_schedule_throttle()
>>
>> Here is one of the callers:
>> mm/swapfile.c
>> 3657 spin_lock(&swap_avail_lock);
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Takes spin lock.
>>
>> 3658 plist_for_each_entry_safe(si, next, &swap_avail_heads[nid],
>> 3659 avail_lists[nid]) {
>> 3660 if (si->bdev) {
>> 3661 blkcg_schedule_throttle(si->bdev->bd_disk, true);
>> ^^^^^^^^^^^^^^^^^^^^^^^
>> Calls blkcg_schedule_throttle().
>>
>> 3662 break;
>> 3663 }
>> 3664 }
>>
>> block/blk-cgroup.c
>> 1851 void blkcg_schedule_throttle(struct gendisk *disk, bool use_memdelay)
>> 1852 {
>> 1853 struct request_queue *q = disk->queue;
>> 1854
>> 1855 if (unlikely(current->flags & PF_KTHREAD))
>> 1856 return;
>> 1857
>> 1858 if (current->throttle_queue != q) {
>> 1859 if (!blk_get_queue(q))
>> 1860 return;
>> 1861
>> 1862 if (current->throttle_queue)
>> 1863 blk_put_queue(current->throttle_queue);
>> ^^^^^^^^^^^^^^
>> Sleeps.
>>
>> 1864 current->throttle_queue = q;
>> 1865 }
>> 1866
>> 1867 if (use_memdelay)
>> 1868 current->use_memdelay = use_memdelay;
>> 1869 set_notify_resume(current);
>> 1870 }
>
> In general, it's quite unusual for a put operation to require a sleepable
> context and I could be missing sth but the actual put / release paths don't
> seem to actually need might_sleep(). It seems sprious.
>
> The might_sleep() in put was added by Christoph's 63f93fd6fa57 ("block: mark
> blk_put_queue as potentially blocking") which promoted it from release to
> put cuz the caller usually can't tell whether its put is the last put.
>
> And that put in release was added by Luis in e8c7d14ac6c3 ("block: revert
> back to synchronous request_queue removal") while making the release path
> synchronous, the rationale being that releasing asynchronously makes dynamic
> device removal / readdition behaviors unpredictable and it also seems to
> note that might_sleep() is no longer needed but still kept, which seems a
> bit odd to me.
>
> Here's my take on it:
>
> * Let's please not require a sleepable context in a put operation. It's
> unusual, inconvenient and error-prone, and likely to cause its users to
> implement multiple copies of async mechanisms around it.
>
> * A better way to deal with removal / readdition race is flushing release
> operaitons either at the end of removal or before trying to add something
> (you can get fancy w/ flushing only if there's name collision too), not
> making a put path synchronously call release which needs to sleep.
>
> * If might_sleep() is currently not needed, let's please drop it. It just
> makes people scratch their head when reading the code.

I looked over the call path, and I don't think anything in there sleeps.
So should be fine to remove the might_sleep().

--
Jens Axboe


2023-01-06 19:56:37

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [bug report] memcontrol: schedule throttling if we are congested

On Fri, Jan 06, 2023 at 11:49:33AM -0700, Jens Axboe wrote:
> On 1/6/23 10:33 AM, Tejun Heo wrote:
> > Hello,
> >
> > (cc'ing Luis, Christoph and Jens and quoting whole body)
> >
> > On Fri, Jan 06, 2023 at 05:58:55PM +0300, Dan Carpenter wrote:
> >> Hello Tejun Heo,
> >>
> >> The patch 2cf855837b89: "memcontrol: schedule throttling if we are
> >> congested" from Jul 3, 2018, leads to the following Smatch static
> >> checker warning:
> >>
> >> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
> >>
> >> The call tree looks like:
> >>
> >> ioc_rqos_merge() <- disables preempt
> >> __cgroup_throttle_swaprate() <- disables preempt
> >> -> blkcg_schedule_throttle()
> >>
> >> Here is one of the callers:
> >> mm/swapfile.c
> >> 3657 spin_lock(&swap_avail_lock);
> >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >> Takes spin lock.
> >>
> >> 3658 plist_for_each_entry_safe(si, next, &swap_avail_heads[nid],
> >> 3659 avail_lists[nid]) {
> >> 3660 if (si->bdev) {
> >> 3661 blkcg_schedule_throttle(si->bdev->bd_disk, true);
> >> ^^^^^^^^^^^^^^^^^^^^^^^
> >> Calls blkcg_schedule_throttle().
> >>
> >> 3662 break;
> >> 3663 }
> >> 3664 }
> >>
> >> block/blk-cgroup.c
> >> 1851 void blkcg_schedule_throttle(struct gendisk *disk, bool use_memdelay)
> >> 1852 {
> >> 1853 struct request_queue *q = disk->queue;
> >> 1854
> >> 1855 if (unlikely(current->flags & PF_KTHREAD))
> >> 1856 return;
> >> 1857
> >> 1858 if (current->throttle_queue != q) {
> >> 1859 if (!blk_get_queue(q))
> >> 1860 return;
> >> 1861
> >> 1862 if (current->throttle_queue)
> >> 1863 blk_put_queue(current->throttle_queue);
> >> ^^^^^^^^^^^^^^
> >> Sleeps.
> >>
> >> 1864 current->throttle_queue = q;
> >> 1865 }
> >> 1866
> >> 1867 if (use_memdelay)
> >> 1868 current->use_memdelay = use_memdelay;
> >> 1869 set_notify_resume(current);
> >> 1870 }
> >
> > In general, it's quite unusual for a put operation to require a sleepable
> > context and I could be missing sth but the actual put / release paths don't
> > seem to actually need might_sleep(). It seems sprious.
> >
> > The might_sleep() in put was added by Christoph's 63f93fd6fa57 ("block: mark
> > blk_put_queue as potentially blocking") which promoted it from release to
> > put cuz the caller usually can't tell whether its put is the last put.
> >
> > And that put in release was added by Luis in e8c7d14ac6c3 ("block: revert
> > back to synchronous request_queue removal") while making the release path
> > synchronous, the rationale being

The rationale was that we reverted exepected userspace expection for
something that was sync to async so broke userspace expectations and
we can't do that.

> > that releasing asynchronously makes dynamic
> > device removal / readdition behaviors unpredictable and it also seems to
> > note that might_sleep() is no longer needed but still kept, which seems a
> > bit odd to me.
> >
> > Here's my take on it:
> >
> > * Let's please not require a sleepable context in a put operation. It's
> > unusual, inconvenient and error-prone, and likely to cause its users to
> > implement multiple copies of async mechanisms around it.
> >
> > * A better way to deal with removal / readdition race is flushing release
> > operaitons either at the end of removal or before trying to add something
> > (you can get fancy w/ flushing only if there's name collision too), not
> > making a put path synchronously call release which needs to sleep.
> >
> > * If might_sleep() is currently not needed, let's please drop it. It just
> > makes people scratch their head when reading the code.
>
> I looked over the call path, and I don't think anything in there sleeps.
> So should be fine to remove the might_sleep().

As soon as commit 63f93fd6fa5717 ("block: mark blk_put_queue as
potentially blocking") on v6.2-rc1 it was upgraded to might_sleep()
directly on blk_put_queue(), I can't find a rationale after that
to justify the removal. But since it is not clear if we keep it,
we should document that rationale.

Luis

2023-01-06 20:40:54

by Tejun Heo

[permalink] [raw]
Subject: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

Dan reports the following smatch detected the following:

block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context

caused by blkcg_schedule_throttle() calling blk_put_queue() in an
non-sleepable context.

blk_put_queue() acquired might_sleep() in 63f93fd6fa57 ("block: mark
blk_put_queue as potentially blocking") which transferred the might_sleep()
from blk_free_queue().

blk_free_queue() acquired might_sleep() in e8c7d14ac6c3 ("block: revert back
to synchronous request_queue removal") while turning request_queue removal
synchronous. However, this isn't necessary as nothing in the free path
actually requires sleeping.

It's pretty unusual to require a sleeping context in a put operation and
it's not needed in the first place. Let's drop it.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili
Cc: Christoph Hellwig <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+
--
block/blk-core.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 9321767470dc..b5098355d8b2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -283,12 +283,9 @@ static void blk_free_queue(struct request_queue *q)
*
* Decrements the refcount of the request_queue and free it when the refcount
* reaches 0.
- *
- * Context: Can sleep.
*/
void blk_put_queue(struct request_queue *q)
{
- might_sleep();
if (refcount_dec_and_test(&q->refs))
blk_free_queue(q);
}

2023-01-06 21:12:50

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

On Fri, Jan 06, 2023 at 10:34:10AM -1000, Tejun Heo wrote:
> Dan reports the following smatch detected the following:
>
> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
>
> caused by blkcg_schedule_throttle() calling blk_put_queue() in an
> non-sleepable context.
>
> blk_put_queue() acquired might_sleep() in 63f93fd6fa57 ("block: mark
> blk_put_queue as potentially blocking") which transferred the might_sleep()
> from blk_free_queue().
>
> blk_free_queue() acquired might_sleep() in e8c7d14ac6c3 ("block: revert back
> to synchronous request_queue removal") while turning request_queue removal
> synchronous. However, this isn't necessary as nothing in the free path
> actually requires sleeping.
>
> It's pretty unusual to require a sleeping context in a put operation and
> it's not needed in the first place. Let's drop it.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: Dan Carpenter <[email protected]>
> Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili
> Cc: Christoph Hellwig <[email protected]>
> Cc: Luis Chamberlain <[email protected]>
> Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+

*tons* has changed since e8c7d14ac6c3 and so the bots might think that
*if* this patch is applied upstream it is justified for older kernels
and I don't think that's yet been verified and doubt it.

And so I think adding a "Fixes" tag is not appropriate here.

First I'd like to hear from Christoph if he agrees with this patch
upstream. For stable, someone would have to do the homework.

Luis

2023-01-06 21:14:58

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

On 1/6/23 1:45 PM, Luis Chamberlain wrote:
> On Fri, Jan 06, 2023 at 10:34:10AM -1000, Tejun Heo wrote:
>> Dan reports the following smatch detected the following:
>>
>> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
>>
>> caused by blkcg_schedule_throttle() calling blk_put_queue() in an
>> non-sleepable context.
>>
>> blk_put_queue() acquired might_sleep() in 63f93fd6fa57 ("block: mark
>> blk_put_queue as potentially blocking") which transferred the might_sleep()
>> from blk_free_queue().
>>
>> blk_free_queue() acquired might_sleep() in e8c7d14ac6c3 ("block: revert back
>> to synchronous request_queue removal") while turning request_queue removal
>> synchronous. However, this isn't necessary as nothing in the free path
>> actually requires sleeping.
>>
>> It's pretty unusual to require a sleeping context in a put operation and
>> it's not needed in the first place. Let's drop it.
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> Reported-by: Dan Carpenter <[email protected]>
>> Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili
>> Cc: Christoph Hellwig <[email protected]>
>> Cc: Luis Chamberlain <[email protected]>
>> Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+
>
> *tons* has changed since e8c7d14ac6c3 and so the bots might think that
> *if* this patch is applied upstream it is justified for older kernels
> and I don't think that's yet been verified and doubt it.
>
> And so I think adding a "Fixes" tag is not appropriate here.
>
> First I'd like to hear from Christoph if he agrees with this patch
> upstream. For stable, someone would have to do the homework.

Outside of the easily audited paths, the kobj release paths are the
only ones of concern. And I didn't spot anything that sleeps. Looks
fine to me.

--
Jens Axboe


2023-01-06 21:15:28

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

On Fri, Jan 06, 2023 at 12:45:12PM -0800, Luis Chamberlain wrote:
> *tons* has changed since e8c7d14ac6c3 and so the bots might think that
> *if* this patch is applied upstream it is justified for older kernels
> and I don't think that's yet been verified and doubt it.

Didn't look like anything relevant changed to me. Besides, all that the
patch does is removing a might_sleep() which can't hurt anything.

--
tejun

2023-01-07 09:12:33

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

On Fri, Jan 06, 2023 at 12:45:12PM -0800, Luis Chamberlain wrote:
> > Signed-off-by: Tejun Heo <[email protected]>
> > Reported-by: Dan Carpenter <[email protected]>
> > Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Luis Chamberlain <[email protected]>
> > Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+
>
> *tons* has changed since e8c7d14ac6c3 and so the bots might think that
> *if* this patch is applied upstream it is justified for older kernels
> and I don't think that's yet been verified and doubt it.

If you enable the correct debug option then the might_sleep() causes a
stack trace. Eventually syzbot will find it.

I would backport it.

regards,
dan carpenter

2023-01-08 18:08:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()

I walked through everything called from blk_free_queue and can't find
anything that sleeps either, so:

Reviewed-by: Christoph Hellwig <[email protected]>

I wonder if we could not also revert
d578c770c85233af592e54537f93f3831bde7e9a as I think that was added
to avoid calling blk_put_queue from atomic context.

2023-01-09 03:39:41

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH block/for-6.2-fixes] block: Drop spurious might_sleep() from blk_put_queue()


On Fri, 06 Jan 2023 10:34:10 -1000, Tejun Heo wrote:
> Dan reports the following smatch detected the following:
>
> block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
>
> caused by blkcg_schedule_throttle() calling blk_put_queue() in an
> non-sleepable context.
>
> [...]

Applied, thanks!

[1/1] block: Drop spurious might_sleep() from blk_put_queue()
commit: 49e4d04f0486117ac57a97890eb1db6d52bf82b3

Best regards,
--
Jens Axboe