2022-12-10 19:37:00

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/2 block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures

When a gendisk is successfully initialized but add_disk() fails such as when
a loop device has invalid number of minor device numbers specified,
blkcg_init_disk() is called during init and then blkcg_exit_disk() during
error handling. Unfortunately, iolatency gets initialized in the former but
doesn't get cleaned up in the latter.

This is because, in non-error cases, the cleanup is performed by
del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
policies, iolatency being one of them, can only be activated once the disk
is fully registered and visible. That assumption is true for wbt and iocost,
but not so for iolatency as it gets initialized before add_disk() is called.

It is desirable to lazy-init rq_qos policies because they are optional
features and add to hot path overhead once initialized - each IO has to walk
all the registered rq_qos policies. So, we want to switch iolatency to lazy
init too. However, that's a bigger change. As a fix for the immediate
problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
This is safe because duplicate calls to rq_qos_exit() become noop's.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: [email protected]
Cc: Josef Bacik <[email protected]>
Cc: Linus Torvalds <[email protected]>
Fixes: d70675121546 ("block: introduce blk-iolatency io controller")
Cc: [email protected] # v4.19+
---
Hello,

I'm posting two patches for the iolatency memory leak issue after add_disk()
failure. This one is the immediate fix and should be really safe. However,
any change has risks and given that the bug being address is not critical at
all, I still think it'd make sense to route it through 6.2-rc1 rather than
applying directly to master for 6.1 release. So, it's tagged for the 6.2
merge window.

Thanks.

block/blk-cgroup.c | 2 ++
1 file changed, 2 insertions(+)

--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -33,6 +33,7 @@
#include "blk-cgroup.h"
#include "blk-ioprio.h"
#include "blk-throttle.h"
+#include "blk-rq-qos.h"

/*
* blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation.
@@ -1322,6 +1323,7 @@ err_unlock:
void blkcg_exit_disk(struct gendisk *disk)
{
blkg_destroy_all(disk);
+ rq_qos_exit(disk->queue);
blk_throtl_exit(disk);
}


2022-12-10 20:06:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 1/2 block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures

On Sat, Dec 10, 2022 at 10:33 AM Tejun Heo <[email protected]> wrote:
>
> I'm posting two patches for the iolatency memory leak issue after add_disk()
> failure. This one is the immediate fix and should be really safe. However,
> any change has risks and given that the bug being address is not critical at
> all, I still think it'd make sense to route it through 6.2-rc1 rather than
> applying directly to master for 6.1 release. So, it's tagged for the 6.2
> merge window.

Ack. I'm archiving these patches, and expect I'll be getting them the
usual ways (ie pull request).

If people expect something else (like me applying them during the
merge window as patches), holler.

Linus

2022-12-12 09:17:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2 block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures

On Sat, Dec 10, 2022 at 08:33:10AM -1000, Tejun Heo wrote:
> When a gendisk is successfully initialized but add_disk() fails such as when
> a loop device has invalid number of minor device numbers specified,
> blkcg_init_disk() is called during init and then blkcg_exit_disk() during
> error handling. Unfortunately, iolatency gets initialized in the former but
> doesn't get cleaned up in the latter.
>
> This is because, in non-error cases, the cleanup is performed by
> del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
> policies, iolatency being one of them, can only be activated once the disk
> is fully registered and visible. That assumption is true for wbt and iocost,
> but not so for iolatency as it gets initialized before add_disk() is called.
>
> It is desirable to lazy-init rq_qos policies because they are optional
> features and add to hot path overhead once initialized - each IO has to walk
> all the registered rq_qos policies. So, we want to switch iolatency to lazy
> init too. However, that's a bigger change. As a fix for the immediate
> problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
> This is safe because duplicate calls to rq_qos_exit() become noop's.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: [email protected]

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2022-12-14 20:39:04

by Jens Axboe

[permalink] [raw]
Subject: Re: (subset) [PATCH 1/2 block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures


On Sat, 10 Dec 2022 08:33:10 -1000, Tejun Heo wrote:
> When a gendisk is successfully initialized but add_disk() fails such as when
> a loop device has invalid number of minor device numbers specified,
> blkcg_init_disk() is called during init and then blkcg_exit_disk() during
> error handling. Unfortunately, iolatency gets initialized in the former but
> doesn't get cleaned up in the latter.
>
> This is because, in non-error cases, the cleanup is performed by
> del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
> policies, iolatency being one of them, can only be activated once the disk
> is fully registered and visible. That assumption is true for wbt and iocost,
> but not so for iolatency as it gets initialized before add_disk() is called.
>
> [...]

Applied, thanks!

[1/2] blk-iolatency: Fix memory leak on add_disk() failures
commit: 813e693023ba10da9e75067780f8378465bf27cc

Best regards,
--
Jens Axboe