2020-01-22 03:21:22

by Sun Ke

[permalink] [raw]
Subject: [v2] nbd: add a flush_workqueue in nbd_start_device

When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.

If num_connections is m (2 < m), and NO.1 ~ NO.n
(1 < n < m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.

To fix it, add a flush_workqueue in nbd_start_device.

Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke <[email protected]>
---
drivers/block/nbd.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b4607dd96185..78181908f0df 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1265,6 +1265,16 @@ static int nbd_start_device(struct nbd_device *nbd)
args = kzalloc(sizeof(*args), GFP_KERNEL);
if (!args) {
sock_shutdown(nbd);
+ /*
+ * If num_connections is m (2 < m),
+ * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful.
+ * But NO.(n + 1) failed. We still have n recv threads.
+ * So, add flush_workqueue here to prevent recv threads
+ * dropping the last config_refs and trying to destroy
+ * the workqueue from inside the workqueue.
+ */
+ if (i)
+ flush_workqueue(nbd->recv_workq);
return -ENOMEM;
}
sk_set_memalloc(config->socks[i]->sock->sk);
--
2.17.2


2020-02-04 02:30:22

by Sun Ke

[permalink] [raw]
Subject: Re: [v2] nbd: add a flush_workqueue in nbd_start_device

ping

?? 2020/1/22 11:18, Sun Ke ะด??:
> When kzalloc fail, may cause trying to destroy the
> workqueue from inside the workqueue.
>
> If num_connections is m (2 < m), and NO.1 ~ NO.n
> (1 < n < m) kzalloc are successful. The NO.(n + 1)
> failed. Then, nbd_start_device will return ENOMEM
> to nbd_start_device_ioctl, and nbd_start_device_ioctl
> will return immediately without running flush_workqueue.
> However, we still have n recv threads. If nbd_release
> run first, recv threads may have to drop the last
> config_refs and try to destroy the workqueue from
> inside the workqueue.
>
> To fix it, add a flush_workqueue in nbd_start_device.
>
> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> Signed-off-by: Sun Ke <[email protected]>
> ---
> drivers/block/nbd.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index b4607dd96185..78181908f0df 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1265,6 +1265,16 @@ static int nbd_start_device(struct nbd_device *nbd)
> args = kzalloc(sizeof(*args), GFP_KERNEL);
> if (!args) {
> sock_shutdown(nbd);
> + /*
> + * If num_connections is m (2 < m),
> + * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful.
> + * But NO.(n + 1) failed. We still have n recv threads.
> + * So, add flush_workqueue here to prevent recv threads
> + * dropping the last config_refs and trying to destroy
> + * the workqueue from inside the workqueue.
> + */
> + if (i)
> + flush_workqueue(nbd->recv_workq);
> return -ENOMEM;
> }
> sk_set_memalloc(config->socks[i]->sock->sk);
>

2020-02-04 02:44:56

by Jens Axboe

[permalink] [raw]
Subject: Re: [v2] nbd: add a flush_workqueue in nbd_start_device

On 2/3/20 7:28 PM, sunke (E) wrote:
> ping

Maybe I forgot to reply, but I queued it up last week:

https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.6&id=5c0dd228b5fc30a3b732c7ae2657e0161ec7ed80

--
Jens Axboe