2015-02-13 17:11:01

by Tony Battersby

[permalink] [raw]
Subject: [PATCH v2 2/2] [SCSI] sg: fix EWOULDBLOCK errors with scsi-mq

With scsi-mq enabled, userspace programs can get unexpected EWOULDBLOCK
(a.k.a. EAGAIN) errors when submitting commands to the SCSI generic
driver. Fix by calling blk_get_request() with GFP_KERNEL instead of
GFP_ATOMIC.

Note: to avoid introducing a potential deadlock, this patch should be
applied after the patch titled "sg: fix unkillable I/O wait deadlock
with scsi-mq".

Cc: Douglas Gilbert <[email protected]>
Cc: <[email protected]> # 3.17+
Signed-off-by: Tony Battersby <[email protected]>
---

For inclusion in kernel 3.20.

The difference in behavior is due to bt_get() in block/blk-mq-tag.c
checking for __GFP_WAIT.

The bsg driver already calls blk_get_request() with GFP_KERNEL, so there
is no need for a change there.

--- linux-3.19.0/drivers/scsi/sg.c.orig 2015-02-13 11:04:40.000000000 -0500
+++ linux-3.19.0/drivers/scsi/sg.c 2015-02-13 11:05:14.000000000 -0500
@@ -1695,7 +1695,22 @@ sg_start_req(Sg_request *srp, unsigned c
return -ENOMEM;
}

+ /*
+ * NOTE
+ *
+ * With scsi-mq enabled, there are a fixed number of preallocated
+ * requests equal in number to shost->can_queue. If all of the
+ * preallocated requests are already in use, then using GFP_ATOMIC with
+ * blk_get_request() will return -EWOULDBLOCK, whereas using GFP_KERNEL
+ * will cause blk_get_request() to sleep until an active command
+ * completes, freeing up a request. Neither option is ideal, but
+ * GFP_KERNEL is the better choice to prevent userspace from getting an
+ * unexpected EWOULDBLOCK.
+ *
+ * With scsi-mq disabled, blk_get_request() with GFP_KERNEL usually
+ * does not sleep except under memory pressure.
+ */
+ rq = blk_get_request(q, rw, GFP_KERNEL);
- rq = blk_get_request(q, rw, GFP_ATOMIC);
if (IS_ERR(rq)) {
kfree(long_cmdp);
return PTR_ERR(rq);


2015-02-15 22:12:10

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] [SCSI] sg: fix EWOULDBLOCK errors with scsi-mq

On 15-02-13 12:10 PM, Tony Battersby wrote:
> With scsi-mq enabled, userspace programs can get unexpected EWOULDBLOCK
> (a.k.a. EAGAIN) errors when submitting commands to the SCSI generic
> driver. Fix by calling blk_get_request() with GFP_KERNEL instead of
> GFP_ATOMIC.
>
> Note: to avoid introducing a potential deadlock, this patch should be
> applied after the patch titled "sg: fix unkillable I/O wait deadlock
> with scsi-mq".
>
> Cc: Douglas Gilbert <[email protected]>
> Cc: <[email protected]> # 3.17+
> Signed-off-by: Tony Battersby <[email protected]>

Acked-by: Douglas Gilbert <[email protected]>
Tested-by: Douglas Gilbert <[email protected]>

> For inclusion in kernel 3.20.
>
> The difference in behavior is due to bt_get() in block/blk-mq-tag.c
> checking for __GFP_WAIT.
>
> The bsg driver already calls blk_get_request() with GFP_KERNEL, so there
> is no need for a change there.
>
> --- linux-3.19.0/drivers/scsi/sg.c.orig 2015-02-13 11:04:40.000000000 -0500
> +++ linux-3.19.0/drivers/scsi/sg.c 2015-02-13 11:05:14.000000000 -0500
> @@ -1695,7 +1695,22 @@ sg_start_req(Sg_request *srp, unsigned c
> return -ENOMEM;
> }
>
> + /*
> + * NOTE
> + *
> + * With scsi-mq enabled, there are a fixed number of preallocated
> + * requests equal in number to shost->can_queue. If all of the
> + * preallocated requests are already in use, then using GFP_ATOMIC with
> + * blk_get_request() will return -EWOULDBLOCK, whereas using GFP_KERNEL
> + * will cause blk_get_request() to sleep until an active command
> + * completes, freeing up a request. Neither option is ideal, but
> + * GFP_KERNEL is the better choice to prevent userspace from getting an
> + * unexpected EWOULDBLOCK.
> + *
> + * With scsi-mq disabled, blk_get_request() with GFP_KERNEL usually
> + * does not sleep except under memory pressure.
> + */
> + rq = blk_get_request(q, rw, GFP_KERNEL);
> - rq = blk_get_request(q, rw, GFP_ATOMIC);
> if (IS_ERR(rq)) {
> kfree(long_cmdp);
> return PTR_ERR(rq);
>
> --