2015-11-26 07:47:10

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH] block: Always check queue limits for cloned requests

When a cloned request is retried on other queues it always needs
to be checked against the queue limits of that queue.
Otherwise the calculations for nr_phys_segments might be wrong,
leading to a crash in scsi_init_sgtable().

To clarify this the patch renames blk_rq_check_limits()
to blk_cloned_rq_check_limits() and removes the symbol
export, as the new function should only be used for
cloned requests and never exported.

Cc: Mike Snitzer <[email protected]>
Cc: Ewan Milne <[email protected]>
Cc: Jeff Moyer <[email protected]>
Signed-off-by: Hannes Reinecke <[email protected]>
---
block/blk-core.c | 21 +++++++--------------
include/linux/blkdev.h | 1 -
2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 5131993b..a0af404 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2114,7 +2114,8 @@ blk_qc_t submit_bio(int rw, struct bio *bio)
EXPORT_SYMBOL(submit_bio);

/**
- * blk_rq_check_limits - Helper function to check a request for the queue limit
+ * blk_cloned_rq_check_limits - Helper function to check a cloned request
+ * for new the queue limits
* @q: the queue
* @rq: the request being checked
*
@@ -2125,20 +2126,13 @@ EXPORT_SYMBOL(submit_bio);
* after it is inserted to @q, it should be checked against @q before
* the insertion using this generic function.
*
- * This function should also be useful for request stacking drivers
- * in some cases below, so export this function.
* Request stacking drivers like request-based dm may change the queue
- * limits while requests are in the queue (e.g. dm's table swapping).
- * Such request stacking drivers should check those requests against
- * the new queue limits again when they dispatch those requests,
- * although such checkings are also done against the old queue limits
- * when submitting requests.
+ * limits when retrying requests on other queues. Those requests need
+ * to be checked against the new queue limits again during dispatch.
*/
-int blk_rq_check_limits(struct request_queue *q, struct request *rq)
+static int blk_cloned_rq_check_limits(struct request_queue *q,
+ struct request *rq)
{
- if (!rq_mergeable(rq))
- return 0;
-
if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) {
printk(KERN_ERR "%s: over max size limit.\n", __func__);
return -EIO;
@@ -2158,7 +2152,6 @@ int blk_rq_check_limits(struct request_queue *q, struct request *rq)

return 0;
}
-EXPORT_SYMBOL_GPL(blk_rq_check_limits);

/**
* blk_insert_cloned_request - Helper for stacking drivers to submit a request
@@ -2170,7 +2163,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
unsigned long flags;
int where = ELEVATOR_INSERT_BACK;

- if (blk_rq_check_limits(q, rq))
+ if (blk_cloned_rq_check_limits(q, rq))
return -EIO;

if (rq->rq_disk &&
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3fe27f8..d14961b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -773,7 +773,6 @@ extern void blk_rq_set_block_pc(struct request *);
extern void blk_requeue_request(struct request_queue *, struct request *);
extern void blk_add_request_payload(struct request *rq, struct page *page,
unsigned int len);
-extern int blk_rq_check_limits(struct request_queue *q, struct request *rq);
extern int blk_lld_busy(struct request_queue *q);
extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
struct bio_set *bs, gfp_t gfp_mask,
--
1.8.5.6


2015-11-26 13:11:08

by Mike Snitzer

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On Thu, Nov 26 2015 at 2:46am -0500,
Hannes Reinecke <[email protected]> wrote:

> When a cloned request is retried on other queues it always needs
> to be checked against the queue limits of that queue.
> Otherwise the calculations for nr_phys_segments might be wrong,
> leading to a crash in scsi_init_sgtable().
>
> To clarify this the patch renames blk_rq_check_limits()
> to blk_cloned_rq_check_limits() and removes the symbol
> export, as the new function should only be used for
> cloned requests and never exported.
>
> Cc: Mike Snitzer <[email protected]>
> Cc: Ewan Milne <[email protected]>
> Cc: Jeff Moyer <[email protected]>
> Signed-off-by: Hannes Reinecke <[email protected]>

Patch looks good. Thanks for getting to the bottom of this.

Jens, please add these extra tags when you pick this up:

Fixes: e2a60da74 ("block: Clean up special command handling logic")
Cc: [email protected] # 3.7+
Acked-by: Mike Snitzer <[email protected]>

2015-11-29 11:50:04

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
> On Thu, Nov 26 2015 at 2:46am -0500,
> Hannes Reinecke <[email protected]> wrote:
>
> > When a cloned request is retried on other queues it always needs
> > to be checked against the queue limits of that queue.
> > Otherwise the calculations for nr_phys_segments might be wrong,
> > leading to a crash in scsi_init_sgtable().
> >
> > To clarify this the patch renames blk_rq_check_limits()
> > to blk_cloned_rq_check_limits() and removes the symbol
> > export, as the new function should only be used for
> > cloned requests and never exported.
> >
> > Cc: Mike Snitzer <[email protected]>
> > Cc: Ewan Milne <[email protected]>
> > Cc: Jeff Moyer <[email protected]>
> > Signed-off-by: Hannes Reinecke <[email protected]>
>
> Patch looks good. Thanks for getting to the bottom of this.
>
> Jens, please add these extra tags when you pick this up:
>
> Fixes: e2a60da74 ("block: Clean up special command handling logic")
> Cc: [email protected] # 3.7+
> Acked-by: Mike Snitzer <[email protected]>

I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
with this patch applied.

markus@x4 linux % git describe
v4.4-rc2-215-g081f3698e606

--
Markus

2015-11-29 15:43:44

by Hannes Reinecke

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
>> On Thu, Nov 26 2015 at 2:46am -0500,
>> Hannes Reinecke <[email protected]> wrote:
>>
>>> When a cloned request is retried on other queues it always needs
>>> to be checked against the queue limits of that queue.
>>> Otherwise the calculations for nr_phys_segments might be wrong,
>>> leading to a crash in scsi_init_sgtable().
>>>
>>> To clarify this the patch renames blk_rq_check_limits()
>>> to blk_cloned_rq_check_limits() and removes the symbol
>>> export, as the new function should only be used for
>>> cloned requests and never exported.
>>>
>>> Cc: Mike Snitzer <[email protected]>
>>> Cc: Ewan Milne <[email protected]>
>>> Cc: Jeff Moyer <[email protected]>
>>> Signed-off-by: Hannes Reinecke <[email protected]>
>>
>> Patch looks good. Thanks for getting to the bottom of this.
>>
>> Jens, please add these extra tags when you pick this up:
>>
>> Fixes: e2a60da74 ("block: Clean up special command handling logic")
>> Cc: [email protected] # 3.7+
>> Acked-by: Mike Snitzer <[email protected]>
>
> I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> with this patch applied.
>
> markus@x4 linux % git describe
> v4.4-rc2-215-g081f3698e606
>
Can you generate a crashdump?
I would need to cross-check with the other dumps I'm having to figure
out if this really is the same issue.
There have been other reports (and fixes) which show we're fighting
several distinct issues here.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: J. Hawn, J. Guild, F. Imend?rffer, HRB 16746 (AG N?rnberg)

2015-11-29 16:15:37

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > On 2015.11.26 at 08:11 -0500, Mike Snitzer wrote:
> >> On Thu, Nov 26 2015 at 2:46am -0500,
> >> Hannes Reinecke <[email protected]> wrote:
> >>
> >>> When a cloned request is retried on other queues it always needs
> >>> to be checked against the queue limits of that queue.
> >>> Otherwise the calculations for nr_phys_segments might be wrong,
> >>> leading to a crash in scsi_init_sgtable().
> >>>
> >>> To clarify this the patch renames blk_rq_check_limits()
> >>> to blk_cloned_rq_check_limits() and removes the symbol
> >>> export, as the new function should only be used for
> >>> cloned requests and never exported.
> >>>
> >>> Cc: Mike Snitzer <[email protected]>
> >>> Cc: Ewan Milne <[email protected]>
> >>> Cc: Jeff Moyer <[email protected]>
> >>> Signed-off-by: Hannes Reinecke <[email protected]>
> >>
> >> Patch looks good. Thanks for getting to the bottom of this.
> >>
> >> Jens, please add these extra tags when you pick this up:
> >>
> >> Fixes: e2a60da74 ("block: Clean up special command handling logic")
> >> Cc: [email protected] # 3.7+
> >> Acked-by: Mike Snitzer <[email protected]>
> >
> > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > with this patch applied.
> >
> > markus@x4 linux % git describe
> > v4.4-rc2-215-g081f3698e606
> >
> Can you generate a crashdump?
> I would need to cross-check with the other dumps I'm having to figure
> out if this really is the same issue.
> There have been other reports (and fixes) which show we're fighting
> several distinct issues here.

Unfortunately no. The crash happens on the disk where I store my log
files. And after it happened the magic SysRq keys don't work anymore.

The crash only happens on my spinning rust drive that uses the cfq
scheduler. The SSDs (deadline) are fine.

The BUG happens reproducibly when building http://www.sagemath.org/ on
that drive.

--
Markus

2015-11-29 16:49:53

by Mike Snitzer

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On Sun, Nov 29 2015 at 11:15am -0500,
Markus Trippelsdorf <[email protected]> wrote:

> On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> > On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > >
> > > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > > with this patch applied.
> > >
> > > markus@x4 linux % git describe
> > > v4.4-rc2-215-g081f3698e606
> > >
> > Can you generate a crashdump?
> > I would need to cross-check with the other dumps I'm having to figure
> > out if this really is the same issue.
> > There have been other reports (and fixes) which show we're fighting
> > several distinct issues here.
>
> Unfortunately no. The crash happens on the disk where I store my log
> files. And after it happened the magic SysRq keys don't work anymore.
>
> The crash only happens on my spinning rust drive that uses the cfq
> scheduler. The SSDs (deadline) are fine.
>
> The BUG happens reproducibly when building http://www.sagemath.org/ on
> that drive.

Are you using DM multipath? If unsure, please let us know which
device(s) map to the "spinning rust drive", and provide output from:
lsblk

2015-11-29 17:05:13

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 2015.11.29 at 11:49 -0500, Mike Snitzer wrote:
> On Sun, Nov 29 2015 at 11:15am -0500,
> Markus Trippelsdorf <[email protected]> wrote:
>
> > On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> > > On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > > >
> > > > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > > > with this patch applied.
> > > >
> > > > markus@x4 linux % git describe
> > > > v4.4-rc2-215-g081f3698e606
> > > >
> > > Can you generate a crashdump?
> > > I would need to cross-check with the other dumps I'm having to figure
> > > out if this really is the same issue.
> > > There have been other reports (and fixes) which show we're fighting
> > > several distinct issues here.
> >
> > Unfortunately no. The crash happens on the disk where I store my log
> > files. And after it happened the magic SysRq keys don't work anymore.
> >
> > The crash only happens on my spinning rust drive that uses the cfq
> > scheduler. The SSDs (deadline) are fine.
> >
> > The BUG happens reproducibly when building http://www.sagemath.org/ on
> > that drive.
>
> Are you using DM multipath? If unsure, please let us know which
> device(s) map to the "spinning rust drive", and provide output from:
> lsblk

No, I'm not using DM multipath.

/dev/sdb2 on /var type btrfs (rw,relatime,compress=lzo,noacl,space_cache)
/dev/sdb2 btrfs 1.9T 904G 944G 49% /var

scsi 1:0:0:0: Direct-Access ATA ST2000DM001-1CH1 CC29 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
sd 1:0:0:0: [sdb] 4096-byte physical blocks
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST2000DM001-1CH164

--
Markus

2015-11-30 06:11:56

by Ming Lei

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On Sun, 29 Nov 2015 18:05:06 +0100
Markus Trippelsdorf <[email protected]> wrote:

> On 2015.11.29 at 11:49 -0500, Mike Snitzer wrote:
> > On Sun, Nov 29 2015 at 11:15am -0500,
> > Markus Trippelsdorf <[email protected]> wrote:
> >
> > > On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
> > > > On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
> > > > >
> > > > > I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
> > > > > with this patch applied.
> > > > >
> > > > > markus@x4 linux % git describe
> > > > > v4.4-rc2-215-g081f3698e606
> > > > >
> > > > Can you generate a crashdump?
> > > > I would need to cross-check with the other dumps I'm having to figure
> > > > out if this really is the same issue.
> > > > There have been other reports (and fixes) which show we're fighting
> > > > several distinct issues here.
> > >
> > > Unfortunately no. The crash happens on the disk where I store my log
> > > files. And after it happened the magic SysRq keys don't work anymore.
> > >
> > > The crash only happens on my spinning rust drive that uses the cfq
> > > scheduler. The SSDs (deadline) are fine.
> > >
> > > The BUG happens reproducibly when building http://www.sagemath.org/ on
> > > that drive.
> >
> > Are you using DM multipath? If unsure, please let us know which
> > device(s) map to the "spinning rust drive", and provide output from:
> > lsblk
>
> No, I'm not using DM multipath.


OK, I guess it is still one block merge issue, care to test the
following patch?

The patch can address one issue when bio->bi_seg_front_size
is set as too small mistakenly, then fewer physical segment may
be figured out.

---
>From 7aa725205f400ee6823a0d19bf9f41a2464725ce Mon Sep 17 00:00:00 2001
From: Ming Lei <[email protected]>
Date: Mon, 30 Nov 2015 13:10:12 +0800
Subject: [PATCH] blk-merge: fix computing bio->bi_seg_front_size in case of
single segment

When bio has only one physical segment, we should set bio's
bi_seg_front_size as the real(final) size of the single segment.

Fixes: 02e707424c2ea(blk-merge: fix blk_bio_segment_split)
Reported-by: Markus Trippelsdorf <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 41a55ba..e01405a 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -103,6 +103,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
bvprv = bv;
bvprvp = &bvprv;
sectors += bv.bv_len >> 9;
+
+ if (nsegs == 1 && seg_size > front_seg_size)
+ front_seg_size = seg_size;
continue;
}
new_segment:
--
1.9.1








2015-11-30 06:47:55

by Hannes Reinecke

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 11/29/2015 06:05 PM, Markus Trippelsdorf wrote:
> On 2015.11.29 at 11:49 -0500, Mike Snitzer wrote:
>> On Sun, Nov 29 2015 at 11:15am -0500,
>> Markus Trippelsdorf <[email protected]> wrote:
>>
>>> On 2015.11.29 at 16:43 +0100, Hannes Reinecke wrote:
>>>> On 11/29/2015 12:49 PM, Markus Trippelsdorf wrote:
>>>>>
>>>>> I'm still seeing the issue (BUG at drivers/scsi/scsi_lib.c:1096!) even
>>>>> with this patch applied.
>>>>>
>>>>> markus@x4 linux % git describe
>>>>> v4.4-rc2-215-g081f3698e606
>>>>>
>>>> Can you generate a crashdump?
>>>> I would need to cross-check with the other dumps I'm having to figure
>>>> out if this really is the same issue.
>>>> There have been other reports (and fixes) which show we're fighting
>>>> several distinct issues here.
>>>
>>> Unfortunately no. The crash happens on the disk where I store my log
>>> files. And after it happened the magic SysRq keys don't work anymore.
>>>
>>> The crash only happens on my spinning rust drive that uses the cfq
>>> scheduler. The SSDs (deadline) are fine.
>>>
>>> The BUG happens reproducibly when building http://www.sagemath.org/ on
>>> that drive.
>>
>> Are you using DM multipath? If unsure, please let us know which
>> device(s) map to the "spinning rust drive", and provide output from:
>> lsblk
>
> No, I'm not using DM multipath.
>
> /dev/sdb2 on /var type btrfs (rw,relatime,compress=lzo,noacl,space_cache)
> /dev/sdb2 btrfs 1.9T 904G 944G 49% /var
>
> scsi 1:0:0:0: Direct-Access ATA ST2000DM001-1CH1 CC29 PQ: 0 ANSI: 5
> sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
> sd 1:0:0:0: [sdb] 4096-byte physical blocks
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: Attached scsi generic sg1 type 0
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> Model Family: Seagate Barracuda 7200.14 (AF)
> Device Model: ST2000DM001-1CH164
>
As Ming Lei indicated, this is probably a different issue. My patch
is for fixing multipath-failover induced I/O errors only.
So if you're not using multipath you won't be affected, neither by
the original issue triggering the BUG_ON nor my patch attempting to
fix it.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

2015-11-30 07:12:56

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: block: Always check queue limits for cloned requests

On 2015.11.30 at 14:11 +0800, Ming Lei wrote:
> On Sun, 29 Nov 2015 18:05:06 +0100
> Markus Trippelsdorf <[email protected]> wrote:
> >
> > No, I'm not using DM multipath.
>
>
> OK, I guess it is still one block merge issue, care to test the
> following patch?
>
> The patch can address one issue when bio->bi_seg_front_size
> is set as too small mistakenly, then fewer physical segment may
> be figured out.

Many thanks. Your patch fixes the issue for me.

--
Markus