2016-03-12 07:43:34

by Kent Overstreet

[permalink] [raw]
Subject: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

I don't know exactly how it's broken, but with that patch segment counting is
broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.

I suggest reverting it for 4.5...


2016-03-12 08:04:28

by Ming Lin

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Fri, Mar 11, 2016 at 11:43 PM, Kent Overstreet
<[email protected]> wrote:
> I don't know exactly how it's broken, but with that patch segment counting is
> broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
>
> I suggest reverting it for 4.5...

+ Ming Lei

2016-03-12 08:49:49

by Ming Lei

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
<[email protected]> wrote:
> I don't know exactly how it's broken, but with that patch segment counting is
> broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
>
> I suggest reverting it for 4.5...

Kent, could you share your test case? I'd like to figure out the root cause.

BTW, I don't object to revert it given it is close to v4.5 release, but I am
curious how it breaks segment couting.


Thanks,
Ming Lei

2016-03-12 09:24:39

by Kent Overstreet

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 04:49:41PM +0800, Ming Lei wrote:
> On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
> <[email protected]> wrote:
> > I don't know exactly how it's broken, but with that patch segment counting is
> > broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
> >
> > I suggest reverting it for 4.5...
>
> Kent, could you share your test case? I'd like to figure out the root cause.

xfstest 036 on bcachefs.

> BTW, I don't object to revert it given it is close to v4.5 release, but I am
> curious how it breaks segment couting.

If you want to debug your version (personally I'd just revert to the simpler
one), I'd start by having your helper use both methods to calculate the last
biovec, and then assert that they're equal.

Also make sure you're testing with a sub-page sized blocksize, if filesystem
blocksize == page size you're not going to be testing the interesting cases

2016-03-12 10:37:09

by Ming Lei

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

Hi Kent,

On Sat, Mar 12, 2016 at 5:24 PM, Kent Overstreet
<[email protected]> wrote:
> On Sat, Mar 12, 2016 at 04:49:41PM +0800, Ming Lei wrote:
>> On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
>> <[email protected]> wrote:
>> > I don't know exactly how it's broken, but with that patch segment counting is
>> > broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
>> >
>> > I suggest reverting it for 4.5...
>>
>> Kent, could you share your test case? I'd like to figure out the root cause.
>
> xfstest 036 on bcachefs.

Given bcachefs isn't merged, could you provide one way to reproduce it
with clean upstream kernel?

>
>> BTW, I don't object to revert it given it is close to v4.5 release, but I am
>> curious how it breaks segment couting.
>
> If you want to debug your version (personally I'd just revert to the simpler
> one), I'd start by having your helper use both methods to calculate the last
> biovec, and then assert that they're equal.
>
> Also make sure you're testing with a sub-page sized blocksize, if filesystem
> blocksize == page size you're not going to be testing the interesting cases

I just run xfstests 036 over bcache and md, with block size 1024/2048, with
xfs/ext4/btrfs, looks the segment counting issue can't be reproduced.

If the issue can only be reproduced with bcachefs, I suggest we don't revert
it until the root cause is figured out.

Thanks,
Ming Lei

2016-03-12 12:12:50

by Kent Overstreet

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 06:36:56PM +0800, Ming Lei wrote:
> Hi Kent,
>
> On Sat, Mar 12, 2016 at 5:24 PM, Kent Overstreet
> <[email protected]> wrote:
> > On Sat, Mar 12, 2016 at 04:49:41PM +0800, Ming Lei wrote:
> >> On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
> >> <[email protected]> wrote:
> >> > I don't know exactly how it's broken, but with that patch segment counting is
> >> > broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
> >> >
> >> > I suggest reverting it for 4.5...
> >>
> >> Kent, could you share your test case? I'd like to figure out the root cause.
> >
> > xfstest 036 on bcachefs.
>
> Given bcachefs isn't merged, could you provide one way to reproduce it
> with clean upstream kernel?
>
> >
> >> BTW, I don't object to revert it given it is close to v4.5 release, but I am
> >> curious how it breaks segment couting.
> >
> > If you want to debug your version (personally I'd just revert to the simpler
> > one), I'd start by having your helper use both methods to calculate the last
> > biovec, and then assert that they're equal.
> >
> > Also make sure you're testing with a sub-page sized blocksize, if filesystem
> > blocksize == page size you're not going to be testing the interesting cases
>
> I just run xfstests 036 over bcache and md, with block size 1024/2048, with
> xfs/ext4/btrfs, looks the segment counting issue can't be reproduced.
>
> If the issue can only be reproduced with bcachefs, I suggest we don't revert
> it until the root cause is figured out.

Ming, if blk_rq_map_sg() is overrunning arrays and corrupting memory that's a
bug in your code - this is certainly a bug in the core block layer - and just
because you haven't been able to reproduce it yet does _not_ mean that no one
else will hit it.

I think I know why bcachefs hits it - that particular test is doing DIO writes
of sub page granularity, creating extents that are logically adjacent and
adjacent on disk (because they were written one right after the other), but that
don't get merged because I'm running my tests with data checksumming enabled.
Then, when we go to read that data - with a buffered read, so page granularity -
the read gets fragmented into multiple bios (because there's multiple extents),
where the two bios are adjacent (and pointing to the same page!) and adjacent on
disk - thus when they're issued you get sub page size segments from two different
bios that are able to be merged, which is otherwise a highly unusual situation.

I'm not about to write a test case for you though, it's your job to figure out
how to test your code.

Also note that it is entirely possible that the segment counting itself is
correct with your patch, and the bug is just that segments aren't getting merged
that the segment counting assumed would be. If that is the case then your patch
merely exposed the bug, but your patch still needs to be reverted in the
meantime.

2016-03-12 13:33:28

by Ming Lei

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 8:12 PM, Kent Overstreet
<[email protected]> wrote:
> On Sat, Mar 12, 2016 at 06:36:56PM +0800, Ming Lei wrote:
>> Hi Kent,
>>
>> On Sat, Mar 12, 2016 at 5:24 PM, Kent Overstreet
>> <[email protected]> wrote:
>> > On Sat, Mar 12, 2016 at 04:49:41PM +0800, Ming Lei wrote:
>> >> On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
>> >> <[email protected]> wrote:
>> >> > I don't know exactly how it's broken, but with that patch segment counting is
>> >> > broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
>> >> >
>> >> > I suggest reverting it for 4.5...
>> >>
>> >> Kent, could you share your test case? I'd like to figure out the root cause.
>> >
>> > xfstest 036 on bcachefs.
>>
>> Given bcachefs isn't merged, could you provide one way to reproduce it
>> with clean upstream kernel?
>>
>> >
>> >> BTW, I don't object to revert it given it is close to v4.5 release, but I am
>> >> curious how it breaks segment couting.
>> >
>> > If you want to debug your version (personally I'd just revert to the simpler
>> > one), I'd start by having your helper use both methods to calculate the last
>> > biovec, and then assert that they're equal.
>> >
>> > Also make sure you're testing with a sub-page sized blocksize, if filesystem
>> > blocksize == page size you're not going to be testing the interesting cases
>>
>> I just run xfstests 036 over bcache and md, with block size 1024/2048, with
>> xfs/ext4/btrfs, looks the segment counting issue can't be reproduced.
>>
>> If the issue can only be reproduced with bcachefs, I suggest we don't revert
>> it until the root cause is figured out.
>
> Ming, if blk_rq_map_sg() is overrunning arrays and corrupting memory that's a
> bug in your code - this is certainly a bug in the core block layer - and just
> because you haven't been able to reproduce it yet does _not_ mean that no one
> else will hit it.

I just mean given bcachefs isn't merged yet, then maybe no one can be effected,
so we have time to figure out the root cause.

I will try your bcachefs to see if I can reproduce it, could you share me
one tree so that I can test bcachefs?

>
> I think I know why bcachefs hits it - that particular test is doing DIO writes
> of sub page granularity, creating extents that are logically adjacent and
> adjacent on disk (because they were written one right after the other), but that
> don't get merged because I'm running my tests with data checksumming enabled.
> Then, when we go to read that data - with a buffered read, so page granularity -
> the read gets fragmented into multiple bios (because there's multiple extents),
> where the two bios are adjacent (and pointing to the same page!) and adjacent on
> disk - thus when they're issued you get sub page size segments from two different
> bios that are able to be merged, which is otherwise a highly unusual situation.

That means the situation of merging two bios, but can't explain why this
commit is wrong.

Last time we had one bug report under this kind of situation, and the test case
in following link can trigger merging two bios always:

http://marc.info/?l=linux-scsi&m=144881373723594&w=2

And finally a88d32af(blk-merge: fix computing bio->bi_seg_front_size in
case of single segment) is figured out for fixing the bug.

Today I have run this test case again, and looks everything is fine.

>
> I'm not about to write a test case for you though, it's your job to figure out
> how to test your code.
>
> Also note that it is entirely possible that the segment counting itself is
> correct with your patch, and the bug is just that segments aren't getting merged
> that the segment counting assumed would be. If that is the case then your patch
> merely exposed the bug, but your patch still needs to be reverted in the
> meantime.


Thanks,
Ming Lei

2016-03-12 13:48:56

by Kent Overstreet

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 09:33:18PM +0800, Ming Lei wrote:
> On Sat, Mar 12, 2016 at 8:12 PM, Kent Overstreet
> <[email protected]> wrote:
> > On Sat, Mar 12, 2016 at 06:36:56PM +0800, Ming Lei wrote:
> >> Hi Kent,
> >>
> >> On Sat, Mar 12, 2016 at 5:24 PM, Kent Overstreet
> >> <[email protected]> wrote:
> >> > On Sat, Mar 12, 2016 at 04:49:41PM +0800, Ming Lei wrote:
> >> >> On Sat, Mar 12, 2016 at 3:43 PM, Kent Overstreet
> >> >> <[email protected]> wrote:
> >> >> > I don't know exactly how it's broken, but with that patch segment counting is
> >> >> > broken - I'm seeing blk_rq_map_sg() overrun the end of the sgtable.
> >> >> >
> >> >> > I suggest reverting it for 4.5...
> >> >>
> >> >> Kent, could you share your test case? I'd like to figure out the root cause.
> >> >
> >> > xfstest 036 on bcachefs.
> >>
> >> Given bcachefs isn't merged, could you provide one way to reproduce it
> >> with clean upstream kernel?
> >>
> >> >
> >> >> BTW, I don't object to revert it given it is close to v4.5 release, but I am
> >> >> curious how it breaks segment couting.
> >> >
> >> > If you want to debug your version (personally I'd just revert to the simpler
> >> > one), I'd start by having your helper use both methods to calculate the last
> >> > biovec, and then assert that they're equal.
> >> >
> >> > Also make sure you're testing with a sub-page sized blocksize, if filesystem
> >> > blocksize == page size you're not going to be testing the interesting cases
> >>
> >> I just run xfstests 036 over bcache and md, with block size 1024/2048, with
> >> xfs/ext4/btrfs, looks the segment counting issue can't be reproduced.
> >>
> >> If the issue can only be reproduced with bcachefs, I suggest we don't revert
> >> it until the root cause is figured out.
> >
> > Ming, if blk_rq_map_sg() is overrunning arrays and corrupting memory that's a
> > bug in your code - this is certainly a bug in the core block layer - and just
> > because you haven't been able to reproduce it yet does _not_ mean that no one
> > else will hit it.
>
> I just mean given bcachefs isn't merged yet, then maybe no one can be effected,
> so we have time to figure out the root cause.

No - we're at rc7, and there is no good reason to think this is bcachefs
specific - when you introduce a regression into mainline the responsible thing
to do is revert first, then debug. Especially this close to release - also,
why'd this go in after the merge window, Jens? blk-merge.c related stuff is
notoriously fragile...

> I will try your bcachefs to see if I can reproduce it, could you share me
> one tree so that I can test bcachefs?

sigh, I'll debug it tomorrow.

bcachefs instructions are at the top of the bcache website, but you'll need some
minor patches to xfstests. easier for me to just do it.

>
> >
> > I think I know why bcachefs hits it - that particular test is doing DIO writes
> > of sub page granularity, creating extents that are logically adjacent and
> > adjacent on disk (because they were written one right after the other), but that
> > don't get merged because I'm running my tests with data checksumming enabled.
> > Then, when we go to read that data - with a buffered read, so page granularity -
> > the read gets fragmented into multiple bios (because there's multiple extents),
> > where the two bios are adjacent (and pointing to the same page!) and adjacent on
> > disk - thus when they're issued you get sub page size segments from two different
> > bios that are able to be merged, which is otherwise a highly unusual situation.
>
> That means the situation of merging two bios, but can't explain why this
> commit is wrong.
>
> Last time we had one bug report under this kind of situation, and the test case
> in following link can trigger merging two bios always:
>
> http://marc.info/?l=linux-scsi&m=144881373723594&w=2
>
> And finally a88d32af(blk-merge: fix computing bio->bi_seg_front_size in
> case of single segment) is figured out for fixing the bug.
>
> Today I have run this test case again, and looks everything is fine.
>
> >
> > I'm not about to write a test case for you though, it's your job to figure out
> > how to test your code.
> >
> > Also note that it is entirely possible that the segment counting itself is
> > correct with your patch, and the bug is just that segments aren't getting merged
> > that the segment counting assumed would be. If that is the case then your patch
> > merely exposed the bug, but your patch still needs to be reverted in the
> > meantime.
>
>
> Thanks,
> Ming Lei

2016-03-12 14:03:07

by Kent Overstreet

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

Here's the output of the patch below:

generic/036 11s ...run fstests generic/036 at 2016-03-12 13:58:21
end 4096 0 ffffea0001d611c0 end2 1024 0 ffffea0001d611c0
len 1024 offset 0 page ffffea0001d611c0
KGDB: Waiting for remote debugger

Your code gives a biovec with bv_len of 4096, the old code gives a biovec with
bv_len of 1024 (and then we dump every biovec, we see that the bio had only a
single biovec that did indeed have bv_len == 1024).

bio_get_last_bvec() is broken, please revert your patch for v4.5.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 261353166d..231aec4fa3 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -318,6 +318,34 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
bio_get_last_bvec(bio, &end_bv);
bio_get_first_bvec(nxt, &nxt_bv);

+ {
+ struct bvec_iter iter;
+ struct bio_vec end2, bv;
+
+ bio_for_each_segment(end2, bio, iter)
+ if (end2.bv_len == iter.bi_size)
+ break;
+
+ if (end_bv.bv_len != end2.bv_len ||
+ end_bv.bv_offset != end2.bv_offset ||
+ end_bv.bv_page != end2.bv_page) {
+ printk(KERN_ERR "end %u %u %p end2 %u %u %p\n",
+ end_bv.bv_len,
+ end_bv.bv_offset,
+ end_bv.bv_page,
+ end2.bv_len,
+ end2.bv_offset,
+ end2.bv_page);
+
+ bio_for_each_segment(bv, bio, iter)
+ printk(KERN_ERR "len %u offset %u page %p\n",
+ bv.bv_len,
+ bv.bv_offset,
+ bv.bv_page);
+ BUG();
+ }
+ }
+
if (!BIOVEC_PHYS_MERGEABLE(&end_bv, &nxt_bv))
return 0;


2016-03-12 14:26:12

by Ming Lei

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, 12 Mar 2016 05:02:56 -0900
Kent Overstreet <[email protected]> wrote:

> Here's the output of the patch below:
>
> generic/036 11s ...run fstests generic/036 at 2016-03-12 13:58:21
> end 4096 0 ffffea0001d611c0 end2 1024 0 ffffea0001d611c0
> len 1024 offset 0 page ffffea0001d611c0
> KGDB: Waiting for remote debugger
>
> Your code gives a biovec with bv_len of 4096, the old code gives a biovec with
> bv_len of 1024 (and then we dump every biovec, we see that the bio had only a
> single biovec that did indeed have bv_len == 1024).

I guess we shouldn't have optimized for the case of non-cloned bio, could you
try the following patch?

--
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 1e7248f..4abc129 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -267,11 +267,6 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
struct bvec_iter iter = bio->bi_iter;
int idx;

- if (!bio_flagged(bio, BIO_CLONED)) {
- *bv = bio->bi_io_vec[bio->bi_vcnt - 1];
- return;
- }
-
if (unlikely(!bio_multiple_segments(bio))) {
*bv = bio_iovec(bio);
return;

Thanks,
Ming

>
> bio_get_last_bvec() is broken, please revert your patch for v4.5.
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 261353166d..231aec4fa3 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -318,6 +318,34 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
> bio_get_last_bvec(bio, &end_bv);
> bio_get_first_bvec(nxt, &nxt_bv);
>
> + {
> + struct bvec_iter iter;
> + struct bio_vec end2, bv;
> +
> + bio_for_each_segment(end2, bio, iter)
> + if (end2.bv_len == iter.bi_size)
> + break;
> +
> + if (end_bv.bv_len != end2.bv_len ||
> + end_bv.bv_offset != end2.bv_offset ||
> + end_bv.bv_page != end2.bv_page) {
> + printk(KERN_ERR "end %u %u %p end2 %u %u %p\n",
> + end_bv.bv_len,
> + end_bv.bv_offset,
> + end_bv.bv_page,
> + end2.bv_len,
> + end2.bv_offset,
> + end2.bv_page);
> +
> + bio_for_each_segment(bv, bio, iter)
> + printk(KERN_ERR "len %u offset %u page %p\n",
> + bv.bv_len,
> + bv.bv_offset,
> + bv.bv_page);
> + BUG();
> + }
> + }
> +
> if (!BIOVEC_PHYS_MERGEABLE(&end_bv, &nxt_bv))
> return 0;
>

2016-03-12 14:36:22

by Kent Overstreet

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 10:25:48PM +0800, Ming Lei wrote:
> On Sat, 12 Mar 2016 05:02:56 -0900
> Kent Overstreet <[email protected]> wrote:
>
> > Here's the output of the patch below:
> >
> > generic/036 11s ...run fstests generic/036 at 2016-03-12 13:58:21
> > end 4096 0 ffffea0001d611c0 end2 1024 0 ffffea0001d611c0
> > len 1024 offset 0 page ffffea0001d611c0
> > KGDB: Waiting for remote debugger
> >
> > Your code gives a biovec with bv_len of 4096, the old code gives a biovec with
> > bv_len of 1024 (and then we dump every biovec, we see that the bio had only a
> > single biovec that did indeed have bv_len == 1024).
>
> I guess we shouldn't have optimized for the case of non-cloned bio, could you
> try the following patch?
>
> --
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 1e7248f..4abc129 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -267,11 +267,6 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
> struct bvec_iter iter = bio->bi_iter;
> int idx;
>
> - if (!bio_flagged(bio, BIO_CLONED)) {
> - *bv = bio->bi_io_vec[bio->bi_vcnt - 1];
> - return;
> - }
> -
> if (unlikely(!bio_multiple_segments(bio))) {
> *bv = bio_iovec(bio);
> return;
>
> Thanks,
> Ming

Yes, that's it.

!BIO_CLONED is _not_ a guarantee that bi_size doesn't straddle the middle of a
bvec - bcachefs was hitting this by bouncing a bio that had already been split
(which can happen elsewhere in the kernel...) but there's other (perfectly
legal) ways it can happen.

I would still strongly suggest reverting the patch for 4.5 and resubmitting
during the next merge window.

2016-03-12 14:40:07

by Ming Lei

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 10:36 PM, Kent Overstreet
<[email protected]> wrote:
> On Sat, Mar 12, 2016 at 10:25:48PM +0800, Ming Lei wrote:
>> On Sat, 12 Mar 2016 05:02:56 -0900
>> Kent Overstreet <[email protected]> wrote:
>>
>> > Here's the output of the patch below:
>> >
>> > generic/036 11s ...run fstests generic/036 at 2016-03-12 13:58:21
>> > end 4096 0 ffffea0001d611c0 end2 1024 0 ffffea0001d611c0
>> > len 1024 offset 0 page ffffea0001d611c0
>> > KGDB: Waiting for remote debugger
>> >
>> > Your code gives a biovec with bv_len of 4096, the old code gives a biovec with
>> > bv_len of 1024 (and then we dump every biovec, we see that the bio had only a
>> > single biovec that did indeed have bv_len == 1024).
>>
>> I guess we shouldn't have optimized for the case of non-cloned bio, could you
>> try the following patch?
>>
>> --
>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>> index 1e7248f..4abc129 100644
>> --- a/include/linux/bio.h
>> +++ b/include/linux/bio.h
>> @@ -267,11 +267,6 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>> struct bvec_iter iter = bio->bi_iter;
>> int idx;
>>
>> - if (!bio_flagged(bio, BIO_CLONED)) {
>> - *bv = bio->bi_io_vec[bio->bi_vcnt - 1];
>> - return;
>> - }
>> -
>> if (unlikely(!bio_multiple_segments(bio))) {
>> *bv = bio_iovec(bio);
>> return;
>>
>> Thanks,
>> Ming
>
> Yes, that's it.
>
> !BIO_CLONED is _not_ a guarantee that bi_size doesn't straddle the middle of a
> bvec - bcachefs was hitting this by bouncing a bio that had already been split
> (which can happen elsewhere in the kernel...) but there's other (perfectly
> legal) ways it can happen.

Exactly!

>
> I would still strongly suggest reverting the patch for 4.5 and resubmitting
> during the next merge window.

I am fine with either way, and I will prepare one patch and let Jens
decide.


thanks,
Ming Lei

2016-03-12 19:51:20

by Linus Torvalds

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On Sat, Mar 12, 2016 at 6:39 AM, Ming Lei <[email protected]> wrote:
>
> I am fine with either way, and I will prepare one patch and let Jens
> decide.

So guys, this needs to be done *now*.

And Jens - this is the last time I believe you when you say late
patches are required.

The buggy patch that introduced this problem was part of that very
late pull request that I already rejected once, and you then claimed
was absolutely required.

So the dicking around with the block layer stops *now*.

Seriously. I'm pissed off.

I don't want to see anything even half-way questionable during the
whole next release window. Not even during the merge window. You need
to do some serious quality control, and re-think the whole "large
changes" model.

Linus

2016-03-12 21:14:47

by Jens Axboe

[permalink] [raw]
Subject: Re: e827091cb1 "block: merge: get the 1st and last bvec via helpers" broken

On 03/12/2016 12:51 PM, Linus Torvalds wrote:
> On Sat, Mar 12, 2016 at 6:39 AM, Ming Lei <[email protected]> wrote:
>>
>> I am fine with either way, and I will prepare one patch and let Jens
>> decide.
>
> So guys, this needs to be done *now*.
>
> And Jens - this is the last time I believe you when you say late
> patches are required.
>
> The buggy patch that introduced this problem was part of that very
> late pull request that I already rejected once, and you then claimed
> was absolutely required.
>
> So the dicking around with the block layer stops *now*.
>
> Seriously. I'm pissed off.

Believe me, I'm as impressed as you are...

I've queued it up and will run it through testing and send it off later.
I still think it's better to apply the fix, rather than revert the
original change.

> I don't want to see anything even half-way questionable during the
> whole next release window. Not even during the merge window. You need
> to do some serious quality control, and re-think the whole "large
> changes" model.

The timing was the issue here, and yeah, it didn't work out well this
time at all. It's a momentary lapse, we'll get it sorted for sure.

--
Jens Axboe