2013-05-17 21:32:46

by Calvin Owens

[permalink] [raw]
Subject: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

Commit 2f6db2a7 was part of a series that cleaned up mdraid code by
replacing explicit re-initialization of struct bio with bio_reset().

In raid5 it incorrectly assumed that a couple initializations of its
members was a full reset, erasing the existing data and unconditionally
triggering the following BUG when assembling arrays:

[ 14.653072] kernel BUG at /home/calvinow/git/linux/drivers/scsi/scsi_lib.c:1196!
[ 14.653074] invalid opcode: 0000 [#1] PREEMPT SMP
[ 14.653076] CPU: 1 PID: 40 Comm: kworker/1:0H Not tainted 3.10.0-rc1-amd-00279-g8f710dd #3
[ 14.653077] Hardware name: System manufacturer System Product Name/M5A88-M, BIOS 0601 09/20/2011
[ 14.653082] Workqueue: kblockd blk_delay_work
<snip>
[ 14.653123] Call Trace:
[ 14.653126] [<ffffffff81477248>] sd_prep_fn+0x2c8/0xb70
[ 14.653129] [<ffffffff812c8b70>] ? deadline_remove_request.isra.9+0x50/0x90
[ 14.653132] [<ffffffff812b8f5b>] blk_peek_request+0xdb/0x210
[ 14.653134] [<ffffffff81465f15>] scsi_request_fn+0x45/0x4e0
[ 14.653136] [<ffffffff812b6a51>] __blk_run_queue+0x31/0x40
[ 14.653138] [<ffffffff812b6a84>] blk_delay_work+0x24/0x40
[ 14.653141] [<ffffffff8105dc2a>] process_one_work+0x1da/0x490
[ 14.653143] [<ffffffff8105dbcd>] ? process_one_work+0x17d/0x490
[ 14.653145] [<ffffffff8105e32a>] worker_thread+0x11a/0x370
[ 14.653147] [<ffffffff8105e210>] ? rescuer_thread+0x2f0/0x2f0
[ 14.653149] [<ffffffff81066296>] kthread+0xd6/0xe0
[ 14.653151] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
[ 14.653154] [<ffffffff816e4d6c>] ret_from_fork+0x7c/0xb0
[ 14.653156] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
[ 14.653172] Code: <snip>
[ 14.653174] RIP [<ffffffff81467329>] scsi_setup_fs_cmnd+0x89/0x90

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/md/raid5.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9359828..97be03f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -643,7 +643,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)

set_bit(STRIPE_IO_STARTED, &sh->state);

- bio_reset(bi);
bi->bi_bdev = rdev->bdev;
bi->bi_rw = rw;
bi->bi_end_io = (rw & WRITE)
@@ -664,9 +663,12 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
bi->bi_rw |= REQ_FLUSH;

+ bi->bi_flags = 1 << BIO_UPTODATE;
+ bi->bi_idx = 0;
bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
bi->bi_io_vec[0].bv_offset = 0;
bi->bi_size = STRIPE_SIZE;
+ bi->bi_next = NULL;
if (rrdev)
set_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags);

@@ -683,7 +685,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)

set_bit(STRIPE_IO_STARTED, &sh->state);

- bio_reset(rbi);
rbi->bi_bdev = rrdev->bdev;
rbi->bi_rw = rw;
BUG_ON(!(rw & WRITE));
--
1.8.2.1


2013-05-18 07:05:26

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Fri, May 17 2013, Calvin Owens wrote:
> Commit 2f6db2a7 was part of a series that cleaned up mdraid code by
> replacing explicit re-initialization of struct bio with bio_reset().
>
> In raid5 it incorrectly assumed that a couple initializations of its
> members was a full reset, erasing the existing data and unconditionally
> triggering the following BUG when assembling arrays:
>
> [ 14.653072] kernel BUG at /home/calvinow/git/linux/drivers/scsi/scsi_lib.c:1196!
> [ 14.653074] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 14.653076] CPU: 1 PID: 40 Comm: kworker/1:0H Not tainted 3.10.0-rc1-amd-00279-g8f710dd #3
> [ 14.653077] Hardware name: System manufacturer System Product Name/M5A88-M, BIOS 0601 09/20/2011
> [ 14.653082] Workqueue: kblockd blk_delay_work
> <snip>
> [ 14.653123] Call Trace:
> [ 14.653126] [<ffffffff81477248>] sd_prep_fn+0x2c8/0xb70
> [ 14.653129] [<ffffffff812c8b70>] ? deadline_remove_request.isra.9+0x50/0x90
> [ 14.653132] [<ffffffff812b8f5b>] blk_peek_request+0xdb/0x210
> [ 14.653134] [<ffffffff81465f15>] scsi_request_fn+0x45/0x4e0
> [ 14.653136] [<ffffffff812b6a51>] __blk_run_queue+0x31/0x40
> [ 14.653138] [<ffffffff812b6a84>] blk_delay_work+0x24/0x40
> [ 14.653141] [<ffffffff8105dc2a>] process_one_work+0x1da/0x490
> [ 14.653143] [<ffffffff8105dbcd>] ? process_one_work+0x17d/0x490
> [ 14.653145] [<ffffffff8105e32a>] worker_thread+0x11a/0x370
> [ 14.653147] [<ffffffff8105e210>] ? rescuer_thread+0x2f0/0x2f0
> [ 14.653149] [<ffffffff81066296>] kthread+0xd6/0xe0
> [ 14.653151] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
> [ 14.653154] [<ffffffff816e4d6c>] ret_from_fork+0x7c/0xb0
> [ 14.653156] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
> [ 14.653172] Code: <snip>
> [ 14.653174] RIP [<ffffffff81467329>] scsi_setup_fs_cmnd+0x89/0x90
>
> Signed-off-by: Calvin Owens <[email protected]>

Kent, there was a report on this issue yesterday as well. We need to get
this fixed up ASAP.

--
Jens Axboe

2013-05-19 17:52:24

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Sat, May 18, 2013 at 09:05:15AM +0200, Jens Axboe wrote:
> On Fri, May 17 2013, Calvin Owens wrote:
> > Commit 2f6db2a7 was part of a series that cleaned up mdraid code by
> > replacing explicit re-initialization of struct bio with bio_reset().
> >
> > In raid5 it incorrectly assumed that a couple initializations of its
> > members was a full reset, erasing the existing data and unconditionally
> > triggering the following BUG when assembling arrays:
> >
> > [ 14.653072] kernel BUG at /home/calvinow/git/linux/drivers/scsi/scsi_lib.c:1196!
> > [ 14.653074] invalid opcode: 0000 [#1] PREEMPT SMP
> > [ 14.653076] CPU: 1 PID: 40 Comm: kworker/1:0H Not tainted 3.10.0-rc1-amd-00279-g8f710dd #3
> > [ 14.653077] Hardware name: System manufacturer System Product Name/M5A88-M, BIOS 0601 09/20/2011
> > [ 14.653082] Workqueue: kblockd blk_delay_work
> > <snip>
> > [ 14.653123] Call Trace:
> > [ 14.653126] [<ffffffff81477248>] sd_prep_fn+0x2c8/0xb70
> > [ 14.653129] [<ffffffff812c8b70>] ? deadline_remove_request.isra.9+0x50/0x90
> > [ 14.653132] [<ffffffff812b8f5b>] blk_peek_request+0xdb/0x210
> > [ 14.653134] [<ffffffff81465f15>] scsi_request_fn+0x45/0x4e0
> > [ 14.653136] [<ffffffff812b6a51>] __blk_run_queue+0x31/0x40
> > [ 14.653138] [<ffffffff812b6a84>] blk_delay_work+0x24/0x40
> > [ 14.653141] [<ffffffff8105dc2a>] process_one_work+0x1da/0x490
> > [ 14.653143] [<ffffffff8105dbcd>] ? process_one_work+0x17d/0x490
> > [ 14.653145] [<ffffffff8105e32a>] worker_thread+0x11a/0x370
> > [ 14.653147] [<ffffffff8105e210>] ? rescuer_thread+0x2f0/0x2f0
> > [ 14.653149] [<ffffffff81066296>] kthread+0xd6/0xe0
> > [ 14.653151] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
> > [ 14.653154] [<ffffffff816e4d6c>] ret_from_fork+0x7c/0xb0
> > [ 14.653156] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
> > [ 14.653172] Code: <snip>
> > [ 14.653174] RIP [<ffffffff81467329>] scsi_setup_fs_cmnd+0x89/0x90
> >
> > Signed-off-by: Calvin Owens <[email protected]>
>
> Kent, there was a report on this issue yesterday as well. We need to get
> this fixed up ASAP.

Sorry for the delay - been vacationing. Reproduced the original bug,
here's a patch that fixes it:


commit 402f5db3708b2062795a384a3d8397cf702e27bc
Author: Kent Overstreet <[email protected]>
Date: Sun May 19 10:27:07 2013 -0700

raid5: Initialize bi_vcnt

The patch that converted raid5 to use bio_reset() forgot to initialize
bi_vcnt.

Signed-off-by: Kent Overstreet <[email protected]>
Cc: NeilBrown <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9359828..753f318 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
bi->bi_rw |= REQ_FLUSH;

+ bi->bi_vcnt = 1;
bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
bi->bi_io_vec[0].bv_offset = 0;
bi->bi_size = STRIPE_SIZE;
@@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
else
rbi->bi_sector = (sh->sector
+ rrdev->data_offset);
+ rbi->bi_vcnt = 1;
rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
rbi->bi_io_vec[0].bv_offset = 0;
rbi->bi_size = STRIPE_SIZE;

2013-05-28 16:03:53

by Ilia Mirkin

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Sun, May 19, 2013 at 1:51 PM, Kent Overstreet <[email protected]> wrote:
> On Sat, May 18, 2013 at 09:05:15AM +0200, Jens Axboe wrote:
>> On Fri, May 17 2013, Calvin Owens wrote:
>> > Commit 2f6db2a7 was part of a series that cleaned up mdraid code by
>> > replacing explicit re-initialization of struct bio with bio_reset().
>> >
>> > In raid5 it incorrectly assumed that a couple initializations of its
>> > members was a full reset, erasing the existing data and unconditionally
>> > triggering the following BUG when assembling arrays:
>> >
>> > [ 14.653072] kernel BUG at /home/calvinow/git/linux/drivers/scsi/scsi_lib.c:1196!
>> > [ 14.653074] invalid opcode: 0000 [#1] PREEMPT SMP
>> > [ 14.653076] CPU: 1 PID: 40 Comm: kworker/1:0H Not tainted 3.10.0-rc1-amd-00279-g8f710dd #3
>> > [ 14.653077] Hardware name: System manufacturer System Product Name/M5A88-M, BIOS 0601 09/20/2011
>> > [ 14.653082] Workqueue: kblockd blk_delay_work
>> > <snip>
>> > [ 14.653123] Call Trace:
>> > [ 14.653126] [<ffffffff81477248>] sd_prep_fn+0x2c8/0xb70
>> > [ 14.653129] [<ffffffff812c8b70>] ? deadline_remove_request.isra.9+0x50/0x90
>> > [ 14.653132] [<ffffffff812b8f5b>] blk_peek_request+0xdb/0x210
>> > [ 14.653134] [<ffffffff81465f15>] scsi_request_fn+0x45/0x4e0
>> > [ 14.653136] [<ffffffff812b6a51>] __blk_run_queue+0x31/0x40
>> > [ 14.653138] [<ffffffff812b6a84>] blk_delay_work+0x24/0x40
>> > [ 14.653141] [<ffffffff8105dc2a>] process_one_work+0x1da/0x490
>> > [ 14.653143] [<ffffffff8105dbcd>] ? process_one_work+0x17d/0x490
>> > [ 14.653145] [<ffffffff8105e32a>] worker_thread+0x11a/0x370
>> > [ 14.653147] [<ffffffff8105e210>] ? rescuer_thread+0x2f0/0x2f0
>> > [ 14.653149] [<ffffffff81066296>] kthread+0xd6/0xe0
>> > [ 14.653151] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
>> > [ 14.653154] [<ffffffff816e4d6c>] ret_from_fork+0x7c/0xb0
>> > [ 14.653156] [<ffffffff810661c0>] ? __kthread_unpark+0x50/0x50
>> > [ 14.653172] Code: <snip>
>> > [ 14.653174] RIP [<ffffffff81467329>] scsi_setup_fs_cmnd+0x89/0x90
>> >
>> > Signed-off-by: Calvin Owens <[email protected]>
>>
>> Kent, there was a report on this issue yesterday as well. We need to get
>> this fixed up ASAP.
>
> Sorry for the delay - been vacationing. Reproduced the original bug,
> here's a patch that fixes it:

I saw this issue as well, and your patch fixes it for me (feel free to
add my Tested-By if necessary). It didn't appear to be in v3.10-rc3,
or any relevant git repos I could find -- just want to make sure it
didn't get lost somewhere down the line, since otherwise my system
dies.

>
>
> commit 402f5db3708b2062795a384a3d8397cf702e27bc
> Author: Kent Overstreet <[email protected]>
> Date: Sun May 19 10:27:07 2013 -0700
>
> raid5: Initialize bi_vcnt
>
> The patch that converted raid5 to use bio_reset() forgot to initialize
> bi_vcnt.
>
> Signed-off-by: Kent Overstreet <[email protected]>
> Cc: NeilBrown <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: [email protected]
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 9359828..753f318 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> bi->bi_rw |= REQ_FLUSH;
>
> + bi->bi_vcnt = 1;
> bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> bi->bi_io_vec[0].bv_offset = 0;
> bi->bi_size = STRIPE_SIZE;
> @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> else
> rbi->bi_sector = (sh->sector
> + rrdev->data_offset);
> + rbi->bi_vcnt = 1;
> rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> rbi->bi_io_vec[0].bv_offset = 0;
> rbi->bi_size = STRIPE_SIZE;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2013-05-29 12:44:54

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5


On Sun, May 19, 2013 at 10:51:45AM -0700, Kent Overstreet wrote:
> Sorry for the delay - been vacationing. Reproduced the original bug,
> here's a patch that fixes it:
>
>
> commit 402f5db3708b2062795a384a3d8397cf702e27bc
> Author: Kent Overstreet <[email protected]>
> Date: Sun May 19 10:27:07 2013 -0700
>
> raid5: Initialize bi_vcnt
>
> The patch that converted raid5 to use bio_reset() forgot to initialize
> bi_vcnt.
>
> Signed-off-by: Kent Overstreet <[email protected]>
> Cc: NeilBrown <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: [email protected]
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 9359828..753f318 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> bi->bi_rw |= REQ_FLUSH;
>
> + bi->bi_vcnt = 1;
> bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> bi->bi_io_vec[0].bv_offset = 0;
> bi->bi_size = STRIPE_SIZE;
> @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> else
> rbi->bi_sector = (sh->sector
> + rrdev->data_offset);
> + rbi->bi_vcnt = 1;
> rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> rbi->bi_io_vec[0].bv_offset = 0;
> rbi->bi_size = STRIPE_SIZE;

Ditto to the previous follow up. We've been tracking this
bug for nearly a month:

https://bugzilla.redhat.com/show_bug.cgi?id=962079

Please include this (or the other) patch to fix it.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

2013-05-29 13:04:18

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Wed, May 29 2013, Richard W.M. Jones wrote:
>
> On Sun, May 19, 2013 at 10:51:45AM -0700, Kent Overstreet wrote:
> > Sorry for the delay - been vacationing. Reproduced the original bug,
> > here's a patch that fixes it:
> >
> >
> > commit 402f5db3708b2062795a384a3d8397cf702e27bc
> > Author: Kent Overstreet <[email protected]>
> > Date: Sun May 19 10:27:07 2013 -0700
> >
> > raid5: Initialize bi_vcnt
> >
> > The patch that converted raid5 to use bio_reset() forgot to initialize
> > bi_vcnt.
> >
> > Signed-off-by: Kent Overstreet <[email protected]>
> > Cc: NeilBrown <[email protected]>
> > Cc: Jens Axboe <[email protected]>
> > Cc: [email protected]
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 9359828..753f318 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> > bi->bi_rw |= REQ_FLUSH;
> >
> > + bi->bi_vcnt = 1;
> > bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > bi->bi_io_vec[0].bv_offset = 0;
> > bi->bi_size = STRIPE_SIZE;
> > @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > else
> > rbi->bi_sector = (sh->sector
> > + rrdev->data_offset);
> > + rbi->bi_vcnt = 1;
> > rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > rbi->bi_io_vec[0].bv_offset = 0;
> > rbi->bi_size = STRIPE_SIZE;
>
> Ditto to the previous follow up. We've been tracking this
> bug for nearly a month:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=962079
>
> Please include this (or the other) patch to fix it.

I'm assuming both Kent and I are waiting for Neil to pick it up. Neil, I
can include this in my next round going upstream, just let me know. It
should have been sent upstream a while back, sorry guys.

--
Jens Axboe

2013-05-29 23:23:08

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Wed, 29 May 2013 15:03:40 +0200 Jens Axboe <[email protected]> wrote:

> On Wed, May 29 2013, Richard W.M. Jones wrote:
> >
> > On Sun, May 19, 2013 at 10:51:45AM -0700, Kent Overstreet wrote:
> > > Sorry for the delay - been vacationing. Reproduced the original bug,
> > > here's a patch that fixes it:
> > >
> > >
> > > commit 402f5db3708b2062795a384a3d8397cf702e27bc
> > > Author: Kent Overstreet <[email protected]>
> > > Date: Sun May 19 10:27:07 2013 -0700
> > >
> > > raid5: Initialize bi_vcnt
> > >
> > > The patch that converted raid5 to use bio_reset() forgot to initialize
> > > bi_vcnt.
> > >
> > > Signed-off-by: Kent Overstreet <[email protected]>
> > > Cc: NeilBrown <[email protected]>
> > > Cc: Jens Axboe <[email protected]>
> > > Cc: [email protected]
> > >
> > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > > index 9359828..753f318 100644
> > > --- a/drivers/md/raid5.c
> > > +++ b/drivers/md/raid5.c
> > > @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> > > bi->bi_rw |= REQ_FLUSH;
> > >
> > > + bi->bi_vcnt = 1;
> > > bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > bi->bi_io_vec[0].bv_offset = 0;
> > > bi->bi_size = STRIPE_SIZE;
> > > @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > else
> > > rbi->bi_sector = (sh->sector
> > > + rrdev->data_offset);
> > > + rbi->bi_vcnt = 1;
> > > rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > rbi->bi_io_vec[0].bv_offset = 0;
> > > rbi->bi_size = STRIPE_SIZE;
> >
> > Ditto to the previous follow up. We've been tracking this
> > bug for nearly a month:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=962079
> >
> > Please include this (or the other) patch to fix it.
>
> I'm assuming both Kent and I are waiting for Neil to pick it up. Neil, I
> can include this in my next round going upstream, just let me know. It
> should have been sent upstream a while back, sorry guys.
>

Seems you were waiting for me, and I was waiting for you :-)

Yes: please include it with your next round. Thanks!

NeilBrown


Attachments:
signature.asc (828.00 B)

2013-05-30 06:43:06

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Thu, May 30 2013, NeilBrown wrote:
> On Wed, 29 May 2013 15:03:40 +0200 Jens Axboe <[email protected]> wrote:
>
> > On Wed, May 29 2013, Richard W.M. Jones wrote:
> > >
> > > On Sun, May 19, 2013 at 10:51:45AM -0700, Kent Overstreet wrote:
> > > > Sorry for the delay - been vacationing. Reproduced the original bug,
> > > > here's a patch that fixes it:
> > > >
> > > >
> > > > commit 402f5db3708b2062795a384a3d8397cf702e27bc
> > > > Author: Kent Overstreet <[email protected]>
> > > > Date: Sun May 19 10:27:07 2013 -0700
> > > >
> > > > raid5: Initialize bi_vcnt
> > > >
> > > > The patch that converted raid5 to use bio_reset() forgot to initialize
> > > > bi_vcnt.
> > > >
> > > > Signed-off-by: Kent Overstreet <[email protected]>
> > > > Cc: NeilBrown <[email protected]>
> > > > Cc: Jens Axboe <[email protected]>
> > > > Cc: [email protected]
> > > >
> > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > > > index 9359828..753f318 100644
> > > > --- a/drivers/md/raid5.c
> > > > +++ b/drivers/md/raid5.c
> > > > @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > > if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> > > > bi->bi_rw |= REQ_FLUSH;
> > > >
> > > > + bi->bi_vcnt = 1;
> > > > bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > > bi->bi_io_vec[0].bv_offset = 0;
> > > > bi->bi_size = STRIPE_SIZE;
> > > > @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > > else
> > > > rbi->bi_sector = (sh->sector
> > > > + rrdev->data_offset);
> > > > + rbi->bi_vcnt = 1;
> > > > rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > > rbi->bi_io_vec[0].bv_offset = 0;
> > > > rbi->bi_size = STRIPE_SIZE;
> > >
> > > Ditto to the previous follow up. We've been tracking this
> > > bug for nearly a month:
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=962079
> > >
> > > Please include this (or the other) patch to fix it.
> >
> > I'm assuming both Kent and I are waiting for Neil to pick it up. Neil, I
> > can include this in my next round going upstream, just let me know. It
> > should have been sent upstream a while back, sorry guys.
> >
>
> Seems you were waiting for me, and I was waiting for you :-)
>
> Yes: please include it with your next round. Thanks!

Hah, in that case we could have waited for a long time! I'll add it to
the current mix, thanks.

--
Jens Axboe

2013-05-30 07:31:36

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: [PATCH] md: Partially revert 2f6db2a7, which broke raid5

On Thu, May 30, 2013 at 09:22:43AM +1000, NeilBrown wrote:
> On Wed, 29 May 2013 15:03:40 +0200 Jens Axboe <[email protected]> wrote:
>
> > On Wed, May 29 2013, Richard W.M. Jones wrote:
> > >
> > > On Sun, May 19, 2013 at 10:51:45AM -0700, Kent Overstreet wrote:
> > > > Sorry for the delay - been vacationing. Reproduced the original bug,
> > > > here's a patch that fixes it:
> > > >
> > > >
> > > > commit 402f5db3708b2062795a384a3d8397cf702e27bc
> > > > Author: Kent Overstreet <[email protected]>
> > > > Date: Sun May 19 10:27:07 2013 -0700
> > > >
> > > > raid5: Initialize bi_vcnt
> > > >
> > > > The patch that converted raid5 to use bio_reset() forgot to initialize
> > > > bi_vcnt.
> > > >
> > > > Signed-off-by: Kent Overstreet <[email protected]>
> > > > Cc: NeilBrown <[email protected]>
> > > > Cc: Jens Axboe <[email protected]>
> > > > Cc: [email protected]
> > > >
> > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > > > index 9359828..753f318 100644
> > > > --- a/drivers/md/raid5.c
> > > > +++ b/drivers/md/raid5.c
> > > > @@ -664,6 +664,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > > if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> > > > bi->bi_rw |= REQ_FLUSH;
> > > >
> > > > + bi->bi_vcnt = 1;
> > > > bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > > bi->bi_io_vec[0].bv_offset = 0;
> > > > bi->bi_size = STRIPE_SIZE;
> > > > @@ -701,6 +702,7 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> > > > else
> > > > rbi->bi_sector = (sh->sector
> > > > + rrdev->data_offset);
> > > > + rbi->bi_vcnt = 1;
> > > > rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> > > > rbi->bi_io_vec[0].bv_offset = 0;
> > > > rbi->bi_size = STRIPE_SIZE;
> > >
> > > Ditto to the previous follow up. We've been tracking this
> > > bug for nearly a month:
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=962079
> > >
> > > Please include this (or the other) patch to fix it.
> >
> > I'm assuming both Kent and I are waiting for Neil to pick it up. Neil, I
> > can include this in my next round going upstream, just let me know. It
> > should have been sent upstream a while back, sorry guys.
> >
>
> Seems you were waiting for me, and I was waiting for you :-)
>
> Yes: please include it with your next round. Thanks!

BTW I tested this patch and it works, so:

Tested-by: Richard W.M. Jones <[email protected]>

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW