2008-12-04 14:26:39

by Alan D. Brunelle

[permalink] [raw]
Subject: kernel BUG at block/blk-timeout.c:178!

(I'll be checking into reproducibility &c., but for now...)

Using git tree as of:

commit 6a1214113090905aca6a492fc8ef10d84c608a69
Merge: c07f62e... 2642b11...
Author: Linus Torvalds <[email protected]>
Date: Mon Dec 1 09:34:23 2008 -0800

Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/ie

* 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linu
ieee1394: sbp2: fix race condition in state change
ieee1394: fix list corruption (reported at module removal)
firewire: fw-sbp2: another iPod mini quirk entry
ieee1394: sbp2: another iPod mini quirk entry

I just got whilst booting (16-way AMD64, 126GB RAM, 3@CCISS, 4@QLA):

------------[ cut here ]------------
kernel BUG at block/blk-timeout.c:178!
invalid opcode: 0000 [#1] SMP
last sysfs file:
CPU 0
Pid: 268, comm: scsi_wq_0 Not tainted 2.6.28-rc6 #13
RIP: 0010:[<ffffffff804e8d4d>] [<ffffffff804e8d4d>] blk_add_timer+0xdd/0xf0
RSP: 0018:ffff88087b6c3830 EFLAGS: 00010002
RAX: ffff88087cdf4408 RBX: ffff88087cded670 RCX: ffff88087cdf4408
RDX: ffff88087cded670 RSI: ffff88087cdf42c8 RDI: ffff88087cdf42c8
RBP: ffff88087b6c3840 R08: 0000000000000020 R09: ffff88087cded670
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000020
R13: ffff88087cdf42c8 R14: 0000000000000020 R15: 0000000000000008
FS: 0000000000000000(0000) GS:ffffffff80f76980(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_wq_0 (pid: 268, threadinfo ffff88087b6c2000, task
ffff88087bf706c0)
Stack:
ffff88087cded670 ffff88087b6cf460 ffff88087b6c3850 ffffffff804e2308
ffff88087b6c3890 ffffffff804e5566 ffff88087cded670 ffff88087cded670
ffff88087cdf1800 ffff88087cded670 ffff88087cded670 ffff88087cdf42c8
Call Trace:
[<ffffffff804e2308>] elv_dequeue_request+0x48/0x60
[<ffffffff804e5566>] blk_queue_start_tag+0x96/0x130
[<ffffffff8060c900>] scsi_request_fn+0x120/0x5a0
[<ffffffff804e47db>] __generic_unplug_device+0x2b/0x40
[<ffffffff804e7d56>] blk_execute_rq_nowait+0x66/0xb0
[<ffffffff804e7dfd>] blk_execute_rq+0x5d/0xb0
[<ffffffff80308c80>] ? sysfs_ilookup_test+0x0/0x20
[<ffffffff8063011e>] hp_sw_tur+0xbe/0x230
[<ffffffff8062ecc6>] ? scsi_dh_handler_lookup+0x56/0xb0
[<ffffffff806305c8>] hp_sw_bus_attach+0x58/0x160
[<ffffffff8062eba2>] scsi_dh_handler_attach+0x32/0x40
[<ffffffff8062f0da>] scsi_dh_notifier+0x6a/0xa0
[<ffffffff8025f3df>] notifier_call_chain+0x3f/0x80
[<ffffffff8025f7f8>] __blocking_notifier_call_chain+0x58/0x80
[<ffffffff8025f831>] blocking_notifier_call_chain+0x11/0x20
[<ffffffff805b4703>] device_add+0x133/0x650
[<ffffffff805ba50c>] ? attribute_container_device_trigger+0xbc/0xd0
[<ffffffff80612944>] scsi_sysfs_add_sdev+0x64/0x280
[<ffffffff80610379>] scsi_probe_and_add_lun+0xdb9/0xde0
[<ffffffff80610a07>] __scsi_scan_target+0x4d7/0x6f0
[<ffffffff806112e0>] scsi_scan_target+0xc0/0xd0
[<ffffffff8061c2f7>] fc_scsi_scan_rport+0xb7/0xc0
[<ffffffff8061c240>] ? fc_scsi_scan_rport+0x0/0xc0
[<ffffffff80256c70>] run_workqueue+0x70/0x120
[<ffffffff80256d80>] worker_thread+0x60/0xb0
[<ffffffff8025aca0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80256d20>] ? worker_thread+0x0/0xb0
[<ffffffff8025a8e9>] kthread+0x49/0x80
[<ffffffff8020d269>] child_rip+0xa/0x11
[<ffffffff8025a8a0>] ? kthread+0x0/0x80
[<ffffffff8020d25f>] ? child_rip+0x0/0x11
Code: 48 8d bb 28 03 00 00 48 89 c6 e8 af 6f d6 ff 48 83 c4 08 5b c9 c3
48 8b 15 81 cb a9 00 89 c0 48 01 d0 48 89 87 38 01 00 00 eb 90 <0f> 0b
eb fe 0f 1f 80 00 00 00 00 0f 0b eb fe 0f 1f 40 00 55 48
RIP [<ffffffff804e8d4d>] blk_add_timer+0xdd/0xf0
RSP <ffff88087b6c3830>
---[ end trace 1d1b7db6f1a8b5dd ]---


2008-12-04 15:25:59

by Alan D. Brunelle

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

(First: It seems to be easily reproduced - 3 for 3.)

Interestingly, if I add a test forward :

diff --git a/block/elevator.c b/block/elevator.c
index 9ac82dd..c2c8342 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -586,6 +586,7 @@ void elv_insert(struct request_queue *q, struct
request *rq,
unsigned ordseq;
int unplug_it = 1;

+ BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));
blk_add_trace_rq(q, rq, BLK_TA_INSERT);

rq->q = q;

the problem moves with it:

------------[ cut here ]------------
kernel BUG at block/elevator.c:589!
invalid opcode: 0000 [#1] SMP
last sysfs file:
CPU 0
Pid: 268, comm: scsi_wq_0 Not tainted 2.6.28-rc6 #14
RIP: 0010:[<ffffffff804e278e>] [<ffffffff804e278e>] elv_insert+0x25e/0x270
RSP: 0018:ffff88087cd4f8c0 EFLAGS: 00010002
RAX: 0000000000000000 RBX: ffff88087b6d52c8 RCX: 00000000fffee229
RDX: 0000000000000001 RSI: ffff88087b6d52c8 RDI: ffff88087cd57670
RBP: ffff88087cd4f8e0 R08: 0000000000000000 R09: ffff88087cd57670
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88087cd57670
R13: ffff88087cd57670 R14: ffff88087cdf5920 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffffffff80f76980(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_wq_0 (pid: 268, threadinfo ffff88087cd4e000, task
ffff88087ba9c040)
Stack:
ffff88087b6d52c8 ffff88087cd57670 0000000000000001 ffff88087cdf5920
ffff88087cd4f910 ffffffff804e281b 0000000000200200 0000000000000001
ffff88087b6d52c8 ffff88087cd57670 ffff88087cd4f940 ffffffff804e7d5e
Call Trace:
[<ffffffff804e281b>] __elv_add_request+0x7b/0xd0
[<ffffffff804e7d5e>] blk_execute_rq_nowait+0x5e/0xb0
[<ffffffff804e7e0d>] blk_execute_rq+0x5d/0xb0
[<ffffffff80308c80>] ? sysfs_ilookup_test+0x0/0x20
[<ffffffff8063012e>] hp_sw_tur+0xbe/0x230
[<ffffffff8062ecd6>] ? scsi_dh_handler_lookup+0x56/0xb0
[<ffffffff806305d8>] hp_sw_bus_attach+0x58/0x160
[<ffffffff8062ebb2>] scsi_dh_handler_attach+0x32/0x40
[<ffffffff8062f0ea>] scsi_dh_notifier+0x6a/0xa0
[<ffffffff8025f3df>] notifier_call_chain+0x3f/0x80
[<ffffffff8025f7f8>] __blocking_notifier_call_chain+0x58/0x80
[<ffffffff8025f831>] blocking_notifier_call_chain+0x11/0x20
[<ffffffff805b4713>] device_add+0x133/0x650
[<ffffffff805ba51c>] ? attribute_container_device_trigger+0xbc/0xd0
[<ffffffff80612954>] scsi_sysfs_add_sdev+0x64/0x280
[<ffffffff80610389>] scsi_probe_and_add_lun+0xdb9/0xde0
[<ffffffff80610a17>] __scsi_scan_target+0x4d7/0x6f0
[<ffffffff806112f0>] scsi_scan_target+0xc0/0xd0
[<ffffffff8061c307>] fc_scsi_scan_rport+0xb7/0xc0
[<ffffffff8061c250>] ? fc_scsi_scan_rport+0x0/0xc0
[<ffffffff80256c70>] run_workqueue+0x70/0x120
[<ffffffff80256d80>] worker_thread+0x60/0xb0
[<ffffffff8025aca0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80256d20>] ? worker_thread+0x0/0xb0
[<ffffffff8025a8e9>] kthread+0x49/0x80
[<ffffffff8020d269>] child_rip+0xa/0x11
[<ffffffff8025a8a0>] ? kthread+0x0/0x80
[<ffffffff8020d25f>] ? child_rip+0x0/0x11
Code: 89 ef 0f 1f 00 e8 53 20 00 00 0f 1f 00 e9 29 fe ff ff 49 8b 44 24
08 4c 89 23 49 89 5c 24 08 48 89 18 48 89 43 08 e9 10 fe ff ff <0f> 0b
eb fe 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
RIP [<ffffffff804e278e>] elv_insert+0x25e/0x270
RSP <ffff88087cd4f8c0>
---[ end trace 0bd30bab78e129e5 ]---


2008-12-04 15:50:25

by Jens Axboe

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

On Thu, Dec 04 2008, Alan D. Brunelle wrote:
> (I'll be checking into reproducibility &c., but for now...)
>
> Using git tree as of:
>
> commit 6a1214113090905aca6a492fc8ef10d84c608a69
> Merge: c07f62e... 2642b11...
> Author: Linus Torvalds <[email protected]>
> Date: Mon Dec 1 09:34:23 2008 -0800
>
> Merge branch 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/ie
>
> * 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linu
> ieee1394: sbp2: fix race condition in state change
> ieee1394: fix list corruption (reported at module removal)
> firewire: fw-sbp2: another iPod mini quirk entry
> ieee1394: sbp2: another iPod mini quirk entry
>
> I just got whilst booting (16-way AMD64, 126GB RAM, 3@CCISS, 4@QLA):
>
> ------------[ cut here ]------------
> kernel BUG at block/blk-timeout.c:178!
> invalid opcode: 0000 [#1] SMP
> last sysfs file:
> CPU 0
> Pid: 268, comm: scsi_wq_0 Not tainted 2.6.28-rc6 #13
> RIP: 0010:[<ffffffff804e8d4d>] [<ffffffff804e8d4d>] blk_add_timer+0xdd/0xf0
> RSP: 0018:ffff88087b6c3830 EFLAGS: 00010002
> RAX: ffff88087cdf4408 RBX: ffff88087cded670 RCX: ffff88087cdf4408
> RDX: ffff88087cded670 RSI: ffff88087cdf42c8 RDI: ffff88087cdf42c8
> RBP: ffff88087b6c3840 R08: 0000000000000020 R09: ffff88087cded670
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000020
> R13: ffff88087cdf42c8 R14: 0000000000000020 R15: 0000000000000008
> FS: 0000000000000000(0000) GS:ffffffff80f76980(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process scsi_wq_0 (pid: 268, threadinfo ffff88087b6c2000, task
> ffff88087bf706c0)
> Stack:
> ffff88087cded670 ffff88087b6cf460 ffff88087b6c3850 ffffffff804e2308
> ffff88087b6c3890 ffffffff804e5566 ffff88087cded670 ffff88087cded670
> ffff88087cdf1800 ffff88087cded670 ffff88087cded670 ffff88087cdf42c8
> Call Trace:
> [<ffffffff804e2308>] elv_dequeue_request+0x48/0x60
> [<ffffffff804e5566>] blk_queue_start_tag+0x96/0x130
> [<ffffffff8060c900>] scsi_request_fn+0x120/0x5a0
> [<ffffffff804e47db>] __generic_unplug_device+0x2b/0x40
> [<ffffffff804e7d56>] blk_execute_rq_nowait+0x66/0xb0
> [<ffffffff804e7dfd>] blk_execute_rq+0x5d/0xb0
> [<ffffffff80308c80>] ? sysfs_ilookup_test+0x0/0x20
> [<ffffffff8063011e>] hp_sw_tur+0xbe/0x230
> [<ffffffff8062ecc6>] ? scsi_dh_handler_lookup+0x56/0xb0
> [<ffffffff806305c8>] hp_sw_bus_attach+0x58/0x160
> [<ffffffff8062eba2>] scsi_dh_handler_attach+0x32/0x40
> [<ffffffff8062f0da>] scsi_dh_notifier+0x6a/0xa0
> [<ffffffff8025f3df>] notifier_call_chain+0x3f/0x80
> [<ffffffff8025f7f8>] __blocking_notifier_call_chain+0x58/0x80
> [<ffffffff8025f831>] blocking_notifier_call_chain+0x11/0x20
> [<ffffffff805b4703>] device_add+0x133/0x650
> [<ffffffff805ba50c>] ? attribute_container_device_trigger+0xbc/0xd0
> [<ffffffff80612944>] scsi_sysfs_add_sdev+0x64/0x280
> [<ffffffff80610379>] scsi_probe_and_add_lun+0xdb9/0xde0
> [<ffffffff80610a07>] __scsi_scan_target+0x4d7/0x6f0
> [<ffffffff806112e0>] scsi_scan_target+0xc0/0xd0
> [<ffffffff8061c2f7>] fc_scsi_scan_rport+0xb7/0xc0
> [<ffffffff8061c240>] ? fc_scsi_scan_rport+0x0/0xc0
> [<ffffffff80256c70>] run_workqueue+0x70/0x120
> [<ffffffff80256d80>] worker_thread+0x60/0xb0
> [<ffffffff8025aca0>] ? autoremove_wake_function+0x0/0x40
> [<ffffffff80256d20>] ? worker_thread+0x0/0xb0
> [<ffffffff8025a8e9>] kthread+0x49/0x80
> [<ffffffff8020d269>] child_rip+0xa/0x11
> [<ffffffff8025a8a0>] ? kthread+0x0/0x80
> [<ffffffff8020d25f>] ? child_rip+0x0/0x11
> Code: 48 8d bb 28 03 00 00 48 89 c6 e8 af 6f d6 ff 48 83 c4 08 5b c9 c3
> 48 8b 15 81 cb a9 00 89 c0 48 01 d0 48 89 87 38 01 00 00 eb 90 <0f> 0b
> eb fe 0f 1f 80 00 00 00 00 0f 0b eb fe 0f 1f 40 00 55 48
> RIP [<ffffffff804e8d4d>] blk_add_timer+0xdd/0xf0
> RSP <ffff88087b6c3830>
> ---[ end trace 1d1b7db6f1a8b5dd ]---

Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
or later.

--
Jens Axboe

2008-12-04 16:13:49

by Alan D. Brunelle

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

Jens Axboe wrote:

>
> Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
> or later.
>

git pull()ed to: feaf3848a813a106f163013af6fcf6c4bfec92d9 and the same
problem occurs.

Alan

2008-12-04 18:31:44

by Alan D. Brunelle

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

Alan D. Brunelle wrote:
> Jens Axboe wrote:
>
>> Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
>> or later.
>>
>
> git pull()ed to: feaf3848a813a106f163013af6fcf6c4bfec92d9 and the same
> problem occurs.

Maybe not - I've not been to reproduce that problem in subsequent
reboots. It could be that I booted the wrong kernel first time (rc6
instead of rc7). Will keep plugging - any idea as to what might have
"fixed" the problem between rc6 & rc7?

Alan

2008-12-04 21:07:15

by Alan D. Brunelle

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

Alan D. Brunelle wrote:
> Alan D. Brunelle wrote:
>> Jens Axboe wrote:
>>
>>> Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
>>> or later.
>>>
>> git pull()ed to: feaf3848a813a106f163013af6fcf6c4bfec92d9 and the same
>> problem occurs.
>
> Maybe not - I've not been to reproduce that problem in subsequent
> reboots. It could be that I booted the wrong kernel first time (rc6
> instead of rc7). Will keep plugging - any idea as to what might have
> "fixed" the problem between rc6 & rc7?
>
> Alan

It's back - just not as easily reproduced as before.

I'm concerned over this piece of code:

/*
* hp_sw_tur - Send TEST UNIT READY
* @sdev: sdev command should be sent to
*
* Use the TEST UNIT READY command to determine
* the path state.
*/
static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
{
struct request *req;
int ret;

req = blk_get_request(sdev->request_queue, WRITE, GFP_NOIO);
if (!req)
return SCSI_DH_RES_TEMP_UNAVAIL;

req->cmd_type = REQ_TYPE_BLOCK_PC;
req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
REQ_FAILFAST_DRIVER;
req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY);
req->cmd[0] = TEST_UNIT_READY;
req->timeout = HP_SW_TIMEOUT;
req->sense = h->sense;
memset(req->sense, 0, SCSI_SENSE_BUFFERSIZE);
req->sense_len = 0;

retry:
ret = blk_execute_rq(req->q, NULL, req, 1);
if (ret == -EIO) {
if (req->sense_len > 0) {
ret = tur_done(sdev, h->sense);
} else {
sdev_printk(KERN_WARNING, sdev,
"%s: sending tur failed with %x\n",
HP_SW_NAME, req->errors);
ret = SCSI_DH_IO;
}
} else {
h->path_state = HP_SW_PATH_ACTIVE;
ret = SCSI_DH_OK;
}
if (ret == SCSI_DH_IMM_RETRY)
goto retry;
if (ret == SCSI_DH_DEV_OFFLINED) {
h->path_state = HP_SW_PATH_PASSIVE;
ret = SCSI_DH_OK;
}

blk_put_request(req);

return ret;
}

I've pushed the BUG ON check into blk_execute_rq, and it's finding it
set there. Could we be getting SCSI_DH_IMM_RETRYs and that's causing the
same request to be used without being re-initialized, and on error the
bit is not being cleaned up properly?

I'm checking that out next...

Alan

2008-12-05 09:40:29

by Jens Axboe

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

On Thu, Dec 04 2008, Alan D. Brunelle wrote:
> Alan D. Brunelle wrote:
> > Alan D. Brunelle wrote:
> >> Jens Axboe wrote:
> >>
> >>> Alan, can you try latest -git? feaf3848a813a106f163013af6fcf6c4bfec92d9
> >>> or later.
> >>>
> >> git pull()ed to: feaf3848a813a106f163013af6fcf6c4bfec92d9 and the same
> >> problem occurs.
> >
> > Maybe not - I've not been to reproduce that problem in subsequent
> > reboots. It could be that I booted the wrong kernel first time (rc6
> > instead of rc7). Will keep plugging - any idea as to what might have
> > "fixed" the problem between rc6 & rc7?
> >
> > Alan
>
> It's back - just not as easily reproduced as before.
>
> I'm concerned over this piece of code:
>
> /*
> * hp_sw_tur - Send TEST UNIT READY
> * @sdev: sdev command should be sent to
> *
> * Use the TEST UNIT READY command to determine
> * the path state.
> */
> static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
> {
> struct request *req;
> int ret;
>
> req = blk_get_request(sdev->request_queue, WRITE, GFP_NOIO);
> if (!req)
> return SCSI_DH_RES_TEMP_UNAVAIL;
>
> req->cmd_type = REQ_TYPE_BLOCK_PC;
> req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> REQ_FAILFAST_DRIVER;
> req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY);
> req->cmd[0] = TEST_UNIT_READY;
> req->timeout = HP_SW_TIMEOUT;
> req->sense = h->sense;
> memset(req->sense, 0, SCSI_SENSE_BUFFERSIZE);
> req->sense_len = 0;
>
> retry:
> ret = blk_execute_rq(req->q, NULL, req, 1);
> if (ret == -EIO) {
> if (req->sense_len > 0) {
> ret = tur_done(sdev, h->sense);
> } else {
> sdev_printk(KERN_WARNING, sdev,
> "%s: sending tur failed with %x\n",
> HP_SW_NAME, req->errors);
> ret = SCSI_DH_IO;
> }
> } else {
> h->path_state = HP_SW_PATH_ACTIVE;
> ret = SCSI_DH_OK;
> }
> if (ret == SCSI_DH_IMM_RETRY)
> goto retry;
> if (ret == SCSI_DH_DEV_OFFLINED) {
> h->path_state = HP_SW_PATH_PASSIVE;
> ret = SCSI_DH_OK;
> }
>
> blk_put_request(req);
>
> return ret;
> }
>
> I've pushed the BUG ON check into blk_execute_rq, and it's finding it
> set there. Could we be getting SCSI_DH_IMM_RETRYs and that's causing the
> same request to be used without being re-initialized, and on error the
> bit is not being cleaned up properly?
>
> I'm checking that out next...

That does indeed look problematic, we only init the timer stuff when
getting the request initially. So you could either make your retry loop
do blk_put_request() and jump to the very beginning again, or this
should fix the current usage.

diff --git a/block/elevator.c b/block/elevator.c
index a6951f7..0a2f378 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -590,6 +590,12 @@ void elv_insert(struct request_queue *q, struct request *rq, int where)

rq->q = q;

+ /*
+ * This could happen on a request requeue, init the timer here as well
+ */
+ blk_delete_timer(rq);
+ blk_clear_rq_complete(rq);
+
switch (where) {
case ELEVATOR_INSERT_FRONT:
rq->cmd_flags |= REQ_SOFTBARRIER;

--
Jens Axboe

2008-12-05 13:32:52

by Alan D. Brunelle

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

Jens Axboe wrote:
>>
>> I've pushed the BUG ON check into blk_execute_rq, and it's finding it
>> set there. Could we be getting SCSI_DH_IMM_RETRYs and that's causing the
>> same request to be used without being re-initialized, and on error the
>> bit is not being cleaned up properly?
>>
>> I'm checking that out next...
>
> That does indeed look problematic, we only init the timer stuff when
> getting the request initially. So you could either make your retry loop
> do blk_put_request() and jump to the very beginning again, or this
> should fix the current usage.
>
> diff --git a/block/elevator.c b/block/elevator.c
> index a6951f7..0a2f378 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -590,6 +590,12 @@ void elv_insert(struct request_queue *q, struct request *rq, int where)
>
> rq->q = q;
>
> + /*
> + * This could happen on a request requeue, init the timer here as well
> + */
> + blk_delete_timer(rq);
> + blk_clear_rq_complete(rq);
> +
> switch (where) {
> case ELEVATOR_INSERT_FRONT:
> rq->cmd_flags |= REQ_SOFTBARRIER;
>

Hi Jens -

I was able to determine we were getting retries on TURs, and then right
(soon?) thereafter was when we triggered the problem. I put some
slightly different code than what you suggest above, and that triggered
a problem elsewhere in SCSI. I backed out what I did, inserted your code
(after updating my tree to:)

commit bbeba4c35c252b2e961f09ce6ebe76b2cd5e7e3e
Merge: 6df944c... 2cbed89...
Author: Linus Torvalds <[email protected]>
Date: Thu Dec 4 21:45:44 2008 -0800

Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/vi

* 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev:
[PATCH] fix bogus argument of blkdev_put() in pktcdvd
[PATCH 2/2] documnt FMODE_ constants
[PATCH 1/2] kill FMODE_NDELAY_NOW
[PATCH] clean up blkdev_get a little bit
[PATCH] Fix block dev compat ioctl handling
[PATCH] kill obsolete temporary comment in swsusp_close()

and I get the following SCSI-related problem (the same error I got with
code I tried last night - basically just clearing out the atomic value).
This has now happened repeatedly 3 times. [I've cc'd linux-scsi as well,
just to make sure to cover all the bases...]

------------[ cut here ]------------
kernel BUG at drivers/scsi/scsi.c:347!
invalid opcode: 0000 [#1] SMP
last sysfs file:
CPU 0
Pid: 0, comm: swapper Not tainted 2.6.28-rc7 #7
RIP: 0010:[<ffffffff80d3df3e>] [<ffffffff80d3df3e>]
scsi_put_command+0x27/0x65
RSP: 0018:ffffffff827b3d80 EFLAGS: 00010046
RAX: 0000000000000282 RBX: ffff88087b68dec0 RCX: ffff88087b68dec8
RDX: ffff88087b68dec8 RSI: 0000000000000282 RDI: ffff88087b35d834
RBP: ffffffff827b3d90 R08: ffffffff827b3d30 R09: 00000000827b3d60
R10: 0000000000000246 R11: ffff88087b16042c R12: ffff88087b35d800
R13: ffff88087b35d920 R14: 0000000000000000 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffffffff82535500(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff82548000, task ffffffff82049380)
Stack:
ffff88087b68dec0 ffff88087b36b230 ffffffff827b3dc0 ffffffff80d43793
0000000000000000 ffff88087b68dec0 ffff88087b35b658 0000000000000000
ffffffff827b3e00 ffffffff80d43980 ffff88087b36b230 ffff88087b35b658
Call Trace:
<IRQ> <0> [<ffffffff80d43793>] scsi_next_command+0x2e/0x46
[<ffffffff80d43980>] scsi_end_request+0x92/0xa4
[<ffffffff80d43fe3>] scsi_io_completion+0x1a7/0x3ad
[<ffffffff80d3d926>] scsi_finish_command+0xe9/0xf2
[<ffffffff80d444d5>] scsi_softirq_done+0x10c/0x115
[<ffffffff8062afeb>] blk_done_softirq+0x77/0x87
[<ffffffff802628a0>] ? ktime_get_ts+0x49/0x4e
[<ffffffff802514b8>] __do_softirq+0x8a/0x151
[<ffffffff8256b140>] ? early_idt_handler+0x0/0x72
[<ffffffff8020dbec>] call_softirq+0x1c/0x28
[<ffffffff8020f32c>] do_softirq+0x44/0x8b
[<ffffffff802511bf>] irq_exit+0x3f/0x82
[<ffffffff8020f5f0>] do_IRQ+0xc3/0xe3
[<ffffffff8020c983>] ret_from_intr+0x0/0x29
<EOI> <0> [<ffffffff80228446>] ? native_safe_halt+0x6/0x8
[<ffffffff8021406e>] ? default_idle+0x3c/0x59
[<ffffffff802141b0>] ? c1e_idle+0x117/0x11e
[<ffffffff8026332d>] ? atomic_notifier_call_chain+0x13/0x15
[<ffffffff8020b4ea>] ? cpu_idle+0x51/0x92
[<ffffffff815be32d>] ? rest_init+0x61/0x63
[<ffffffff8256bd6b>] ? start_kernel+0x39c/0x3a7
[<ffffffff8256b29f>] ? x86_64_start_reservations+0xa5/0xa9
[<ffffffff8256b3f0>] ? x86_64_start_kernel+0x12a/0x139
Code: 41 5e c9 c3 55 48 89 e5 41 54 53 4c 8b 27 48 89 fb 49 8d 7c 24 34
e8 69 56 98 00 48 8b 53 08 48 8d 4b 08 48 89 c6 48 39 ca 75 04 <0f> 0b
eb fe 48 8b 43 10 48 89 42 08 48 89 10 48 8b 3b 48 89 4b
RIP [<ffffffff80d3df3e>] scsi_put_command+0x27/0x65
RSP <ffffffff827b3d80>



I'll post my .config & full dump a little later.

Alan

2008-12-05 13:35:44

by Jens Axboe

[permalink] [raw]
Subject: Re: kernel BUG at block/blk-timeout.c:178!

On Fri, Dec 05 2008, Alan D. Brunelle wrote:
> Jens Axboe wrote:
> >>
> >> I've pushed the BUG ON check into blk_execute_rq, and it's finding it
> >> set there. Could we be getting SCSI_DH_IMM_RETRYs and that's causing the
> >> same request to be used without being re-initialized, and on error the
> >> bit is not being cleaned up properly?
> >>
> >> I'm checking that out next...
> >
> > That does indeed look problematic, we only init the timer stuff when
> > getting the request initially. So you could either make your retry loop
> > do blk_put_request() and jump to the very beginning again, or this
> > should fix the current usage.
> >
> > diff --git a/block/elevator.c b/block/elevator.c
> > index a6951f7..0a2f378 100644
> > --- a/block/elevator.c
> > +++ b/block/elevator.c
> > @@ -590,6 +590,12 @@ void elv_insert(struct request_queue *q, struct request *rq, int where)
> >
> > rq->q = q;
> >
> > + /*
> > + * This could happen on a request requeue, init the timer here as well
> > + */
> > + blk_delete_timer(rq);
> > + blk_clear_rq_complete(rq);
> > +
> > switch (where) {
> > case ELEVATOR_INSERT_FRONT:
> > rq->cmd_flags |= REQ_SOFTBARRIER;
> >
>
> Hi Jens -
>
> I was able to determine we were getting retries on TURs, and then right
> (soon?) thereafter was when we triggered the problem. I put some
> slightly different code than what you suggest above, and that triggered
> a problem elsewhere in SCSI. I backed out what I did, inserted your code
> (after updating my tree to:)
>
> commit bbeba4c35c252b2e961f09ce6ebe76b2cd5e7e3e
> Merge: 6df944c... 2cbed89...
> Author: Linus Torvalds <[email protected]>
> Date: Thu Dec 4 21:45:44 2008 -0800
>
> Merge branch 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/vi
>
> * 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev:
> [PATCH] fix bogus argument of blkdev_put() in pktcdvd
> [PATCH 2/2] documnt FMODE_ constants
> [PATCH 1/2] kill FMODE_NDELAY_NOW
> [PATCH] clean up blkdev_get a little bit
> [PATCH] Fix block dev compat ioctl handling
> [PATCH] kill obsolete temporary comment in swsusp_close()
>
> and I get the following SCSI-related problem (the same error I got with
> code I tried last night - basically just clearing out the atomic value).
> This has now happened repeatedly 3 times. [I've cc'd linux-scsi as well,
> just to make sure to cover all the bases...]
>
> ------------[ cut here ]------------
> kernel BUG at drivers/scsi/scsi.c:347!
> invalid opcode: 0000 [#1] SMP
> last sysfs file:
> CPU 0
> Pid: 0, comm: swapper Not tainted 2.6.28-rc7 #7
> RIP: 0010:[<ffffffff80d3df3e>] [<ffffffff80d3df3e>]
> scsi_put_command+0x27/0x65
> RSP: 0018:ffffffff827b3d80 EFLAGS: 00010046
> RAX: 0000000000000282 RBX: ffff88087b68dec0 RCX: ffff88087b68dec8
> RDX: ffff88087b68dec8 RSI: 0000000000000282 RDI: ffff88087b35d834
> RBP: ffffffff827b3d90 R08: ffffffff827b3d30 R09: 00000000827b3d60
> R10: 0000000000000246 R11: ffff88087b16042c R12: ffff88087b35d800
> R13: ffff88087b35d920 R14: 0000000000000000 R15: 0000000000000001
> FS: 0000000000000000(0000) GS:ffffffff82535500(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff82548000, task ffffffff82049380)
> Stack:
> ffff88087b68dec0 ffff88087b36b230 ffffffff827b3dc0 ffffffff80d43793
> 0000000000000000 ffff88087b68dec0 ffff88087b35b658 0000000000000000
> ffffffff827b3e00 ffffffff80d43980 ffff88087b36b230 ffff88087b35b658
> Call Trace:
> <IRQ> <0> [<ffffffff80d43793>] scsi_next_command+0x2e/0x46
> [<ffffffff80d43980>] scsi_end_request+0x92/0xa4
> [<ffffffff80d43fe3>] scsi_io_completion+0x1a7/0x3ad
> [<ffffffff80d3d926>] scsi_finish_command+0xe9/0xf2
> [<ffffffff80d444d5>] scsi_softirq_done+0x10c/0x115
> [<ffffffff8062afeb>] blk_done_softirq+0x77/0x87
> [<ffffffff802628a0>] ? ktime_get_ts+0x49/0x4e
> [<ffffffff802514b8>] __do_softirq+0x8a/0x151
> [<ffffffff8256b140>] ? early_idt_handler+0x0/0x72
> [<ffffffff8020dbec>] call_softirq+0x1c/0x28
> [<ffffffff8020f32c>] do_softirq+0x44/0x8b
> [<ffffffff802511bf>] irq_exit+0x3f/0x82
> [<ffffffff8020f5f0>] do_IRQ+0xc3/0xe3
> [<ffffffff8020c983>] ret_from_intr+0x0/0x29
> <EOI> <0> [<ffffffff80228446>] ? native_safe_halt+0x6/0x8
> [<ffffffff8021406e>] ? default_idle+0x3c/0x59
> [<ffffffff802141b0>] ? c1e_idle+0x117/0x11e
> [<ffffffff8026332d>] ? atomic_notifier_call_chain+0x13/0x15
> [<ffffffff8020b4ea>] ? cpu_idle+0x51/0x92
> [<ffffffff815be32d>] ? rest_init+0x61/0x63
> [<ffffffff8256bd6b>] ? start_kernel+0x39c/0x3a7
> [<ffffffff8256b29f>] ? x86_64_start_reservations+0xa5/0xa9
> [<ffffffff8256b3f0>] ? x86_64_start_kernel+0x12a/0x139
> Code: 41 5e c9 c3 55 48 89 e5 41 54 53 4c 8b 27 48 89 fb 49 8d 7c 24 34
> e8 69 56 98 00 48 8b 53 08 48 8d 4b 08 48 89 c6 48 39 ca 75 04 <0f> 0b
> eb fe 48 8b 43 10 48 89 42 08 48 89 10 48 8b 3b 48 89 4b
> RIP [<ffffffff80d3df3e>] scsi_put_command+0x27/0x65
> RSP <ffffffff827b3d80>
>

I haven't audited everything for that path yet, but I don't think others
do retries like you suggest here. I'd strongly encourage you to change
the retry logic to really put and get a new request, instead of relying
on all state being clean for a reissue.

--
Jens Axboe