2005-01-11 05:24:00

by Srihari Vijayaraghavan

[permalink] [raw]
Subject: [PROBLEM] Badness in cfq_account_completion at drivers/block/cfq-iosched.c:916

I see zillion of these error messages in vanilla
2.6.10:
Jan 11 16:10:13 linux kernel: [<c0220ee0>]
scsi_end_request+0xa9/0xd6
Jan 11 16:10:13 linux kernel: [<c0221181>]
scsi_io_completion+0xee/0x48e
Jan 11 16:10:13 linux kernel: [<c028145f>]
_spin_unlock_irqrestore+0x5/0x6
Jan 11 16:10:13 linux kernel: [<c01233d0>]
__mod_timer+0x122/0x160
Jan 11 16:10:13 linux kernel: [<c01f187e>]
i8042_interrupt+0x4b/0x19f
Jan 11 16:10:13 linux kernel: [<f881ca9c>]
sd_rw_intr+0x52/0x2a3 [sd_mod]
Jan 11 16:10:14 linux kernel: [<c021d0e2>]
scsi_finish_command+0x17/0xb2
Jan 11 16:10:15 linux kernel: [<c0123cc8>]
run_timer_softirq+0x175/0x17d
Jan 11 16:10:15 linux kernel: [<c021d05e>]
scsi_softirq+0x9c/0xd3
Jan 11 16:10:15 linux kernel: [<c011fde2>]
__do_softirq+0x62/0xce
Jan 11 16:10:15 linux kernel: [<c0105112>]
do_softirq+0x42/0x49
Jan 11 16:10:15 linux kernel: =======================
Jan 11 16:10:15 linux kernel: [<c0105022>]
do_IRQ+0x42/0x54
Jan 11 16:10:16 linux kernel: [<c010376e>]
common_interrupt+0x1a/0x20
Jan 11 16:10:16 linux kernel: [<c010101e>]
default_idle+0x0/0x33
Jan 11 16:10:16 linux kernel: [<c0101047>]
default_idle+0x29/0x33
Jan 11 16:10:16 linux kernel: [<c01010b2>]
cpu_idle+0x2e/0x3e
Jan 11 16:10:16 linux kernel: [<c03388ff>]
start_kernel+0x18d/0x1c9
Jan 11 16:10:16 linux kernel: [<c0338345>]
unknown_bootoption+0x0/0x1bc
Jan 11 16:10:16 linux kernel: Badness in
cfq_account_completion at
drivers/block/cfq-iosched.c:916
Jan 11 16:10:16 linux kernel: [<c020566a>]
cfq_completed_request+0xf6/0xfe
Jan 11 16:10:16 linux kernel: [<c01ffbff>]
__blk_put_request+0x4d/0x90
Jan 11 16:10:16 linux kernel: [<c0200e7e>]
end_that_request_last+0xb5/0xe3
Jan 11 16:10:16 linux kernel: [<c0220ee0>]
scsi_end_request+0xa9/0xd6
Jan 11 16:10:16 linux kernel: [<c0221181>]
scsi_io_completion+0xee/0x48e
Jan 11 16:10:16 linux kernel: [<c028145f>]
_spin_unlock_irqrestore+0x5/0x6
Jan 11 16:10:16 linux kernel: [<c01233d0>]
__mod_timer+0x122/0x160
Jan 11 16:10:16 linux kernel: [<c01f187e>]
i8042_interrupt+0x4b/0x19f
Jan 11 16:10:16 linux kernel: [<f881ca9c>]
sd_rw_intr+0x52/0x2a3 [sd_mod]
Jan 11 16:10:16 linux kernel: [<c021d0e2>]
scsi_finish_command+0x17/0xb2
Jan 11 16:10:16 linux kernel: [<c0123cc8>]
run_timer_softirq+0x175/0x17d
Jan 11 16:10:16 linux kernel: [<c021d05e>]
scsi_softirq+0x9c/0xd3
Jan 11 16:10:16 linux kernel: [<c011fde2>]
__do_softirq+0x62/0xce
Jan 11 16:10:16 linux kernel: [<c0105112>]
do_softirq+0x42/0x49

It is an IBM x360, 2 Xeon (HT), 4 GB, Hardware RAID
70+ GB on IBM ServeRaid etc.

Thanks
Hari

PS: Please cc me in replies, as I am not subscribed to
LKML. I will see if I see same error messages in
-bk<latest>.


Find local movie times and trailers on Yahoo! Movies.
http://au.movies.yahoo.com


2005-01-11 09:04:34

by Jens Axboe

[permalink] [raw]
Subject: Re: [PROBLEM] Badness in cfq_account_completion at drivers/block/cfq-iosched.c:916

On Tue, Jan 11 2005, Srihari Vijayaraghavan wrote:
> I see zillion of these error messages in vanilla
> 2.6.10:
> Jan 11 16:10:13 linux kernel: [<c0220ee0>]
> scsi_end_request+0xa9/0xd6

[snip]

Does this fix it?

===== drivers/block/cfq-iosched.c 1.17 vs edited =====
--- 1.17/drivers/block/cfq-iosched.c 2004-12-24 09:12:58 +01:00
+++ edited/drivers/block/cfq-iosched.c 2005-01-11 10:03:17 +01:00
@@ -622,8 +622,10 @@
cfq_sort_rr_list(cfqq, 0);
}

- crq->accounted = 0;
- cfqq->cfqd->rq_in_driver--;
+ if (crq->accounted) {
+ crq->accounted = 0;
+ cfqq->cfqd->rq_in_driver--;
+ }
}
list_add(&rq->queuelist, &q->queue_head);
}

--
Jens Axboe

2005-01-12 01:36:18

by Srihari Vijayaraghavan

[permalink] [raw]
Subject: Re: [PROBLEM] Badness in cfq_account_completion at drivers/block/cfq-iosched.c:916

--- Jens Axboe <[email protected]> wrote:
> ...
> Does this fix it?
>
> ===== drivers/block/cfq-iosched.c 1.17 vs edited
> =====
> --- 1.17/drivers/block/cfq-iosched.c 2004-12-24
> 09:12:58 +01:00
> +++ edited/drivers/block/cfq-iosched.c 2005-01-11
> 10:03:17 +01:00
> @@ -622,8 +622,10 @@
> cfq_sort_rr_list(cfqq, 0);
> }
>
> - crq->accounted = 0;
> - cfqq->cfqd->rq_in_driver--;
> + if (crq->accounted) {
> + crq->accounted = 0;
> + cfqq->cfqd->rq_in_driver--;
> + }
> }
> list_add(&rq->queuelist, &q->queue_head);
> }

Yes, it does fix the problem with cfq, and the system
works fine. No more "Badness" error messages. Thanks
Jens.

While you are at it, is this acceptable?:
--- test/drivers/block/elevator.c.orig 2005-01-11
15:47:07.000000000 +1100
+++ test/drivers/block/elevator.c 2005-01-12
12:16:19.365813400 +1100
@@ -170,8 +170,6 @@
#else
#error "You must build at least 1 IO scheduler into
the kernel"
#endif
- printk(KERN_INFO "elevator: using %s as default io
scheduler\n",
- chosen_elevator);
}

static int __init elevator_setup(char *str)
@@ -516,6 +514,9 @@
spin_unlock_irq(&elv_list_lock);

printk(KERN_INFO "io scheduler %s registered\n",
e->elevator_name);
+ if (!strcmp(e->elevator_name, chosen_elevator))
+ printk(KERN_INFO "elevator: using %s as default io
scheduler\n",
+
e->elevator_name);
return 0;
}
EXPORT_SYMBOL_GPL(elv_register);

It has an advantage of working even when one uses
"elevator=" kernel boot parameter. If it is wrong
completely, I am sorry about it.

Thank you.
Hari

PS: I am using web email interface, if things appear
funny, sorry about that.


Find local movie times and trailers on Yahoo! Movies.
http://au.movies.yahoo.com

2005-01-12 07:42:36

by Jens Axboe

[permalink] [raw]
Subject: Re: [PROBLEM] Badness in cfq_account_completion at drivers/block/cfq-iosched.c:916

On Wed, Jan 12 2005, Srihari Vijayaraghavan wrote:
> --- Jens Axboe <[email protected]> wrote:
> > ...
> > Does this fix it?
> >
> > ===== drivers/block/cfq-iosched.c 1.17 vs edited
> > =====
> > --- 1.17/drivers/block/cfq-iosched.c 2004-12-24
> > 09:12:58 +01:00
> > +++ edited/drivers/block/cfq-iosched.c 2005-01-11
> > 10:03:17 +01:00
> > @@ -622,8 +622,10 @@
> > cfq_sort_rr_list(cfqq, 0);
> > }
> >
> > - crq->accounted = 0;
> > - cfqq->cfqd->rq_in_driver--;
> > + if (crq->accounted) {
> > + crq->accounted = 0;
> > + cfqq->cfqd->rq_in_driver--;
> > + }
> > }
> > list_add(&rq->queuelist, &q->queue_head);
> > }
>
> Yes, it does fix the problem with cfq, and the system
> works fine. No more "Badness" error messages. Thanks
> Jens.

Super, thanks.

> While you are at it, is this acceptable?:
> --- test/drivers/block/elevator.c.orig 2005-01-11
> 15:47:07.000000000 +1100
> +++ test/drivers/block/elevator.c 2005-01-12
> 12:16:19.365813400 +1100
> @@ -170,8 +170,6 @@
> #else
> #error "You must build at least 1 IO scheduler into
> the kernel"
> #endif
> - printk(KERN_INFO "elevator: using %s as default io
> scheduler\n",
> - chosen_elevator);
> }
>
> static int __init elevator_setup(char *str)
> @@ -516,6 +514,9 @@
> spin_unlock_irq(&elv_list_lock);
>
> printk(KERN_INFO "io scheduler %s registered\n",
> e->elevator_name);
> + if (!strcmp(e->elevator_name, chosen_elevator))
> + printk(KERN_INFO "elevator: using %s as default io
> scheduler\n",
> +
> e->elevator_name);
> return 0;
> }
> EXPORT_SYMBOL_GPL(elv_register);
>
> It has an advantage of working even when one uses
> "elevator=" kernel boot parameter. If it is wrong
> completely, I am sorry about it.

Yes that's a good idea, perhaps just adding "(default)" at the end of
the default io scheduler is better so we save that extra line.

--
Jens Axboe