2002-07-04 11:50:00

by Jochen Suckfuell

[permalink] [raw]
Subject: Disk IO statistics still buggy

Hi!

The IO statistics displayed in /proc/partitions are still buggy, because
after some time, the value for the currently running requests gets too
high or too low (see the archives, look for "/proc/partitions").

Is anyone working on a fix?

Or does someone have an idea where the error might occur?
Which parts of the kernel play a role in starting/stopping block IO
requests (that are not part of ll_rw_blk.c) and might not handle the
accounting correctly?

I'd like to fix this and would appreciate any pointers, since after a
first look there is no obvious error in ll_rw_blk.c .

Thanks in advance,

Jochen Suckfuell


2002-07-04 11:56:58

by Jochen Suckfuell

[permalink] [raw]
Subject: Re: Disk IO statistics still buggy


> The IO statistics displayed in /proc/partitions are still buggy, because
> after some time, the value for the currently running requests gets too
> high or too low (see the archives, look for "/proc/partitions").
I forgot: This is on all recent 2.4 kernels; I can't tell for 2.5.

Bye
Jochen

2002-07-04 18:29:08

by Adrian Bunk

[permalink] [raw]
Subject: Re: Disk IO statistics still buggy

On Thu, 4 Jul 2002, Jochen Suckfuell wrote:

> Hi!

Hi Jochen!

> The IO statistics displayed in /proc/partitions are still buggy, because
> after some time, the value for the currently running requests gets too
> high or too low (see the archives, look for "/proc/partitions").
>
> Is anyone working on a fix?
>...

Marcelos' BK repository (that will become 2.4.19-rc2) includes a patch to
remove these statistics completely from /proc/partitions...

> Thanks in advance,
>
> Jochen Suckfuell

cu
Adrian

--

You only think this is a free country. Like the US the UK spends a lot of
time explaining its a free country because its a police state.
Alan Cox

2002-07-06 04:23:25

by Bill Davidsen

[permalink] [raw]
Subject: Re: Disk IO statistics still buggy

On Thu, 4 Jul 2002, Adrian Bunk wrote:

> On Thu, 4 Jul 2002, Jochen Suckfuell wrote:
>
> > Hi!
>
> Hi Jochen!
>
> > The IO statistics displayed in /proc/partitions are still buggy, because
> > after some time, the value for the currently running requests gets too
> > high or too low (see the archives, look for "/proc/partitions").
> >
> > Is anyone working on a fix?
> >...
> Marcelos' BK repository (that will become 2.4.19-rc2) includes a patch to
> remove these statistics completely from /proc/partitions...

Is this the new Linux way of life? Removing modules is hard, GET RID OF
THE FEATURE! Stats in /proc/partitions are not always correct, GET RID OF
THE FEATURE!

The IDE support in 2.5 sucks rocks off the bottom of the ocean, is that
next?

To the point, the stats are there, there are used by many of the widely
used distributions, this is supposed to be stable, that's the excuse for
not adding features, how can that justify taking out a feature? That's one
of the useful things for determining which (if any) partitions have enough
traffic to be best candidates for a separate drive. Yes I know there are
other more complex ways than looking at human readable numbers. Why is
that desirable? Why not fix the feature (I don't agree it's all that far
off in most cases).

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-07-06 07:45:51

by Andries Brouwer

[permalink] [raw]
Subject: Re: Disk IO statistics still buggy

On Sat, Jul 06, 2002 at 12:15:47AM -0400, Bill Davidsen wrote:

> > Marcelos' BK repository (that will become 2.4.19-rc2) includes a patch to
> > remove these statistics completely from /proc/partitions...
>
> Is this the new Linux way of life? Removing modules is hard, GET RID OF
> THE FEATURE! Stats in /proc/partitions are not always correct, GET RID OF
> THE FEATURE!

Shouting and exclamation marks - a bad sign.

You are mistaken. This has never been a feature.
It is not in 2.4.18, and it looks like it will not be in 2.4.19.

It is in some vendor kernels, but it is ugly and causes various problems.
If somebody cares about having statistics she should submit a patch
adding /proc/diskstat.

Andries

2002-07-06 13:50:24

by Thunder from the hill

[permalink] [raw]
Subject: Re: Disk IO statistics still buggy

Hi,

On Sat, Jul 06, 2002 at 12:15:47AM -0400, Bill Davidsen wrote:
> > Marcelos' BK repository (that will become 2.4.19-rc2) includes a patch to
> > remove these statistics completely from /proc/partitions...
>
> Is this the new Linux way of life? Removing modules is hard, GET RID OF
> THE FEATURE! Stats in /proc/partitions are not always correct, GET RID OF
> THE FEATURE!

You misunderstood. It was nothing incorrect about the stats in
/proc/partitions, it was just moved because it just didn't belong into
/proc/partitions.

Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------

2002-07-09 16:57:41

by Jochen Suckfuell

[permalink] [raw]
Subject: [PATCH] Re: Disk IO statistics still buggy

Hi!

On Sat, Jul 06, 2002 at 07:42:42AM -0600, Thunder from the hill wrote:
> On Sat, Jul 06, 2002 at 12:15:47AM -0400, Bill Davidsen wrote:
> > > Marcelos' BK repository (that will become 2.4.19-rc2) includes a patch to
> > > remove these statistics completely from /proc/partitions...
> >
> > Is this the new Linux way of life? Removing modules is hard, GET RID OF
> > THE FEATURE! Stats in /proc/partitions are not always correct, GET RID OF
> > THE FEATURE!
>
> You misunderstood. It was nothing incorrect about the stats in
> /proc/partitions, it was just moved because it just didn't belong into
> /proc/partitions.

And it hasn't been removed either. Only the output has been removed; the
data is still collected. Meanwhile I found the bug that leads to wrong
ios_in_flight values on SCSI systems:

The accounting was done on a copy of the request _after_ the request has
been dequeued and the irq_request_lock released. I fixed this by taking
this lock again while calling the accounting function (see the patch
below).

** This patch is relevant regardless where the statistics will finally
be printed. **

One issue is still left: on my non-SCSI machine, the ios_in_flight value
is at -1 when although requests are running. It doesn't get any more
wrong than that it seems, and I'm not sure where that happened. Maybe
something went wrong when initializing this value at boot time? I have a
workaround that corrects negative ios_in_flight values to zero each time
/proc/partitions is read. After further testing I will post it to the
list.

Bye
Jochen Suckf?ll

--
Jochen Suckfuell --- http://www.suckfuell.net/jochen/ ---

--- linux/drivers/scsi/scsi_lib.c Mon Jul 8 16:15:27 2002
+++ linux_work/drivers/scsi/scsi_lib.c Tue Jul 9 17:56:39 2002
@@ -426,7 +426,9 @@
if (req->waiting != NULL) {
complete(req->waiting);
}
+ spin_lock_irq(&io_request_lock);
req_finished_io(req);
+ spin_unlock_irq(&io_request_lock);
add_blkdev_randomness(MAJOR(req->rq_dev));

SDpnt = SCpnt->device;


2002-07-11 22:18:54

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] Re: Disk IO statistics still buggy


Christoph,

Can you take a look at this one ?

On Tue, 9 Jul 2002, Jochen Suckfuell wrote:

> And it hasn't been removed either. Only the output has been removed; the
> data is still collected. Meanwhile I found the bug that leads to wrong
> ios_in_flight values on SCSI systems:
>
> The accounting was done on a copy of the request _after_ the request has
> been dequeued and the irq_request_lock released. I fixed this by taking
> this lock again while calling the accounting function (see the patch
> below).
>
> ** This patch is relevant regardless where the statistics will finally
> be printed. **
>
> One issue is still left: on my non-SCSI machine, the ios_in_flight value
> is at -1 when although requests are running. It doesn't get any more
> wrong than that it seems, and I'm not sure where that happened. Maybe
> something went wrong when initializing this value at boot time? I have a
> workaround that corrects negative ios_in_flight values to zero each time
> /proc/partitions is read. After further testing I will post it to the
> list.
>
> Bye
> Jochen Suckf?ll
>
> --
> Jochen Suckfuell --- http://www.suckfuell.net/jochen/ ---
>
> --- linux/drivers/scsi/scsi_lib.c Mon Jul 8 16:15:27 2002
> +++ linux_work/drivers/scsi/scsi_lib.c Tue Jul 9 17:56:39 2002
> @@ -426,7 +426,9 @@
> if (req->waiting != NULL) {
> complete(req->waiting);
> }
> + spin_lock_irq(&io_request_lock);
> req_finished_io(req);
> + spin_unlock_irq(&io_request_lock);
> add_blkdev_randomness(MAJOR(req->rq_dev));
>
> SDpnt = SCpnt->device;

2002-07-12 05:47:55

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] Re: Disk IO statistics still buggy

On Tue, Jul 09 2002, Jochen Suckfuell wrote:
> --- linux/drivers/scsi/scsi_lib.c Mon Jul 8 16:15:27 2002
> +++ linux_work/drivers/scsi/scsi_lib.c Tue Jul 9 17:56:39 2002
> @@ -426,7 +426,9 @@
> if (req->waiting != NULL) {
> complete(req->waiting);
> }
> + spin_lock_irq(&io_request_lock);
> req_finished_io(req);
> + spin_unlock_irq(&io_request_lock);
> add_blkdev_randomness(MAJOR(req->rq_dev));
>
> SDpnt = SCpnt->device;

good spotting, that's definitely a bug

--
Jens Axboe

2002-07-22 15:08:23

by Jochen Suckfuell

[permalink] [raw]
Subject: Re: [PATCH] Re: Disk IO statistics still buggy

Hi!

I'm just curious about the status: will this patch be included in 2.4.19
final?

On Thu, Jul 11, 2002 at 06:27:25PM -0300, Marcelo Tosatti wrote:
>
> Christoph,
> Can you take a look at this one ?
>
> On Tue, 9 Jul 2002, Jochen Suckfuell wrote:
> > The accounting was done on a copy of the request _after_ the request has
> > been dequeued and the irq_request_lock released. I fixed this by taking
> > this lock again while calling the accounting function (see the patch
> > below).
> >
> > [...]
> >
> > --- linux/drivers/scsi/scsi_lib.c Mon Jul 8 16:15:27 2002
> > +++ linux_work/drivers/scsi/scsi_lib.c Tue Jul 9 17:56:39 2002
> > @@ -426,7 +426,9 @@
> > if (req->waiting != NULL) {
> > complete(req->waiting);
> > }
> > + spin_lock_irq(&io_request_lock);
> > req_finished_io(req);
> > + spin_unlock_irq(&io_request_lock);
> > add_blkdev_randomness(MAJOR(req->rq_dev));
> >
> > SDpnt = SCpnt->device;

Bye
Jochen