2002-09-27 19:31:14

by Andrew Morton

[permalink] [raw]
Subject: Re: Warning - running *really* short on DMA buffers while doingfiletransfers

"Justin T. Gibbs" wrote:
>
> >> Which unfortunately characterizes only a single symptom without breaking
> >> it down on a transaction by transaction basis. We need to understand
> >> how many writes were queued by the OS to the drive between each read to
> >> know if the drive is actually allowing writes to pass reads or not.
> >>
> >
> > Given that I measured a two-second read latency with four tags,
> > that would be about 60 megabytes of write traffic after the
> > read was submitted. Say, 120 requests. That's with a tag
> > depth of four.
>
> I still don't follow your reasoning. Your benchmark indicates the
> latency for several reads (cat kernel/*.c), not the per-read latency.
> The two are quite different and unless you know the per-read latency and
> whether it was affected by filling the drive's entire cache with
> pent up writes (again these are writes that are above and beyond
> those still assigned tags) you are still speculating that writes
> are passing reads.
>
> If you can tell me exactly how you ran your benchmark, I'll find the
> information I want by using a SCSI bus analyzer to sniff the traffic
> on the bus.

I did it by tracing. 4 meg printk buffer, teach printk to timestamp
its output, add tracing printk's, then stare at the output.

The patches are at

http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.38/2.5.38-mm1/experimental/printk-big-buf.patch
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.38/2.5.38-mm1/experimental/elevator-debug.patch

One sample trace is at
http://www.zip.com.au/~akpm/linux/patches/trace.txt

Watch the read of sector 528598. It was inserted into the
elevator at 24989.185 seconds, was taken off the elevator by
the driver at 24989.186 seconds and was completed in bio_endio()
at 24992.273 seconds. That trace was taken with 253 tags. I
don't have a 4-tag trace handy but it was much the same, with
a two-second lag.

I am assuming that the driver submits the request to the disk
shortly after calling elv_next_request(). If I'm wrong, and
the driver itself is hanging onto the request for a significant
amount of time then the disk is not the source of the delay.


2002-09-27 19:47:40

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: Warning - running *really* short on DMA buffers while doingfiletransfers

> Watch the read of sector 528598. It was inserted into the
> elevator at 24989.185 seconds, was taken off the elevator by
> the driver at 24989.186 seconds and was completed in bio_endio()
> at 24992.273 seconds. That trace was taken with 253 tags. I
> don't have a 4-tag trace handy but it was much the same, with
> a two-second lag.
>
> I am assuming that the driver submits the request to the disk
> shortly after calling elv_next_request(). If I'm wrong, and
> the driver itself is hanging onto the request for a significant
> amount of time then the disk is not the source of the delay.

Since your drive cannot handle 253 tags, when saturated with commands,
a new command is never submitted to the drive directly. Instead the
command waits in the aic7xxx driver's queue until space is available
on the device. In FreeBSD, this never happens as tag depth is known
to, and adjusted by, the mid-layer. In Linux I must report the
queue depth without having sufficient load or history with the device
to know anything about its capabilities so I have no choice but to
throttle internally should the device support fewer tags than initially
reported to the OS. You can determine the actual device queue
depth from "cat /proc/scsi/aic7xxx/#". Run a bunch of I/O first so
that the tag depth gets locked first.

--
Justin