2008-01-19 16:54:21

by Török Edwin

[permalink] [raw]
Subject: [PATCH] LatencyTop: make reason (blk_execute_rq+0x7a/0xd) known

Hi Arjan,

One of the top reasons in latencytop is
Unknown reason (blk_execute_rq+0x7a/0xd) 152.3 msec
49.0 msec

Its from hddtemp, hald-addon-storage, ...

I wrote a simple systemtap script [1] to find out where its called from,
and its definitely from the ioctl (see below).

1820322406
hddtemp(3999):
<- latency of 153260223 ns
Returning from: 0xffffffff80311ab0 : blk_execute_rq+0x0/0xd0 []
Returning to : 0xffffffff80315580 : sg_io+0x200/0x3c0 []
0xffffffff80315a02 : scsi_cmd_ioctl+0x2c2/0x410 []
0xffffffff80417153 : ip_queue_xmit+0x2b3/0x420 []
0xffffffff880c3be9 : sd_ioctl+0x99/0xf0 [sd_mod]
0xffffffff80313839 : blkdev_driver_ioctl+0x39/0x40 []
0xffffffff80313ab8 : blkdev_ioctl+0x278/0x710 []
0xffffffff803e540d : release_sock+0x8d/0xa0 []
0xffffffff803e54b3 : lock_sock_nested+0x93/0xa0 []
0xffffffff8029a4ff : d_alloc+0x6f/0x1a0 []
0xffffffff802f102f : security_d_instantiate+0x1f/0x30 []
0xffffffff803e540d : release_sock+0x8d/0xa0 []
0xffffffff8043c8c8 : inet_accept+0x88/0xc0 []
0xffffffff803e408e : move_addr_to_user+0x5e/0x70 []
0xffffffff802f0bb1 : security_socket_post_accept+0x11/0x20 []
0xffffffff803e472e : sys_accept+0x13e/0x1d0 []
0xffffffff802af12b : block_ioctl+0x1b/0x20 []
0xffffffff8029482b : do_ioctl+0x1b/0x70 []
0xffffffff80294a6d : vfs_ioctl+0x1ed/0x290 []
0xffffffff80296281 : sys_select+0x41/0x1b0 []
0xffffffff80294b92 : sys_ioctl+0x82/0xa0 []
0xffffffff8020c18e : system_call+0x7e/0x83 []


I suggest adding an annotation to make this reason known to latencytop
by annotating sd_ioctl/scsi_cmd_ioctl, see my patch below.
After my patch, instead of blk_execute_rq the real reason is showed:

Cause Maximum Average
SCSI device ioctl 34.2 msec
14.4 msec

I noticed there's a similar annotation in sr.c for cdrom_ioctl, but sr.c
also has annotations for scsi_nonblockable_ioctl, and scsi_ioctl.
If they are needed for sd_block_ioctl too, maybe
scsi_nonblockable_ioctl/scsi_ioctl should have the annotation themselves.

I also noticed some unsigned overflows, my top latency was sometimes a
very large number (18................................), which
was probably -1, or some other negative number. I haven't found a way
to reproduce it yet (it happens very rarely).

[1] Systemtap script:
#!/usr/bin/stap -v
global start
probe kernel.function("blk_execute_rq") {
start[tid()] = gettimeofday_ns()
}

probe kernel.function("blk_execute_rq").return {
# if we have not seen this before
if (!([tid()] in start)) next
delta = gettimeofday_ns() - start[tid()]
# show backtraces for latencies over 1ms
if(delta > 1000000) {
printf("%s <- latency of %d ns\n", thread_indent(1), delta)
print_backtrace ()
}
delete start[tid()]
}

P.S.: LatencyTop could have a feature to ask for more details on unknown
latency reasons, and it would generate a systemtap script itself, that
would show the backtraces to help figure out whats going on.

Best regards,
--Edwin

----

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index a69b155..93ccfc0 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -686,6 +687,7 @@ static int sd_ioctl(struct inode * inode, struct
file * filp,
struct scsi_device *sdp = scsi_disk(disk)->device;
void __user *p = (void __user *)arg;
int error;
+ struct latency_entry reason;

SCSI_LOG_IOCTL(1, printk("sd_ioctl: disk=%s, cmd=0x%x\n",
disk->disk_name, cmd));
@@ -710,7 +712,9 @@ static int sd_ioctl(struct inode * inode, struct
file * filp,
case SCSI_IOCTL_GET_BUS_NUMBER:
return scsi_ioctl(sdp, cmd, p);
default:
+ set_latency_reason("SCSI device ioctl", &reason);
error = scsi_cmd_ioctl(filp, disk->queue, disk, cmd, p);
+ restore_latency_reason(&reason);
if (error != -ENOTTY)
return error;
}


2008-01-19 18:30:34

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] LatencyTop: make reason (blk_execute_rq+0x7a/0xd) known

T?r?k Edwin wrote:
> Hi Arjan,
>
> One of the top reasons in latencytop is
> Unknown reason (blk_execute_rq+0x7a/0xd) 152.3 msec
> 49.0 msec
>
> Its from hddtemp, hald-addon-storage, ...
>
> I wrote a simple systemtap script [1] to find out where its called from,
> and its definitely from the ioctl (see below).
>
> 1820322406
> hddtemp(3999):
> <- latency of 153260223 ns
> Returning from: 0xffffffff80311ab0 : blk_execute_rq+0x0/0xd0 []
> Returning to : 0xffffffff80315580 : sg_io+0x200/0x3c0 []
> 0xffffffff80315a02 : scsi_cmd_ioctl+0x2c2/0x410 []
> 0xffffffff80417153 : ip_queue_xmit+0x2b3/0x420 []
> 0xffffffff880c3be9 : sd_ioctl+0x99/0xf0 [sd_mod]
> 0xffffffff80313839 : blkdev_driver_ioctl+0x39/0x40 []
> 0xffffffff80313ab8 : blkdev_ioctl+0x278/0x710 []
> 0xffffffff803e540d : release_sock+0x8d/0xa0 []
> 0xffffffff803e54b3 : lock_sock_nested+0x93/0xa0 []
> 0xffffffff8029a4ff : d_alloc+0x6f/0x1a0 []
> 0xffffffff802f102f : security_d_instantiate+0x1f/0x30 []
> 0xffffffff803e540d : release_sock+0x8d/0xa0 []
> 0xffffffff8043c8c8 : inet_accept+0x88/0xc0 []
> 0xffffffff803e408e : move_addr_to_user+0x5e/0x70 []
> 0xffffffff802f0bb1 : security_socket_post_accept+0x11/0x20 []
> 0xffffffff803e472e : sys_accept+0x13e/0x1d0 []
> 0xffffffff802af12b : block_ioctl+0x1b/0x20 []
> 0xffffffff8029482b : do_ioctl+0x1b/0x70 []
> 0xffffffff80294a6d : vfs_ioctl+0x1ed/0x290 []
> 0xffffffff80296281 : sys_select+0x41/0x1b0 []
> 0xffffffff80294b92 : sys_ioctl+0x82/0xa0 []
> 0xffffffff8020c18e : system_call+0x7e/0x83 []
>
>
> I suggest adding an annotation to make this reason known to latencytop
> by annotating sd_ioctl/scsi_cmd_ioctl, see my patch below.
> After my patch, instead of blk_execute_rq the real reason is showed:
>
> Cause Maximum Average
> SCSI device ioctl 34.2 msec
> 14.4 msec

great! I'll put this into my patchkit!

>
> I noticed there's a similar annotation in sr.c for cdrom_ioctl, but sr.c
> also has annotations for scsi_nonblockable_ioctl, and scsi_ioctl.
> If they are needed for sd_block_ioctl too, maybe
> scsi_nonblockable_ioctl/scsi_ioctl should have the annotation themselves.

yeah I'll take a look at this; some of the annotations are still a bit iffy in their placement

>
> I also noticed some unsigned overflows, my top latency was sometimes a
> very large number (18................................), which
> was probably -1, or some other negative number. I haven't found a way
> to reproduce it yet (it happens very rarely).

I've not seen this; I'll take another peak at the code.

> P.S.: LatencyTop could have a feature to ask for more details on unknown
> latency reasons, and it would generate a systemtap script itself, that
> would show the backtraces to help figure out whats going on.

I've considered having an option inside the code itself that would print a
backtrace for, say, an unknown latency larger than 100msec
but the code got ugly (the accounting gets done in the scheduler path,
where you can't printk).
I'll try again ;-)

2008-01-19 19:41:45

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [PATCH] LatencyTop: make reason (blk_execute_rq+0x7a/0xd) known

Arjan van de Ven <[email protected]> writes:

> T?r?k Edwin wrote:
>> [...]
>> P.S.: LatencyTop could have a feature to ask for more details on unknown
>> latency reasons, and it would generate a systemtap script itself, that
>> would show the backtraces to help figure out whats going on.
>
> I've considered having an option inside the code itself that would print a
> backtrace for, say, an unknown latency larger than 100msec
> but the code got ugly (the accounting gets done in the scheduler path,
> where you can't printk). [...]

Systemtap does not use printk, and can generate decent backtraces from
the kernel (and into userspace before too long). (FWIW, it seems to
be only a small stretch to implement tools like *top via systemtap
rather than custom-written kernel code.)

- FChE

2008-01-19 19:47:49

by Török Edwin

[permalink] [raw]
Subject: Re: [PATCH] LatencyTop: make reason (blk_execute_rq+0x7a/0xd) known

Arjan van de Ven wrote:
> T?r?k Edwin wrote:
>>
>> Cause Maximum
>> Average
>> SCSI device ioctl 34.2 msec
>> 14.4 msec
>
> great! I'll put this into my patchkit!

Thanks.

>> I also noticed some unsigned overflows, my top latency was sometimes a
>> very large number (18................................), which
>> was probably -1, or some other negative number. I haven't found a way
>> to reproduce it yet (it happens very rarely).
>
>
> I've not seen this; I'll take another peak at the code.

I just captured this:

Cause Maximum Average
Waiting for userspace lock 18446744073638300.0
msec 9223372036783524.0 msec
Waiting for processes to die (waitpid) 18446744073638296.0
msec 9223372036783520.0 msec
Waiting for event (poll) 18446744073565216.0
msec 764505121372.0 msec
SCSI device ioctl 35.2 msec
13.7 msec
Application requested delay 19.5 msec
0.0 msec
page fault 18.6 msec
4.0 msec
Reading from file 16.6 msec
1.5 msec
Waiting for buffer IO 15.2 msec
3.0 msec
mutex lock 15.0 msec
15.0 msec

I was also looking at /proc/latency_stats using watch, and I've seen
that there were negative number.
Unfortunately I couldn't copy+paste it, because it was gone too fast.

I was running a program with 4 threads doing usleep(0) in an inf-loop.
However I can't reproduce the overflow now with the same program.

Best regards,
--Edwin