2006-02-02 22:39:07

by Benjamin

[permalink] [raw]
Subject: Measure the Accept Queueing Time

Hello! I am doing my research for the web server performance modeling.
After the connection request from a client complete the TCP 3-way
handshake. It would become an open request and this open request will be
placed in the accept queue. At this point the new child socket is
created and pointed to by the open request. The connection is considered
to be established at this point.
Each time the Web server process executes the "accept()" system call,
the first open request in the accept queue is removed and the socket
which is pointed to by this open request is returned.
All I want is to measure the amount of time when a open request is in
the accept queue. I've tracked the source code of the Linux kernel. I
may know the flow but still not sure my direction is correct or not. I
write down what I found as follows. If there is anything wrong, please
tell me! Thank you very mush

1. The struct "sock" in "sock.h" has a parameter call "ack_backlog".
This parameter counts how many open request in the accept queue.

2. The struct "tcp_opt" in "sock.h" has two parameters call
"accept_queque" and "accept_queue_tail".These two parameter actually
point to the exact accept queue.

3. The struct "open_request" in "tcp.h" exactly represents the open
request which I've mention above.

4. After a connection request from a client complete the TCP 3-way
handshake, the listen socket would call the function "tcp_acceptq_queue"
in "tcp.h". This function puts new open request (i.e. the struct
"open_request") into accept queue (i.e. struct "accept_queue" in
"tcp_opt").

5. Each time the Web server process executes the accept() system call,
the function "tcp_accept" in "tcp.c" would be called. This function
removes the first open request in the accept queue and return the socket
which is pointed to by the open request.


Is there anything wrong with what I describe above? Or there is any
reference regarding this matter? Please tell me! Thank you very much!

p.s. My Linux Kernel Version is 2.4.25


2006-02-04 00:13:48

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: Measure the Accept Queueing Time


Benjamin Chu <[email protected]> writes:

> [...] All I want is to measure the amount of time when a open
> request is in the accept queue. [...]
> p.s. My Linux Kernel Version is 2.4.25

Dude, you made me spend several hours playing with systemtap to solve
this simple-sounding problem. :-)

At the end, I have a script (attached below) that works on one
particular kernel build (a beta RHEL4 kernel, which has the systemtap
prerequisite kprobes, but uses the same networking structure).
Unfortunately, it also demonstrated some of the work we have to do in
systemtap land to make it work better.

The good news: here is the output of the script running on a vmware
instance that has only single socket listener program, being briefly
tickled by hand, then hammered by "nc" connections in a loop. The
lines came out every ten seconds (as requested by the script, see
below), and report on the number of successful accept()s, plus the
number of microseconds that each open_request* was in the
accept_queue.

% socket -l -s NNNN &
[1] 25384
% stap -g acceptdelay.stp
[1139011591] socket(25384) count=1 avg=1487us
[1139011601] socket(25384) count=9 avg=560us
[1139011611] socket(25384) count=6 avg=915us
[1139011621] socket(25384) count=747 avg=604us
[1139011631] socket(25384) count=1280 avg=558us
[1139011641] socket(25384) count=1147 avg=547us
[1139011645] socket(25384) count=25 avg=471us

Is that the sort of data you were hoping to see?

The systemtap translator unfortunately has problems identifying the
most ideal probe insertion points, based on source code coordinates or
function names. The interception of inline functions is weak. At the
present, we also don't have/use the benefit of hooks inserted into the
kernel just for us, which would make probes simple and precise. But
all these problems are being worked on.

So, as a stop-gap for this warts-and-all demonstration, the script
just uses hard-coded PC addresses. Please look beyond that and use
your imagination!

- FChE


#! stap -g

# This embedded-C function is needed to extract a tcp_opt field
# from a pointer cast to generic struct sock*. Later, systemtap
# will offer first class syntax and protected operation for this.
%{
#include <net/tcp.h>
%}
function tcp_aq_head:long (sock:long)
%{
struct tcp_opt *tp = tcp_sk ((struct sock*) (uintptr_t) THIS->sock);
THIS->__retvalue = (int64_t) (uintptr_t) tp->accept_queue;
%}


global open_request_arrivals # indexed by open_request pointer
global open_request_delays # stats, indexed by pid and execname

probe # a spot in the open_request creation path
# XXX: the following commented-out probe points should also work
# kernel.inline("tcp_acceptq_queue")
# kernel.function("tcp_v4_conn_request")
# kernel.inline("tcp_openreq_init")
kernel.statement("*@net/ipv4/tcp_ipv4.c:1472") # after tcp_openreq_init()
{
open_request_arrivals[$req] = gettimeofday_us()
}


# One could also probe at entry of tcp_accept itself, to track
# whether an accept() syscall was blocked by absence of pending
# open_requests.


probe # a spot in tcp_accept, after pending open_request found
# kernel.statement("*@net/ipv4/tcp_ipv4.c:1922")
kernel.statement (0xc029eec8) # near req = tp->accept_queue
{
req = tcp_aq_head ($sk) # tcp_sk(sk)->accept_queue; $req should work too
then = open_request_arrivals[req]
if (then) {
delete open_request_arrivals[req] # save memory
now = gettimeofday_us()
open_request_delays[pid(),execname()] <<< now-then
}
}


probe timer.ms(10000), end # periodically and at session shutdown
{
now=gettimeofday_s()
foreach ([pid+,execname] in open_request_delays) # sort by pid
printf("[%d] %s(%d) count=%d avg=%dus\n", now, execname, pid,
@count(open_request_delays[pid,execname]),
@avg(open_request_delays[pid,execname]))
delete open_request_delays
}


# Some preprocessor magic to prevent someone from running this
# demonstration script (with its hard-coded probe addresses)
# on another kernel build
%( kernel_vr == "2.6.9-30.ELsmp" %? %( arch == "i686" %? /* OK */
%: BADARCH %)
%: BADVERSION %)

2006-02-04 15:16:29

by Benjamin

[permalink] [raw]
Subject: Re: Measure the Accept Queueing Time

Oh! Thank you very much and sorry to take you so much time!
Now, I am trying to implement what you said. It seems not easy to me!^_^
Thanks again!
Benjamin Chu
----- Original Message -----
From: "Frank Ch. Eigler" <[email protected]>
To: "Benjamin Chu" <[email protected]>
Cc: <[email protected]>; <[email protected]>
Sent: Friday, February 03, 2006 6:13 PM
Subject: Re: Measure the Accept Queueing Time


>
> Benjamin Chu <[email protected]> writes:
>
> > [...] All I want is to measure the amount of time when a open
> > request is in the accept queue. [...]
> > p.s. My Linux Kernel Version is 2.4.25
>
> Dude, you made me spend several hours playing with systemtap to solve
> this simple-sounding problem. :-)
>
> At the end, I have a script (attached below) that works on one
> particular kernel build (a beta RHEL4 kernel, which has the systemtap
> prerequisite kprobes, but uses the same networking structure).
> Unfortunately, it also demonstrated some of the work we have to do in
> systemtap land to make it work better.
>
> The good news: here is the output of the script running on a vmware
> instance that has only single socket listener program, being briefly
> tickled by hand, then hammered by "nc" connections in a loop. The
> lines came out every ten seconds (as requested by the script, see
> below), and report on the number of successful accept()s, plus the
> number of microseconds that each open_request* was in the
> accept_queue.
>
> % socket -l -s NNNN &
> [1] 25384
> % stap -g acceptdelay.stp
> [1139011591] socket(25384) count=1 avg=1487us
> [1139011601] socket(25384) count=9 avg=560us
> [1139011611] socket(25384) count=6 avg=915us
> [1139011621] socket(25384) count=747 avg=604us
> [1139011631] socket(25384) count=1280 avg=558us
> [1139011641] socket(25384) count=1147 avg=547us
> [1139011645] socket(25384) count=25 avg=471us
>
> Is that the sort of data you were hoping to see?
>
> The systemtap translator unfortunately has problems identifying the
> most ideal probe insertion points, based on source code coordinates or
> function names. The interception of inline functions is weak. At the
> present, we also don't have/use the benefit of hooks inserted into the
> kernel just for us, which would make probes simple and precise. But
> all these problems are being worked on.
>
> So, as a stop-gap for this warts-and-all demonstration, the script
> just uses hard-coded PC addresses. Please look beyond that and use
> your imagination!
>
> - FChE
>
>
> #! stap -g
>
> # This embedded-C function is needed to extract a tcp_opt field
> # from a pointer cast to generic struct sock*. Later, systemtap
> # will offer first class syntax and protected operation for this.
> %{
> #include <net/tcp.h>
> %}
> function tcp_aq_head:long (sock:long)
> %{
> struct tcp_opt *tp = tcp_sk ((struct sock*) (uintptr_t) THIS->sock);
> THIS->__retvalue = (int64_t) (uintptr_t) tp->accept_queue;
> %}
>
>
> global open_request_arrivals # indexed by open_request pointer
> global open_request_delays # stats, indexed by pid and execname
>
> probe # a spot in the open_request creation path
> # XXX: the following commented-out probe points should also work
> # kernel.inline("tcp_acceptq_queue")
> # kernel.function("tcp_v4_conn_request")
> # kernel.inline("tcp_openreq_init")
> kernel.statement("*@net/ipv4/tcp_ipv4.c:1472") # after
tcp_openreq_init()
> {
> open_request_arrivals[$req] = gettimeofday_us()
> }
>
>
> # One could also probe at entry of tcp_accept itself, to track
> # whether an accept() syscall was blocked by absence of pending
> # open_requests.
>
>
> probe # a spot in tcp_accept, after pending open_request found
> # kernel.statement("*@net/ipv4/tcp_ipv4.c:1922")
> kernel.statement (0xc029eec8) # near req = tp->accept_queue
> {
> req = tcp_aq_head ($sk) # tcp_sk(sk)->accept_queue; $req should work too
> then = open_request_arrivals[req]
> if (then) {
> delete open_request_arrivals[req] # save memory
> now = gettimeofday_us()
> open_request_delays[pid(),execname()] <<< now-then
> }
> }
>
>
> probe timer.ms(10000), end # periodically and at session shutdown
> {
> now=gettimeofday_s()
> foreach ([pid+,execname] in open_request_delays) # sort by pid
> printf("[%d] %s(%d) count=%d avg=%dus\n", now, execname, pid,
> @count(open_request_delays[pid,execname]),
> @avg(open_request_delays[pid,execname]))
> delete open_request_delays
> }
>
>
> # Some preprocessor magic to prevent someone from running this
> # demonstration script (with its hard-coded probe addresses)
> # on another kernel build
> %( kernel_vr == "2.6.9-30.ELsmp" %? %( arch == "i686" %? /* OK */
> %: BADARCH %)
> %: BADVERSION %)
>