2007-09-09 17:33:01

by Tom Tucker

[permalink] [raw]
Subject: Question about svc_age_temp_sockets


I'm confused about how svc_age_temp_sockets is supposed to work. From my
reading, the timer fires every 6 minutes and calls the svc_age_temp_sockets
function. This function sweeps the tempsocks list and unconditionally
test_and_sets every transport SK_OLD. If it was the first to set it, it will
test if the transport's not busy and then shut it down. So here's what I
don't get:

- A transport that was added 5 minutes into the timer period, could get shut
down after only a minute of idleness

- A transport that was busy may _never_ get shutdown because it will already
have it's SK_OLD bit set when it comes back through 6 minutes later. If no
more requests ever arrive from the client, the SK_OLD bit will never get
reset and the test_and_set check skips the transport if the bit's already
set.

- There's a check for a zero sk_inuse count. I don't see a legitimate way
for a socket to be on the tempsocks list with a zero refcount.

Am I just hopelessly confused here or is this code broken?

Tom



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-09-14 09:16:48

by Greg Banks

[permalink] [raw]
Subject: Re: Question about svc_age_temp_sockets

On Sun, Sep 09, 2007 at 12:33:44PM -0500, Tom Tucker wrote:
>
> I'm confused about how svc_age_temp_sockets is supposed to work. [...]
> - There's a check for a zero sk_inuse count. I don't see a legitimate way
> for a socket to be on the tempsocks list with a zero refcount.
>
> Am I just hopelessly confused here or is this code broken?

When the code was written, the natural idle state of a svcsock
was sk_inuse == 0. Later, in commit

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=aaf68cfbf2241d24d46583423f6bff5c47e088b3

Neil changed this so the natural idle state is sk_inuse == 1 and
the sk_inuse == 0 condition only happens transiently during socket
shutdown. This is a far better design, but it seems the check
in svc_age_temp_sockets() wasn't updated at the time and is now
dead code. It could probably be removed.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-14 09:19:51

by Greg Banks

[permalink] [raw]
Subject: Re: Question about svc_age_temp_sockets

On Mon, Sep 10, 2007 at 10:33:13AM +0200, Wolfgang Walter wrote:
> > I'm confused about how svc_age_temp_sockets is supposed to work.
> > [...]
>
> Hmm, I think it works like that:
>
> if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
> continue;
>
> So: if SK_OLD is NOT already set then
>
> !test_and_set_bit(SK_OLD, &svsk->sk_flags)
>
> is true and therefor continue is executed (so the transport is not closed,
> only SK_OLD is set).
>
> If SK_OLD is already set then this evaluates to false and the transport is
> removed. This means that SK_OLD was set when the timer run the last time and
> there was no request has arrived since then.

That's right, it's mark-and-sweep. The SK_OLD bit indicates
"no requests have been received since the last ageing timer".
If it's seen in the next ageing timer, then the socket had been
idle for between 1 and 2 timer periods and thus may be shut down.

Greg
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-10 08:34:05

by Wolfgang Walter

[permalink] [raw]
Subject: Re: Question about svc_age_temp_sockets

> I'm confused about how svc_age_temp_sockets is supposed to work. From my
> reading, the timer fires every 6 minutes and calls the svc_age_temp_socke=
ts
> function. This function sweeps the tempsocks list and unconditionally
> test_and_sets every transport SK_OLD. If it was the first to set it, it
> will test if the transport's not busy and then shut it down. So here's wh=
at
> I don't get:
>
> - A transport that was added 5 minutes into the timer period, could get
> shut down after only a minute of idleness
>
> - A transport that was busy may _never_ get shutdown because it will
> already have it's SK_OLD bit set when it comes back through 6 minutes
> later. If no more requests ever arrive from the client, the SK_OLD bit wi=
ll
> never get reset and the test_and_set check skips the transport if the bit=
's
> already set.

Hmm, I think it works like that:

if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
continue;

So: if SK_OLD is NOT already set then
=

!test_and_set_bit(SK_OLD, &svsk->sk_flags)

is true and therefor continue is executed (so the transport is not closed,
only SK_OLD is set).

If SK_OLD is already set then this evaluates to false and the transport is =

removed. This means that SK_OLD was set when the timer run the last time an=
d =

there was no request has arrived since then.

>
> - There's a check for a zero sk_inuse count. I don't see a legitimate way
> for a socket to be on the tempsocks list with a zero refcount.
>

This is something I don't understand, too. See my posting:

http://marc.info/?l=3Dlinux-nfs&m=3D118927571630268&w=3D4

As far as I see it does not work.

Regards,
-- =

Wolfgang Walter
Studentenwerk M=FCnchen
Anstalt des =F6ffentlichen Rechts

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-10 14:05:59

by Tom Tucker

[permalink] [raw]
Subject: Re: Question about svc_age_temp_sockets

On Mon, 2007-09-10 at 10:33 +0200, Wolfgang Walter wrote:
> > I'm confused about how svc_age_temp_sockets is supposed to work. From my
> > reading, the timer fires every 6 minutes and calls the svc_age_temp_sockets
> > function. This function sweeps the tempsocks list and unconditionally
> > test_and_sets every transport SK_OLD. If it was the first to set it, it
> > will test if the transport's not busy and then shut it down. So here's what
> > I don't get:
> >
> > - A transport that was added 5 minutes into the timer period, could get
> > shut down after only a minute of idleness
> >
> > - A transport that was busy may _never_ get shutdown because it will
> > already have it's SK_OLD bit set when it comes back through 6 minutes
> > later. If no more requests ever arrive from the client, the SK_OLD bit will
> > never get reset and the test_and_set check skips the transport if the bit's
> > already set.
>
> Hmm, I think it works like that:
>
> if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
> continue;
>
> So: if SK_OLD is NOT already set then
>
> !test_and_set_bit(SK_OLD, &svsk->sk_flags)
>
> is true and therefor continue is executed (so the transport is not closed,
> only SK_OLD is set).
>
> If SK_OLD is already set then this evaluates to false and the transport is
> removed. This means that SK_OLD was set when the timer run the last time and
> there was no request has arrived since then.

You're right, I misread the !. Duh. So the observation is that it takes
between 6 and 12 minutes to shut down the transport. What does
Connectathon consider reasonable variability?

>
> >
> > - There's a check for a zero sk_inuse count. I don't see a legitimate way
> > for a socket to be on the tempsocks list with a zero refcount.
> >
>
> This is something I don't understand, too. See my posting:
>
> http://marc.info/?l=linux-nfs&m=118927571630268&w=4
>
> As far as I see it does not work.
>
> Regards,


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs