2003-02-03 14:13:54

by Franz Sirl

[permalink] [raw]
Subject: Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,

On 2003-02-02 15:40:33 Bill Davidsen wrote:
>On Wed, 29 Jan 2003, David C Niemi wrote:
>
> >
> > On Tue, 28 Jan 2003, David S. Miller wrote:
> > > From: [email protected]
> > > Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
> > >
> > > Hey! Interesting thing has just happened, it is the first time when I
> > > found the bug formulating a senstence while writing e-mail not while
> > > peering to code. :-)
> > >
> > > Congratulations :-)
> >
> > Just to confirm, this fix works for me as well.
> >
> > ...
> > > Indeed, this bug exists in 2.4 as well of course.
> > >
> > > This bug is 2.4.3 vintage :-) It got added as part of initial
> > > zerocopy merge in fact.
> >
> > Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
> > even once, despite heavily using it for several days under the same
> > circumstances. Is there any reason 2.4.x would be better able to
> recover?
> > 2.5.59 with the fix seems to feel a bit less balky than 2.4.18 without the
> > fix, so it seemed to me that 2.4.18 had some way of recovering at the cost
> > of a several second pause in the session.
>
>The problem which I have been seeing with some regularity is not the hang
>you describe (I see that infrequently) but rather a hang after I exit an
>ssh connection. I open several dozen windows at a time to a cluster when I
>do admin, and when I close almost always at least one doesn't drop without
>"~." to help. So far in a hour I haven't seen that.

That's some internal problem in OpenSSH, can be seen on Solaris as well.
Can be easily reproduced in a ssh session:

nohup sleep 60 &
logout

The ssh session will terminate only after the sleep exited.

Franz.



2003-02-03 20:58:17

by Bill Davidsen

[permalink] [raw]
Subject: Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,

On Mon, 3 Feb 2003, Franz Sirl wrote:

> On 2003-02-02 15:40:33 Bill Davidsen wrote:

> >The problem which I have been seeing with some regularity is not the hang
> >you describe (I see that infrequently) but rather a hang after I exit an
> >ssh connection. I open several dozen windows at a time to a cluster when I
> >do admin, and when I close almost always at least one doesn't drop without
> >"~." to help. So far in a hour I haven't seen that.
>
> That's some internal problem in OpenSSH, can be seen on Solaris as well.
> Can be easily reproduced in a ssh session:
>
> nohup sleep 60 &
> logout
>
> The ssh session will terminate only after the sleep exited.

That is a problem with processes left running. I do not forward
connections, I do not forward X, I do not (in normal practice) leave
anything running. A typical thing to do is to go to each machine in a
cluster and look for a user activity:
grep "user" log/stats.readers
exit
nothing more. And every once in a while that hangs after executing the
logout sequence. With the patch it hasn't to date.

That doesn't mean it's a fix, I don't see it every day, I just haven't
seen it in a few days since I put in the patch.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.