2009-07-11 19:14:18

by Sergey Senozhatsky

[permalink] [raw]
Subject: possible regression with pty.c commit

Hello,
commit d945cb9cce20ac7143c2de8d88b187f62db99bdc ("pty: Rework the pty layer to use the normal buffering logic")
seems to brake kdesu.

(quote: "Because some su implementations (i.e. the one from Red HatВ®) don't want to
read the password from stdin, KDE su creates a pty/tty pair and
executes su with it's standard filedescriptors connected to the tty.")

Revert "pty: Rework the pty layer to use the normal buffering logic"
(commit d945cb9cce20ac7143c2de8d88b187f62db99bdc) makes kdesu work again.

strace /usr/bin/kdesu -u root -c top:
//tons of kde stuff were cut

execve("/usr/bin/kdesu", ["/usr/bin/kdesu", "-u", "root", "-c", "top"], [/* 36 vars */]) = 0
brk(0) = 0x9268000
...
sigreturn() = ? (mask now [])
access("/bin/su", X_OK) = 0
open("/dev/ptmx", O_RDWR) = 10
ioctl(10, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(10, TIOCGPTN, [10]) = 0
stat64("/dev/pts/10", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 10), ...}) = 0
ioctl(10, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(10, TIOCGPTN, [10]) = 0
stat64("/dev/pts/10", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 10), ...}) = 0
statfs("/dev/pts/10", {f_type="DEVPTS_SUPER_MAGIC", f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
ioctl(10, TIOCSPTLCK, [0]) = 0
open("/dev/pts/10", O_RDWR) = 11
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb68c5718) = 4072
close(11) = 0
fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
read(10, "Password: ", 255) = 10
select(11, [10], NULL, NULL, {0, 100000}) = 0 (Timeout)
kill(4072, SIG_0) = 0
open("/dev/pts/10", O_RDWR) = 11
kill(4072, SIG_0) = 0
ioctl(11, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 -opost isig icanon -echo ...}) = 0
close(11) = 0
kill(4072, SIG_0) = 0
write(10, "**~WRITE_PASSWORD~**", 11) = 11
write(10, "\n", 1) = 1
fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
read(10, "\n", 255) = 1
fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
read(10, "kdesu_stub\n", 255) = 11
open("/dev/pts/10", O_RDWR) = 11
ioctl(11, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 -opost isig icanon echo ...}) = 0
ioctl(11, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 -opost isig icanon echo ...}) = 0
ioctl(11, SNDCTL_TMR_START or TCSETS, {B38400 -opost isig icanon -echo ...}) = 0
ioctl(11, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 -opost isig icanon -echo ...}) = 0
close(11) = 0
write(10, "stop", 4) = 4
write(10, "\n", 1) = 1
fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
read(10, 0xbf840858, 255) = -1 EIO (Input/output error) <<<<<<<<<
--- SIGCHLD (Child exited) @ 0 (0) ---
write(5, "\0", 1) = 1
sigreturn() = ? (mask now [])
.....


As for "bug 13522: BUG: scheduling while atomic" - I'll test it.


Sergey


Attachments:
(No filename) (3.26 kB)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-11 22:24:31

by Alan

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On Sat, 11 Jul 2009 22:15:56 +0300
Sergey Senozhatsky <[email protected]> wrote:

> Hello,
> commit d945cb9cce20ac7143c2de8d88b187f62db99bdc ("pty: Rework the pty layer to use the normal buffering logic")
> seems to brake kdesu.

This looks like a timing bug in kdesu at first glance but it may be more
complex.

> close(11) = 0

We close one side of the pty/tty pair

> write(10, "stop", 4) = 4
> write(10, "\n", 1) = 1
> fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
> read(10, 0xbf840858, 255) = -1 EIO (Input/output error) <<<<<<<<<

At this point the other side is closed, we have a hangup and the read
correctly I think gets -EIO.

I will have a look at kdesu on Monday, I've got Fedora setups here so
hopefully I can reproduce it simply.

2009-07-11 23:53:45

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On (07/11/09 23:24), Alan Cox wrote:
> On Sat, 11 Jul 2009 22:15:56 +0300
> Sergey Senozhatsky <[email protected]> wrote:
>
> > Hello,
> > commit d945cb9cce20ac7143c2de8d88b187f62db99bdc ("pty: Rework the pty layer to use the normal buffering logic")
> > seems to brake kdesu.
>
> This looks like a timing bug in kdesu at first glance but it may be more
> complex.
>
> > close(11) = 0
>
> We close one side of the pty/tty pair
>
> > write(10, "stop", 4) = 4
> > write(10, "\n", 1) = 1
> > fcntl64(10, F_GETFL) = 0x2 (flags O_RDWR)
> > read(10, 0xbf840858, 255) = -1 EIO (Input/output error) <<<<<<<<<
>
> At this point the other side is closed, we have a hangup and the read
> correctly I think gets -EIO.
>
> I will have a look at kdesu on Monday, I've got Fedora setups here so
> hopefully I can reproduce it simply.
>

Alan, I forgot to tell - I'm using KDE 3.5.9 (3.5.10). Don't know whether this can be reproduced with KDE 4.x.x.

Sergey


Attachments:
(No filename) (1.04 kB)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-13 11:47:47

by Alan

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

> Alan, I forgot to tell - I'm using KDE 3.5.9 (3.5.10). Don't know whether this can be reproduced with KDE 4.x.x.

Dumping out the traces tty side the tty code appears to be
working correctly. The userspace on the other hand appears broken and to
only work by chance with the old code.

For one it wants for new output to appear before checking for "Password:"
and in doing so consumes any partial output it receives without checking.
I suspect it should be doing line += more

One for the KDE people as its not the kind of thing we can fudge back
kernel side if my diagnosis is right.

2009-07-13 16:39:09

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On (07/13/09 12:48), Alan Cox wrote:
> > Alan, I forgot to tell - I'm using KDE 3.5.9 (3.5.10). Don't know whether this can be reproduced with KDE 4.x.x.
>
> Dumping out the traces tty side the tty code appears to be
> working correctly. The userspace on the other hand appears broken and to
> only work by chance with the old code.
>
> For one it wants for new output to appear before checking for "Password:"
> and in doing so consumes any partial output it receives without checking.
> I suspect it should be doing line += more
>
> One for the KDE people as its not the kind of thing we can fudge back
> kernel side if my diagnosis is right.
>
Hello,
Sorry for delay. d945cb9cce20ac7143c2de8d88b187f62db99bdc is not the first bad commit as I've managed to make
kdesu work simply adding some senseless instructions to pty_write_room(). Anyway.

As for kdesu, it's obvious that kernel should be free from code that simply makes some buggy user space
programm working (even one that was working for years until 2.6.31).
Lots of versions could be affected (4.a.b, 3.c.d, ...) and I'm just not sure that all of them will be
updated ( not all, but popular ones. like 3.5.x ). We'll see.

Thanks,
Sergey


Attachments:
(No filename) (1.18 kB)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-14 16:26:31

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

Hello Alan,
I'm having another problem which I guess may be caused by pty/tty changes.
The problem is that ppp connection constantly hangups under load (downloading)
(it works perfectly with 30 kernel).

syslog:
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Request received.
pptp[1942]: anon log[ctrlp_rep:pptp_ctrl.c:251]: Sent control packet type is 6 'Echo-Reply'
pppd[1929]: No response to 4 echo-requests
pppd[1929]: Serial link appears to be disconnected.
pppd[1929]: Connect time 8.5 minutes.

In average it works ~10 minutes.
I did "strace -ff -F -tt -s 200 -o ... pon ..." which produced 11MiB and 46 files (it'll take some time to dig).
Do you have any ability to test ppp under load?

Sergey


Attachments:
(No filename) (1.16 kB)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-14 23:17:40

by Pierre Willenbrock

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

Sergey Senozhatsky schrieb:
> Hello Alan,
> I'm having another problem which I guess may be caused by pty/tty changes.
> The problem is that ppp connection constantly hangups under load (downloading)
> (it works perfectly with 30 kernel).
>
> syslog:
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Reply received.
> pptp[1942]: anon log[logecho:pptp_ctrl.c:677]: Echo Request received.
> pptp[1942]: anon log[ctrlp_rep:pptp_ctrl.c:251]: Sent control packet type is 6 'Echo-Reply'
> pppd[1929]: No response to 4 echo-requests
> pppd[1929]: Serial link appears to be disconnected.
> pppd[1929]: Connect time 8.5 minutes.
>
> In average it works ~10 minutes.
> I did "strace -ff -F -tt -s 200 -o ... pon ..." which produced 11MiB and 46 files (it'll take some time to dig).
> Do you have any ability to test ppp under load?
>
> Sergey

Hello everyone,

I can reproduce this in mere seconds, using an otherwise idle ppp link
and ping -f -s256 -l256.
Usually, the first batch of packets hangs the pty, but if the link is
not entirely idle, the ping may work for a few more seconds. In fact,
bisect points to commit d945cb9cce20ac7143c2de8d88b187f62db99bdc ("pty:
Rework the pty layer to use the normal buffering logic").

Regards,
Pierre

2009-07-15 10:04:54

by Alan

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On Tue, 14 Jul 2009 19:28:37 +0300
Sergey Senozhatsky <[email protected]> wrote:

> Hello Alan,
> I'm having another problem which I guess may be caused by pty/tty changes.
> The problem is that ppp connection constantly hangups under load (downloading)
> (it works perfectly with 30 kernel).

Have you also got the unthrottle change reverted - without that revert it
certainly will hang as you describe.

2009-07-15 10:25:29

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On (07/15/09 11:05), Alan Cox wrote:
> > Hello Alan,
> > I'm having another problem which I guess may be caused by pty/tty changes.
> > The problem is that ppp connection constantly hangups under load (downloading)
> > (it works perfectly with 30 kernel).
>
> Have you also got the unthrottle change reverted - without that revert it
> certainly will hang as you describe.
>

Hello Alan, do yo mean

drivers/net/ppp_async.c
commit a6540f731d506d9e82444cf0020e716613d4c46c
ppp: Fix throttling bugs

- tty_unthrottle(tty);

I'll revert it and test.

Thanks,
Sergey


Attachments:
(No filename) (572.00 B)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-15 10:29:49

by Alan

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On Wed, 15 Jul 2009 13:27:33 +0300
Sergey Senozhatsky <[email protected]> wrote:

> On (07/15/09 11:05), Alan Cox wrote:
> > > Hello Alan,
> > > I'm having another problem which I guess may be caused by pty/tty changes.
> > > The problem is that ppp connection constantly hangups under load (downloading)
> > > (it works perfectly with 30 kernel).
> >
> > Have you also got the unthrottle change reverted - without that revert it
> > certainly will hang as you describe.
> >
>
> Hello Alan, do yo mean
>
> drivers/net/ppp_async.c
> commit a6540f731d506d9e82444cf0020e716613d4c46c
> ppp: Fix throttling bugs
>
> - tty_unthrottle(tty);
>
> I'll revert it and test.

I do yes

2009-07-15 17:03:24

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On (07/15/09 11:30), Alan Cox wrote:
> > On (07/15/09 11:05), Alan Cox wrote:
> > > > Hello Alan,
> > > > I'm having another problem which I guess may be caused by pty/tty changes.
> > > > The problem is that ppp connection constantly hangups under load (downloading)
> > > > (it works perfectly with 30 kernel).
> > >
> > > Have you also got the unthrottle change reverted - without that revert it
> > > certainly will hang as you describe.
> > >
> >
> I do yes
>

d945cb9cce20ac7143c2de8d88b187f62db99bdc "pty: Rework the pty layer to use the normal buffering logic"
and REVERTED a6540f731d506d9e82444cf0020e716613d4c46c "ppp: Fix throttling bugs"

seems to work.

Sergey


Attachments:
(No filename) (683.00 B)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-17 00:51:26

by Dave Young

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On Thu, Jul 16, 2009 at 1:03 AM, Sergey
Senozhatsky<[email protected]> wrote:
> On (07/15/09 11:30), Alan Cox wrote:
>> > On (07/15/09 11:05), Alan Cox wrote:
>> > > > Hello Alan,
>> > > > I'm having another problem which I guess may be caused by pty/tty changes.
>> > > > The problem is that ppp connection constantly hangups under load (downloading)
>> > > > (it works perfectly with 30 kernel).
>> > >
>> > > Have you also got the unthrottle change reverted - without that revert it
>> > > certainly will hang as you describe.
>> > >
>> >
>> I do yes
>>
>
>  d945cb9cce20ac7143c2de8d88b187f62db99bdc "pty: Rework the pty layer to use  the normal buffering logic"
>  and REVERTED a6540f731d506d9e82444cf0020e716613d4c46c "ppp: Fix throttling bugs"
>
> seems to work.

I have same result, the pppoe connection is unstable with the patch.
Are there better solution for this issue?

>
>        Sergey
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iJwEAQECAAYFAkpeC/cACgkQfKHnntdSXjSGcwQAuDgNU1r+1lgdto4HdBaejLW8
> +DbEVuixCG4PPBM9nlCxI2nK0KydKF/8KRw5iI/GszXlk4WXhlwDcWbAK7BoUPyH
> 3D4uw17BtcYbtn2cmPFeAC4pMVbrDr3CsstBNW6F7K37VYL8bgIfQrEoL9Oi5SUE
> tNV6lyNnuGq/3zSqRA8=
> =ROb6
> -----END PGP SIGNATURE-----
>
>



--
Regards
dave

2009-07-17 08:32:22

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On (07/17/09 08:51), Dave Young wrote:
> > ?d945cb9cce20ac7143c2de8d88b187f62db99bdc "pty: Rework the pty layer to use ?the normal buffering logic"
> > ?and REVERTED a6540f731d506d9e82444cf0020e716613d4c46c "ppp: Fix throttling bugs"
> >
> > seems to work.
>
> I have same result, the pppoe connection is unstable with the patch.
> Are there better solution for this issue?
>

Hello Dave.
The solution is in rc3-git2.
Linus has reverted "throttling" commits.

git log drivers/net/ppp_async.c
git log drivers/net/ppp_synctty.c
4a21b8cb3550f19f838f7c48345fbbf6a0e8536b
Revert "ppp: Fix throttling bugs"

Requested-by: Alan Cox <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>


Sergey


Attachments:
(No filename) (725.00 B)
signature.asc (315.00 B)
Digital signature
Download all attachments

2009-07-17 13:34:19

by Dave Young

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

On Fri, Jul 17, 2009 at 4:34 PM, Sergey
Senozhatsky<[email protected]> wrote:
> On (07/17/09 08:51), Dave Young wrote:
>> >  d945cb9cce20ac7143c2de8d88b187f62db99bdc "pty: Rework the pty layer to use  the normal buffering logic"
>> >  and REVERTED a6540f731d506d9e82444cf0020e716613d4c46c "ppp: Fix throttling bugs"
>> >
>> > seems to work.
>>
>> I have same result, the pppoe connection is unstable with the patch.
>> Are there better solution for this issue?
>>
>
> Hello Dave.
> The solution is in rc3-git2.
> Linus has reverted "throttling" commits.
>
> git log drivers/net/ppp_async.c
> git log drivers/net/ppp_synctty.c
> 4a21b8cb3550f19f838f7c48345fbbf6a0e8536b
> Revert "ppp: Fix throttling bugs"
>
> Requested-by: Alan Cox <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
>
>        Sergey

Thanks for tell, and then the "scheduling in atomic" problem?

> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iJwEAQECAAYFAkpgN5gACgkQfKHnntdSXjRg0wQAhef6hdvDaOXuQSsxFLH/l9H1
> 99NtAA8ewJ7Rx6/lNkzLu9n/A8qJtdoZxNP9xQDeBbeMXaH+NA5+FSo48mGbb/jp
> ePjb3vIIOzRRqGWsKYFdn0m4JW5Bw/2IvVFbHBPfa9C+ZDBPodiyDHI9gZi8bF99
> xc5uyDLAlCksK+WYu+s=
> =BLJz
> -----END PGP SIGNATURE-----
>
>



--
Regards
dave

2009-07-19 23:07:18

by Andreas Schwab

[permalink] [raw]
Subject: Re: possible regression with pty.c commit

Here is a testcase with expect:

#!/usr/bin/expect -f
spawn echo foo bar
expect {
"foo bar" {puts $expect_out(buffer)}
}

With d945cb9cce20ac7143c2de8d88b187f62db99bdc reverted, I get the
expected output:

$ expect -f pty.exp
spawn echo foo bar
foo bar
foo bar

Without the revert the expect process receives EIO when it tries to read
the output of the echo process which already died.

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."