2008-08-10 15:18:13

by Ico Doornekamp

[permalink] [raw]
Subject: TIOCGWINSZ retuns old pty size after receiving SIGWINCH


Hello,

Recently my X terminals showed annoying behaviour where the application
in the terminal was not resized properly to the actual size of the X
terminal emulator window, resulting in a lot of misaligned text on the
screen. Hunting the issue down from the windowmanager and the terminal
emulator program, I suspect the problem might lie in the kernel. I'm
running 2.6.26 on a dual core i386.

What I see is this: the userspace application receives a SIGWINCH signal
and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
in some cases the old instead of the new terminal size is returned.
A small delay before the ioctl seems to 'fix' this behaviour.

I noticed some changes involving locking in the the pty code in the last
kernel verions, could one of these changes cause the above behaviour ? If
so, wouldn't this affect much more users ?

Ico

--
:wq
^X^Cy^K^X^C^C^C^C


2008-08-12 02:51:40

by Javeed Shaikh

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

Ico Doornekamp wrote:
> Hello,
>
> Recently my X terminals showed annoying behaviour where the application
> in the terminal was not resized properly to the actual size of the X
> terminal emulator window, resulting in a lot of misaligned text on the
> screen. Hunting the issue down from the windowmanager and the terminal
> emulator program, I suspect the problem might lie in the kernel. I'm
> running 2.6.26 on a dual core i386.
>
> What I see is this: the userspace application receives a SIGWINCH signal
> and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> in some cases the old instead of the new terminal size is returned.
> A small delay before the ioctl seems to 'fix' this behaviour.
>
> I noticed some changes involving locking in the the pty code in the last
> kernel verions, could one of these changes cause the above behaviour ? If
> so, wouldn't this affect much more users ?
>
> Ico

Hi,

I've been experiencing similar issues.
Here's a screenshot: http://omploader.org/vbzA1 .
Note how emacs is only taking up half of the terminal's total height.

However, my diagnostics are different from yours. Sending SIGWINCH
(in this case, to emacs) fixes the issue, at least temporarily. However,
sending SIGWINCH to the shell process under which emacs is running
(using a negative PID, to specify the "process group") has no visible effect.
I believe the issue has to do with sending SIGWINCH to process groups,
though I do not have enough experience in this area to be sure.

git-bisect tracked it down to this:
46151122e0a2e80e5a6b2889f595e371fe2b600d is first bad commit
commit 46151122e0a2e80e5a6b2889f595e371fe2b600d
Author: Mike Galbraith <[email protected]>
Date: Thu May 8 17:00:42 2008 +0200

sched: fix weight calculations

Starting at v2.6.26, I wasn't able to easily revert the commit. However,
the following "patch" seems to have alleviated the problem.

=== begin patch ===
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 1c7283c..0e269ed 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -1,4 +1,4 @@
-SCHED_FEAT(NEW_FAIR_SLEEPERS, 1)
+SCHED_FEAT(NEW_FAIR_SLEEPERS, 0)
SCHED_FEAT(WAKEUP_PREEMPT, 1)
SCHED_FEAT(START_DEBIT, 1)
SCHED_FEAT(AFFINE_WAKEUPS, 1)
=== end patch ===

This is merely a workaround, however.

I was able to easily revert commit 46151122e0a2e80e5a6b2889f595e371fe2b600d on
the latest git version of the kernel. However, it did not alleviate the problem.

As in v2.6.26, the above patch (disabling NEW_FAIR_SLEEPERS) seems to work well.

2008-08-12 04:03:31

by Javeed Shaikh

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

I spoke too soon. The problem still exists with my workaround on the
latest git kernel,
but it seems to be harder to reproduce.

On Mon, Aug 11, 2008 at 10:51 PM, Javeed Shaikh <[email protected]> wrote:
> Ico Doornekamp wrote:
>> Hello,
>>
>> Recently my X terminals showed annoying behaviour where the application
>> in the terminal was not resized properly to the actual size of the X
>> terminal emulator window, resulting in a lot of misaligned text on the
>> screen. Hunting the issue down from the windowmanager and the terminal
>> emulator program, I suspect the problem might lie in the kernel. I'm
>> running 2.6.26 on a dual core i386.
>>
>> What I see is this: the userspace application receives a SIGWINCH signal
>> and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
>> in some cases the old instead of the new terminal size is returned.
>> A small delay before the ioctl seems to 'fix' this behaviour.
>>
>> I noticed some changes involving locking in the the pty code in the last
>> kernel verions, could one of these changes cause the above behaviour ? If
>> so, wouldn't this affect much more users ?
>>
>> Ico
>
> Hi,
>
> I've been experiencing similar issues.
> Here's a screenshot: http://omploader.org/vbzA1 .
> Note how emacs is only taking up half of the terminal's total height.
>
> However, my diagnostics are different from yours. Sending SIGWINCH
> (in this case, to emacs) fixes the issue, at least temporarily. However,
> sending SIGWINCH to the shell process under which emacs is running
> (using a negative PID, to specify the "process group") has no visible effect.
> I believe the issue has to do with sending SIGWINCH to process groups,
> though I do not have enough experience in this area to be sure.
>
> git-bisect tracked it down to this:
> 46151122e0a2e80e5a6b2889f595e371fe2b600d is first bad commit
> commit 46151122e0a2e80e5a6b2889f595e371fe2b600d
> Author: Mike Galbraith <[email protected]>
> Date: Thu May 8 17:00:42 2008 +0200
>
> sched: fix weight calculations
>
> Starting at v2.6.26, I wasn't able to easily revert the commit. However,
> the following "patch" seems to have alleviated the problem.
>
> === begin patch ===
> diff --git a/kernel/sched_features.h b/kernel/sched_features.h
> index 1c7283c..0e269ed 100644
> --- a/kernel/sched_features.h
> +++ b/kernel/sched_features.h
> @@ -1,4 +1,4 @@
> -SCHED_FEAT(NEW_FAIR_SLEEPERS, 1)
> +SCHED_FEAT(NEW_FAIR_SLEEPERS, 0)
> SCHED_FEAT(WAKEUP_PREEMPT, 1)
> SCHED_FEAT(START_DEBIT, 1)
> SCHED_FEAT(AFFINE_WAKEUPS, 1)
> === end patch ===
>
> This is merely a workaround, however.
>
> I was able to easily revert commit 46151122e0a2e80e5a6b2889f595e371fe2b600d on
> the latest git version of the kernel. However, it did not alleviate the problem.
>
> As in v2.6.26, the above patch (disabling NEW_FAIR_SLEEPERS) seems to work well.
>

2008-08-12 23:58:30

by Javeed Shaikh

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

I appear to have fixed it.

It seems that SIGWINCH was being fired off before the tty's size was
updated, as Ico hypothesized.

The patch follows.
Please comment!

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index 7501310..8e2fa3c 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -3021,6 +3021,9 @@ static int tiocswinsz(struct tty_struct *tty,
struct tty_struct *real_tty,
rpgrp = get_pid(real_tty->pgrp);
spin_unlock_irqrestore(&tty->ctrl_lock, flags);

+ tty->winsize = tmp_ws;
+ real_tty->winsize = tmp_ws;
+
if (pgrp)
kill_pgrp(pgrp, SIGWINCH, 1);
if (rpgrp != pgrp && rpgrp)
@@ -3029,8 +3032,6 @@ static int tiocswinsz(struct tty_struct *tty,
struct tty_struct *real_tty,
put_pid(pgrp);
put_pid(rpgrp);

- tty->winsize = tmp_ws;
- real_tty->winsize = tmp_ws;
done:
mutex_unlock(&tty->termios_mutex);
return 0;


On Tue, Aug 12, 2008 at 12:03 AM, Javeed Shaikh <[email protected]> wrote:
> I spoke too soon. The problem still exists with my workaround on the
> latest git kernel,
> but it seems to be harder to reproduce.
>
> On Mon, Aug 11, 2008 at 10:51 PM, Javeed Shaikh <[email protected]> wrote:
>> Ico Doornekamp wrote:
>>> Hello,
>>>
>>> Recently my X terminals showed annoying behaviour where the application
>>> in the terminal was not resized properly to the actual size of the X
>>> terminal emulator window, resulting in a lot of misaligned text on the
>>> screen. Hunting the issue down from the windowmanager and the terminal
>>> emulator program, I suspect the problem might lie in the kernel. I'm
>>> running 2.6.26 on a dual core i386.
>>>
>>> What I see is this: the userspace application receives a SIGWINCH signal
>>> and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
>>> in some cases the old instead of the new terminal size is returned.
>>> A small delay before the ioctl seems to 'fix' this behaviour.
>>>
>>> I noticed some changes involving locking in the the pty code in the last
>>> kernel verions, could one of these changes cause the above behaviour ? If
>>> so, wouldn't this affect much more users ?
>>>
>>> Ico
>>
>> Hi,
>>
>> I've been experiencing similar issues.
>> Here's a screenshot: http://omploader.org/vbzA1 .
>> Note how emacs is only taking up half of the terminal's total height.
>>
>> However, my diagnostics are different from yours. Sending SIGWINCH
>> (in this case, to emacs) fixes the issue, at least temporarily. However,
>> sending SIGWINCH to the shell process under which emacs is running
>> (using a negative PID, to specify the "process group") has no visible effect.
>> I believe the issue has to do with sending SIGWINCH to process groups,
>> though I do not have enough experience in this area to be sure.
>>
>> git-bisect tracked it down to this:
>> 46151122e0a2e80e5a6b2889f595e371fe2b600d is first bad commit
>> commit 46151122e0a2e80e5a6b2889f595e371fe2b600d
>> Author: Mike Galbraith <[email protected]>
>> Date: Thu May 8 17:00:42 2008 +0200
>>
>> sched: fix weight calculations
>>
>> Starting at v2.6.26, I wasn't able to easily revert the commit. However,
>> the following "patch" seems to have alleviated the problem.
>>
>> === begin patch ===
>> diff --git a/kernel/sched_features.h b/kernel/sched_features.h
>> index 1c7283c..0e269ed 100644
>> --- a/kernel/sched_features.h
>> +++ b/kernel/sched_features.h
>> @@ -1,4 +1,4 @@
>> -SCHED_FEAT(NEW_FAIR_SLEEPERS, 1)
>> +SCHED_FEAT(NEW_FAIR_SLEEPERS, 0)
>> SCHED_FEAT(WAKEUP_PREEMPT, 1)
>> SCHED_FEAT(START_DEBIT, 1)
>> SCHED_FEAT(AFFINE_WAKEUPS, 1)
>> === end patch ===
>>
>> This is merely a workaround, however.
>>
>> I was able to easily revert commit 46151122e0a2e80e5a6b2889f595e371fe2b600d on
>> the latest git version of the kernel. However, it did not alleviate the problem.
>>
>> As in v2.6.26, the above patch (disabling NEW_FAIR_SLEEPERS) seems to work well.
>>
>

2008-08-13 09:37:38

by Alan

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Tue, 12 Aug 2008 19:58:20 -0400
"Javeed Shaikh" <[email protected]> wrote:

> I appear to have fixed it.

You've moved the race.

> It seems that SIGWINCH was being fired off before the tty's size was
> updated, as Ico hypothesized.

There is what I believe to be a correct fix in the -next tree. It is a
bit invasive for 2.6.27 given we are in the -rc tree but I could push the
relevant pieces if the problem justifies the risk

Alan

2008-08-19 07:40:37

by Andrew Morton

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:

>
> Hello,
>
> Recently my X terminals showed annoying behaviour where the application
> in the terminal was not resized properly to the actual size of the X
> terminal emulator window, resulting in a lot of misaligned text on the
> screen. Hunting the issue down from the windowmanager and the terminal
> emulator program, I suspect the problem might lie in the kernel. I'm
> running 2.6.26 on a dual core i386.
>
> What I see is this: the userspace application receives a SIGWINCH signal
> and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> in some cases the old instead of the new terminal size is returned.
> A small delay before the ioctl seems to 'fix' this behaviour.
>
> I noticed some changes involving locking in the the pty code in the last
> kernel verions, could one of these changes cause the above behaviour ? If
> so, wouldn't this affect much more users ?
>

hm, that code is pretty simple and although it does the SIGWINCH and
the window-size setting in a peculiar order, it looks to be race-free.

Approximately what proportion of the time does it go wrong?

2008-08-19 07:54:34

by Ico Doornekamp

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH



* On 2008-08-19 Andrew Morton <[email protected]> wrote :

> On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
>
> > Recently my X terminals showed annoying behaviour where the application
> > in the terminal was not resized properly to the actual size of the X
> > terminal emulator window, resulting in a lot of misaligned text on the
> > screen. Hunting the issue down from the windowmanager and the terminal
> > emulator program, I suspect the problem might lie in the kernel. I'm
> > running 2.6.26 on a dual core i386.
> >
> > What I see is this: the userspace application receives a SIGWINCH signal
> > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > in some cases the old instead of the new terminal size is returned.
> > A small delay before the ioctl seems to 'fix' this behaviour.
> >
> > I noticed some changes involving locking in the the pty code in the last
> > kernel verions, could one of these changes cause the above behaviour ? If
> > so, wouldn't this affect much more users ?
>
> hm, that code is pretty simple and although it does the SIGWINCH and
> the window-size setting in a peculiar order, it looks to be race-free.
>
> Approximately what proportion of the time does it go wrong?

I guess about 10 to 20% of the resizes. I happen to be using a tiling
window manager which causes resizing more often and more agressive then
'normal' window managers, I guess this helps triggering the problem.

I temporary worked around this issue this by changing the order of the
signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
this is not much of a real fix.

Ico

2008-08-19 08:07:50

by Andrew Morton

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Tue, 19 Aug 2008 09:54:14 +0200 Ico Doornekamp <[email protected]> wrote:

>
>
> * On 2008-08-19 Andrew Morton <[email protected]> wrote :
>
> > On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
> >
> > > Recently my X terminals showed annoying behaviour where the application
> > > in the terminal was not resized properly to the actual size of the X
> > > terminal emulator window, resulting in a lot of misaligned text on the
> > > screen. Hunting the issue down from the windowmanager and the terminal
> > > emulator program, I suspect the problem might lie in the kernel. I'm
> > > running 2.6.26 on a dual core i386.
> > >
> > > What I see is this: the userspace application receives a SIGWINCH signal
> > > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > > in some cases the old instead of the new terminal size is returned.
> > > A small delay before the ioctl seems to 'fix' this behaviour.
> > >
> > > I noticed some changes involving locking in the the pty code in the last
> > > kernel verions, could one of these changes cause the above behaviour ? If
> > > so, wouldn't this affect much more users ?
> >
> > hm, that code is pretty simple and although it does the SIGWINCH and
> > the window-size setting in a peculiar order, it looks to be race-free.
> >
> > Approximately what proportion of the time does it go wrong?
>
> I guess about 10 to 20% of the resizes. I happen to be using a tiling
> window manager which causes resizing more often and more agressive then
> 'normal' window managers, I guess this helps triggering the problem.
>
> I temporary worked around this issue this by changing the order of the
> signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
> this is not much of a real fix.
>

Well damn. Are you sure? The code looks solid to me.

At least, it does after

Author: Alan Cox <[email protected]> 2008-08-15 02:39:38
Committer: Linus Torvalds <[email protected]> 2008-08-15 10:34:07
Parent: 000b9151d7851cc1e490b2a76d0206e524f43cca (Fix race/oops in tty layer after BKL pushdown)
Branches: git-cifs, git-ia64, git-nfs, git-powerpc-merge, linux-next, remotes/origin/master
Follows: v2.6.27-rc3
Precedes: next-20080818

tty: remove resize window special case

perhaps you're still running a kernel which is earlier than that?

2008-08-19 11:44:50

by Ico Doornekamp

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH



* On 2008-08-19 Andrew Morton <[email protected]> wrote :

> On Tue, 19 Aug 2008 09:54:14 +0200 Ico Doornekamp <[email protected]> wrote:
>
> >
> >
> > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> >
> > > On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
> > >
> > > > Recently my X terminals showed annoying behaviour where the application
> > > > in the terminal was not resized properly to the actual size of the X
> > > > terminal emulator window, resulting in a lot of misaligned text on the
> > > > screen. Hunting the issue down from the windowmanager and the terminal
> > > > emulator program, I suspect the problem might lie in the kernel. I'm
> > > > running 2.6.26 on a dual core i386.
> > > >
> > > > What I see is this: the userspace application receives a SIGWINCH signal
> > > > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > > > in some cases the old instead of the new terminal size is returned.
> > > > A small delay before the ioctl seems to 'fix' this behaviour.
> > > >
> > > > I noticed some changes involving locking in the the pty code in the last
> > > > kernel verions, could one of these changes cause the above behaviour ? If
> > > > so, wouldn't this affect much more users ?
> > >
> > > hm, that code is pretty simple and although it does the SIGWINCH and
> > > the window-size setting in a peculiar order, it looks to be race-free.
> > >
> > > Approximately what proportion of the time does it go wrong?
> >
> > I guess about 10 to 20% of the resizes. I happen to be using a tiling
> > window manager which causes resizing more often and more agressive then
> > 'normal' window managers, I guess this helps triggering the problem.
> >
> > I temporary worked around this issue this by changing the order of the
> > signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
> > this is not much of a real fix.
> >
>
> Well damn. Are you sure? The code looks solid to me.
>
> At least, it does after
>
> Author: Alan Cox <[email protected]> 2008-08-15 02:39:38
> Committer: Linus Torvalds <[email protected]> 2008-08-15 10:34:07
> Parent: 000b9151d7851cc1e490b2a76d0206e524f43cca (Fix race/oops in tty layer after BKL pushdown)
> Branches: git-cifs, git-ia64, git-nfs, git-powerpc-merge, linux-next, remotes/origin/master
> Follows: v2.6.27-rc3
> Precedes: next-20080818
>
> tty: remove resize window special case
>
> perhaps you're still running a kernel which is earlier than that?

I reported the behaviour on 2.6.26.2, but I was not aware this issue was
adressed already in the 2.6.27 tree. I will update to the latest 2.6.27
rc and report if the problem still persists.

2008-08-19 17:57:01

by Ico Doornekamp

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH



* On 2008-08-19 Ico Doornekamp <[email protected]> wrote :

>
>
> * On 2008-08-19 Andrew Morton <[email protected]> wrote :
>
> > On Tue, 19 Aug 2008 09:54:14 +0200 Ico Doornekamp <[email protected]> wrote:
> > >
> > > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> > >
> > > > On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
> > > >
> > > > > Recently my X terminals showed annoying behaviour where the application
> > > > > in the terminal was not resized properly to the actual size of the X
> > > > > terminal emulator window, resulting in a lot of misaligned text on the
> > > > > screen. Hunting the issue down from the windowmanager and the terminal
> > > > > emulator program, I suspect the problem might lie in the kernel. I'm
> > > > > running 2.6.26 on a dual core i386.
> > > > >
> > > > > What I see is this: the userspace application receives a SIGWINCH signal
> > > > > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > > > > in some cases the old instead of the new terminal size is returned.
> > > > > A small delay before the ioctl seems to 'fix' this behaviour.
> > > > >
> > > > > I noticed some changes involving locking in the the pty code in the last
> > > > > kernel verions, could one of these changes cause the above behaviour ? If
> > > > > so, wouldn't this affect much more users ?
> > > >
> > > > hm, that code is pretty simple and although it does the SIGWINCH and
> > > > the window-size setting in a peculiar order, it looks to be race-free.
> > > >
> > > > Approximately what proportion of the time does it go wrong?
> > >
> > > I guess about 10 to 20% of the resizes. I happen to be using a tiling
> > > window manager which causes resizing more often and more agressive then
> > > 'normal' window managers, I guess this helps triggering the problem.
> > >
> > > I temporary worked around this issue this by changing the order of the
> > > signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
> > > this is not much of a real fix.
> > >
> >
> > Well damn. Are you sure? The code looks solid to me.
> >
> > At least, it does after
> >
> > Author: Alan Cox <[email protected]> 2008-08-15 02:39:38
> > Committer: Linus Torvalds <[email protected]> 2008-08-15 10:34:07
> > Parent: 000b9151d7851cc1e490b2a76d0206e524f43cca (Fix race/oops in tty layer after BKL pushdown)
> > Branches: git-cifs, git-ia64, git-nfs, git-powerpc-merge, linux-next, remotes/origin/master
> > Follows: v2.6.27-rc3
> > Precedes: next-20080818
> >
> > tty: remove resize window special case
> >
> > perhaps you're still running a kernel which is earlier than that?
>
> I reported the behaviour on 2.6.26.2, but I was not aware this issue was
> adressed already in the 2.6.27 tree. I will update to the latest 2.6.27
> rc and report if the problem still persists.

I am not able to reproduce the problem with git 2.6.27-rc3-next-20080819.

2008-08-19 19:03:38

by Andrew Morton

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Tue, 19 Aug 2008 19:56:39 +0200
Ico Doornekamp <[email protected]> wrote:

>
>
> * On 2008-08-19 Ico Doornekamp <[email protected]> wrote :
>
> >
> >
> > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> >
> > > On Tue, 19 Aug 2008 09:54:14 +0200 Ico Doornekamp <[email protected]> wrote:
> > > >
> > > > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> > > >
> > > > > On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
> > > > >
> > > > > > Recently my X terminals showed annoying behaviour where the application
> > > > > > in the terminal was not resized properly to the actual size of the X
> > > > > > terminal emulator window, resulting in a lot of misaligned text on the
> > > > > > screen. Hunting the issue down from the windowmanager and the terminal
> > > > > > emulator program, I suspect the problem might lie in the kernel. I'm
> > > > > > running 2.6.26 on a dual core i386.
> > > > > >
> > > > > > What I see is this: the userspace application receives a SIGWINCH signal
> > > > > > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > > > > > in some cases the old instead of the new terminal size is returned.
> > > > > > A small delay before the ioctl seems to 'fix' this behaviour.
> > > > > >
> > > > > > I noticed some changes involving locking in the the pty code in the last
> > > > > > kernel verions, could one of these changes cause the above behaviour ? If
> > > > > > so, wouldn't this affect much more users ?
> > > > >
> > > > > hm, that code is pretty simple and although it does the SIGWINCH and
> > > > > the window-size setting in a peculiar order, it looks to be race-free.
> > > > >
> > > > > Approximately what proportion of the time does it go wrong?
> > > >
> > > > I guess about 10 to 20% of the resizes. I happen to be using a tiling
> > > > window manager which causes resizing more often and more agressive then
> > > > 'normal' window managers, I guess this helps triggering the problem.
> > > >
> > > > I temporary worked around this issue this by changing the order of the
> > > > signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
> > > > this is not much of a real fix.
> > > >
> > >
> > > Well damn. Are you sure? The code looks solid to me.
> > >
> > > At least, it does after
> > >
> > > Author: Alan Cox <[email protected]> 2008-08-15 02:39:38
> > > Committer: Linus Torvalds <[email protected]> 2008-08-15 10:34:07
> > > Parent: 000b9151d7851cc1e490b2a76d0206e524f43cca (Fix race/oops in tty layer after BKL pushdown)
> > > Branches: git-cifs, git-ia64, git-nfs, git-powerpc-merge, linux-next, remotes/origin/master
> > > Follows: v2.6.27-rc3
> > > Precedes: next-20080818
> > >
> > > tty: remove resize window special case
> > >
> > > perhaps you're still running a kernel which is earlier than that?
> >
> > I reported the behaviour on 2.6.26.2, but I was not aware this issue was
> > adressed already in the 2.6.27 tree. I will update to the latest 2.6.27
> > rc and report if the problem still persists.
>
> I am not able to reproduce the problem with git 2.6.27-rc3-next-20080819.

That's linux-next, yes?

linux-next has additional locking in there:

commit 2283faa9cec083b6ddc1fa02a974ce1c797e847f
Author: Alan Cox <[email protected]>
Date: Thu Aug 14 09:53:22 2008 +1000

tty-fix-pty-termios-race

Kanru Chen posted a patch versus the old code which deals with the case
where you resize the pty side of a pty/tty pair. In that situation the
termios data is updated for both pty and tty but the locks are not held
on both sides.

This reimplements the fix against the updated tty code. Patch by self but
the hard bit (noticing and fixing the bug) is thanks to Kanru Chen.

Signed-off-by: Alan Cox <[email protected]>

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index a8ddcba..779c6b5 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -2068,7 +2068,7 @@ static int tiocgwinsz(struct tty_struct *tty, struct winsize __user *arg)
/**
* tty_do_resize - resize event
* @tty: tty being resized
- * @real_tty: real tty (if using a pty/tty pair)
+ * @real_tty: real tty (not the same as tty if using a pty/tty pair)
* @rows: rows (character)
* @cols: cols (character)
*
@@ -2085,6 +2085,14 @@ int tty_do_resize(struct tty_struct *tty, struct tty_struct *real_tty,
mutex_lock(&tty->termios_mutex);
if (!memcmp(ws, &tty->winsize, sizeof(*ws)))
goto done;
+
+ /* If a pty/tty pair is updated we will have a real_tty defined
+ which doesn't match the tty. In this case as we will update
+ both of the tty termios sets. We can lock both mutex safely here
+ as in this case real_tty is the tty, tty is the pty side and we
+ have lock ordering */
+ if (real_tty != tty)
+ mutex_lock(&real_tty->termios_mutex);
/* Get the PID values and reference them so we can
avoid holding the tty ctrl lock while sending signals */
spin_lock_irqsave(&tty->ctrl_lock, flags);
@@ -2102,6 +2110,8 @@ int tty_do_resize(struct tty_struct *tty, struct tty_struct *real_tty,

tty->winsize = *ws;
real_tty->winsize = *ws;
+ if (real_tty != tty)
+ mutex_unlock(&real_tty->termios_mutex);
done:
mutex_unlock(&tty->termios_mutex);
return 0;

which perhaps fixed the problem.


This bug may have been present in released kernels for some time - I
think I read in another thread that CPU scheduler changes might have
caused it to surface. Doesn't matter really - we don't want
user-visible races like this affecting desktop applications in stable
kernels!

Can you please help us to complete this list?

2.6.25.x: ?
2.6.26.x: bad
2.6.27-rc3/4: ?
linux-next good

Thanks.

2008-08-19 20:13:33

by Ico Doornekamp

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH



* On 2008-08-19 Andrew Morton <[email protected]> wrote :

> On Tue, 19 Aug 2008 19:56:39 +0200
> Ico Doornekamp <[email protected]> wrote:
>
> >
> >
> > * On 2008-08-19 Ico Doornekamp <[email protected]> wrote :
> >
> > >
> > >
> > > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> > >
> > > > On Tue, 19 Aug 2008 09:54:14 +0200 Ico Doornekamp <[email protected]> wrote:
> > > > >
> > > > > * On 2008-08-19 Andrew Morton <[email protected]> wrote :
> > > > >
> > > > > > On Sun, 10 Aug 2008 17:08:59 +0200 Ico Doornekamp <[email protected]> wrote:
> > > > > >
> > > > > > > Recently my X terminals showed annoying behaviour where the application
> > > > > > > in the terminal was not resized properly to the actual size of the X
> > > > > > > terminal emulator window, resulting in a lot of misaligned text on the
> > > > > > > screen. Hunting the issue down from the windowmanager and the terminal
> > > > > > > emulator program, I suspect the problem might lie in the kernel. I'm
> > > > > > > running 2.6.26 on a dual core i386.
> > > > > > >
> > > > > > > What I see is this: the userspace application receives a SIGWINCH signal
> > > > > > > and acquires the terminal size usign the TIOCGWINSZ ioctl. It seems that
> > > > > > > in some cases the old instead of the new terminal size is returned.
> > > > > > > A small delay before the ioctl seems to 'fix' this behaviour.
> > > > > > >
> > > > > > > I noticed some changes involving locking in the the pty code in the last
> > > > > > > kernel verions, could one of these changes cause the above behaviour ? If
> > > > > > > so, wouldn't this affect much more users ?
> > > > > >
> > > > > > hm, that code is pretty simple and although it does the SIGWINCH and
> > > > > > the window-size setting in a peculiar order, it looks to be race-free.
> > > > > >
> > > > > > Approximately what proportion of the time does it go wrong?
> > > > >
> > > > > I guess about 10 to 20% of the resizes. I happen to be using a tiling
> > > > > window manager which causes resizing more often and more agressive then
> > > > > 'normal' window managers, I guess this helps triggering the problem.
> > > > >
> > > > > I temporary worked around this issue this by changing the order of the
> > > > > signal and the updating of the pty size in tty_io.c's tiocswinsz(), but
> > > > > this is not much of a real fix.
> > > > >
> > > >
> > > > Well damn. Are you sure? The code looks solid to me.
> > > >
> > > > At least, it does after
> > > >
> > > > Author: Alan Cox <[email protected]> 2008-08-15 02:39:38
> > > > Committer: Linus Torvalds <[email protected]> 2008-08-15 10:34:07
> > > > Parent: 000b9151d7851cc1e490b2a76d0206e524f43cca (Fix race/oops in tty layer after BKL pushdown)
> > > > Branches: git-cifs, git-ia64, git-nfs, git-powerpc-merge, linux-next, remotes/origin/master
> > > > Follows: v2.6.27-rc3
> > > > Precedes: next-20080818
> > > >
> > > > tty: remove resize window special case
> > > >
> > > > perhaps you're still running a kernel which is earlier than that?
> > >
> > > I reported the behaviour on 2.6.26.2, but I was not aware this issue was
> > > adressed already in the 2.6.27 tree. I will update to the latest 2.6.27
> > > rc and report if the problem still persists.
> >
> > I am not able to reproduce the problem with git 2.6.27-rc3-next-20080819.
>
> That's linux-next, yes?
>
> linux-next has additional locking in there:
>
> commit 2283faa9cec083b6ddc1fa02a974ce1c797e847f
> Author: Alan Cox <[email protected]>
> Date: Thu Aug 14 09:53:22 2008 +1000
>
> tty-fix-pty-termios-race
>
> Kanru Chen posted a patch versus the old code which deals with the case
> where you resize the pty side of a pty/tty pair. In that situation the
> termios data is updated for both pty and tty but the locks are not held
> on both sides.
>
> This reimplements the fix against the updated tty code. Patch by self but
> the hard bit (noticing and fixing the bug) is thanks to Kanru Chen.
>
> Signed-off-by: Alan Cox <[email protected]>
>
> diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
> index a8ddcba..779c6b5 100644
> --- a/drivers/char/tty_io.c
> +++ b/drivers/char/tty_io.c
> @@ -2068,7 +2068,7 @@ static int tiocgwinsz(struct tty_struct *tty, struct winsize __user *arg)
> /**
> * tty_do_resize - resize event
> * @tty: tty being resized
> - * @real_tty: real tty (if using a pty/tty pair)
> + * @real_tty: real tty (not the same as tty if using a pty/tty pair)
> * @rows: rows (character)
> * @cols: cols (character)
> *
> @@ -2085,6 +2085,14 @@ int tty_do_resize(struct tty_struct *tty, struct tty_struct *real_tty,
> mutex_lock(&tty->termios_mutex);
> if (!memcmp(ws, &tty->winsize, sizeof(*ws)))
> goto done;
> +
> + /* If a pty/tty pair is updated we will have a real_tty defined
> + which doesn't match the tty. In this case as we will update
> + both of the tty termios sets. We can lock both mutex safely here
> + as in this case real_tty is the tty, tty is the pty side and we
> + have lock ordering */
> + if (real_tty != tty)
> + mutex_lock(&real_tty->termios_mutex);
> /* Get the PID values and reference them so we can
> avoid holding the tty ctrl lock while sending signals */
> spin_lock_irqsave(&tty->ctrl_lock, flags);
> @@ -2102,6 +2110,8 @@ int tty_do_resize(struct tty_struct *tty, struct tty_struct *real_tty,
>
> tty->winsize = *ws;
> real_tty->winsize = *ws;
> + if (real_tty != tty)
> + mutex_unlock(&real_tty->termios_mutex);
> done:
> mutex_unlock(&tty->termios_mutex);
> return 0;
>
> which perhaps fixed the problem.
>
>
> This bug may have been present in released kernels for some time - I
> think I read in another thread that CPU scheduler changes might have
> caused it to surface. Doesn't matter really - we don't want
> user-visible races like this affecting desktop applications in stable
> kernels!
>
> Can you please help us to complete this list?

2.6.25.9: good
2.6.26-rc1 bad ( +/- 10% of the resizes)
2.6.26.2: still bad ( +/- 10% of the resizes )
2.6.27-rc3: ugly bad ( > 75% of the resizes )
linux-next good

Any other versions to test ?

--
:wq
^X^Cy^K^X^C^C^C^C

2008-08-19 20:49:19

by Andrew Morton

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Tue, 19 Aug 2008 22:13:11 +0200
Ico Doornekamp <[email protected]> wrote:

>
>
> >
> > This bug may have been present in released kernels for some time - I
> > think I read in another thread that CPU scheduler changes might have
> > caused it to surface. Doesn't matter really - we don't want
> > user-visible races like this affecting desktop applications in stable
> > kernels!
> >
> > Can you please help us to complete this list?
>
> 2.6.25.9: good
> 2.6.26-rc1 bad ( +/- 10% of the resizes)
> 2.6.26.2: still bad ( +/- 10% of the resizes )
> 2.6.27-rc3: ugly bad ( > 75% of the resizes )
> linux-next good
>
> Any other versions to test ?

Current Linus mainline has

commit 8c9a9dd0fa3a269d380eaae2dc1bee39e865fae1
Author: Alan Cox <[email protected]>
Date: Fri Aug 15 10:39:38 2008 +0100

tty: remove resize window special case

which might have fixed this after 2.6.27-rc3, so testing
ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.27-rc3-git6.gz
would be most interesting.

We should hunt down the problem and get 2.6.26.x fixed up too.

Thanks.

2008-08-20 07:23:18

by Ico Doornekamp

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH



* On 2008-08-19 Andrew Morton <[email protected]> wrote :

> On Tue, 19 Aug 2008 22:13:11 +0200 Ico Doornekamp <[email protected]> wrote:
> >
> > >
> > > This bug may have been present in released kernels for some time - I
> > > think I read in another thread that CPU scheduler changes might have
> > > caused it to surface. Doesn't matter really - we don't want
> > > user-visible races like this affecting desktop applications in stable
> > > kernels!
> > >
> > > Can you please help us to complete this list?
> >
> > 2.6.25.9: good
> > 2.6.26-rc1 bad ( +/- 10% of the resizes)
> > 2.6.26.2: still bad ( +/- 10% of the resizes )
> > 2.6.27-rc3: ugly bad ( > 75% of the resizes )
> > linux-next good
> >
> > Any other versions to test ?
>
> Current Linus mainline has
>
> commit 8c9a9dd0fa3a269d380eaae2dc1bee39e865fae1
> Author: Alan Cox <[email protected]>
> Date: Fri Aug 15 10:39:38 2008 +0100
>
> tty: remove resize window special case
>
> which might have fixed this after 2.6.27-rc3, so testing
> ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.27-rc3-git6.gz
> would be most interesting.

Still very broken.

--
:wq
^X^Cy^K^X^C^C^C^C

2008-10-03 16:45:30

by Christoph

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

Hello list!

This is a follow up of
http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-08/msg08509.html

Summary:
resizing a terminal emulator window results in a lot of misaligned text
on the screen. SIGWINCH signal defective.

It ends with this question and I'd like to contribute some confusion:

> > 2.6.25.9: good
> > 2.6.26-rc1 bad ( +/- 10% of the resizes)
> > 2.6.26.2: still bad ( +/- 10% of the resizes )
> > 2.6.27-rc3: ugly bad ( > 75% of the resizes )
> > linux-next good

> Any other versions to test ?

I got this bug after kernel upgrade from 2.6.25.25 to 2.6.26.5 on my ppc
notebook. That made me wonder. On my x86 desktop I don't had that bug.
I tested:

ppc, notebook
2.6.25.25 ok
2.6.26.5 bug
2.6.26.4 bug

x86, desktop
2.6.26.1 ok
2.6.26.2 ok
2.6.26.3 ok
2.6.26.4 ok !

So I diffed the config of .4 and there is an eye catcher:

ppc: CONFIG_PREEMPT (=> CONFIG_LOCK_KERNEL)
x86: CONFIG_PREEMPT_VOLUNTARY

So, recompiled and tested:

2.6.26.4, x86, CONFIG_PREEMPT => bug
2.6.26.5, ppc, CONFIG_PREEMPT_VOLUNTARY => ok

Hope, this information helps you hunting the bug.

greets

chr

2008-10-05 11:57:19

by Kanru Chen

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

Hi Alan,

Could you push a fix for this problem for .27 release? It seems many
people encounter this if they compile with PREEMPT enabled and it's quite
annoying.

Regards,
--
~ Kanru Chen <[email protected]>
'v' http://stu.csie.ncnu.edu.tw/~kanru.96/
// \\ GnuPG-Key ID: 365CC7A2
/( )\ Fingerprint: 3278 DFB4 BB28 6E8C 9E1F 1ECB B1B7 5B5F 365C C7A2
^`~'^


Attachments:
(No filename) (376.00 B)
signature.asc (197.00 B)
Digital signature
Download all attachments

2008-10-05 12:03:56

by Alan

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Sun, 5 Oct 2008 19:39:44 +0800
Kanru Chen <[email protected]> wrote:

> Hi Alan,
>
> Could you push a fix for this problem for .27 release? It seems many
> people encounter this if they compile with PREEMPT enabled and it's quite
> annoying.

The relevant fixes were pushed and are in the current -rc series kernels.
The tty mutex now locks the TIOCG/SWINSZ calls using the tty mutex lock
of the tty side of tty/pty pairs.

Alan

2008-10-05 12:18:18

by Kanru Chen

[permalink] [raw]
Subject: Re: TIOCGWINSZ retuns old pty size after receiving SIGWINCH

On Sun, 5 Oct 2008 13:03:30 +0100
Alan Cox <[email protected]> wrote:

> The relevant fixes were pushed and are in the current -rc series kernels.
> The tty mutex now locks the TIOCG/SWINSZ calls using the tty mutex lock
> of the tty side of tty/pty pairs.

Thanks! I didn't notice that.

Regards,
--
~ Kanru Chen <[email protected]>
'v' http://stu.csie.ncnu.edu.tw/~kanru.96/
// \\ GnuPG-Key ID: 365CC7A2
/( )\ Fingerprint: 3278 DFB4 BB28 6E8C 9E1F 1ECB B1B7 5B5F 365C C7A2
^`~'^


Attachments:
(No filename) (508.00 B)
signature.asc (197.00 B)
Digital signature
Download all attachments