2002-01-21 14:11:01

by Jan Hudec

[permalink] [raw]
Subject: 2.4.17 OOPS in tty code.

Hello All,

Tty device code causes oopses when closing /dev/console and devfs is used.
The bug is reproducible on 2.4.17 UML port. The uml arch code however does
not seem involved. The problem is, that the tty flip buffer flushing task
somehow remains in the tq_timer task queue when the tty struct is freed.
When the device is subsequently reopened (or the memory allocated for other
purpose), run_task_queue OOPSes when it comes acros the entry, that has
it's pointers overwriten.

The bug is regularly triggered in shutdown process (init seems to
close and reopen /dev/console).

As it's the user-mode port, I don't have standart OOPS message, but I am
willing to provide any backtraces and logs you request.

------------------------------------------------------------------------
- Jan Hudec <[email protected]>


2002-01-21 22:03:54

by Jeff Dike

[permalink] [raw]
Subject: Re: 2.4.17 OOPS in tty code.

[email protected] said:
> Tty device code causes oopses when closing /dev/console and devfs is
> used. The bug is reproducible on 2.4.17 UML port.

How do you reproduce it?

UML config, command line, a backtrace, etc would be nice.

Jeff

2002-01-21 22:48:26

by Richard Gooch

[permalink] [raw]
Subject: Re: 2.4.17 OOPS in tty code.

Jeff Dike writes:
> [email protected] said:
> > Tty device code causes oopses when closing /dev/console and devfs is
> > used. The bug is reproducible on 2.4.17 UML port.
>
> How do you reproduce it?
>
> UML config, command line, a backtrace, etc would be nice.

Furthermore, this was done without applying the latest devfs patch
(v199.8 as I write this). Bug reports with old versions of devfs are
(and should be) dropped in the bit-bucket, especially considering
recent devfs patches have ChangeLog entries which talk about fixing
Oopses!

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2002-01-21 23:14:56

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.17 OOPS in tty code.

Richard Gooch wrote:
>
> Jeff Dike writes:
> > [email protected] said:
> > > Tty device code causes oopses when closing /dev/console and devfs is
> > > used. The bug is reproducible on 2.4.17 UML port.
> >
> > How do you reproduce it?
> >
> > UML config, command line, a backtrace, etc would be nice.
>
> Furthermore, this was done without applying the latest devfs patch
> (v199.8 as I write this). Bug reports with old versions of devfs are
> (and should be) dropped in the bit-bucket, especially considering
> recent devfs patches have ChangeLog entries which talk about fixing
> Oopses!
>

Jan's report seems to have nothing to do with devfs. It
sounds like it's purely a tty-layer thing.

I'd like to see the full backtrace before we bitbucket
this one, please.


-

2002-01-24 13:19:08

by Jan Hudec

[permalink] [raw]
Subject: Re: 2.4.17 OOPS in tty code.

Hello,

Sorry for the late rpely. I have a bit complicated acces to mail.

Here are some traces for the OOPS. I added following suggested patch:
--- linux-2.4.18-pre4/drivers/char/tty_io.c Tue Jan 15 15:08:24 2002
+++ linux-akpm-1/drivers/char/tty_io.c Mon Jan 21 15:23:32 2002
@@ -1266,8 +1266,14 @@ static void release_dev(struct file * fi
/*
* Make sure that the tty's task queue isn't activated.
*/
+ if (test_bit(TTY_DONT_FLIP, &tty->flags))
+ BUG();
run_task_queue(&tq_timer);
+ if (tty->flip.tqueue.sync)
+ BUG();
flush_scheduled_tasks();
+ if (tty->flip.tqueue.sync)
+ BUG();
and none of the bugs was triggered.

I also added printks to some functions to print argument values on entry (and
return from alloc_tty_struct). For every call to function tty_flip_buffer_push
I added a breakpoint to print full backrtrace.

This is backtrace in actual segfault:
Breakpoint 1, panic (fmt=0xa0146420 "Kernel mode fault at addr 0x%lx, ip 0x%lx") at panic.c:45
45 {
(gdb) bt
#0 panic (fmt=0xa0146420 "Kernel mode fault at addr 0x%lx, ip 0x%lx") at panic.c:45
#1 0xa00c578c in segv (address=0, ip=2684428006, is_write=0, is_user=0) at trap_kern.c:94
#2 0xa00c63ea in segv_handler (sig=11, sc=0xa0893ae0, usermode=0) at trap_user.c:369
#3 0xa00c6575 in sig_handler (sig=11, sc={gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, __dsh = 0, edi = 2693587200, esi = 2693348808, ebp = 2693348816, esp = 2693348792, ebx = 0, edx = 0, ecx = 2693348576, eax = 0, trapno = 14, err = 4, eip = 2684428006, cs = 35, __csh = 0, eflags = 2163219, esp_at_signal = 2693348792, ss = 43, __ssh = 0, fpstate = 0xa0893b38, oldmask = 134217728, cr2 = 0}) at trap_user.c:428
#4 <signal handler called>
#5 __run_task_queue (list=0xa0169330) at softirq.c:352
#6 0xa00701d9 in release_dev (filp=0xa25d7340) at /usr/home/bulb/umlinux/include/linux/tqueue.h:122
#7 0xa0070629 in tty_release (inode=0xa08d1080, filp=0xa25d7340) at tty_io.c:1440
#8 0xa002aa25 in fput (file=0xa25d7340) at file_table.c:113
#9 0xa0029ac7 in filp_close (filp=0xa25d7340, id=0xa0171840) at open.c:838
#10 0xa0029b1c in sys_close (fd=0) at open.c:862
#11 0xa00c4575 in execute_syscall (regs={regs = {0, 2048, 1, 0, 2684353436, 2684353084, 4294967258, 43, 43, 0, 0, 6, 1074649757, 35, 2097799, 2684353040, 43}}) at syscall_kern.c:326
#12 0xa00c4671 in syscall_handler (unused=0x0) at syscall_user.c:70

I attach output on console (including debuging printks - each is first line
in named function except for alloc_tty_struct, where it's the last one.
The debugger output contains backtraces of all entries to tty_flip_buffer_push
for the same session. In the session I just waited for the shell to start
(it's started directly from inittab) and then quickly typed halt and <CR>.

The um-kernel was compiled with attached config. The host kernel was 2.4.13-ac8.
inittab, rc and rcS scripts used in the session are included. All binaries
(including /sbin/halt) are copied from debian (unstable) installation (last
updated about a month ago).

--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <[email protected]>


Attachments:
(No filename) (3.19 kB)
output (5.32 kB)
output_on_console
debugger (12.81 kB)
debugger_output
.config (5.03 kB)
inittab (1.43 kB)
rc (238.00 B)
rcS (299.00 B)
Download all attachments

2002-01-31 00:21:45

by Jan Hudec

[permalink] [raw]
Subject: Re: 2.4.17 OOPS in tty code.

> Hello All,
>
> Tty device code causes oopses when closing /dev/console and devfs is used.
> The bug is reproducible on 2.4.17 UML port. The uml arch code however does
> not seem involved. The problem is, that the tty flip buffer flushing task
> somehow remains in the tq_timer task queue when the tty struct is freed.
> When the device is subsequently reopened (or the memory allocated for other
> purpose), run_task_queue OOPSes when it comes acros the entry, that has
> it's pointers overwriten.

Well, I hunted down the bug a bit more. The user-mode code DOES get involved.
When /dev/console is open, the pointer is written to vts[line].tty (in
console_open), but noone cares to remove it when it's freed. And I don't
have any process running on line 0. Just I am not sure, weather the correct
way is to avoid freeing the structure (eg. via ref-count) or to remove the
pointer in close_console.

--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <[email protected]>