2004-09-07 19:58:42

by Alan

[permalink] [raw]
Subject: The Serial Layer

Is anyone currently looking at fixing this before I start applying
extreme violence ? In particular to start trying to do something about
the races in TIOCSTI, line discipline setting, hangup v receive, drivers
abusing the API and calling ldisc.receive_buf direct ?

Alan


2004-09-07 20:24:44

by Arjan van de Ven

[permalink] [raw]
Subject: Re: The Serial Layer

On Tue, 2004-09-07 at 20:49, Alan Cox wrote:
> Is anyone currently looking at fixing this before I start applying
> extreme violence ? In particular to start trying to do something about
> the races in TIOCSTI, line discipline setting, hangup v receive, drivers
> abusing the API and calling ldisc.receive_buf direct ?

don't you mean the TTY layer instead of the serial layer ?


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-09-07 20:27:42

by Russell King

[permalink] [raw]
Subject: Re: The Serial Layer

On Tue, Sep 07, 2004 at 07:49:42PM +0100, Alan Cox wrote:
> Is anyone currently looking at fixing this before I start applying
> extreme violence ? In particular to start trying to do something about
> the races in TIOCSTI, line discipline setting, hangup v receive, drivers
> abusing the API and calling ldisc.receive_buf direct ?

I'm certainly not delving into the TTY layers itself - there's far
too many drivers which would break horribly if I were to do that.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-09-07 20:27:43

by Alan

[permalink] [raw]
Subject: Re: The Serial Layer

On Maw, 2004-09-07 at 21:06, Arjan van de Ven wrote:
> On Tue, 2004-09-07 at 20:49, Alan Cox wrote:
> > Is anyone currently looking at fixing this before I start applying
> > extreme violence ? In particular to start trying to do something about
> > the races in TIOCSTI, line discipline setting, hangup v receive, drivers
> > abusing the API and calling ldisc.receive_buf direct ?
>
> don't you mean the TTY layer instead of the serial layer ?

Both. A lot of hangup/receive races are in the serial drivers themselves
doing things like

hangup
[close ldisc]
send bytes to the ldisc
[Boom!]

Alan

2004-09-09 16:43:47

by Theodore Ts'o

[permalink] [raw]
Subject: Re: The Serial Layer

On Tue, Sep 07, 2004 at 08:14:20PM +0100, Alan Cox wrote:
> On Maw, 2004-09-07 at 21:06, Arjan van de Ven wrote:
> > On Tue, 2004-09-07 at 20:49, Alan Cox wrote:
> > > Is anyone currently looking at fixing this before I start applying
> > > extreme violence ? In particular to start trying to do something about
> > > the races in TIOCSTI, line discipline setting, hangup v receive, drivers
> > > abusing the API and calling ldisc.receive_buf direct ?

Calling ldisc.receive_buf directly() should be OK as long as you're
not in an interrupt handler. (For example the comtrol driver polls in
a timer bottom-half, so it's OK to call ldisc.receive_buf).
Unfortunately, some drivers don't quite follow the rules.

> > don't you mean the TTY layer instead of the serial layer ?
>
> Both. A lot of hangup/receive races are in the serial drivers themselves
> doing things like
>
> hangup
> [close ldisc]
> send bytes to the ldisc
> [Boom!]

The hangup handling needs to be completely redone, so that we don't
force serial drivers to do a completely shutdown of the port in an
interrupt context. If the drivers are careful, it can be safe, but
it's too hard to handle hangup correctly.

If you have time to work on the tty layer (sucker!!!), please go ahead
and start work by all means. I was hoping to have time to clean up
some of the more egregious problems sometime next year (after I escape
back into development), but getting this fixed sooner rather the later
would be a definitely Good Thing.

- Ted

2004-09-09 17:41:07

by Alan

[permalink] [raw]
Subject: Re: The Serial Layer

On Iau, 2004-09-09 at 13:57, Theodore Ts'o wrote:
> Calling ldisc.receive_buf directly() should be OK as long as you're
> not in an interrupt handler. (For example the comtrol driver polls in
> a timer bottom-half, so it's OK to call ldisc.receive_buf).
> Unfortunately, some drivers don't quite follow the rules.

If the driver calls ldisc->receive_buf it means it bypasses the
TTY_DONT_FLIP locking used by the n_tty driver. Am I missing something
here.

> The hangup handling needs to be completely redone, so that we don't
> force serial drivers to do a completely shutdown of the port in an
> interrupt context. If the drivers are careful, it can be safe, but
> it's too hard to handle hangup correctly.

Most of that I think comes out in the wash with the ldisc locking.
Once the driver uses tty_ldisc_ref() it'll get NULL back in the case
where the hangup raced the driver. I'm also no longer dropping back
to N_TTY in the hangup path. I couldn't see any reason this was
neccessary since the release will clean it up anyway ?

Alan

2004-09-10 15:48:24

by Theodore Ts'o

[permalink] [raw]
Subject: Re: The Serial Layer

On Thu, Sep 09, 2004 at 05:28:46PM +0100, Alan Cox wrote:
> On Iau, 2004-09-09 at 13:57, Theodore Ts'o wrote:
> > Calling ldisc.receive_buf directly() should be OK as long as you're
> > not in an interrupt handler. (For example the comtrol driver polls in
> > a timer bottom-half, so it's OK to call ldisc.receive_buf).
> > Unfortunately, some drivers don't quite follow the rules.
>
> If the driver calls ldisc->receive_buf it means it bypasses the
> TTY_DONT_FLIP locking used by the n_tty driver. Am I missing something
> here.

Um, yes, you're right. The direct calls to ldisc->receive_buf predate
TTY_DONT_FLIP; TTY_DONT_FLIP was added by Bill Hawes (if I remember
correctly) in an attempt to fix various locking problems, but
unfortunately the direct callers weren't fixed.

> Most of that I think comes out in the wash with the ldisc locking.
> Once the driver uses tty_ldisc_ref() it'll get NULL back in the case
> where the hangup raced the driver. I'm also no longer dropping back
> to N_TTY in the hangup path. I couldn't see any reason this was
> neccessary since the release will clean it up anyway ?

We needed to close the line discpline in order to prevent a line
discipline (such as ppp) from trying to write to the tty. Given there
was an invariant that a tty always had a line discpline always, we
reassigned it back to N_TTY. The right answer is probably to be to
add checks to the line discpline code before they attempt to send data
to the tty.

- Ted

2004-09-10 16:31:44

by Alan

[permalink] [raw]
Subject: Re: The Serial Layer

On Gwe, 2004-09-10 at 15:32, Theodore Ts'o wrote:
> We needed to close the line discpline in order to prevent a line
> discipline (such as ppp) from trying to write to the tty. Given there
> was an invariant that a tty always had a line discpline always, we
> reassigned it back to N_TTY. The right answer is probably to be to
> add checks to the line discpline code before they attempt to send data
> to the tty.

Ok that makes sense. So we need the same kind of ordering rules about
device->close() as we now enforce with ldisc->close(). That suggests we
simply need to call ldisc->close before device->close. I think we can
now do that safely as the device won't deliver data to a closed ldisc.

Will look into it

2004-09-12 03:02:27

by David Eger

[permalink] [raw]
Subject: Re: The Serial Layer

On Tue, Sep 07, 2004 at 07:49:42PM +0100, Alan Cox wrote:
> Is anyone currently looking at fixing this before I start applying
> extreme violence ?

While you're at it, could you patch it up so we can have more than one
serial device again? I tracked down a bug a month ago having to do
with the pmac_zilog driver freaking out because it tried to

uart_register(major=TTY_MAJOR, minor=64, nr=foo)

and someone else had gotten there first. It boils down to a call
to register_chrdev_region(), which fails if the requested space is
taken, instead of looking past the claimed device numbers for some
more empty ones...

-dte

2004-09-12 16:27:44

by Alan

[permalink] [raw]
Subject: Re: The Serial Layer

On Sul, 2004-09-12 at 04:01, David Eger wrote:
> While you're at it, could you patch it up so we can have more than one
> serial device again? I tracked down a bug a month ago having to do
> with the pmac_zilog driver freaking out because it tried to

uart layer is rather below the stuff I'm touching in the tty code.