2009-10-15 11:06:35

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Am Mittwoch 14 Oktober 2009 23:53:46 schrieben Sie:
> hvc_console_print() calls the HVC client driver's put_chars() callback
> to write some characters to the console. If the callback returns 0, that
> indicates that no characters were written (perhaps the output buffer is
> full), but hvc_console_print() treats that as an error and discards the
> rest of the buffer.
>
> So change hvc_console_print() to just loop and call put_chars() again if it
> returns a 0 return code.
>
> This change makes hvc_console_print() behave more like hvc_push(), which
> does check for a 0 return code and re-schedules itself.

There is a difference between console and tty, if the console call does not
return, it might bring the full system to a stop. (if its the preferred console,
init will stop)

> This patch might fix drivers that return 0 to indicate that they're busy, such
> as arch/powerpc/platforms/pseries/hvconsole.c. It will break drivers that
> return 0 if their output buffer is full, but where those buffers cannot be
> emptied while the kernel is in a loop.

Yep. I think it really depends on the backend if looping will result in any
progress or not. My experience wth hvc_console is, that its quite hard to get
changes tested on all backends. (e.g. XEN, pseries, iseries, virtio_console,
s390_iucv...), so even if this change turns out to be correct, it should
probably sit in linux-next for a while. In additon I really dont oversee, what
backends wil break due to this patch.

The fact that struct console->write returns void indicates that the console
layer is not interested in errors. We have two policies we can implement:

1. drop console messages if case of congestion but keep the system going
2. dont drop messages and wait, even if the system might come to a complete stop

Looking at drivers/char/vt.c
/* console busy or not yet initialized */
if (!printable)
return;
if (!spin_trylock(&printing_lock))
return;
could mean that Linux consoles should not block.

Maybe its time to ask some of the elder magicians (CCing Alan Cox and linux-
kernel) about blocking and error handling in console code.


> --- a/drivers/char/hvc_console.c
> +++ b/drivers/char/hvc_console.c
> @@ -161,7 +161,7 @@ static void hvc_console_print(struct console *co, const
> char *b, }
> } else {
> r = cons_ops[index]->put_chars(vtermnos[index], c, i);
> - if (r <= 0) {
> + if (r < 0) {
> /* throw away chars on error */
> i = 0;
> } else if (r > 0) {
>


2009-10-15 16:09:34

by Scott Wood

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Thu, Oct 15, 2009 at 01:05:47PM +0200, Christian Borntraeger wrote:
> The fact that struct console->write returns void indicates that the console
> layer is not interested in errors. We have two policies we can implement:
>
> 1. drop console messages if case of congestion but keep the system going
> 2. dont drop messages and wait, even if the system might come to a complete stop
>
> Looking at drivers/char/vt.c
> /* console busy or not yet initialized */
> if (!printable)
> return;
> if (!spin_trylock(&printing_lock))
> return;
> could mean that Linux consoles should not block.

That's a bit different -- the code above is testing for potential deadlocks
within Linux (or a not-yet-initialized console), not a device that has yet
to process the last batch of characters we threw at it. Plus, given the
"console must be locked when we get here" comment, I'm not sure that you'll
ever see contention on printing_lock?

Serial consoles currently block when waiting for the buffer to drain:

static void serial8250_console_putchar(struct uart_port *port, int ch)
{
struct uart_8250_port *up = (struct uart_8250_port *)port;

wait_for_xmitr(up, UART_LSR_THRE);
serial_out(up, UART_TX, ch);
}


-Scott

2009-10-15 18:42:50

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Am Donnerstag 15 Oktober 2009 18:09:06 schrieb Scott Wood:
> On Thu, Oct 15, 2009 at 01:05:47PM +0200, Christian Borntraeger wrote:
> > The fact that struct console->write returns void indicates that the
> > console layer is not interested in errors. We have two policies we can
> > implement:
> >
> > 1. drop console messages if case of congestion but keep the system going
> > 2. dont drop messages and wait, even if the system might come to a
> > complete stop
> >
> > Looking at drivers/char/vt.c
> > /* console busy or not yet initialized */
> > if (!printable)
> > return;
> > if (!spin_trylock(&printing_lock))
> > return;
> > could mean that Linux consoles should not block.
>
> That's a bit different -- the code above is testing for potential deadlocks
> within Linux (or a not-yet-initialized console), not a device that has yet
> to process the last batch of characters we threw at it. Plus, given the
> "console must be locked when we get here" comment, I'm not sure that you'll
> ever see contention on printing_lock?
>
> Serial consoles currently block when waiting for the buffer to drain:

Right. Looking at more drivers it seems that both ways (waiting and dropping)
are used.

Hmmm, if we are ok with having both options, we should let the hvc backend
decide if it wants to drain or to discard.

If we just busy loop, it actually does not matter how we let hvc_console react
on 0, as long as we adopt all backends to use that interface consistent.

On the other hand, backends might want to do special magic on congestion so I
personally tend to let the backend loop instead of hvc_console. But I am really
not sure.

Christian

2009-10-15 18:55:45

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Christian Borntraeger wrote:
> Hmmm, if we are ok with having both options, we should let the hvc backend
> decide if it wants to drain or to discard.
>
> If we just busy loop, it actually does not matter how we let hvc_console react
> on 0, as long as we adopt all backends to use that interface consistent.
>
> On the other hand, backends might want to do special magic on congestion so I
> personally tend to let the backend loop instead of hvc_console. But I am really
> not sure.

The reason that we're asking for this change is that I have an hvc client driver that drops characters during heavy printk() output. hvc calls my driver, but the output buffer is still full, so the driver just returns 0.

If I add a spin-loop in the driver, then we don't need to change hvc, but now the driver is blocking during user-space print operations (via hvc_write and hvc_push).

--
Timur Tabi
Linux kernel developer at Freescale

2009-10-15 18:57:58

by Scott Wood

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Christian Borntraeger wrote:
> Right. Looking at more drivers it seems that both ways (waiting and dropping)
> are used.
>
> Hmmm, if we are ok with having both options, we should let the hvc backend
> decide if it wants to drain or to discard.

I'd say the dropping approach is quite undesirable (significant
potential for output loss unless the buffer is huge), unless there's
simply no way to safely spin. Hopefully there are no such backends, but
if there are perhaps we can have them return some special code to
indicate that.

> If we just busy loop, it actually does not matter how we let hvc_console react
> on 0, as long as we adopt all backends to use that interface consistent.
>
> On the other hand, backends might want to do special magic on congestion so I
> personally tend to let the backend loop instead of hvc_console. But I am really
> not sure.

Doing it in the backend requires the backend to know whether it's being
called for printk or for user I/O. In the latter case, we don't want to
spin, but rather wait for an IRQ (or poll with a timer if there's no IRQ).

-Scott

2009-10-15 19:27:08

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Am Donnerstag 15 Oktober 2009 20:57:45 schrieb Scott Wood:
> Doing it in the backend requires the backend to know whether it's being
> called for printk or for user I/O. In the latter case, we don't want to
> spin, but rather wait for an IRQ (or poll with a timer if there's no IRQ).

Right. Now you have me convinced that we really should have some logic in
hvc_console because, its the only place that knows if it came from printk or
user.

About the backends, there are some that spin until the text is delivered (e.g.
virtio) , others can drop (e.g. iucv is a connection oriented protocol and it
will (and has to) drop if there is no connection).

2009-10-15 19:33:07

by Scott Wood

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

Christian Borntraeger wrote:
> About the backends, there are some that spin until the text is delivered (e.g.
> virtio) , others can drop (e.g. iucv is a connection oriented protocol and it
> will (and has to) drop if there is no connection).

Sure, dropping due to not having a connection makes sense. That's
different from merely being busy. Can the iucv code tell the difference
between those two states?

-Scott

2009-10-16 04:47:53

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Thu, 2009-10-15 at 13:57 -0500, Scott Wood wrote:
> I'd say the dropping approach is quite undesirable (significant
> potential for output loss unless the buffer is huge), unless there's
> simply no way to safely spin. Hopefully there are no such backends, but
> if there are perhaps we can have them return some special code to
> indicate that.

Should never spin. Best is to keep a copy in the upper layer of the
pending data and throttle (not accept further data from tty layer) until
we have managed to flush out that "pending" buffer.

> > If we just busy loop, it actually does not matter how we let hvc_console react
> > on 0, as long as we adopt all backends to use that interface consistent.
> >
> > On the other hand, backends might want to do special magic on congestion so I
> > personally tend to let the backend loop instead of hvc_console. But I am really
> > not sure.
>
> Doing it in the backend requires the backend to know whether it's being
> called for printk or for user I/O. In the latter case, we don't want to
> spin, but rather wait for an IRQ (or poll with a timer if there's no IRQ).

Ben.

2009-10-16 08:50:01

by Hendrik Brueckner

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Thu, Oct 15, 2009 at 02:32:54PM -0500, Scott Wood wrote:
> Christian Borntraeger wrote:
>> About the backends, there are some that spin until the text is
>> delivered (e.g. virtio) , others can drop (e.g. iucv is a connection
>> oriented protocol and it will (and has to) drop if there is no
>> connection).
>
> Sure, dropping due to not having a connection makes sense. That's
> different from merely being busy. Can the iucv code tell the difference
> between those two states?

The states are handled by the hvc_iucv itself:
If the hvc_iucv code has a connection established, terminal or console data
are queued and sent to the peer. If the state is disconnected, terminal and
console data is discarded internally.

- Hendrik

2009-10-16 15:34:29

by Scott Wood

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Fri, Oct 16, 2009 at 03:46:45PM +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2009-10-15 at 13:57 -0500, Scott Wood wrote:
> > I'd say the dropping approach is quite undesirable (significant
> > potential for output loss unless the buffer is huge), unless there's
> > simply no way to safely spin. Hopefully there are no such backends, but
> > if there are perhaps we can have them return some special code to
> > indicate that.
>
> Should never spin.

Why is a hypervisor console different than a serial port in this regard?

> Best is to keep a copy in the upper layer of the pending data and throttle
> (not accept further data from tty layer) until we have managed to flush
> out that "pending" buffer.

The data isn't coming from the tty layer -- we're talking about printk. How
do you throttle that without spinning?

I agree that it shouldn't spin when handling tty I/O.

-Scott

2009-10-16 18:03:50

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Fri, 2009-10-16 at 10:33 -0500, Scott Wood wrote:
> On Fri, Oct 16, 2009 at 03:46:45PM +1100, Benjamin Herrenschmidt wrote:
> > On Thu, 2009-10-15 at 13:57 -0500, Scott Wood wrote:
> > > I'd say the dropping approach is quite undesirable (significant
> > > potential for output loss unless the buffer is huge), unless there's
> > > simply no way to safely spin. Hopefully there are no such backends, but
> > > if there are perhaps we can have them return some special code to
> > > indicate that.
> >
> > Should never spin.
>
> Why is a hypervisor console different than a serial port in this regard?

Ah sorry, yeah, struct console can I suppose, it's the tty that
shouldn't.

> > Best is to keep a copy in the upper layer of the pending data and throttle
> > (not accept further data from tty layer) until we have managed to flush
> > out that "pending" buffer.
>
> The data isn't coming from the tty layer -- we're talking about printk. How
> do you throttle that without spinning?
>
> I agree that it shouldn't spin when handling tty I/O.

Ben.

2009-10-17 23:17:56

by Timur Tabi

[permalink] [raw]
Subject: Re: [PATCH] hvc_console: returning 0 from put_chars is not an error

On Fri, Oct 16, 2009 at 3:49 AM, Hendrik Brueckner
<[email protected]> wrote:

> The states are handled by the hvc_iucv itself:
> If the hvc_iucv code has a connection established, terminal or console data
> are queued and sent to the peer. If the state is disconnected, terminal and
> console data is discarded internally.

So what is the plan now?

--
Timur Tabi
Linux kernel developer at Freescale