Move the hadover message to after the boot console has been released to
avoid bad interactions between it and the real console.
Signed-off-by: Maciej W. Rozycki <[email protected]>
---
The 69331af79cf29e26d1231152a172a1a10c2df511 commit of May 8th added a
"console handover: ..." message to register_console() that is output
during the short period when both the boot and the newly-registered
console are registered.
This is presumably fine for boot consoles implemented entirely by Linux
as they are fully controlled. But it may produce problems when the boot
console is actually implemented as a call to the firmware which may not be
quite happy about how the OS driver for the piece of hardware involved
controls it.
I hit this problem with the DECstation. Depending on the configuration
the fimrware uses a graphics adapter or a predefined serial port for
console output -- which device actually that is cannot be reliably
determined by Linux, though an approximation may be possible. Now if the
firmware uses the serial port and Linux is asked to use the same serial
port for the real console, then this printk() hangs forever in the
firmware. The driver used is drivers/serial/zs.c.
The reason is by the time the ->write() call is issued for the boot
console as a result of this printk(), the zs.c driver has been initialised
and because at the moment the serial port has not been opened, the serial
transmitter is disabled. The firmware polls for the transmit buffer empty
condition, but does not enable the module, presumably under the assumption
it will not be called once an OS driver has taken control of the device
(the register containing the enable bit is write-only anyway, so it would
be hard to restore the previous value). This causes a hang, because once
a single character is put into the transmit buffer it will not become
empty until the transmitter has been enabled.
The serial console as implemented by zs.c handles the case correctly, by
enabling the transmitter, outputting what should be output, waiting for
the transmit shift register to drain and restoring the state of the
transmitter enable (which is held in a shadow variable).
I feel a bit uneasy about keeping serial transmitters enabled for lines
that have not been opened; I gather others may agree as for example while
not explicitly mentioned, I believe it is implied by what is said in
Documentation/serial/driver referring to the ->shutdown() call: "Disable
the port, [...]" -- with the transmitter enabled a port can hardly be
considered fully disabled. Below is a change which makes the problem
disappear for me, but I suppose there was a deliberate reason for placing
the printk() where it is now and nowhere else.
Any suggestions will be appreciated.
Maciej
patch-mips-2.6.23-rc5-20070904-printk-handover-0
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/kernel/printk.c linux-mips-2.6.23-rc5-20070904/kernel/printk.c
--- linux-mips-2.6.23-rc5-20070904.macro/kernel/printk.c 2007-09-04 04:56:21.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/kernel/printk.c 2007-09-19 21:10:16.000000000 +0000
@@ -1014,11 +1014,11 @@ void register_console(struct console *co
return;
if (bootconsole && (console->flags & CON_CONSDEV)) {
+ unregister_console(bootconsole);
+ console->flags &= ~CON_PRINTBUFFER;
printk(KERN_INFO "console handover: boot [%s%d] -> real [%s%d]\n",
bootconsole->name, bootconsole->index,
console->name, console->index);
- unregister_console(bootconsole);
- console->flags &= ~CON_PRINTBUFFER;
} else {
printk(KERN_INFO "console [%s%d] enabled\n",
console->name, console->index);
On Thu, 20 Sep 2007 18:28:49 +0100 (BST) "Maciej W. Rozycki" <[email protected]> wrote:
> Move the hadover message to after the boot console has been released to
> avoid bad interactions between it and the real console.
>
> Signed-off-by: Maciej W. Rozycki <[email protected]>
> ---
> The 69331af79cf29e26d1231152a172a1a10c2df511 commit of May 8th added a
> "console handover: ..." message to register_console() that is output
> during the short period when both the boot and the newly-registered
> console are registered.
>
> This is presumably fine for boot consoles implemented entirely by Linux
> as they are fully controlled. But it may produce problems when the boot
> console is actually implemented as a call to the firmware which may not be
> quite happy about how the OS driver for the piece of hardware involved
> controls it.
>
> I hit this problem with the DECstation. Depending on the configuration
> the fimrware uses a graphics adapter or a predefined serial port for
> console output -- which device actually that is cannot be reliably
> determined by Linux, though an approximation may be possible. Now if the
> firmware uses the serial port and Linux is asked to use the same serial
> port for the real console, then this printk() hangs forever in the
> firmware. The driver used is drivers/serial/zs.c.
>
> The reason is by the time the ->write() call is issued for the boot
> console as a result of this printk(), the zs.c driver has been initialised
> and because at the moment the serial port has not been opened, the serial
> transmitter is disabled. The firmware polls for the transmit buffer empty
> condition, but does not enable the module, presumably under the assumption
> it will not be called once an OS driver has taken control of the device
> (the register containing the enable bit is write-only anyway, so it would
> be hard to restore the previous value). This causes a hang, because once
> a single character is put into the transmit buffer it will not become
> empty until the transmitter has been enabled.
>
> The serial console as implemented by zs.c handles the case correctly, by
> enabling the transmitter, outputting what should be output, waiting for
> the transmit shift register to drain and restoring the state of the
> transmitter enable (which is held in a shadow variable).
>
> I feel a bit uneasy about keeping serial transmitters enabled for lines
> that have not been opened; I gather others may agree as for example while
> not explicitly mentioned, I believe it is implied by what is said in
> Documentation/serial/driver referring to the ->shutdown() call: "Disable
> the port, [...]" -- with the transmitter enabled a port can hardly be
> considered fully disabled. Below is a change which makes the problem
> disappear for me, but I suppose there was a deliberate reason for placing
> the printk() where it is now and nowhere else.
>
> Any suggestions will be appreciated.
>
> Maciej
>
> patch-mips-2.6.23-rc5-20070904-printk-handover-0
> diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/kernel/printk.c linux-mips-2.6.23-rc5-20070904/kernel/printk.c
> --- linux-mips-2.6.23-rc5-20070904.macro/kernel/printk.c 2007-09-04 04:56:21.000000000 +0000
> +++ linux-mips-2.6.23-rc5-20070904/kernel/printk.c 2007-09-19 21:10:16.000000000 +0000
> @@ -1014,11 +1014,11 @@ void register_console(struct console *co
> return;
>
> if (bootconsole && (console->flags & CON_CONSDEV)) {
> + unregister_console(bootconsole);
> + console->flags &= ~CON_PRINTBUFFER;
> printk(KERN_INFO "console handover: boot [%s%d] -> real [%s%d]\n",
> bootconsole->name, bootconsole->index,
> console->name, console->index);
> - unregister_console(bootconsole);
> - console->flags &= ~CON_PRINTBUFFER;
> } else {
> printk(KERN_INFO "console [%s%d] enabled\n",
> console->name, console->index);
It would be useful to have some basic information like: Which kernel
version was this found in? Which kernel version last worked?
Maciej W. Rozycki <[email protected]> wrote:
> Move the hadover message to after the boot console has been released to
> avoid bad interactions between it and the real console.
This message is usefull if the handover fails, therefore it should be printed
on the boot console, while successfull switchover is implied by having any
output on the real console. Isn't it?
--
Our last fight was my fault: My wife asked me "What's on the TV?"
I said, "Dust!"
Fri?, Spammer: [email protected]
Andrew Morton wrote:
> On Thu, 20 Sep 2007 18:28:49 +0100 (BST) "Maciej W. Rozycki" <[email protected]> wrote:
>
>> Move the hadover message to after the boot console has been released to
>> avoid bad interactions between it and the real console.
[ longish problem discussion snipped ]
>> considered fully disabled. Below is a change which makes the problem
>> disappear for me, but I suppose there was a deliberate reason for placing
>> the printk() where it is now and nowhere else.
Well, I placed the printk there is for user interface reasons. I think
especially in case the early console and the real console go to
different physical devices it is useful to have the reason it stops
printing messages displayed on the early console. So people don't think
the computer hangs although it just prints messages elsewhere ...
If that isn't going to work due to two instances not knowing each other
(kernel & firmware) should not mess with the same physical device, then
I'd just drop the printk. And I see no pretty and easy way around that
issue :-(
We could do the printk and unregister before we setup the new console.
Which has the drawback that we are in trouble in case the setup() call
for the new console fails ...
We could split the printk into two, one early ("trying to setup new
console foo") which goes to the boot console, then (assuming the setup
worked ok) unregister silently and print a message about the successful
init and boot console unregister on the new console only. Which results
in two lines being printed for the handover when both consoles address
the same physical device. Not that nice IMHO, but maybe still the best
way to handle it.
cheers,
Gerd
On Fri, Sep 21, 2007 at 02:42:34PM +0200, Gerd Hoffmann wrote:
> Andrew Morton wrote:
> > On Thu, 20 Sep 2007 18:28:49 +0100 (BST) "Maciej W. Rozycki" <[email protected]> wrote:
> >
> >> Move the hadover message to after the boot console has been released to
> >> avoid bad interactions between it and the real console.
>
> [ longish problem discussion snipped ]
>
> >> considered fully disabled. Below is a change which makes the problem
> >> disappear for me, but I suppose there was a deliberate reason for placing
> >> the printk() where it is now and nowhere else.
>
> Well, I placed the printk there is for user interface reasons. I think
> especially in case the early console and the real console go to
> different physical devices it is useful to have the reason it stops
> printing messages displayed on the early console. So people don't think
> the computer hangs although it just prints messages elsewhere ...
>
> If that isn't going to work due to two instances not knowing each other
> (kernel & firmware) should not mess with the same physical device, then
> I'd just drop the printk. And I see no pretty and easy way around that
> issue :-(
>
> We could do the printk and unregister before we setup the new console.
> Which has the drawback that we are in trouble in case the setup() call
> for the new console fails ...
>
> We could split the printk into two, one early ("trying to setup new
> console foo") which goes to the boot console, then (assuming the setup
> worked ok) unregister silently and print a message about the successful
> init and boot console unregister on the new console only. Which results
> in two lines being printed for the handover when both consoles address
> the same physical device. Not that nice IMHO, but maybe still the best
> way to handle it.
I had an issue with the console initialisation on serial ports, which I
discovered during my PXA work. My reason for asking about the kernel
versions (which Andrew forwarded to LKML) is to determine whether the
report is as a result of those changes, or lack of those changes.
Those -mm versions with git-arm in probably have that change. Ergo
the importance to answer this question about kernel versions.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
Russell King wrote:
> I had an issue with the console initialisation on serial ports, which I
> discovered during my PXA work. My reason for asking about the kernel
> versions (which Andrew forwarded to LKML) is to determine whether the
> report is as a result of those changes, or lack of those changes.
>
> Those -mm versions with git-arm in probably have that change. Ergo
> the importance to answer this question about kernel versions.
The problem description sounds like it could have been introduced by
that patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=69331af79cf29e26d1231152a172a1a10c2df511
cheers,
Gerd
On Fri, 21 Sep 2007, Gerd Hoffmann wrote:
> Well, I placed the printk there is for user interface reasons. I think
> especially in case the early console and the real console go to
> different physical devices it is useful to have the reason it stops
> printing messages displayed on the early console. So people don't think
> the computer hangs although it just prints messages elsewhere ...
I have thought this might be the reason, but did not want to suggest
anything. Yes, it is useful in a way and I was bitten a few times in the
past with the output disappearing on the boot console with no further
output (and usually a complete hang) because of a problem in the real
console driver. It just happens I always knew which driver was involved.
> If that isn't going to work due to two instances not knowing each other
> (kernel & firmware) should not mess with the same physical device, then
> I'd just drop the printk. And I see no pretty and easy way around that
> issue :-(
Not so fast please. :-)
> We could do the printk and unregister before we setup the new console.
> Which has the drawback that we are in trouble in case the setup() call
> for the new console fails ...
>
> We could split the printk into two, one early ("trying to setup new
> console foo") which goes to the boot console, then (assuming the setup
> worked ok) unregister silently and print a message about the successful
> init and boot console unregister on the new console only. Which results
> in two lines being printed for the handover when both consoles address
> the same physical device. Not that nice IMHO, but maybe still the best
> way to handle it.
I have thought of it too. It is just the "for" loop responsible for
doing this is somewhat convoluted so I did not want to proceed with a
possibly complicated patch just to demonstrate the problem.
I think it is not a bad approach and in some sense it could be nicer as
would provide some feedback on which drivers are tried for the console.
The drawback is if there are many candidate console drivers that may fail
for some reason, like being optional for the platform, then a lot of
output will result. The way it could work might be by printing the:
handover: boot [<foo>] -> real [<bar>]
line to the boot console only and then printing:
console [<bar>] enabled
unconditionally if registering of <bar> succeeded and once <foo> has been
unregistered.
On the other hand the current approach has this limitation that if it
actually works for a given platform, then the two lines may be printed to
the same devices and overlap. This is for example the case with the SWARM
board (using drivers/serial/sb1250-duart.c) which also uses a firmware
call for the initial console. It does not seem to be that concerned about
the transmitter being disabled, but it does not wait for the transmitter
shift register to drain and output is garbled. Here is an excerpt from a
SWARM boot log as seen on the serial port (codes in agle brackets coming
from the interpretation of non-ASCII characters by `less'):
PID hash table entries: 1024 (order: 10, 8192 bytes)
Using 1.000 MHz high precision timer.
Console: colour dum7<BD><B9><CD><BD><B1><95><81>handover: boot [early0] -> real [duart0]
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
Though frankly it is even some previous line being destroyed here...
Maciej
On Fri, 21 Sep 2007, Andrew Morton wrote:
> It would be useful to have some basic information like: Which kernel
> version was this found in? Which kernel version last worked?
The kernel used was a 2.6.23-rc5 snapshot from Sep 4th, 2007 taken from
the linux-mips.org tree (as stated by the revision of the patch provided).
There is no version that can be considered working as the change to add
the printk() to register_console() went in before drivers/serial/zs.c was
added and the old drivers/tc/zs.c driver might have been to sloppy to
handle such bits as the transmitter enable properly. I may investigate.
Chances are nobody tried any version of the kernel from the period
between the printk() was added and drivers/tc/zs.c removed with the
DECstation; I have not.
Anyway, I did confirm keeping the transmitter enabled all the time makes
the problem go away.
Maciej
On Fri, Sep 21, 2007 at 02:36:50PM +0100, Maciej W. Rozycki wrote:
> On Fri, 21 Sep 2007, Andrew Morton wrote:
>
> > It would be useful to have some basic information like: Which kernel
> > version was this found in? Which kernel version last worked?
>
> The kernel used was a 2.6.23-rc5 snapshot from Sep 4th, 2007 taken from
> the linux-mips.org tree (as stated by the revision of the patch provided).
> There is no version that can be considered working as the change to add
> the printk() to register_console() went in before drivers/serial/zs.c was
> added and the old drivers/tc/zs.c driver might have been to sloppy to
> handle such bits as the transmitter enable properly. I may investigate.
>
> Chances are nobody tried any version of the kernel from the period
> between the printk() was added and drivers/tc/zs.c removed with the
> DECstation; I have not.
>
> Anyway, I did confirm keeping the transmitter enabled all the time makes
> the problem go away.
I suspect the correct fix for this is to take the fix in the ARM tree
for serial_core's console initialisation, and ensure that zs.c handles
the power management of the port (enabling/disabling transmitter)
in the power management callback.
You can find the fix in the mbox archive:
http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cut/arm:devel.mbox
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
On Fri, 21 Sep 2007, Russell King wrote:
> I suspect the correct fix for this is to take the fix in the ARM tree
> for serial_core's console initialisation, and ensure that zs.c handles
> the power management of the port (enabling/disabling transmitter)
> in the power management callback.
Hmm, it sounds a bit hackish to me. The serial part is from the 1970s --
nobody thought of power management back then. And even though the
DECstation is about 20 years younger, it does not provide any means for
power management either -- it is a workstation/server class system after
all.
> You can find the fix in the mbox archive:
>
> http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cut/arm:devel.mbox
Hmm, nothing of relevance there.
I see no problem with implementing the ->pm() call as something like:
void zs_pm(struct uart_port uport, unsigned int state, unsigned int oldstate)
{
struct zs_port *zport = to_zport(uport);
if (state < 3)
zport->regs[5] |= TxENAB;
else
zport->regs[5] &= ~TxENAB;
write_zsreg(zport, R5, zport->regs[5]);
}
itself, but it looks more like a workaround than a fix to me. I'll have
to think about it and also double-check it is safe. The SCC is about as
complicated as you can get with a serial port.
Thanks for the hint though.
Maciej
On Fri, Sep 21, 2007 at 03:36:05PM +0100, Maciej W. Rozycki wrote:
> On Fri, 21 Sep 2007, Russell King wrote:
> > You can find the fix in the mbox archive:
> >
> > http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cut/arm:devel.mbox
>
> Hmm, nothing of relevance there.
Hmm, it does help to occasionally run the script which uploads the
patches and stuff. Also I typoed the URL:
http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/arm:devel.mbox
> I see no problem with implementing the ->pm() call as something like:
>
> void zs_pm(struct uart_port uport, unsigned int state, unsigned int oldstate)
> {
> struct zs_port *zport = to_zport(uport);
>
> if (state < 3)
> zport->regs[5] |= TxENAB;
> else
> zport->regs[5] &= ~TxENAB;
> write_zsreg(zport, R5, zport->regs[5]);
> }
>
> itself, but it looks more like a workaround than a fix to me. I'll have
> to think about it and also double-check it is safe. The SCC is about as
> complicated as you can get with a serial port.
You will need the serial patch from the above mbox.
PS, Gerd's bouncing email address removed from the CC line.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
On Fri, 21 Sep 2007, Russell King wrote:
> > I see no problem with implementing the ->pm() call as something like:
> >
> > void zs_pm(struct uart_port uport, unsigned int state, unsigned int oldstate)
> > {
> > struct zs_port *zport = to_zport(uport);
> >
> > if (state < 3)
> > zport->regs[5] |= TxENAB;
> > else
> > zport->regs[5] &= ~TxENAB;
> > write_zsreg(zport, R5, zport->regs[5]);
> > }
> >
> > itself, but it looks more like a workaround than a fix to me. I'll have
> > to think about it and also double-check it is safe. The SCC is about as
> > complicated as you can get with a serial port.
>
> You will need the serial patch from the above mbox.
I have got it now and tried together with the function above and some
minor adjustments as appropriate elsewhere. It seems to work; by
coincidence (because of the sequence the bootstrap initialisation of the
console and the serial core is done) it even works when the selected real
console is other than the bootstrap one. I do not particularly like to
have things working by accident and I stand by what I said previously
about this approach looking hackish to me.
Though lacking better proposals I suppose this will have to stay. Here
is a change I have implemented; if no further comments are seen, I will
submit it for inclusion.
Maciej
patch-mips-2.6.23-rc5-20070904-zs-pm-4
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/drivers/serial/zs.c linux-mips-2.6.23-rc5-20070904/drivers/serial/zs.c
--- linux-mips-2.6.23-rc5-20070904.macro/drivers/serial/zs.c 2007-09-04 04:55:44.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/drivers/serial/zs.c 2007-09-22 20:25:10.000000000 +0000
@@ -787,7 +787,6 @@ static int zs_startup(struct uart_port *
zport->regs[1] &= ~RxINT_MASK;
zport->regs[1] |= RxINT_ALL | TxINT_ENAB | EXT_INT_ENAB;
zport->regs[3] |= RxENABLE;
- zport->regs[5] |= TxENAB;
zport->regs[15] |= BRKIE;
write_zsreg(zport, R1, zport->regs[1]);
write_zsreg(zport, R3, zport->regs[3]);
@@ -814,7 +813,6 @@ static void zs_shutdown(struct uart_port
spin_lock_irqsave(&scc->zlock, flags);
- zport->regs[5] &= ~TxENAB;
zport->regs[3] &= ~RxENABLE;
write_zsreg(zport, R5, zport->regs[5]);
write_zsreg(zport, R3, zport->regs[3]);
@@ -959,6 +957,23 @@ static void zs_set_termios(struct uart_p
spin_unlock_irqrestore(&scc->zlock, flags);
}
+/*
+ * Hack alert!
+ * Required solely so that the initial PROM-based console
+ * works undisturbed in parallel with this one.
+ */
+static void zs_pm(struct uart_port *uport, unsigned int state,
+ unsigned int oldstate)
+{
+ struct zs_port *zport = to_zport(uport);
+
+ if (state < 3)
+ zport->regs[5] |= TxENAB;
+ else
+ zport->regs[5] &= ~TxENAB;
+ write_zsreg(zport, R5, zport->regs[5]);
+}
+
static const char *zs_type(struct uart_port *uport)
{
@@ -1041,6 +1056,7 @@ static struct uart_ops zs_ops = {
.startup = zs_startup,
.shutdown = zs_shutdown,
.set_termios = zs_set_termios,
+ .pm = zs_pm,
.type = zs_type,
.release_port = zs_release_port,
.request_port = zs_request_port,
@@ -1190,6 +1206,7 @@ static int __init zs_console_setup(struc
return ret;
zs_reset(zport);
+ zs_pm(uport, 0, -1);
if (options)
uart_parse_options(options, &baud, &parity, &bits, &flow);