2001-12-03 08:51:56

by Junichi Nomura

[permalink] [raw]
Subject: [PATCH] 2.4.16 kernel/printk.c (per processor initialization check)

Hello,

I experienced system hang on my SMP machine and it turned out to be due to
console write before mmu initialization completes.

To be more specific, even if secondary processors are not in status enough
to do actual console I/O (e.g. mmu is not initialized), call_console_drivers()
tries to do it.
This leads to unpredictable result. For me, for example, it cause machine
check abort and hang up system.

Attached is a patch for it.

--- kernel/printk.c 2001/11/27 04:41:49 1.1.1.8
+++ kernel/printk.c 2001/12/03 05:25:26
@@ -491,20 +491,22 @@
*/
void release_console_sem(void)
{
unsigned long flags;
unsigned long _con_start, _log_end;
unsigned long must_wake_klogd = 0;

for ( ; ; ) {
spin_lock_irqsave(&logbuf_lock, flags);
must_wake_klogd |= log_start - log_end;
+ if (!(cpu_online_map & 1UL << smp_processor_id()))
+ break;
if (con_start == log_end)
break; /* Nothing to print */
_con_start = con_start;
_log_end = log_end;
con_start = log_end; /* Flush */
spin_unlock_irqrestore(&logbuf_lock, flags);
call_console_drivers(_con_start, _log_end);
}
console_may_schedule = 0;
up(&console_sem);

Best regards.
--
NOMURA, Jun'ichi <[email protected], [email protected]>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.


2001-12-04 00:20:52

by Junichi Nomura

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initialization check)

Hi,

Thank you for commenting.

From: Andrew Morton <[email protected]>
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initialization check)
Date: Mon, 03 Dec 2001 01:20:28 -0800

> Seems that there is some sort of ordering problem here - someone
> is calling printk before the MMU is initialised, but after some
> console drivers have been installed.

Yes.
Because smp_init() is later in place than console_init(), printk() can be
called in such a situation.
For example, in IA-64, identify_cpu() is called before ia64_mmu_init(),
while identify_cpu() calls printk() in it.
I don't think the ordering itself is a problem.

> I suspect the real fix is elsewhere, but I'm not sure where.
>
> Probably a clearer place to put this test would be within
> printk itself, immediately before the down_trylock. Does that
> work?

The reason I put it in release_console_sem() is that release_console_sem()
can be called from other functions than printk(), e.g. console_unblank().
I agree with you that it is clearer but I think it is not sufficient.

Best regards.
--
NOMURA, Jun'ichi <[email protected], [email protected]>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.

2001-12-04 01:45:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initialization check)

[email protected] wrote:
>
> Hello,
>
> I experienced system hang on my SMP machine and it turned out to be due to
> console write before mmu initialization completes.
>
> To be more specific, even if secondary processors are not in status enough
> to do actual console I/O (e.g. mmu is not initialized), call_console_drivers()
> tries to do it.
> This leads to unpredictable result. For me, for example, it cause machine
> check abort and hang up system.
>
> Attached is a patch for it.
>
> --- kernel/printk.c 2001/11/27 04:41:49 1.1.1.8
> +++ kernel/printk.c 2001/12/03 05:25:26
> @@ -491,20 +491,22 @@
> */
> void release_console_sem(void)
> {
> unsigned long flags;
> unsigned long _con_start, _log_end;
> unsigned long must_wake_klogd = 0;
>
> for ( ; ; ) {
> spin_lock_irqsave(&logbuf_lock, flags);
> must_wake_klogd |= log_start - log_end;
> + if (!(cpu_online_map & 1UL << smp_processor_id()))
> + break;
> if (con_start == log_end)
> break; /* Nothing to print */
> _con_start = con_start;
> _log_end = log_end;
> con_start = log_end; /* Flush */
> spin_unlock_irqrestore(&logbuf_lock, flags);
> call_console_drivers(_con_start, _log_end);
> }
> console_may_schedule = 0;
> up(&console_sem);
>

Seems that there is some sort of ordering problem here - someone
is calling printk before the MMU is initialised, but after some
console drivers have been installed.

I suspect the real fix is elsewhere, but I'm not sure where.

Probably a clearer place to put this test would be within
printk itself, immediately before the down_trylock. Does that
work?

-

2001-12-04 02:16:31

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initializationcheck)

[email protected] wrote:
>
> Hi,
>
> Thank you for commenting.
>
> From: Andrew Morton <[email protected]>
> Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initialization check)
> Date: Mon, 03 Dec 2001 01:20:28 -0800
>
> > Seems that there is some sort of ordering problem here - someone
> > is calling printk before the MMU is initialised, but after some
> > console drivers have been installed.
>
> Yes.
> Because smp_init() is later in place than console_init(), printk() can be
> called in such a situation.
> For example, in IA-64, identify_cpu() is called before ia64_mmu_init(),
> while identify_cpu() calls printk() in it.
> I don't think the ordering itself is a problem.
>
> > I suspect the real fix is elsewhere, but I'm not sure where.
> >
> > Probably a clearer place to put this test would be within
> > printk itself, immediately before the down_trylock. Does that
> > work?
>
> The reason I put it in release_console_sem() is that release_console_sem()
> can be called from other functions than printk(), e.g. console_unblank().
> I agree with you that it is clearer but I think it is not sufficient.
>

I really doubt if any of those paths could be called before
even the MMU is set up.

It seems that the ia64 port has installed some console drivers,
and has then called them before it is ready to do so. Via printk.

It should not have installed the console drivers that early. Do
you know what console driver is causing the problem?

If the console driver is not fixable then a more general approach
would be, in printk.c:

#ifndef ARCH_HAS_PRINTK_MAY_BE_USED
#define printk_may_be_used() (1)
#endif

then, in printk() itself:

if (*p == '\n')
log_level_unknown = 1;
}

+ if (!printk_may_be_used())
+ return printed_len;

if (!down_trylock(&console_sem)) {
/*

then, for ia64, give it a printk_may_be_used() function, and
define ARCH_HAS_PRINTK_MAY_BE_USED somewhere.

Or just not install console drivers before they may be safely
used!

-

2001-12-06 05:01:46

by Junichi Nomura

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processor initializationcheck)

Hi,
excuse me for not prompt response. I've been off-line for 2 days.

> > The reason I put it in release_console_sem() is that release_console_sem()
> > can be called from other functions than printk(), e.g. console_unblank().
> > I agree with you that it is clearer but I think it is not sufficient.
>
> I really doubt if any of those paths could be called before
> even the MMU is set up.

I didn't have any intention to say that console_unblank() is called so early.

OK. Here is revised patch which checks if cpu initialization is done
just before down_trylock(). This works for me.

diff -u -r1.1.1.8 printk.c
--- kernel/printk.c 2001/11/27 04:41:49 1.1.1.8
+++ kernel/printk.c 2001/12/06 04:54:50
@@ -438,7 +438,13 @@
log_level_unknown = 1;
}

- if (!down_trylock(&console_sem)) {
+ if (!(cpu_online_map & 1UL << smp_processor_id())) {
+ /*
+ * The cpu has not been initialized completely
+ * enough to call console drivers.
+ */
+ spin_unlock_irqrestore(&logbuf_lock, flags);
+ } else if (!down_trylock(&console_sem)) {
/*
* We own the drivers. We can drop the spinlock and let
* release_console_sem() print the text



Best regards.
--
NOMURA, Jun'ichi <[email protected], [email protected]>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.

2001-12-07 03:40:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

[email protected] wrote:
>
> Hi,
> excuse me for not prompt response. I've been off-line for 2 days.
>
> > > The reason I put it in release_console_sem() is that release_console_sem()
> > > can be called from other functions than printk(), e.g. console_unblank().
> > > I agree with you that it is clearer but I think it is not sufficient.
> >
> > I really doubt if any of those paths could be called before
> > even the MMU is set up.
>

Marcelo,

after a fairly lengthy off-list discussion, it turns out that
special-casing printk is probably the best way to proceed
to fix this one.

The problem is that the boot processor sets up the console drivers,
and is able to call them via printk(), but the application processors
at that time are not able to call printk() because the console device
driver mappings are not set up. Undoing this inside the ia64 boot code is
complex and fragile. Possibly the problem exists on other platforms
but hasn't been discovered yet.

So the patch defines an architecture-specific macro `arch_consoles_callable()'
which, if not defined, defaults to `1', so the impact to other platforms
(and to uniprocessor ia64) is zero.



--- linux-2.4.17-pre4/include/asm-ia64/system.h Thu Nov 22 23:02:59 2001
+++ linux-akpm/include/asm-ia64/system.h Wed Dec 5 23:09:15 2001
@@ -405,6 +405,10 @@ extern void ia64_load_extra (struct task
ia64_psr(ia64_task_regs(prev))->dfh = 1; \
__switch_to(prev,next,last); \
} while (0)
+
+/* Return true if this CPU can call the console drivers in printk() */
+#define arch_consoles_callable() (cpu_online_map & (1UL << smp_processor_id()))
+
#else
# define switch_to(prev,next,last) do { \
ia64_psr(ia64_task_regs(next))->dfh = (ia64_get_fpu_owner() != (next)); \
--- linux-2.4.17-pre4/kernel/printk.c Thu Nov 22 23:02:59 2001
+++ linux-akpm/kernel/printk.c Wed Dec 5 23:03:54 2001
@@ -38,6 +38,10 @@

#define LOG_BUF_MASK (LOG_BUF_LEN-1)

+#ifndef arch_consoles_callable
+#define arch_consoles_callable() (1)
+#endif
+
/* printk's without a loglevel use this.. */
#define DEFAULT_MESSAGE_LOGLEVEL 4 /* KERN_WARNING */

@@ -438,6 +442,14 @@ asmlinkage int printk(const char *fmt, .
log_level_unknown = 1;
}

+ if (!arch_consoles_callable()) {
+ /*
+ * On some architectures, the consoles are not usable
+ * on secondary CPUs early in the boot process.
+ */
+ spin_unlock_irqrestore(&logbuf_lock, flags);
+ goto out;
+ }
if (!down_trylock(&console_sem)) {
/*
* We own the drivers. We can drop the spinlock and let
@@ -454,6 +466,7 @@ asmlinkage int printk(const char *fmt, .
*/
spin_unlock_irqrestore(&logbuf_lock, flags);
}
+out:
return printed_len;
}
EXPORT_SYMBOL(printk);

2001-12-07 20:08:57

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)


How hard would it be to fix that on ia64 code?

I'm really not willing to apply this kludge...


On Thu, 6 Dec 2001, Andrew Morton wrote:

> [email protected] wrote:
> >
> > Hi,
> > excuse me for not prompt response. I've been off-line for 2 days.
> >
> > > > The reason I put it in release_console_sem() is that release_console_sem()
> > > > can be called from other functions than printk(), e.g. console_unblank().
> > > > I agree with you that it is clearer but I think it is not sufficient.
> > >
> > > I really doubt if any of those paths could be called before
> > > even the MMU is set up.
> >
>
> Marcelo,
>
> after a fairly lengthy off-list discussion, it turns out that
> special-casing printk is probably the best way to proceed
> to fix this one.
>
> The problem is that the boot processor sets up the console drivers,
> and is able to call them via printk(), but the application processors
> at that time are not able to call printk() because the console device
> driver mappings are not set up. Undoing this inside the ia64 boot code is
> complex and fragile. Possibly the problem exists on other platforms
> but hasn't been discovered yet.
>
> So the patch defines an architecture-specific macro `arch_consoles_callable()'
> which, if not defined, defaults to `1', so the impact to other platforms
> (and to uniprocessor ia64) is zero.
>
>
>
> --- linux-2.4.17-pre4/include/asm-ia64/system.h Thu Nov 22 23:02:59 2001
> +++ linux-akpm/include/asm-ia64/system.h Wed Dec 5 23:09:15 2001
> @@ -405,6 +405,10 @@ extern void ia64_load_extra (struct task
> ia64_psr(ia64_task_regs(prev))->dfh = 1; \
> __switch_to(prev,next,last); \
> } while (0)
> +
> +/* Return true if this CPU can call the console drivers in printk() */
> +#define arch_consoles_callable() (cpu_online_map & (1UL << smp_processor_id()))
> +
> #else
> # define switch_to(prev,next,last) do { \
> ia64_psr(ia64_task_regs(next))->dfh = (ia64_get_fpu_owner() != (next)); \
> --- linux-2.4.17-pre4/kernel/printk.c Thu Nov 22 23:02:59 2001
> +++ linux-akpm/kernel/printk.c Wed Dec 5 23:03:54 2001
> @@ -38,6 +38,10 @@
>
> #define LOG_BUF_MASK (LOG_BUF_LEN-1)
>
> +#ifndef arch_consoles_callable
> +#define arch_consoles_callable() (1)
> +#endif
> +
> /* printk's without a loglevel use this.. */
> #define DEFAULT_MESSAGE_LOGLEVEL 4 /* KERN_WARNING */
>
> @@ -438,6 +442,14 @@ asmlinkage int printk(const char *fmt, .
> log_level_unknown = 1;
> }
>
> + if (!arch_consoles_callable()) {
> + /*
> + * On some architectures, the consoles are not usable
> + * on secondary CPUs early in the boot process.
> + */
> + spin_unlock_irqrestore(&logbuf_lock, flags);
> + goto out;
> + }
> if (!down_trylock(&console_sem)) {
> /*
> * We own the drivers. We can drop the spinlock and let
> @@ -454,6 +466,7 @@ asmlinkage int printk(const char *fmt, .
> */
> spin_unlock_irqrestore(&logbuf_lock, flags);
> }
> +out:
> return printed_len;
> }
> EXPORT_SYMBOL(printk);
>

2001-12-07 20:52:49

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

On Fri, Dec 07, 2001 at 04:52:07PM -0200, Marcelo Tosatti wrote:
> How hard would it be to fix that on ia64 code?
> I'm really not willing to apply this kludge...

It's possible, but the off-list discussion's consensus centered
around this being a bootstrap ordering issue, where drivers may
not be called from the application processors (at least not universally)
until they have been initialized to some degree. printk() is in the
unique position of performing such calls, and that is the idea around
which this patch is centered. Specifically, on those architectures
where some initialization of kernel virtual addressing is required to
access memory regions from which console (or any) I/O is done, the
driver calls may not be done from application processors until they
have been initialized (on IA64, this is initializing the uncached
region's region register and installing TLB fault handlers).

So this is actually a broader issue than IA64 alone, though
IA64 seems to be the first to have encountered it. The patch here
provides a hook for all affected architectures to protect themselves
against this issue, although there are perhaps other ways.


Cheers,
Bill

2001-12-07 21:39:45

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti <[email protected]> said:

Marcelo> I'm really not willing to apply this kludge...

Do you agree that it should always be safe to call printk() from C code?

--david

2001-12-07 22:01:14

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> >>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti <[email protected]> said:
> Marcelo> I'm really not willing to apply this kludge...
>
> Do you agree that it should always be safe to call printk() from C code?

Sounds a good goal, but surely thats up to the arch code to get right

2001-12-07 22:04:14

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)



On Fri, 7 Dec 2001, David Mosberger wrote:

> >>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti <[email protected]> said:
>
> Marcelo> I'm really not willing to apply this kludge...
>
> Do you agree that it should always be safe to call printk() from C code?

No if you can't access the console to print the message :)

Its just that I would prefer to see the thing fixed in arch-dependant code
instead special casing core code.

2001-12-07 22:11:01

by Chris Friesen

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

David Mosberger wrote:
>
> >>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti <[email protected]> said:
>
> Marcelo> I'm really not willing to apply this kludge...
>
> Do you agree that it should always be safe to call printk() from C code?

Is it really safe to call this from interrupt handlers? I can think of cases
where the time required to print can totally mess stuff up...

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2001-12-07 22:18:21

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Fri, 7 Dec 2001 18:47:11 -0200 (BRST), Marcelo Tosatti <[email protected]> said:

Marcelo> On Fri, 7 Dec 2001, David Mosberger wrote:

>> >>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti
>> <[email protected]> said:
>>
Marcelo> I'm really not willing to apply this kludge...
>> Do you agree that it should always be safe to call printk() from
>> C code?

Marcelo> No if you can't access the console to print the message :)

Let me quote the first few lines of the Linux kernel:

----
asmlinkage void __init start_kernel(void)
{
char * command_line;
unsigned long mempages;
extern char saved_command_line[];
/*
* Interrupts are still disabled. Do necessary setups, then
* enable them
*/
lock_kernel();
printk(linux_banner);
----

You still think it doesn't make sense?

Marcelo> Its just that I would prefer to see the thing fixed in
Marcelo> arch-dependant code instead special casing core code.

Only architecture specific problems should be fixed with architecture
specific code.

I'm not entirely sure whether this particular problem is architecture
specific. Perhaps it is and, if so, I'm certainly happy to fix it in
the ia64 specific code. However, are you really 100% certain that on
x86 there are no console drivers which in some fashion depend on
cpu_init() having completed execution? Note that the x86 version of
cpu_init() also has printk()s. If you're not certain of this, the AP
startup problem could occur on x86, too. I haven't looked at all the
other platforms, but I suspect similar things will be true, there.

--david

2001-12-07 22:26:07

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)



On Fri, 7 Dec 2001, David Mosberger wrote:

> >>>>> On Fri, 7 Dec 2001 18:47:11 -0200 (BRST), Marcelo Tosatti <[email protected]> said:
>
> Marcelo> On Fri, 7 Dec 2001, David Mosberger wrote:
>
> >> >>>>> On Fri, 7 Dec 2001 16:52:07 -0200 (BRST), Marcelo Tosatti
> >> <[email protected]> said:
> >>
> Marcelo> I'm really not willing to apply this kludge...
> >> Do you agree that it should always be safe to call printk() from
> >> C code?
>
> Marcelo> No if you can't access the console to print the message :)
>
> Let me quote the first few lines of the Linux kernel:
>
> ----
> asmlinkage void __init start_kernel(void)
> {
> char * command_line;
> unsigned long mempages;
> extern char saved_command_line[];
> /*
> * Interrupts are still disabled. Do necessary setups, then
> * enable them
> */
> lock_kernel();
> printk(linux_banner);
> ----
>
> You still think it doesn't make sense?

Ok, it does. However, this still does not make me change my mind.

> Marcelo> Its just that I would prefer to see the thing fixed in
> Marcelo> arch-dependant code instead special casing core code.
>
> Only architecture specific problems should be fixed with architecture
> specific code.
>
> I'm not entirely sure whether this particular problem is architecture
> specific. Perhaps it is and, if so, I'm certainly happy to fix it in
> the ia64 specific code. However, are you really 100% certain that on
> x86 there are no console drivers which in some fashion depend on
> cpu_init() having completed execution? Note that the x86 version of
> cpu_init() also has printk()s. If you're not certain of this, the AP
> startup problem could occur on x86, too. I haven't looked at all the
> other platforms, but I suspect similar things will be true, there.

Prove, please. If you show me it can also happen on other architectures,
I'll be glad to apply the patch.

2001-12-08 01:10:46

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Fri, 7 Dec 2001 19:09:03 -0200 (BRST), Marcelo Tosatti <[email protected]> said:

>> I'm not entirely sure whether this particular problem is
>> architecture specific. Perhaps it is and, if so, I'm certainly
>> happy to fix it in the ia64 specific code. However, are you
>> really 100% certain that on x86 there are no console drivers
>> which in some fashion depend on cpu_init() having completed
>> execution? Note that the x86 version of cpu_init() also has
>> printk()s. If you're not certain of this, the AP startup problem
>> could occur on x86, too. I haven't looked at all the other
>> platforms, but I suspect similar things will be true, there.

Marcelo> Prove, please. If you show me it can also happen on other
Marcelo> architectures, I'll be glad to apply the patch.

I'm no x86 expert, but I have the impression that
current_cpu_data.loops_per_jiffy will be invalid (probably 0) until
smp_store_cpu_info() is called in smp_callin(). If so, a console
driver using udelay() might not work properly. I suspect there are
other issues, but this is just based on looking at the x86 source code
for 5 minutes.

--david

2001-12-08 11:24:20

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> I'm no x86 expert, but I have the impression that
> current_cpu_data.loops_per_jiffy will be invalid (probably 0) until
> smp_store_cpu_info() is called in smp_callin(). If so, a console
> driver using udelay() might not work properly. I suspect there are
> other issues, but this is just based on looking at the x86 source code
> for 5 minutes.

x86_udelay_tsc wont have been set at that point so the main timer is still
being used.

2001-12-08 16:42:06

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Sat, 8 Dec 2001 11:27:19 +0000 (GMT), Alan Cox <[email protected]> said:

>> I'm no x86 expert, but I have the impression that
>> current_cpu_data.loops_per_jiffy will be invalid (probably 0)
>> until smp_store_cpu_info() is called in smp_callin(). If so, a
>> console driver using udelay() might not work properly. I suspect
>> there are other issues, but this is just based on looking at the
>> x86 source code for 5 minutes.

Alan> x86_udelay_tsc wont have been set at that point so the main
Alan> timer is still being used.

x86 does use current_cpu_data.loops_per_jiffy in the non-TSC case, no?

--david

2001-12-08 20:37:09

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> Alan> x86_udelay_tsc wont have been set at that point so the main
> Alan> timer is still being used.
>
> x86 does use current_cpu_data.loops_per_jiffy in the non-TSC case, no?

I believe so. So we should propogate that across earlier, although its
not needed for our current console drivers that I can see

2001-12-09 00:33:33

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Sat, 8 Dec 2001 20:45:07 +0000 (GMT), Alan Cox <[email protected]> said:

Alan> x86_udelay_tsc wont have been set at that point so the main
Alan> timer is still being used.

>> x86 does use current_cpu_data.loops_per_jiffy in the non-TSC
>> case, no?

Alan> I believe so. So we should propogate that across earlier,
Alan> although its not needed for our current console drivers that I
Alan> can see

I don't think you can do it early enough. calibrate_delay() requires
irqs to be enabled and the first printk() happens long before irqs are
enabled on an AP.

--david

2001-12-09 00:52:24

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> I don't think you can do it early enough. calibrate_delay() requires
> irqs to be enabled and the first printk() happens long before irqs are
> enabled on an AP.

So we make sure our initial console code doesnt need udelay(), or set an
initial safe default like 25MHz

2001-12-09 00:59:14

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Sun, 9 Dec 2001 00:55:25 +0000 (GMT), Alan Cox <[email protected]> said:

>> I don't think you can do it early enough. calibrate_delay()
>> requires irqs to be enabled and the first printk() happens long
>> before irqs are enabled on an AP.

Alan> So we make sure our initial console code doesnt need udelay(),
Alan> or set an initial safe default like 25MHz

So someone is going to maintain a list of what a console driver can
and cannot do for all 12+ ports in existence?

The alternative is to do:

--- linux-2.4.16/kernel/printk.c Mon Nov 26 11:19:24 2001
+++ lia64-kdb/kernel/printk.c Thu Nov 29 21:45:08 2001
@@ -498,6 +505,10 @@
for ( ; ; ) {
spin_lock_irqsave(&logbuf_lock, flags);
must_wake_klogd |= log_start - log_end;
+#ifdef CONFIG_SMP
+ if (!(cpu_online_map & (1UL << smp_processor_id())))
+ break;
+#endif
if (con_start == log_end)
break; /* Nothing to print */
_con_start = con_start;

and be done with it.

--david

2001-12-09 01:10:25

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> Alan> So we make sure our initial console code doesnt need udelay(),
> Alan> or set an initial safe default like 25MHz
>
> So someone is going to maintain a list of what a console driver can
> and cannot do for all 12+ ports in existence?
>
> The alternative is to do:

And break the ability for non broken setups to debug SMP boot up. Lets do
the job properly.

2001-12-09 01:15:15

by David Mosberger

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

>>>>> On Sun, 9 Dec 2001 01:15:25 +0000 (GMT), Alan Cox <[email protected]> said:

Alan> And break the ability for non broken setups to debug SMP boot
Alan> up. Lets do the job properly.

Then use Andrew's patch (attached below).

--david

--- linux-2.4.17-pre4/include/asm-ia64/system.h Thu Nov 22 23:02:59 2001
+++ linux-akpm/include/asm-ia64/system.h Wed Dec 5 23:09:15 2001
@@ -405,6 +405,10 @@ extern void ia64_load_extra (struct task
ia64_psr(ia64_task_regs(prev))->dfh = 1; \
__switch_to(prev,next,last); \
} while (0)
+
+/* Return true if this CPU can call the console drivers in printk() */
+#define arch_consoles_callable() (cpu_online_map & (1UL << smp_processor_id()))
+
#else
# define switch_to(prev,next,last) do { \
ia64_psr(ia64_task_regs(next))->dfh = (ia64_get_fpu_owner() != (next)); \
--- linux-2.4.17-pre4/kernel/printk.c Thu Nov 22 23:02:59 2001
+++ linux-akpm/kernel/printk.c Wed Dec 5 23:03:54 2001
@@ -38,6 +38,10 @@

#define LOG_BUF_MASK (LOG_BUF_LEN-1)

+#ifndef arch_consoles_callable
+#define arch_consoles_callable() (1)
+#endif
+
/* printk's without a loglevel use this.. */
#define DEFAULT_MESSAGE_LOGLEVEL 4 /* KERN_WARNING */

@@ -438,6 +442,14 @@ asmlinkage int printk(const char *fmt, .
log_level_unknown = 1;
}

+ if (!arch_consoles_callable()) {
+ /*
+ * On some architectures, the consoles are not usable
+ * on secondary CPUs early in the boot process.
+ */
+ spin_unlock_irqrestore(&logbuf_lock, flags);
+ goto out;
+ }
if (!down_trylock(&console_sem)) {
/*
* We own the drivers. We can drop the spinlock and let
@@ -454,6 +466,7 @@ asmlinkage int printk(const char *fmt, .
*/
spin_unlock_irqrestore(&logbuf_lock, flags);
}
+out:
return printed_len;
}
EXPORT_SYMBOL(printk);

2001-12-09 01:24:25

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.16 kernel/printk.c (per processorinitializationcheck)

> Alan> And break the ability for non broken setups to debug SMP boot
> Alan> up. Lets do the job properly.
>
> Then use Andrew's patch (attached below).

No objection.