2005-12-15 02:38:31

by Yasunori Goto

[permalink] [raw]
Subject: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().


Hello.

I met a trouble in 2.6.15-rc5-mm2 on my ia64 box. (Tiger4)
The trouble was kernel panic at early boot time due to calling
strange instruction "break 0" at smp_flush_tlb_all().

I investigated its cause and realized that gcc warned following
messages.
"arch/ia64/kernel/smp.c:228 Warning: function called through a non-
compatible type"
"arch/ia64/kernel/smp.c:228: note: if this code is reached,
the program will abort"
"arch/ia64/kernel/smp.c:251 Warning: function called through a non-
compatible type"
"arch/ia64/kernel/smp.c:251: note: if this code is reached,
the program will abort"

The line 228 and 251 are calling on_each_cpu(). And the last
instruction of this function was "break 0" indeed.

void
smp_flush_tlb_all (void)
{
on_each_cpu((void (*)(void *))local_flush_tlb_all, NULL, 1, 1);
}


void
smp_flush_tlb_mm (struct mm_struct *mm)
{
:
:
*/
on_each_cpu((void (*)(void *))local_finish_flush_tlb_mm, mm, 1, 1);
}

When I removed following patch which is in 2.6.15-rc5-mm2,
which changes on_each_cpu() from static inline function to macro,
then there was no warning, and kernel could boot up.
So, I guess that gcc was not able to solve a bit messy cast
for calling function "local_flush_tlb_all()" due to its change.

Thanks.

--------------------------------------------------------------------------

From: Benjamin LaHaise <[email protected]>

An inline function in smp.h introduces messy ordering requirements on
thread_info by way of using an inline function instead of macro. Convert
on_each_cpu to a macro in order to avoid a big include mess.

Signed-off-by: Andrew Morton <[email protected]>
---

include/linux/smp.h | 25 +++++++++++++------------
1 files changed, 13 insertions(+), 12 deletions(-)

diff -puN include/linux/smp.h~untangle-smph-vs-thread_info include/linux/smp.h
--- 25/include/linux/smp.h~untangle-smph-vs-thread_info Fri Dec 9 15:16:46 2005
+++ 25-akpm/include/linux/smp.h Fri Dec 9 15:16:46 2005
@@ -57,19 +57,20 @@ extern int smp_call_function (void (*fun
int retry, int wait);

/*
- * Call a function on all processors
+ * Call a function on all processors.
+ * This needs to be a macro to allow for arch specific dependances on
+ * sched.h in preempt_*().
*/
-static inline int on_each_cpu(void (*func) (void *info), void *info,
- int retry, int wait)
-{
- int ret = 0;
-
- preempt_disable();
- ret = smp_call_function(func, info, retry, wait);
- func(info);
- preempt_enable();
- return ret;
-}
+#define on_each_cpu(func, info, retry, wait) \
+({ \
+ int _ret = 0; \
+ \
+ preempt_disable(); \
+ _ret = smp_call_function(func, info, retry, wait); \
+ (func)(info); \
+ preempt_enable(); \
+ _ret; \
+})

#define MSG_ALL_BUT_SELF 0x8000 /* Assume <32768 CPU's */
#define MSG_ALL 0x8001
_

--
Yasunori Goto



2005-12-15 02:57:19

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

Yasunori Goto <[email protected]> wrote:
>
> When I removed following patch which is in 2.6.15-rc5-mm2,
> which changes on_each_cpu() from static inline function to macro,
> then there was no warning, and kernel could boot up.
> So, I guess that gcc was not able to solve a bit messy cast
> for calling function "local_flush_tlb_all()" due to its change.

Thanks. I'll drop it.

I built and booted that kernel on my Tiger. Odd. I suspect there's
something very non-aggressive about my .config - this sort of thing has
happened before.

2005-12-15 03:03:52

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

On Wed, Dec 14, 2005 at 06:56:58PM -0800, Andrew Morton wrote:
> Thanks. I'll drop it.

Please don't. Fix ia64's brain damage instead. Function pointers
should not be cast, period.

-ben
--
"You know, I've seen some crystals do some pretty trippy shit, man."
Don't Email: <[email protected]>.

2005-12-15 05:26:26

by Kenji Kaneshige

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

Benjamin LaHaise wrote:
> On Wed, Dec 14, 2005 at 06:56:58PM -0800, Andrew Morton wrote:
>
>>Thanks. I'll drop it.
>
>
> Please don't. Fix ia64's brain damage instead. Function pointers
> should not be cast, period.
>
> -ben

How about this?

Thanks,
Kenji Kaneshige


We need this patch on ia64 to convert on_each_cpu to a macro.

Signed-off-by: Kenji Kaneshige <[email protected]>

arch/ia64/kernel/smp.c | 4 ++--
arch/ia64/kernel/smpboot.c | 2 +-
arch/ia64/mm/tlb.c | 6 +++---
include/asm-ia64/mmu_context.h | 4 ++--
include/asm-ia64/tlbflush.h | 5 +++--
5 files changed, 11 insertions(+), 10 deletions(-)

Index: linux-2.6.15-rc5-mm2/arch/ia64/kernel/smpboot.c
===================================================================
--- linux-2.6.15-rc5-mm2.orig/arch/ia64/kernel/smpboot.c
+++ linux-2.6.15-rc5-mm2/arch/ia64/kernel/smpboot.c
@@ -652,7 +652,7 @@ int __cpu_disable(void)
remove_siblinginfo(cpu);
cpu_clear(cpu, cpu_online_map);
fixup_irqs();
- local_flush_tlb_all();
+ local_flush_tlb_all(NULL);
cpu_clear(cpu, cpu_callin_map);
return 0;
}
Index: linux-2.6.15-rc5-mm2/arch/ia64/mm/tlb.c
===================================================================
--- linux-2.6.15-rc5-mm2.orig/arch/ia64/mm/tlb.c
+++ linux-2.6.15-rc5-mm2/arch/ia64/mm/tlb.c
@@ -81,7 +81,7 @@ wrap_mmu_context (struct mm_struct *mm)
if (i != cpu)
per_cpu(ia64_need_tlb_flush, i) = 1;
put_cpu();
- local_flush_tlb_all();
+ local_flush_tlb_all(NULL);
}

void
@@ -111,7 +111,7 @@ ia64_global_tlb_purge (struct mm_struct
}

void
-local_flush_tlb_all (void)
+local_flush_tlb_all (void *dummy)
{
unsigned long i, j, flags, count0, count1, stride0, stride1, addr;

@@ -192,5 +192,5 @@ ia64_tlb_init (void)
local_cpu_data->ptce_stride[0] = ptce_info.stride[0];
local_cpu_data->ptce_stride[1] = ptce_info.stride[1];

- local_flush_tlb_all(); /* nuke left overs from bootstrapping... */
+ local_flush_tlb_all(NULL);/* nuke left overs from bootstrapping... */
}
Index: linux-2.6.15-rc5-mm2/arch/ia64/kernel/smp.c
===================================================================
--- linux-2.6.15-rc5-mm2.orig/arch/ia64/kernel/smp.c
+++ linux-2.6.15-rc5-mm2/arch/ia64/kernel/smp.c
@@ -225,7 +225,7 @@ smp_send_reschedule (int cpu)
void
smp_flush_tlb_all (void)
{
- on_each_cpu((void (*)(void *))local_flush_tlb_all, NULL, 1, 1);
+ on_each_cpu(local_flush_tlb_all, NULL, 1, 1);
}

void
@@ -248,7 +248,7 @@ smp_flush_tlb_mm (struct mm_struct *mm)
* anyhow, and once a CPU is interrupted, the cost of local_flush_tlb_all() is
* rather trivial.
*/
- on_each_cpu((void (*)(void *))local_finish_flush_tlb_mm, mm, 1, 1);
+ on_each_cpu(local_finish_flush_tlb_mm, mm, 1, 1);
}

/*
Index: linux-2.6.15-rc5-mm2/include/asm-ia64/tlbflush.h
===================================================================
--- linux-2.6.15-rc5-mm2.orig/include/asm-ia64/tlbflush.h
+++ linux-2.6.15-rc5-mm2/include/asm-ia64/tlbflush.h
@@ -23,7 +23,7 @@
* Flush everything (kernel mapping may also have changed due to
* vmalloc/vfree).
*/
-extern void local_flush_tlb_all (void);
+extern void local_flush_tlb_all (void *dummy);

#ifdef CONFIG_SMP
extern void smp_flush_tlb_all (void);
@@ -34,8 +34,9 @@ extern void local_flush_tlb_all (void);
#endif

static inline void
-local_finish_flush_tlb_mm (struct mm_struct *mm)
+local_finish_flush_tlb_mm (void *info)
{
+ struct mm_struct *mm = (struct mm_struct *)info;
if (mm == current->active_mm)
activate_context(mm);
}
Index: linux-2.6.15-rc5-mm2/include/asm-ia64/mmu_context.h
===================================================================
--- linux-2.6.15-rc5-mm2.orig/include/asm-ia64/mmu_context.h
+++ linux-2.6.15-rc5-mm2/include/asm-ia64/mmu_context.h
@@ -60,13 +60,13 @@ enter_lazy_tlb (struct mm_struct *mm, st
static inline void
delayed_tlb_flush (void)
{
- extern void local_flush_tlb_all (void);
+ extern void local_flush_tlb_all (void *);
unsigned long flags;

if (unlikely(__ia64_per_cpu_var(ia64_need_tlb_flush))) {
spin_lock_irqsave(&ia64_ctx.lock, flags);
if (__ia64_per_cpu_var(ia64_need_tlb_flush)) {
- local_flush_tlb_all();
+ local_flush_tlb_all(NULL);
__ia64_per_cpu_var(ia64_need_tlb_flush) = 0;
}
spin_unlock_irqrestore(&ia64_ctx.lock, flags);


2005-12-15 06:35:38

by Yasunori Goto

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

Thanks! It works!

BTW, I found same casted function at on_each_cpu() in parisc code.
(arch/parisc/kernel/cache.c
arch/parisc/kernel/smp.c
arch/parisc/mm/init.c)

Are they also should fixed?
I don't have parisc box. So, I don't know there is same trouble on
parisc box or not, and I can't test it.

Bye.


> Benjamin LaHaise wrote:
> > On Wed, Dec 14, 2005 at 06:56:58PM -0800, Andrew Morton wrote:
> >
> >>Thanks. I'll drop it.
> >
> >
> > Please don't. Fix ia64's brain damage instead. Function pointers
> > should not be cast, period.
> >
> > -ben
>
> How about this?
>
> Thanks,
> Kenji Kaneshige
>
>
> We need this patch on ia64 to convert on_each_cpu to a macro.
>
> Signed-off-by: Kenji Kaneshige <[email protected]>
>
> arch/ia64/kernel/smp.c | 4 ++--
> arch/ia64/kernel/smpboot.c | 2 +-
> arch/ia64/mm/tlb.c | 6 +++---
> include/asm-ia64/mmu_context.h | 4 ++--
> include/asm-ia64/tlbflush.h | 5 +++--
> 5 files changed, 11 insertions(+), 10 deletions(-)
>
> Index: linux-2.6.15-rc5-mm2/arch/ia64/kernel/smpboot.c
> ===================================================================
> --- linux-2.6.15-rc5-mm2.orig/arch/ia64/kernel/smpboot.c
> +++ linux-2.6.15-rc5-mm2/arch/ia64/kernel/smpboot.c
> @@ -652,7 +652,7 @@ int __cpu_disable(void)
> remove_siblinginfo(cpu);
> cpu_clear(cpu, cpu_online_map);
> fixup_irqs();
> - local_flush_tlb_all();
> + local_flush_tlb_all(NULL);
> cpu_clear(cpu, cpu_callin_map);
> return 0;
> }
> Index: linux-2.6.15-rc5-mm2/arch/ia64/mm/tlb.c
> ===================================================================
> --- linux-2.6.15-rc5-mm2.orig/arch/ia64/mm/tlb.c
> +++ linux-2.6.15-rc5-mm2/arch/ia64/mm/tlb.c
> @@ -81,7 +81,7 @@ wrap_mmu_context (struct mm_struct *mm)
> if (i != cpu)
> per_cpu(ia64_need_tlb_flush, i) = 1;
> put_cpu();
> - local_flush_tlb_all();
> + local_flush_tlb_all(NULL);
> }
>
> void
> @@ -111,7 +111,7 @@ ia64_global_tlb_purge (struct mm_struct
> }
>
> void
> -local_flush_tlb_all (void)
> +local_flush_tlb_all (void *dummy)
> {
> unsigned long i, j, flags, count0, count1, stride0, stride1, addr;
>
> @@ -192,5 +192,5 @@ ia64_tlb_init (void)
> local_cpu_data->ptce_stride[0] = ptce_info.stride[0];
> local_cpu_data->ptce_stride[1] = ptce_info.stride[1];
>
> - local_flush_tlb_all(); /* nuke left overs from bootstrapping... */
> + local_flush_tlb_all(NULL);/* nuke left overs from bootstrapping... */
> }
> Index: linux-2.6.15-rc5-mm2/arch/ia64/kernel/smp.c
> ===================================================================
> --- linux-2.6.15-rc5-mm2.orig/arch/ia64/kernel/smp.c
> +++ linux-2.6.15-rc5-mm2/arch/ia64/kernel/smp.c
> @@ -225,7 +225,7 @@ smp_send_reschedule (int cpu)
> void
> smp_flush_tlb_all (void)
> {
> - on_each_cpu((void (*)(void *))local_flush_tlb_all, NULL, 1, 1);
> + on_each_cpu(local_flush_tlb_all, NULL, 1, 1);
> }
>
> void
> @@ -248,7 +248,7 @@ smp_flush_tlb_mm (struct mm_struct *mm)
> * anyhow, and once a CPU is interrupted, the cost of local_flush_tlb_all() is
> * rather trivial.
> */
> - on_each_cpu((void (*)(void *))local_finish_flush_tlb_mm, mm, 1, 1);
> + on_each_cpu(local_finish_flush_tlb_mm, mm, 1, 1);
> }
>
> /*
> Index: linux-2.6.15-rc5-mm2/include/asm-ia64/tlbflush.h
> ===================================================================
> --- linux-2.6.15-rc5-mm2.orig/include/asm-ia64/tlbflush.h
> +++ linux-2.6.15-rc5-mm2/include/asm-ia64/tlbflush.h
> @@ -23,7 +23,7 @@
> * Flush everything (kernel mapping may also have changed due to
> * vmalloc/vfree).
> */
> -extern void local_flush_tlb_all (void);
> +extern void local_flush_tlb_all (void *dummy);
>
> #ifdef CONFIG_SMP
> extern void smp_flush_tlb_all (void);
> @@ -34,8 +34,9 @@ extern void local_flush_tlb_all (void);
> #endif
>
> static inline void
> -local_finish_flush_tlb_mm (struct mm_struct *mm)
> +local_finish_flush_tlb_mm (void *info)
> {
> + struct mm_struct *mm = (struct mm_struct *)info;
> if (mm == current->active_mm)
> activate_context(mm);
> }
> Index: linux-2.6.15-rc5-mm2/include/asm-ia64/mmu_context.h
> ===================================================================
> --- linux-2.6.15-rc5-mm2.orig/include/asm-ia64/mmu_context.h
> +++ linux-2.6.15-rc5-mm2/include/asm-ia64/mmu_context.h
> @@ -60,13 +60,13 @@ enter_lazy_tlb (struct mm_struct *mm, st
> static inline void
> delayed_tlb_flush (void)
> {
> - extern void local_flush_tlb_all (void);
> + extern void local_flush_tlb_all (void *);
> unsigned long flags;
>
> if (unlikely(__ia64_per_cpu_var(ia64_need_tlb_flush))) {
> spin_lock_irqsave(&ia64_ctx.lock, flags);
> if (__ia64_per_cpu_var(ia64_need_tlb_flush)) {
> - local_flush_tlb_all();
> + local_flush_tlb_all(NULL);
> __ia64_per_cpu_var(ia64_need_tlb_flush) = 0;
> }
> spin_unlock_irqrestore(&ia64_ctx.lock, flags);
>
>

--
Yasunori Goto



2005-12-15 14:31:23

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

On Thu, Dec 15, 2005 at 02:24:29PM +0900, Kenji Kaneshige wrote:
> How about this?

Excellent! Thanks Kenji. Tony, are you okay with this patch going in?

-ben

2005-12-15 15:41:04

by Matthew Wilcox

[permalink] [raw]
Subject: Re: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

On Thu, Dec 15, 2005 at 03:34:27PM +0900, Yasunori Goto wrote:
> Thanks! It works!
>
> BTW, I found same casted function at on_each_cpu() in parisc code.
> (arch/parisc/kernel/cache.c
> arch/parisc/kernel/smp.c
> arch/parisc/mm/init.c)
>
> Are they also should fixed?
> I don't have parisc box. So, I don't know there is same trouble on
> parisc box or not, and I can't test it.

Yes, these will also need to be changed. Thanks.

2005-12-15 17:24:24

by Luck, Tony

[permalink] [raw]
Subject: RE: 2.6.15-rc5-mm2 can't boot on ia64 due to changing on_each_cpu().

> On Thu, Dec 15, 2005 at 02:24:29PM +0900, Kenji Kaneshige wrote:
> > How about this?
>
> Excellent! Thanks Kenji. Tony, are you okay with this patch going in?

It is a bit annoying to have to add an argument that is never
used to local_flush_tlb_all() just to make the compiler make
the right code when we want to use in with on_each_cpu(). But
I don't see a better way.

Acked-by: Tony Luck <[email protected]>

-Tony