2005-09-21 13:11:19

by Brice Goglin

[permalink] [raw]
Subject: Kernel panic during SysRq-b on Alpha

Hi,

I get the following panic each time I try SysRq-b on my
Alpha EV56. This is a home-compiled 2.6.13.1 kernel.
I don't see anything interesting in 2.6.13.2 changelog.

This one was actually caught through a serial line,
but I see the same behavior when sysrq-b is passed
through the main console.

Regards,
Brice



SysRq : Resetting
Kernel bug at kernel/printk.c:683
swapper(0): Kernel Bug 1
pc = [<fffffc000032706c>] ra = [<fffffc00004352d4>] ps = 0007 Not
tainted
pc is at acquire_console_sem+0x2c/0x90
ra is at take_over_console+0x74/0x500
v0 = 0000000000000007 t0 = 000000000fffff00 t1 = 0000000000010000
t2 = fffffc0000002240 t3 = 0000000000000000 t4 = 000000000000000d
t5 = 000000000000000e t6 = ffffffffffffe051 t7 = fffffc000059c000
a0 = fffffc00005022c0 a1 = 0000000000000000 a2 = 000000000000003e
a3 = 0000000000000001 a4 = 0000000000000001 a5 = 0000000000000005
t8 = 000000000000001f t9 = fffffc00004038d0 t10= 0000000000000000
t11= 000000000000000a pv = fffffc0000327040 at = 0000000000000000
gp = fffffc0000648e00 sp = fffffc000059fbc0
Trace:
[<fffffc00004352d4>] take_over_console+0x74/0x500
[<fffffc0000312de8>] common_shutdown_1+0x78/0x130
[<fffffc0000312ec0>] common_shutdown+0x20/0x30
[<fffffc00004372f4>] __handle_sysrq+0xd4/0x200
[<fffffc0000441398>] receive_chars+0x1e8/0x330
[<fffffc000044197c>] serial8250_interrupt+0x12c/0x130
[<fffffc0000315dfc>] handle_IRQ_event+0x6c/0xf0
[<fffffc00003167a0>] handle_irq+0xd0/0x180
[<fffffc000031f530>] miata_srm_device_interrupt+0x30/0x50
[<fffffc0000316d64>] do_entInt+0xf4/0x140
[<fffffc0000311140>] ret_from_sys_call+0x0/0x10
[<fffffc0000312ce0>] default_idle+0x0/0x10
[<fffffc0000312d48>] cpu_idle+0x58/0x80
[<fffffc0000312ce0>] default_idle+0x0/0x10
[<fffffc0000312ce0>] default_idle+0x0/0x10
[<fffffc00003100a4>] rest_init+0x44/0x60
[<fffffc000031001c>] __start+0x1c/0x20

Code: 2021ff00 b53e0008 a0480064 44410002 e4400004 00000081
<000002ab> 0050944f
Kernel panic - not syncing: Aiee, killing interrupt handler!


2005-09-22 06:13:10

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

On Wed, Sep 21, 2005 at 03:11:07PM +0200, Brice Goglin wrote:
> Kernel bug at kernel/printk.c:683
> swapper(0): Kernel Bug 1
> pc = [<fffffc000032706c>] ra = [<fffffc00004352d4>] ps = 0007 Not
> tainted
> pc is at acquire_console_sem+0x2c/0x90

Indeed, acquire_console_sem() does BUG() in interrupt context now,
as in the case of SysRq-b.

Ivan.

--- linux/arch/alpha/kernel/process.c.orig Mon Aug 29 03:41:01 2005
+++ linux/arch/alpha/kernel/process.c Thu Sep 22 09:51:26 2005
@@ -127,6 +127,10 @@ common_shutdown_1(void *generic_ptr)
/* If booted from SRM, reset some of the original environment. */
if (alpha_using_srm) {
#ifdef CONFIG_DUMMY_CONSOLE
+ /* If we've gotten here after SysRq-b, leave interrupt
+ context before taking over the console. */
+ if (in_interrupt())
+ irq_exit();
/* This has the effect of resetting the VGA video origin. */
take_over_console(&dummy_con, 0, MAX_NR_CONSOLES-1, 1);
#endif

2005-09-22 06:43:17

by Andrew Morton

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

Ivan Kokshaysky <[email protected]> wrote:
>
> On Wed, Sep 21, 2005 at 03:11:07PM +0200, Brice Goglin wrote:
> > Kernel bug at kernel/printk.c:683
> > swapper(0): Kernel Bug 1
> > pc = [<fffffc000032706c>] ra = [<fffffc00004352d4>] ps = 0007 Not
> > tainted
> > pc is at acquire_console_sem+0x2c/0x90
>
> Indeed, acquire_console_sem() does BUG() in interrupt context now,
> as in the case of SysRq-b.
>
> Ivan.
>
> --- linux/arch/alpha/kernel/process.c.orig Mon Aug 29 03:41:01 2005
> +++ linux/arch/alpha/kernel/process.c Thu Sep 22 09:51:26 2005
> @@ -127,6 +127,10 @@ common_shutdown_1(void *generic_ptr)
> /* If booted from SRM, reset some of the original environment. */
> if (alpha_using_srm) {
> #ifdef CONFIG_DUMMY_CONSOLE
> + /* If we've gotten here after SysRq-b, leave interrupt
> + context before taking over the console. */
> + if (in_interrupt())
> + irq_exit();
> /* This has the effect of resetting the VGA video origin. */
> take_over_console(&dummy_con, 0, MAX_NR_CONSOLES-1, 1);

Wow, never seen that done before. Does it actually work? For keyboard,
serial console and /proc/sysrq-trigger?

2005-09-22 09:05:06

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

On Wed, Sep 21, 2005 at 11:42:32PM -0700, Andrew Morton wrote:
> Wow, never seen that done before. Does it actually work? For keyboard,
> serial console and /proc/sysrq-trigger?

Yes, all of this works for me.

There is another problem on Alpha with 2.6.14-rc kernels, much worse:
slab.c:index_of() works _only_ when it's really inlined, because of
__builtin_constant_p() check. It happens to work on other archs
due to "always_inline" alchemy in compiler.h, but on Alpha we undo
the "inline" redefinitions as they heavily break our internal stuff.
So the slab.c blows up very early on boot (at least when compiled
with gcc3).

I'd be happy if it is possible to stop global redefining of "inline"
keywords and just use __attribute__((always_inline)) when needed.
If not, I don't know how to fix that cleanly.

Richard?

Ivan.

2005-09-22 09:22:38

by Andrew Morton

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

Ivan Kokshaysky <[email protected]> wrote:
>
> On Wed, Sep 21, 2005 at 11:42:32PM -0700, Andrew Morton wrote:
> > Wow, never seen that done before. Does it actually work? For keyboard,
> > serial console and /proc/sysrq-trigger?
>
> Yes, all of this works for me.
>
> There is another problem on Alpha with 2.6.14-rc kernels, much worse:
> slab.c:index_of() works _only_ when it's really inlined, because of
> __builtin_constant_p() check. It happens to work on other archs
> due to "always_inline" alchemy in compiler.h, but on Alpha we undo
> the "inline" redefinitions as they heavily break our internal stuff.
> So the slab.c blows up very early on boot (at least when compiled
> with gcc3).

hm, you might need to do some special-casing around that function.

> I'd be happy if it is possible to stop global redefining of "inline"
> keywords and just use __attribute__((always_inline)) when needed.
> If not, I don't know how to fix that cleanly.

We did that because gcc 3.3 (iirc) was utterly buggered. I forget what it
was doing exactly - generating out-of-line copies in various compilation
units, using more stack space as a result. That workaround shrunk typical
x86 kernels by ~64k.

If recent gcc's have a -fdont-be-so-damn-stupid option we could use that.

2005-09-22 10:14:04

by Brice Goglin

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

Le 22.09.2005 11:04, Ivan Kokshaysky a ?crit :
> On Wed, Sep 21, 2005 at 11:42:32PM -0700, Andrew Morton wrote:
>
>>Wow, never seen that done before. Does it actually work? For keyboard,
>>serial console and /proc/sysrq-trigger?
>
>
> Yes, all of this works for me.

Thanks a lot, works here too (only tried on serial console).

By the way, Ivan, do you have problems with gcc 4 on alpha ?
All kernels I tried between 2.6.11 and 2.6.13 with Debian gcc-4.0.1-2
have a strange bug that does not appear with gcc 3.3 and 3.4
(non-root ssh sessions are immediately closed).

Regards,
Brice

2005-09-22 10:34:46

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

On Thu, Sep 22, 2005 at 02:21:52AM -0700, Andrew Morton wrote:
> hm, you might need to do some special-casing around that function.

Maybe something like this?
I think this also makes more obvious how index_of() was
supposed to work.

Ivan.

--- 2.6.14-rc1/include/asm-alpha/compiler.h.inline Mon Aug 29 03:41:01 2005
+++ linux/include/asm-alpha/compiler.h Thu Sep 22 13:49:53 2005
@@ -98,6 +98,9 @@
#undef inline
#undef __inline__
#undef __inline
-
+#if __GNUC__ == 3 && __GNUC_MINOR__ >= 1 || __GNUC__ > 3
+#undef __always_inline
+#define __always_inline inline __attribute__((always_inline))
+#endif

#endif /* __ALPHA_COMPILER_H */
--- 2.6.14-rc1/mm/slab.c Tue Sep 13 14:16:37 2005
+++ linux/mm/slab.c Thu Sep 22 13:58:18 2005
@@ -308,12 +308,12 @@ struct kmem_list3 __initdata initkmem_li
#define SIZE_L3 (1 + MAX_NUMNODES)

/*
- * This function may be completely optimized away if
+ * This function must be completely optimized away if
* a constant is passed to it. Mostly the same as
* what is in linux/slab.h except it returns an
* index.
*/
-static inline int index_of(const size_t size)
+static __always_inline int index_of(const size_t size)
{
if (__builtin_constant_p(size)) {
int i = 0;
@@ -329,7 +329,8 @@ static inline int index_of(const size_t
extern void __bad_size(void);
__bad_size();
}
- }
+ } else
+ BUG();
return 0;
}

2005-09-22 10:42:34

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

On Thu, Sep 22, 2005 at 12:13:54PM +0200, Brice Goglin wrote:
> By the way, Ivan, do you have problems with gcc 4 on alpha ?

Not that I use gcc 4 every day, but I don't recall any problems.
I'll try that again.

Ivan.

2005-09-23 22:03:23

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Kernel panic during SysRq-b on Alpha

On Thu, Sep 22, 2005 at 12:13:54PM +0200, Brice Goglin wrote:
> All kernels I tried between 2.6.11 and 2.6.13 with Debian gcc-4.0.1-2
> have a strange bug that does not appear with gcc 3.3 and 3.4
> (non-root ssh sessions are immediately closed).

Confirmed. :-(
This happens with gcc 4.0.1 release and 4.1-20050917 snapshot.
The sshd child process dies with following errors:

Sep 24 01:11:14 den sshd[568]: fatal: mm_send_fd: sendmsg(3): Invalid argument
Sep 24 01:11:14 den sshd[568]: syslogin_perform_logout: logout() returned an error
Sep 24 01:11:14 den sshd[571]: fatal: mm_receive_fd: recvmsg: expected received 1 got 0

At least this gives some clue where to seek...

Ivan.