2002-03-15 13:16:56

by Russell King

[permalink] [raw]
Subject: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

Linus, Marcelo, Dave,

The following patch removes Alt-Sysrq-L and its associated hack to kill
of PID1, the init process. This is a mis-feature.

If PID1 is killed, the kernel immediately enters an infinite loop in the
depths of do_exit() with interrupts disabled, completely locking the
machine. Obviously you can only reach for the reset button or power
switch after this, leaving you with dirty filesystems.

This patch has appeared on LKML a couple of months ago.

--- orig/drivers/char/sysrq.c Fri Mar 15 10:13:07 2002
+++ linux/drivers/char/sysrq.c Mon Mar 11 11:44:18 2002
@@ -284,24 +284,20 @@

/* signal sysrq helper function
* Sends a signal to all user processes */
-static void send_sig_all(int sig, int even_init)
+static void send_sig_all(int sig)
{
struct task_struct *p;

for_each_task(p) {
- if (p->mm) { /* Not swapper nor kernel thread */
- if (p->pid == 1 && even_init)
- /* Ugly hack to kill init */
- p->pid = 0x8000;
- if (p->pid != 1)
- force_sig(sig, p);
- }
+ if (p->mm && p->pid != 1)
+ /* Not swapper, init nor kernel thread */
+ force_sig(sig, p);
}
}

static void sysrq_handle_term(int key, struct pt_regs *pt_regs,
struct kbd_struct *kbd, struct tty_struct *tty) {
- send_sig_all(SIGTERM, 0);
+ send_sig_all(SIGTERM);
console_loglevel = 8;
}
static struct sysrq_key_op sysrq_term_op = {
@@ -312,7 +308,7 @@

static void sysrq_handle_kill(int key, struct pt_regs *pt_regs,
struct kbd_struct *kbd, struct tty_struct *tty) {
- send_sig_all(SIGKILL, 0);
+ send_sig_all(SIGKILL);
console_loglevel = 8;
}
static struct sysrq_key_op sysrq_kill_op = {
@@ -321,17 +317,6 @@
action_msg: "Kill All Tasks",
};

-static void sysrq_handle_killall(int key, struct pt_regs *pt_regs,
- struct kbd_struct *kbd, struct tty_struct *tty) {
- send_sig_all(SIGKILL, 1);
- console_loglevel = 8;
-}
-static struct sysrq_key_op sysrq_killall_op = {
- handler: sysrq_handle_killall,
- help_msg: "killalL",
- action_msg: "Kill All Tasks (even init)",
-};
-
/* END SIGNAL SYSRQ HANDLERS BLOCK */


@@ -366,7 +351,7 @@
#else
/* k */ NULL,
#endif
-/* l */ &sysrq_killall_op,
+/* l */ NULL,
/* m */ &sysrq_showmem_op,
/* n */ NULL,
/* o */ NULL, /* This will often be registered

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html


2002-03-15 14:11:31

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L


[email protected] said:
> The following patch removes Alt-Sysrq-L and its associated hack to
> kill of PID1, the init process. This is a mis-feature.

This is not a mis-feature.

> If PID1 is killed, the kernel immediately enters an infinite loop in
> the depths of do_exit() with interrupts disabled, completely locking
> the machine. Obviously you can only reach for the reset button or
> power switch after this, leaving you with dirty filesystems.

This is a mis-feature. Leaving you without even the facility to use SysRq
any further is just insane.

--
dwmw2


2002-03-15 14:30:12

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

On Fri, Mar 15, 2002 at 02:11:04PM +0000, David Woodhouse wrote:
> [email protected] said:
> > The following patch removes Alt-Sysrq-L and its associated hack to
> > kill of PID1, the init process. This is a mis-feature.
>
> This is not a mis-feature.
>
> > If PID1 is killed, the kernel immediately enters an infinite loop in
> > the depths of do_exit() with interrupts disabled, completely locking
> > the machine. Obviously you can only reach for the reset button or
> > power switch after this, leaving you with dirty filesystems.
>
> This is a mis-feature. Leaving you without even the facility to use SysRq
> any further is just insane.

Well, I've tried this approach, Linus rejected it.

If you'd like to take up this problem, be my guest.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-03-15 14:33:02

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L


[email protected] said:
> Well, I've tried this approach, Linus rejected it.
> If you'd like to take up this problem, be my guest.

Not really - I also tried already. But I'm disinclined to offer band-aids
for the brokenness.

--
dwmw2


2002-03-15 14:43:02

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

On Fri, Mar 15, 2002 at 02:32:39PM +0000, David Woodhouse wrote:
> [email protected] said:
> > Well, I've tried this approach, Linus rejected it.
> > If you'd like to take up this problem, be my guest.
>
> Not really - I also tried already. But I'm disinclined to offer band-aids
> for the brokenness.

I don't know of any Linux kernel that has ever been able to cope with PID1
dying. I certainly remember facing the PID1 dying causing lockup as far
back as 1.3 kernels, and I even tried to fix it back then. The argument
put forward for not fixing it is that PID1 should not exit. Period.

The point here is not that the kernel itself can't cope with PID1 exiting,
but that the code _bypasses_ the protection put into the kernel against
PID1 exiting.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-03-15 14:47:12

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L


[email protected] said:
> I don't know of any Linux kernel that has ever been able to cope with
> PID1 dying. I certainly remember facing the PID1 dying causing lockup
> as far back as 1.3 kernels, and I even tried to fix it back then. The
> argument put forward for not fixing it is that PID1 should not exit.
> Period.

I distinctly recall it working perfectly OK in around 2.1.50. I had boxen
where /sbin/init was a shell script which would bring up the interfaces,
enable routing, and exit.

--
dwmw2


2002-03-15 14:54:52

by Nicholas Berry

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

> I distinctly recall it working perfectly OK in around 2.1.50. I had boxen
> where /sbin/init was a shell script which would bring up the interfaces,
> enable routing, and exit.

That's a different thing, I think.

That is, 'init exiting' versus 'all the code to prevent init being killed is bypassed and init is killed'

2002-03-15 15:03:02

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

On Fri, Mar 15, 2002 at 09:54:05AM -0500, Nicholas Berry wrote:
> > I distinctly recall it working perfectly OK in around 2.1.50. I had boxen
> > where /sbin/init was a shell script which would bring up the interfaces,
> > enable routing, and exit.
>
> That's a different thing, I think.
>
> That is, 'init exiting' versus 'all the code to prevent init being killed
> is bypassed and init is killed'

Very true.

With all recent kernels, init exiting causes the last of these to trigger:

NORET_TYPE void do_exit(long code)
{
struct task_struct *tsk = current;

if (in_interrupt())
panic("Aiee, killing interrupt handler!");
if (!tsk->pid)
panic("Attempted to kill the idle task!");
if (tsk->pid == 1)
panic("Attempted to kill init!");

It is this very test that Alt-SysRQ-L is attempting to bypass which causes
the problem.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-03-19 08:00:46

by Kasper Dupont

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

Russell King wrote:
>
> With all recent kernels, init exiting causes the last of these to trigger:
>
> NORET_TYPE void do_exit(long code)
> {
> struct task_struct *tsk = current;
>
> if (in_interrupt())
> panic("Aiee, killing interrupt handler!");
> if (!tsk->pid)
> panic("Attempted to kill the idle task!");
> if (tsk->pid == 1)
> panic("Attempted to kill init!");

Why actually panic because of an attempt to kill init?

Of course a message should be printed, but after that
couldn't do_exit enter a loop where it just handles
signals and zombies?

--
Kasper Dupont -- der bruger for meget tid p? usenet.
For sending spam use mailto:[email protected]

2002-03-19 16:29:18

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

On Tue, Mar 19, 2002 at 09:00:21AM +0100, Kasper Dupont wrote:
> Russell King wrote:
> >
> > With all recent kernels, init exiting causes the last of these to trigger:
> >
> > NORET_TYPE void do_exit(long code)
> > {
> > struct task_struct *tsk = current;
> >
> > if (in_interrupt())
> > panic("Aiee, killing interrupt handler!");
> > if (!tsk->pid)
> > panic("Attempted to kill the idle task!");
> > if (tsk->pid == 1)
> > panic("Attempted to kill init!");
>
> Why actually panic because of an attempt to kill init?
>
> Of course a message should be printed, but after that
> couldn't do_exit enter a loop where it just handles
> signals and zombies?

Examine the LKML archive around 23rd December 2001, where Alan Cox wrote:

| pid1 ends up trying to kill pid1 and it goes deeply down the toilet from
| that point onwards. The Unix traditional world reboots when pid 1 dies.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-03-19 21:30:42

by Kasper Dupont

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

--- exit.c~ Mon Feb 25 20:38:13 2002
+++ exit.c Tue Mar 19 21:47:58 2002
@@ -429,6 +429,38 @@
write_unlock_irq(&tasklist_lock);
}

+#define __KERNEL_SYSCALLS__
+#include <linux/unistd.h>
+NORET_TYPE void flush_child_loop(struct task_struct *curtask)
+{
+ struct k_sigaction sa;
+ daemonize();
+
+ spin_lock_irq(&curtask->sigmask_lock);
+ siginitsetinv(&curtask->blocked, sigmask(SIGCHLD));
+ recalc_sigpending(curtask);
+ spin_unlock_irq(&curtask->sigmask_lock);
+
+ /* Install a handler so SIGCLD is delivered */
+ sa.sa.sa_handler = SIG_IGN;
+ sa.sa.sa_flags = 0;
+ siginitset(&sa.sa.sa_mask, sigmask(SIGCHLD));
+ do_sigaction(SIGCHLD, &sa, (struct k_sigaction *)0);
+
+ for (;;) {
+ set_task_state(curtask, TASK_INTERRUPTIBLE);
+ schedule();
+ if (signal_pending(curtask)) {
+ while (waitpid(-1, (unsigned int *)0, __WALL|WNOHANG) > 0)
+ ;
+ spin_lock_irq(&curtask->sigmask_lock);
+ flush_signals(curtask);
+ recalc_sigpending(curtask);
+ spin_unlock_irq(&curtask->sigmask_lock);
+ }
+ }
+}
+
NORET_TYPE void do_exit(long code)
{
struct task_struct *tsk = current;
@@ -437,8 +469,10 @@
panic("Aiee, killing interrupt handler!");
if (!tsk->pid)
panic("Attempted to kill the idle task!");
- if (tsk->pid == 1)
- panic("Attempted to kill init!");
+ if (tsk->pid == 1) {
+ printk(KERN_EMERG "Attempted to kill init!\n");
+ flush_child_loop(tsk);
+ }
tsk->flags |= PF_EXITING;
del_timer_sync(&tsk->real_timer);


Attachments:
killinit.patch (1.42 kB)

2002-03-20 00:07:56

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4 and 2.5: remove Alt-Sysrq-L

> - panic("Attempted to kill init!");
> + if (tsk->pid == 1) {
> + printk(KERN_EMERG "Attempted to kill init!\n");
> + flush_child_loop(tsk);
> + }

This can occur in IRQ path - your code won't handle that. Otherwise it
seems to have potential