2002-09-08 22:46:19

by Anton Altaparmakov

[permalink] [raw]
Subject: PANIC caused by dequeue_signal() in current Linus BK tree

Hi,

The current Linus BK tree panics on INIT on my UP Athlon highmem box
(compiled with SMP and preempt enabled) -- more info available on request:

ksymoops 2.4.5 on i686 2.4.19. Options used
-v vmlinux (specified)
-K (specified)
-L (specified)
-o /lib/modules/2.5.33/ (specified)
-m ./System.map (specified)

No modules in ksyms, skipping objects
Unable to handle kernel paging request at virtual address 5a5a5a5e
c01283a7
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0060:[<c01283a7>] Tainted: G S
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010006
eax: ffffffff ebx: 00000002 ecx: c1b74040 edx: 5a5a5a5a
esi: c1b78000 edi: c1b79f30 ebp: f7aeffd0 esp: c1b79ed0
ds: 0068 es: 0068 ss: 0068
Stack: 00000000 40400000 40017000 f7af9400 c04f00e0 c0131309 c04f00e0 f7af9400
c1b78000 c1b78000 00000000 c1b79f30 c01297e7 f7aeffd0 c1b74658
c1b79f30
c1b74658 c1b79fc4 c1b74658 c1b78000 c1b79f30 c01078d5 c1b79f30
c1b79fc4
Call Trace: [<c0131309>] [<c01297e7>] [<c01078d5>] [<c01354eb>] [<c0135377>]
[<c0135808>] [<c013539a>] [<c013584b>] [<c0129c30>] [<c0107b2a>]
Code: 39 5a 04 74 38 89 d5 8b 12 85 d2 75 f3 8b 54 24 34 8d 43 ff


>>EIP; c01283a7 <dequeue_signal+87/140> <=====

>>eax; ffffffff <END_OF_CODE+3faa41db/????>
>>ecx; c1b74040 <END_OF_CODE+161821c/????>
>>edx; 5a5a5a5a Before first symbol
>>esi; c1b78000 <END_OF_CODE+161c1dc/????>
>>edi; c1b79f30 <END_OF_CODE+161e10c/????>
>>ebp; f7aeffd0 <END_OF_CODE+375941ac/????>
>>esp; c1b79ed0 <END_OF_CODE+161e0ac/????>

Trace; c0131309 <zap_pmd_range+49/60>
Trace; c01297e7 <get_signal_to_deliver+97/370>
Trace; c01078d5 <do_signal+55/b0>
Trace; c01354eb <unmap_region+13b/170>
Trace; c0135377 <unmap_vma+87/90>
Trace; c0135808 <do_munmap+138/190>
Trace; c013539a <unmap_vma_list+1a/30>
Trace; c013584b <do_munmap+17b/190>
Trace; c0129c30 <sys_rt_sigprocmask+170/2d0>
Trace; c0107b2a <work_notifysig+13/15>

Code; c01283a7 <dequeue_signal+87/140>
00000000 <_EIP>:
Code; c01283a7 <dequeue_signal+87/140> <=====
0: 39 5a 04 cmp %ebx,0x4(%edx) <=====
Code; c01283aa <dequeue_signal+8a/140>
3: 74 38 je 3d <_EIP+0x3d> c01283e4
<dequeue_signal+c4/140>
Code; c01283ac <dequeue_signal+8c/140>
5: 89 d5 mov %edx,%ebp
Code; c01283ae <dequeue_signal+8e/140>
7: 8b 12 mov (%edx),%edx
Code; c01283b0 <dequeue_signal+90/140>
9: 85 d2 test %edx,%edx
Code; c01283b2 <dequeue_signal+92/140>
b: 75 f3 jne 0 <_EIP>
Code; c01283b4 <dequeue_signal+94/140>
d: 8b 54 24 34 mov 0x34(%esp,1),%edx
Code; c01283b8 <dequeue_signal+98/140>
11: 8d 43 ff lea 0xffffffff(%ebx),%eax

<0>Kernel panic: Attempted to kill init!

Best regards,

Anton


--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


2002-09-08 23:16:26

by Anton Altaparmakov

[permalink] [raw]
Subject: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree

Hi,

I had a look and the panic actually happens in collect_signal() in here:

static inline int collect_signal(int sig, struct sigpending *list,
siginfo_t *info)
{
if (sigismember(&list->signal, sig)) {
/* Collect the siginfo appropriate to this signal. */
struct sigqueue *q, **pp;
pp = &list->head;
while ((q = *pp) != NULL) {
q becomes 0x5a5a5a5a ^^^^^^^^^
if (q->info.si_signo == sig)
0x5a5a5a5a is dereferenced ^^^^^^^^^^^^^^^^
goto found_it;
pp = &q->next;
}

Hope this helps.

Best regards,

Anton


--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2002-09-09 00:12:36

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree

On Andrew Morton's suggestion I tried with preempt disabled. That still
gives the same result.

I then also tried to compile the kernel for UP and it still gives the same
result.

Anton

At 00:21 09/09/02, Anton Altaparmakov wrote:
>Hi,
>
>I had a look and the panic actually happens in collect_signal() in here:
>
>static inline int collect_signal(int sig, struct sigpending *list,
>siginfo_t *info)
>{
> if (sigismember(&list->signal, sig)) {
> /* Collect the siginfo appropriate to this signal. */
> struct sigqueue *q, **pp;
> pp = &list->head;
> while ((q = *pp) != NULL) {
>q becomes 0x5a5a5a5a ^^^^^^^^^
> if (q->info.si_signo == sig)
>0x5a5a5a5a is dereferenced ^^^^^^^^^^^^^^^^
> goto found_it;
> pp = &q->next;
> }
>
>Hope this helps.
>
>Best regards,
>
> Anton

--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2002-09-09 01:31:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree


On Mon, 9 Sep 2002, Anton Altaparmakov wrote:

> Hi,
>
> I had a look and the panic actually happens in collect_signal() in here:
>
> static inline int collect_signal(int sig, struct sigpending *list,
> siginfo_t *info)
> {
> if (sigismember(&list->signal, sig)) {
> /* Collect the siginfo appropriate to this signal. */
> struct sigqueue *q, **pp;
> pp = &list->head;
> while ((q = *pp) != NULL) {
> q becomes 0x5a5a5a5a ^^^^^^^^^
> if (q->info.si_signo == sig)
> 0x5a5a5a5a is dereferenced ^^^^^^^^^^^^^^^^
> goto found_it;
> pp = &q->next;
> }
>
> Hope this helps.

0x5a5a5a5a is the slab poisoning byte, I bet somebody free's the thing,
and Ingo and I never noticed because we didn't have slab debugging
enabled.

Ingo, mind looking at this a bit?

Linus

2002-09-09 01:43:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree


On Sun, 8 Sep 2002, Linus Torvalds wrote:

> 0x5a5a5a5a is the slab poisoning byte, I bet somebody free's the thing,
> and Ingo and I never noticed because we didn't have slab debugging
> enabled.
>
> Ingo, mind looking at this a bit?

yes, i'm on it. It could also be the missing initialization of the
shared-pending queue. Funny - i usually have CONFIG_SLAB_DEBUGGING enabled
all the time - but not for this patch :-|

Ingo

2002-09-09 01:49:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree


indeed the problem is that the shared_pending queue is not initialized in
INIT_SIGNALS. Patch in a few minutes.

Ingo

2002-09-09 02:35:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: pinpointed: PANIC caused by dequeue_signal() in current Linus BK tree


the attached patch fixes the bootup crash. There were two initialization
bugs:

- INIT_SIGNAL needs to set shared_pending.

- exec() needs to set up newsig properly.

the second one caused the crash Anton saw.

Ingo

--- linux/arch/i386/kernel/init_task.c.orig Mon Sep 9 04:04:01 2002
+++ linux/arch/i386/kernel/init_task.c Mon Sep 9 04:04:35 2002
@@ -10,7 +10,7 @@

static struct fs_struct init_fs = INIT_FS;
static struct files_struct init_files = INIT_FILES;
-static struct signal_struct init_signals = INIT_SIGNALS;
+static struct signal_struct init_signals = INIT_SIGNALS(init_signals);
struct mm_struct init_mm = INIT_MM(init_mm);

/*
--- linux/include/linux/init_task.h.orig Mon Sep 9 04:02:19 2002
+++ linux/include/linux/init_task.h Mon Sep 9 04:08:08 2002
@@ -29,10 +29,11 @@
.mmlist = LIST_HEAD_INIT(name.mmlist), \
}

-#define INIT_SIGNALS { \
+#define INIT_SIGNALS(sig) { \
.count = ATOMIC_INIT(1), \
.action = { {{0,}}, }, \
- .siglock = SPIN_LOCK_UNLOCKED \
+ .siglock = SPIN_LOCK_UNLOCKED, \
+ .shared_pending = { NULL, &sig.shared_pending.head, {{0}}}, \
}

/*
--- linux/fs/exec.c.orig Mon Sep 9 04:40:49 2002
+++ linux/fs/exec.c Mon Sep 9 04:41:28 2002
@@ -514,6 +514,8 @@
spin_lock_init(&newsig->siglock);
atomic_set(&newsig->count, 1);
memcpy(newsig->action, current->sig->action, sizeof(newsig->action));
+ init_sigpending(&newsig->shared_pending);
+
spin_lock_irq(&current->sigmask_lock);
current->sig = newsig;
spin_unlock_irq(&current->sigmask_lock);