LinuxLists.cc - 2.6.10-mm1 panic in sysfs ?

2005-01-05 18:04:17

by Badari Pulavarty

[permalink] [raw]

Subject: 2.6.10-mm1 panic in sysfs ?

Hi Andrew,

I get a panic in sysfs_readdir() while booting 2.6.10-mm1
kernel. Known fixes ?

Thanks,
Badari

Attachments:

sysfs-panic.out (13.18 kB)

2005-01-06 00:52:56

by Andrew Morton

[permalink] [raw]

Subject: Re: 2.6.10-mm1 panic in sysfs ?

Badari Pulavarty <[email protected]> wrote:
>
> Hi Andrew,
>
> I get a panic in sysfs_readdir() while booting 2.6.10-mm1
> kernel. Known fixes ?
>

It's news to me.

> Unable to handle kernel NULL pointer dereference at virtual address 00000020
> printing eip:
> c109c8ef
> *pde = 0191c001
> Oops: 0000 [#1]
> SMP
> Modules linked in:
> CPU: 2
> EIP: 0060:[<c109c8ef>] Not tainted VLI
> EFLAGS: 00010282 (2.6.10-mm1kexec)

What is "2.6.10mm1kexec"?

> EIP is at sysfs_readdir+0xef/0x280
> eax: 00000000 ebx: c15e1160 ecx: 0000000c edx: 00000020
> esi: c15e1164 edi: c15dd72d ebp: c1a7df78 esp: c1a7df3c
> ds: 007b es: 007b ss: 0068
> Process getcfg (pid: 1927, threadinfo=c1a7c000 task=c2ba3040)

Try to work out what arguments are being passed to `getcfg', then run it by
hand, under strace, to see what /sysfs file is being accessed when it oopses.

> Stack: 00000001 00000000 00000017 00000004 c156d62c 0000000c c15dd720 c21bf324
> c156d620 c1071f80 c1a7dfa0 c1c837e0 c131caa0 c1c837e0 c1587428 c1a7df94
> c1071e48 c1a7dfa0 c1071f80 c1a7c000 0804f944 fffffff7 c1a7dfbc c10720aa
> Call Trace:
> [<c1004dc6>] show_stack+0xa6/0xb0
> [<c1004f42>] show_registers+0x152/0x1c0
> [<c100514d>] die+0xed/0x180
> [<c1018b6d>] do_page_fault+0x45d/0x6e9
> [<c1004a2b>] error_code+0x2b/0x30
> [<c1071e48>] vfs_readdir+0x98/0xb0
> [<c10720aa>] sys_getdents+0x6a/0xd0
> [<c1003f31>] sysenter_past_esp+0x52/0x75
> Code: eb 89 d8 e8 c4 ea ff ff 89 45 dc b9 ff ff ff ff 31 c0 8b 7d dc f2 ae f7 d1 49 89 4d d8 8b 43 20 85 c0 0f 84 37 01 00 00 8b 40 0c <8b> 50 20 0f b7 43 1c 89 54 24 08 c1 e8 0c 89 44 24 0c 8b 4d f0
>
>
>
>
>
>
>

2005-01-06 01:06:18

by Badari Pulavarty

[permalink] [raw]

Subject: Re: 2.6.10-mm1 panic in sysfs ?

On Wed, 2005-01-05 at 16:52, Andrew Morton wrote:
> Badari Pulavarty <[email protected]> wrote:
> >
> > Hi Andrew,
> >
> > I get a panic in sysfs_readdir() while booting 2.6.10-mm1
> > kernel. Known fixes ?
> >
>
> It's news to me.
>
> > Unable to handle kernel NULL pointer dereference at virtual address 00000020
> > printing eip:
> > c109c8ef
> > *pde = 0191c001
> > Oops: 0000 [#1]
> > SMP
> > Modules linked in:
> > CPU: 2
> > EIP: 0060:[<c109c8ef>] Not tainted VLI
> > EFLAGS: 00010282 (2.6.10-mm1kexec)
>
> What is "2.6.10mm1kexec"?

I enabled kexec in the kernel, so added "kexec" in the EXTRA VERSION.

>
> > EIP is at sysfs_readdir+0xef/0x280
> > eax: 00000000 ebx: c15e1160 ecx: 0000000c edx: 00000020
> > esi: c15e1164 edi: c15dd72d ebp: c1a7df78 esp: c1a7df3c
> > ds: 007b es: 007b ss: 0068
> > Process getcfg (pid: 1927, threadinfo=c1a7c000 task=c2ba3040)
>
> Try to work out what arguments are being passed to `getcfg', then run it by
> hand, under strace, to see what /sysfs file is being accessed when it oopses.

Sure. will do and let you know.

Thanks,
Badari

2005-01-06 11:11:05

by Maneesh Soni

[permalink] [raw]

Subject: Re: 2.6.10-mm1 panic in sysfs ?

On Wed, Jan 05, 2005 at 09:36:42AM -0800, Badari Pulavarty wrote:
> Hi Andrew,
>
> I get a panic in sysfs_readdir() while booting 2.6.10-mm1
> kernel. Known fixes ?
>
> Thanks,
> Badari
>
>

[....]
> Creating /var/log/boot.msg done
> showconsole: Warning: the ioctl TIOCGDEV is not known by the kernel
> System Boot Control: The system has been set up
> Skipped features: boot.cycle boot.sched
> System Boot Control: Running /etc/init.d/boot.local done
> INIT: Entering runlevel: 1
> Boot logging started on /dev/ttyS0(/dev/console) at Wed Jan 5 00:33:53 2005
> Master Resource Control: previous runlevel: N, switching to runlevel:1
> Hotplug is already active (disable with NOHOTPLUG=1 at the boot prodone
> coldplug scanning input: *** done
> scanning pci: ****.W*.*..*Unable to handle kernel NULL pointer dereference at virtual address 00000020
> printing eip:
> c109c8ef
> *pde = 0191c001
> Oops: 0000 [#1]
> SMP
> Modules linked in:
> CPU: 2
> EIP: 0060:[<c109c8ef>] Not tainted VLI
> EFLAGS: 00010282 (2.6.10-mm1kexec)
> EIP is at sysfs_readdir+0xef/0x280
> eax: 00000000 ebx: c15e1160 ecx: 0000000c edx: 00000020
> esi: c15e1164 edi: c15dd72d ebp: c1a7df78 esp: c1a7df3c
> ds: 007b es: 007b ss: 0068
> Process getcfg (pid: 1927, threadinfo=c1a7c000 task=c2ba3040)

I think it crashed as the dentry->d_inode is NULL, which is surprising. Getting
some info on the file in process will certainly help.

--------
static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
{
struct dentry *dentry = filp->f_dentry;
struct sysfs_dirent * parent_sd = dentry->d_fsdata;
struct sysfs_dirent *cursor = filp->private_data;
struct list_head *p, *q = &cursor->s_sibling;
ino_t ino;
int i = filp->f_pos;

switch (i) {
case 0:
ino = dentry->d_inode->i_ino;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0)
break;
filp->f_pos++;
i++;

-------

BTW, is this a kexec boot or normal boot? I don't know if this has any
effect or not but just trying to find reasons behind messages like
entering runlevel 1 etc..

Thanks
Maneesh

--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990