2006-01-14 00:18:22

by P. Christeas

[permalink] [raw]
Subject: Regression in Autofs, 2.6.15-git

I had been testing some other thing in the kernel, when I noticed that the
commit (in Linus' tree) given below makes *autofs4* crash.
It does crash with a hard oops whenever I try to enter a /net/host/... address
in konqueror's location bar.
Is this a known issue or should I provide more information?


commit fc33a7bb9c6dd8f6e4a014976200f8fdabb3a45c
Author: Christoph Hellwig <[email protected]>
Date: Mon Jan 9 20:52:17 2006 -0800

[PATCH] per-mountpoint noatime/nodiratime

Turn noatime and nodiratime into per-mount instead of per-sb flags.

After all the preparations this is a rather trivial patch. The mount code
needs to treat the two options as per-mount instead of per-superblock, and
touch_atime needs to be changed to check the new MNT_ flags in addition to
the MS_ flags that are kept for filesystems that are always
noatime/nodiratime but not user settable anymore. Besides that core code
only nfs needed an update because it's leaving atime updates to the server
and thus sets the S_NOATIME flag on every inode, but needs to know whether
it's a real noatime mount for an getattr optimization.

While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that
were
only used by touch_atime.


2006-01-14 11:35:15

by Andrew Morton

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

"P. Christeas" <[email protected]> wrote:
>
> I had been testing some other thing in the kernel, when I noticed that the
> commit (in Linus' tree) given below makes *autofs4* crash.
> It does crash with a hard oops whenever I try to enter a /net/host/... address
> in konqueror's location bar.
> Is this a known issue or should I provide more information?
>
>
> commit fc33a7bb9c6dd8f6e4a014976200f8fdabb3a45c
> Author: Christoph Hellwig <[email protected]>
> Date: Mon Jan 9 20:52:17 2006 -0800
>
> [PATCH] per-mountpoint noatime/nodiratime
>
> Turn noatime and nodiratime into per-mount instead of per-sb flags.
>
> After all the preparations this is a rather trivial patch. The mount code
> needs to treat the two options as per-mount instead of per-superblock, and
> touch_atime needs to be changed to check the new MNT_ flags in addition to
> the MS_ flags that are kept for filesystems that are always
> noatime/nodiratime but not user settable anymore. Besides that core code
> only nfs needed an update because it's leaving atime updates to the server
> and thus sets the S_NOATIME flag on every inode, but needs to know whether
> it's a real noatime mount for an getattr optimization.
>
> While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that
> were
> only used by touch_atime.

Thanks for working that out.

It works for me. Are you able to capture the oops output?

2006-01-14 11:50:53

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Saturday 14 January 2006 1:34 pm, you wrote:
> Thanks for working that out.
>
> It works for me. Are you able to capture the oops output?
Works in what sense? Are you able to reproduce the oops?
It is quite difficult to reproduce the oops, since it makes the whole system
freeze (the fs part is oopsed, and then all processes depend on it). Hence
I've called it "hard" . It may be captured with a serial console, I 'll give
it a try..

2006-01-14 11:55:26

by Andrew Morton

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

"P. Christeas" <[email protected]> wrote:
>
> On Saturday 14 January 2006 1:34 pm, you wrote:
> > Thanks for working that out.
> >
> > It works for me. Are you able to capture the oops output?
> Works in what sense? Are you able to reproduce the oops?

No, I am not. I did `cd /net/<host>/usr/src' and things worked OK.

> It is quite difficult to reproduce the oops, since it makes the whole system
> freeze (the fs part is oopsed, and then all processes depend on it). Hence
> I've called it "hard" . It may be captured with a serial console, I 'll give
> it a try..

OK, thanks. Also if you're in the console a digital photo of the screen
works nicely.

2006-01-14 12:57:17

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Saturday 14 January 2006 1:54 pm, Andrew Morton wrote:
> "P. Christeas" <[email protected]> wrote:
> > On Saturday 14 January 2006 1:34 pm, you wrote:
> > > Thanks for working that out.
> > >
> > > It works for me. Are you able to capture the oops output?
> >
> > Works in what sense? Are you able to reproduce the oops?
>
> No, I am not. I did `cd /net/<host>/usr/src' and things worked OK.
>
> > It is quite difficult to reproduce the oops, since it makes the whole
> > system freeze (the fs part is oopsed, and then all processes depend on
> > it). Hence I've called it "hard" . It may be captured with a serial
> > console, I 'll give it a try..
>
> OK, thanks. Also if you're in the console a digital photo of the screen
> works nicely.

Here it is.
(how do I load the symbols into gdb, so that I can see the source listing?
With vmlinux on i386 it doesn't work.)

My auto.net supplied parameter is:
opts="-fstype=nfs,soft,intr,nodev,nosuid,nonstrict"
I use autofs-4.1.3 and mount-2.12r.
The nfs server is a 2.6.15-rc6.


Attachments:
(No filename) (1.01 kB)
autofs.oops (2.08 kB)
Download all attachments

2006-01-14 13:18:11

by Andrew Morton

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

"P. Christeas" <[email protected]> wrote:
>
> On Saturday 14 January 2006 1:54 pm, Andrew Morton wrote:
> > "P. Christeas" <[email protected]> wrote:
> > > On Saturday 14 January 2006 1:34 pm, you wrote:
> > > > Thanks for working that out.
> > > >
> > > > It works for me. Are you able to capture the oops output?
> > >
> > > Works in what sense? Are you able to reproduce the oops?
> >
> > No, I am not. I did `cd /net/<host>/usr/src' and things worked OK.
> >
> > > It is quite difficult to reproduce the oops, since it makes the whole
> > > system freeze (the fs part is oopsed, and then all processes depend on
> > > it). Hence I've called it "hard" . It may be captured with a serial
> > > console, I 'll give it a try..
> >
> > OK, thanks. Also if you're in the console a digital photo of the screen
> > works nicely.
>
> Here it is.

Great, thanks.

> (how do I load the symbols into gdb, so that I can see the source listing?
> With vmlinux on i386 it doesn't work.)

umm, I think this'll work:

Set CONFIG_DEBUG_INFO=y, rebuild, reboot

Look in /proc/modules, see autofs4's starting address

Calculate <offset>=<EIP>-<autofs4's starting address>

gdb /lib/modules/$(uname -r)/kernel/fs/autofs4/autofs4.ko

(gdb) l *<offset>

Still, we can work out what happened:

> Unable to handle kernel NULL pointer dereference at virtual address 00000030
> printing eip:
> *pde = 00000000
> Oops: 0000 [#1]
> PREEMPT SMP
> Modules linked in: nfs autofs4 cpufreq_ondemand cpufreq_userspace cpufreq_powersave p4_clockmod speedstep_lib freq_table nfsd exportfs lockd sunrpc irtty_sir sir_dev irda crc_ccitt rfcomm l2cap bluetooth snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_atiixp snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd_page_alloc i2c_isa 8139too eth1394 sd_mod ohci1394 ieee1394 loop cx88_blackbird cx8802 tda9887 tuner cx8800 cx88xx i2c_algo_bit video_buf ir_common tveeprom i2c_core btcx_risc usb_storage scsi_mod usbhid ehci_hcd ohci_hcd usbcore video container button battery
> CPU: 1
> EIP: 0060:[<c0162875>] Not tainted VLI
> EFLAGS: 00210202 (2.6.15xrg-gf33dc619)
> EIP is at touch_atime+0x43/0x9f
> eax: 40000000 ebx: db67435c ecx: d8942a00 edx: 00000004
> esi: d3aba6c0 edi: d7e942b0 ebp: 00000004 esp: d3cede50
> ds: 007b es: 007b ss: 0068
> Process konqueror (pid: 4751, threadinfo=d3cec000 task=dfda6a90)
> Stack: <0>00000001 00000001 d362fd50 d3aba6c0 e1b0e727 00000004 d362fd50 00000000
> d3aba6c0 d362fd50 00000000 e1b0edd7 00000004 d362fd50 00000002 d371b8bc
> d362fd50 d362fd50 c1627d40 e1b0e909 d362fd50 d3cedea8 db67435c 00000004
> Call Trace:
> [<e1b0e727>] autofs4_update_usage+0x2c/0x4b [autofs4]
> [<e1b0edd7>] autofs4_revalidate+0x10d/0x121 [autofs4]
> [<e1b0e909>] autofs4_dir_open+0xb7/0x19b [autofs4]
> [<c0158627>] permission+0x7f/0x8c
> [<c0158647>] vfs_permission+0x13/0x17
> [<c0159da5>] may_open+0x53/0x1a1
> [<e1b0e852>] autofs4_dir_open+0x0/0x19b [autofs4]
> [<c014c7cf>] __dentry_open+0xe7/0x1e5
> [<c014c98c>] nameidata_to_filp+0x1f/0x31
> [<c014c8fd>] filp_open+0x30/0x38
> [<c014cb69>] do_sys_open+0x3c/0xaf
> [<c01027cf>] sysenter_past_esp+0x54/0x75
> Code: a8 01 75 7e f6 83 78 01 00 00 02 75 75 f6 c4 04 75 70 f6 c4 08 74 10 0f b7 43 28 25 00 f0 00 00 3d 00 40 00 00 74 5b 85 d2 74 1b <8b> 42 2c a8 08 75 50 a8 10 74 10 0f b7 43 28 25 00 f0 00 00 3d
> <6>note: konqueror[4751] exited with preempt_count 1
>

We test incoming arg `mnt' for NULL so we can ignore that.

You oopsed accessing 0x00000030, and offsetof(super_block, s_flags) is
0x30. So autofs4 has passed in a dentry which has a NULL
dentry->d_inode->i_sb and the new code didn't expect that.


A temp workaround would be something like this:

diff -puN fs/inode.c~a fs/inode.c
--- devel/fs/inode.c~a 2006-01-14 05:16:16.000000000 -0800
+++ devel-akpm/fs/inode.c 2006-01-14 05:17:00.000000000 -0800
@@ -1194,8 +1194,9 @@ void touch_atime(struct vfsmount *mnt, s
return;

if ((inode->i_flags & S_NOATIME) ||
- (inode->i_sb->s_flags & MS_NOATIME) ||
- ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode)))
+ (inode->i_sb && (inode->i_sb->s_flags & MS_NOATIME)) ||
+ ((inode->i_sb && (inode->i_sb->s_flags & MS_NODIRATIME)) &&
+ S_ISDIR(inode->i_mode)))
return;

/*
_

2006-01-14 14:01:33

by Al Viro

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Sat, Jan 14, 2006 at 05:17:37AM -0800, Andrew Morton wrote:
> You oopsed accessing 0x00000030, and offsetof(super_block, s_flags) is
> 0x30. So autofs4 has passed in a dentry which has a NULL
> dentry->d_inode->i_sb and the new code didn't expect that.

It's not just new code... inode->i_sb should _never_ be NULL, period.
We set it immediately after memory had been allocated by alloc_inode()
and never modify afterwards, let alone reset to NULL.

2006-01-14 14:05:32

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Saturday 14 January 2006 4:01 pm, Al Viro wrote:
> On Sat, Jan 14, 2006 at 05:17:37AM -0800, Andrew Morton wrote:
> > You oopsed accessing 0x00000030, and offsetof(super_block, s_flags) is
> > 0x30. So autofs4 has passed in a dentry which has a NULL
> > dentry->d_inode->i_sb and the new code didn't expect that.
>
> It's not just new code... inode->i_sb should _never_ be NULL, period.
> We set it immediately after memory had been allocated by alloc_inode()
> and never modify afterwards, let alone reset to NULL.

Andrew's fix didn't work, so I'm seeking it further..

2006-01-14 14:29:11

by Ian Kent

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Sat, 14 Jan 2006, Al Viro wrote:

> On Sat, Jan 14, 2006 at 05:17:37AM -0800, Andrew Morton wrote:
> > You oopsed accessing 0x00000030, and offsetof(super_block, s_flags) is
> > 0x30. So autofs4 has passed in a dentry which has a NULL
> > dentry->d_inode->i_sb and the new code didn't expect that.
>
> It's not just new code... inode->i_sb should _never_ be NULL, period.
> We set it immediately after memory had been allocated by alloc_inode()
> and never modify afterwards, let alone reset to NULL.
>

I don't do such things in the autofs4 module either.
I'm puzzled by this?

Ian

2006-01-14 15:11:27

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

> > Unable to handle kernel NULL pointer dereference at virtual address
> > 00000030 printing eip:
> > *pde = 00000000
> > Oops: 0000 [#1]
> > PREEMPT SMP
> > Modules linked in: nfs autofs4 cpufreq_ondemand cpufreq_userspace
> > cpufreq_powersave p4_clockmod speedstep_lib freq_table nfsd exportfs
> > lockd sunrpc irtty_sir sir_dev irda crc_ccitt rfcomm l2cap bluetooth
> > snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
> > snd_pcm_oss snd_mixer_oss snd_atiixp snd_ac97_codec snd_ac97_bus snd_pcm
> > snd_timer snd_page_alloc i2c_isa 8139too eth1394 sd_mod ohci1394 ieee1394
> > loop cx88_blackbird cx8802 tda9887 tuner cx8800 cx88xx i2c_algo_bit
> > video_buf ir_common tveeprom i2c_core btcx_risc usb_storage scsi_mod
> > usbhid ehci_hcd ohci_hcd usbcore video container button battery CPU: 1
> > EIP: 0060:[<c0162875>] Not tainted VLI
> > EFLAGS: 00210202 (2.6.15xrg-gf33dc619)
> > EIP is at touch_atime+0x43/0x9f
> > eax: 40000000 ebx: db67435c ecx: d8942a00 edx: 00000004
> > esi: d3aba6c0 edi: d7e942b0 ebp: 00000004 esp: d3cede50
^^^^^^^^^^^^^^^^^^^^^
Note these..

> > ds: 007b es: 007b ss: 0068
> > Process konqueror (pid: 4751, threadinfo=d3cec000 task=dfda6a90)
> > Stack: <0>00000001 00000001 d362fd50 d3aba6c0 e1b0e727 00000004 d362fd50
> > 00000000 d3aba6c0 d362fd50 00000000 e1b0edd7 00000004 d362fd50 00000002
> > d371b8bc d362fd50 d362fd50 c1627d40 e1b0e909 d362fd50 d3cedea8 db67435c
> > 00000004 Call Trace:
> > [<e1b0e727>] autofs4_update_usage+0x2c/0x4b [autofs4]
> > [<e1b0edd7>] autofs4_revalidate+0x10d/0x121 [autofs4]
> > [<e1b0e909>] autofs4_dir_open+0xb7/0x19b [autofs4]
> > [<c0158627>] permission+0x7f/0x8c
> > [<c0158647>] vfs_permission+0x13/0x17
> > [<c0159da5>] may_open+0x53/0x1a1
> > [<e1b0e852>] autofs4_dir_open+0x0/0x19b [autofs4]
> > [<c014c7cf>] __dentry_open+0xe7/0x1e5
> > [<c014c98c>] nameidata_to_filp+0x1f/0x31
> > [<c014c8fd>] filp_open+0x30/0x38
> > [<c014cb69>] do_sys_open+0x3c/0xaf
> > [<c01027cf>] sysenter_past_esp+0x54/0x75
> > Code: a8 01 75 7e f6 83 78 01 00 00 02 75 75 f6 c4 04 75 70 f6 c4 08 74
> > 10 0f b7 43 28 25 00 f0 00 00 3d 00 40 00 00 74 5b 85 d2 74 1b <8b> 42 2c
> > a8 08 75 50 a8 10 74 10 0f b7 43 28 25 00 f0 00 00 3d <6>note:
> > konqueror[4751] exited with preempt_count 1
>
> We test incoming arg `mnt' for NULL so we can ignore that.
>
Still, it seems that the problem lies in 'mnt'.
EIP: 0060:[<c0162875>] -->
0xc0162875 is in touch_atime (fs/inode.c:1205).
and:
Dump of assembler code for function touch_atime:
0xc0162832 <touch_atime+0>: push %esi
0xc0162833 <touch_atime+1>: push %ebx
0xc0162834 <touch_atime+2>: push %eax
0xc0162835 <touch_atime+3>: push %eax
0xc0162836 <touch_atime+4>: mov 0x14(%esp),%edx
...
0xc0162871 <touch_atime+63>: test %edx,%edx
0xc0162873 <touch_atime+65>: je 0xc0162890 <touch_atime+94>
0xc0162875 <touch_atime+67>: mov 0x2c(%edx),%eax
0xc0162878 <touch_atime+70>: test $0x8,%al
0xc016287a <touch_atime+72>: jne 0xc01628cc <touch_atime+154>

Doesn't that mean that mnt==0x0004 ? Clearly wrong. I can also see from
Christian's patch that mnt wasn't previously used, so it makes perfect sense
for that commit to introduce the oops.

I guess the problem lies in autofs4_revalidate (fs/autofs4/root.c:420), the
nd->mnt value..

I will add a silly validator (mnt>0xff) instead of (mnt) and see..

2006-01-14 15:25:50

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Saturday 14 January 2006 5:11 pm, P. Christeas wrote:
> Doesn't that mean that mnt==0x0004 ? Clearly wrong. I can also see from
> Christian's patch that mnt wasn't previously used, so it makes perfect
> sense for that commit to introduce the oops.
>
> I guess the problem lies in autofs4_revalidate (fs/autofs4/root.c:420), the
> nd->mnt value..
>
> I will add a silly validator (mnt>0xff) instead of (mnt) and see..
Confirmed: ((u32)mnt>0xff) discards the invalid 'mnt' value and the oops
disappears.
That is, the autofs4 code needs some debugging :( .

2006-01-14 17:07:12

by Ian Kent

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

On Sat, 2006-01-14 at 17:25 +0200, P. Christeas wrote:
> On Saturday 14 January 2006 5:11 pm, P. Christeas wrote:
> > Doesn't that mean that mnt==0x0004 ? Clearly wrong. I can also see from
> > Christian's patch that mnt wasn't previously used, so it makes perfect
> > sense for that commit to introduce the oops.
> >
> > I guess the problem lies in autofs4_revalidate (fs/autofs4/root.c:420), the
> > nd->mnt value..
> >
> > I will add a silly validator (mnt>0xff) instead of (mnt) and see..
> Confirmed: ((u32)mnt>0xff) discards the invalid 'mnt' value and the oops
> disappears.
> That is, the autofs4 code needs some debugging :( .

Yes. It's me again.

Could you try this patch please.

--- linux-2.6.15/fs/autofs4/root.c.dumb-nameidata 2006-01-15 01:01:26.000000000 +0800
+++ linux-2.6.15/fs/autofs4/root.c 2006-01-15 01:02:12.000000000 +0800
@@ -193,6 +193,8 @@ static int autofs4_dir_open(struct inode
if (!empty)
d_invalidate(dentry);

+ nd.dentry = dentry;
+ nd.mnt = mnt;
nd.flags = LOOKUP_DIRECTORY;
status = (dentry->d_op->d_revalidate)(dentry, &nd);



2006-01-14 17:54:42

by P. Christeas

[permalink] [raw]
Subject: Re: Regression in Autofs, 2.6.15-git

Well done Ian!
That patch fixes the problem.
That explains the conditions needed to reproduce the oops. Konqueror would
perform a number of lookups (some failing) before requesting the mountpoint
directory.

On Saturday 14 January 2006 7:06 pm, Ian Kent wrote:

> Yes. It's me again.
>
> Could you try this patch please.
>
> --- linux-2.6.15/fs/autofs4/root.c.dumb-nameidata 2006-01-15
> 01:01:26.000000000 +0800 +++ linux-2.6.15/fs/autofs4/root.c 2006-01-15
> 01:02:12.000000000 +0800 @@ -193,6 +193,8 @@ static int
> autofs4_dir_open(struct inode
> if (!empty)
> d_invalidate(dentry);
>
> + nd.dentry = dentry;
> + nd.mnt = mnt;
> nd.flags = LOOKUP_DIRECTORY;
> status = (dentry->d_op->d_revalidate)(dentry, &nd);