2009-01-18 22:23:32

by Bernhard Schmidt

[permalink] [raw]
Subject: BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

Hello,

first off, this might be an Ubuntu regression and I've also opened a
bugreport there, but I think it is in mainline as well. Maybe someone
familiar with the code can see the problem at first glance without too
much debugging.

Since quite a while I've been mounting my CIFS fileserver at home
(running Samba 3.2) using IPv6 transport:

# mount -t cifs -o user=berni,ip=2001:xxx::yy //fileserv/pub /pub

This used to work fine with the Ubuntu kernel 2.6.27-9, which according
to the changelog is 2.6.27.4-based. Ubuntu kernel 2.6.27-11 (currently
in the -proposed repository, so not rolled out yet, appears to be based
on 2.6.27.9) throws a BUG in my face (and does not mount, obviously).
Since according to the Ubuntu changelog the only changes in cifs have
been the ones in -stable 2.6.27.7 to 2.6.27.8, namely

* cifs: Reduce number of socket retries in large write path
* cifs: Fix error in smb_send2
* cifs: Fix cifs reconnection flags
* cifs: remove unused list, add new cifs sock list to prepare for
mount/umount fix
* cifs: clean up server protocol handling
* cifs: disable sharing session and tcon and add new TCP sharing code
* cifs: reinstate sharing of SMB sessions sans races
* cifs: minor cleanup to cifs_mount
* cifs: reinstate sharing of tree connections
* cifs: Fix build break
* cifs: Fix check for tcon seal setting and fix oops on failed mount
from earlier patch
* cifs: prevent cifs_writepages() from skipping unwritten pages
* cifs: fix check for dead tcon in smb_init

I assume it's an upstream problem.

[28816.788084] CIFS VFS: Error connecting to socket. Aborting operation
[28816.788094] CIFS VFS: cifs_mount failed w/return code = -113
[28816.788121] BUG: unable to handle kernel paging request at 69000030
[28816.788125] IP: [<f9bfde00>] :cifs:cifs_read_super+0xa0/0x1e0
[28816.788140] *pde = 00000000
[28816.788144] Oops: 0000 [#1] SMP
[28816.788148] Modules linked in: nls_utf8 ufs qnx4 hfsplus hfs minix
ntfs vfat msdos fat jfs xfs reiserfs ext2 nls_cp437 cifs af_packet
binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev
tun ipv6 pci_slot container sbs sbshc video output battery
iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw
psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm pcspkr
snd_seq_dummy k8temp snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore
i2c_piix4 snd_page_alloc i2c_core evdev dm_multipath scsi_dh pl2303
usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid
hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata
usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot dm_mod
thermal processor fan fbcon tileblit font bitblit softcursor fuse
[28816.788215]
[28816.788219] Pid: 20540, comm: mount.cifs Tainted: P
(2.6.27-11-generic #1)
[28816.788222] EIP: 0060:[<f9bfde00>] EFLAGS: 00010286 CPU: 0
[28816.788232] EIP is at cifs_read_super+0xa0/0x1e0 [cifs]
[28816.788234] EAX: 00000044 EBX: 69000000 ECX: ffffffff EDX: 00000046
[28816.788237] ESI: d43b5000 EDI: ffffff8f EBP: d42a1e8c ESP: d42a1e6c
[28816.788239] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[28816.788242] Process mount.cifs (pid: 20540, ti=d42a0000 task=d1b48c90
task.ti=d42a0000)
[28816.788245] Stack: f9c24428 ffffff8f d31aa000 d8951e00 0000004e
d8951e00 d8951e00 00000000
[28816.788252] d42a1eb0 f9bfdfa9 00000000 f6603e80 fffffff4
d31aa000 f6603e80 00000000
[28816.788258] f9c38520 d42a1ed8 c01b468e d43b5000 f6603e80
d31aa000 00000040 d4086000
[28816.788264] Call Trace:
[28816.788270] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
[28816.788282] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
[28816.788292] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
[28816.788296] [<c01cccff>] ? do_new_mount+0x6f/0x90
[28816.788301] [<c01cd242>] ? do_mount+0x1d2/0x1f0
[28816.788306] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
[28816.788310] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
[28816.788314] [<c01cd2f1>] ? sys_mount+0x91/0xc0
[28816.788318] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
[28816.788323] =======================
[28816.788324] Code: 65 c6 8b 43 30 8b 55 f0 c6 04 10 00 8b 45 e8 89 f1
89 da 89 04 24 8b 45 ec e8 fd dc 00 00 85 c0 89 c7 74 57 8b 45 08 85 c0
74 30 <8b> 43 30 85 c0 74 0c e8 a4 f0 5a c6 c7 43 30 00 00 00 00 8b 43
[28816.788354] EIP: [<f9bfde00>] cifs_read_super+0xa0/0x1e0 [cifs]
SS:ESP 0068:d42a1e6c
[28816.788365] ---[ end trace 9d71176ecad6924f ]---

Do you need any further information? As far as I can see the Ubuntu
Jaunty packaged version of 2.6.28 works fine, but I did not run
extensive tests on it.

Bernhard


2009-01-19 02:03:52

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Sun, 18 Jan 2009 23:13:29 +0100
Bernhard Schmidt <[email protected]> wrote:

> Hello,
>
> first off, this might be an Ubuntu regression and I've also opened a
> bugreport there, but I think it is in mainline as well. Maybe someone
> familiar with the code can see the problem at first glance without too
> much debugging.
>
> Since quite a while I've been mounting my CIFS fileserver at home
> (running Samba 3.2) using IPv6 transport:
>
> # mount -t cifs -o user=berni,ip=2001:xxx::yy //fileserv/pub /pub
>
> This used to work fine with the Ubuntu kernel 2.6.27-9, which according
> to the changelog is 2.6.27.4-based. Ubuntu kernel 2.6.27-11 (currently
> in the -proposed repository, so not rolled out yet, appears to be based
> on 2.6.27.9) throws a BUG in my face (and does not mount, obviously).
> Since according to the Ubuntu changelog the only changes in cifs have
> been the ones in -stable 2.6.27.7 to 2.6.27.8, namely
>
> * cifs: Reduce number of socket retries in large write path
> * cifs: Fix error in smb_send2
> * cifs: Fix cifs reconnection flags
> * cifs: remove unused list, add new cifs sock list to prepare for
> mount/umount fix
> * cifs: clean up server protocol handling
> * cifs: disable sharing session and tcon and add new TCP sharing code
> * cifs: reinstate sharing of SMB sessions sans races
> * cifs: minor cleanup to cifs_mount
> * cifs: reinstate sharing of tree connections
> * cifs: Fix build break
> * cifs: Fix check for tcon seal setting and fix oops on failed mount
> from earlier patch
> * cifs: prevent cifs_writepages() from skipping unwritten pages
> * cifs: fix check for dead tcon in smb_init
>
> I assume it's an upstream problem.
>
> [28816.788084] CIFS VFS: Error connecting to socket. Aborting operation
> [28816.788094] CIFS VFS: cifs_mount failed w/return code = -113
> [28816.788121] BUG: unable to handle kernel paging request at 69000030
> [28816.788125] IP: [<f9bfde00>] :cifs:cifs_read_super+0xa0/0x1e0
> [28816.788140] *pde = 00000000
> [28816.788144] Oops: 0000 [#1] SMP
> [28816.788148] Modules linked in: nls_utf8 ufs qnx4 hfsplus hfs minix
> ntfs vfat msdos fat jfs xfs reiserfs ext2 nls_cp437 cifs af_packet
> binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev
> tun ipv6 pci_slot container sbs sbshc video output battery
> iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw
> psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm pcspkr
> snd_seq_dummy k8temp snd_seq_oss snd_seq_midi snd_rawmidi
> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore
> i2c_piix4 snd_page_alloc i2c_core evdev dm_multipath scsi_dh pl2303
> usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
> mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid
> hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata
> usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot dm_mod
> thermal processor fan fbcon tileblit font bitblit softcursor fuse
> [28816.788215]
> [28816.788219] Pid: 20540, comm: mount.cifs Tainted: P
> (2.6.27-11-generic #1)
> [28816.788222] EIP: 0060:[<f9bfde00>] EFLAGS: 00010286 CPU: 0
> [28816.788232] EIP is at cifs_read_super+0xa0/0x1e0 [cifs]
> [28816.788234] EAX: 00000044 EBX: 69000000 ECX: ffffffff EDX: 00000046
> [28816.788237] ESI: d43b5000 EDI: ffffff8f EBP: d42a1e8c ESP: d42a1e6c
> [28816.788239] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [28816.788242] Process mount.cifs (pid: 20540, ti=d42a0000 task=d1b48c90
> task.ti=d42a0000)
> [28816.788245] Stack: f9c24428 ffffff8f d31aa000 d8951e00 0000004e
> d8951e00 d8951e00 00000000
> [28816.788252] d42a1eb0 f9bfdfa9 00000000 f6603e80 fffffff4
> d31aa000 f6603e80 00000000
> [28816.788258] f9c38520 d42a1ed8 c01b468e d43b5000 f6603e80
> d31aa000 00000040 d4086000
> [28816.788264] Call Trace:
> [28816.788270] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
> [28816.788282] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
> [28816.788292] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
> [28816.788296] [<c01cccff>] ? do_new_mount+0x6f/0x90
> [28816.788301] [<c01cd242>] ? do_mount+0x1d2/0x1f0
> [28816.788306] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
> [28816.788310] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
> [28816.788314] [<c01cd2f1>] ? sys_mount+0x91/0xc0
> [28816.788318] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
> [28816.788323] =======================
> [28816.788324] Code: 65 c6 8b 43 30 8b 55 f0 c6 04 10 00 8b 45 e8 89 f1
> 89 da 89 04 24 8b 45 ec e8 fd dc 00 00 85 c0 89 c7 74 57 8b 45 08 85 c0
> 74 30 <8b> 43 30 85 c0 74 0c e8 a4 f0 5a c6 c7 43 30 00 00 00 00 8b 43
> [28816.788354] EIP: [<f9bfde00>] cifs_read_super+0xa0/0x1e0 [cifs]
> SS:ESP 0068:d42a1e6c
> [28816.788365] ---[ end trace 9d71176ecad6924f ]---
>
> Do you need any further information? As far as I can see the Ubuntu
> Jaunty packaged version of 2.6.28 works fine, but I did not run
> extensive tests on it.

How reproducible is this? Can you make it happen on every attempt?
Is this kernel being built with CONFIG_CIFS_DFS_UPCALL=y ?

We see this message:

> [28816.788094] CIFS VFS: cifs_mount failed w/return code = -113

...and then we see it oops in:

> [28816.788125] IP: [<f9bfde00>] :cifs:cifs_read_super+0xa0/0x1e0

After you see that error message, the code in that function doesn't do
much. Mainly, it just checks some pointers and frees a few things. If
it's crashing then it's probably happening when the code dereferences
the struct members in out_mount_failed, but I don't see any obvious
bugs jumping out at me.

Could you email me the cifs.ko module from this kernel? I'd like to
disassemble it and have a look at where it crashed. I may not be
able to tell much, but it's worth a look...

Thanks,
--
Jeff Layton <[email protected]>

2009-01-19 10:33:15

by Bernhard Schmidt

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Sun, Jan 18, 2009 at 09:03:14PM -0500, Jeff Layton wrote:

> How reproducible is this? Can you make it happen on every attempt?
> Is this kernel being built with CONFIG_CIFS_DFS_UPCALL=y ?

Happens every time. CIFS_DFS_UPCALL is set, yes.

> Could you email me the cifs.ko module from this kernel? I'd like to
> disassemble it and have a look at where it crashed. I may not be
> able to tell much, but it's worth a look...

Will do unicast when I'm back at my workstation at home.

Thanks,
Bernhard

2009-01-19 21:07:55

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Mon, 19 Jan 2009 11:32:48 +0100
Bernhard Schmidt <[email protected]> wrote:

> On Sun, Jan 18, 2009 at 09:03:14PM -0500, Jeff Layton wrote:
>
> > How reproducible is this? Can you make it happen on every attempt?
> > Is this kernel being built with CONFIG_CIFS_DFS_UPCALL=y ?
>
> Happens every time. CIFS_DFS_UPCALL is set, yes.
>
> > Could you email me the cifs.ko module from this kernel? I'd like to
> > disassemble it and have a look at where it crashed. I may not be
> > able to tell much, but it's worth a look...
>
> Will do unicast when I'm back at my workstation at home.
>
> Thanks,
> Bernhard

Thanks for the kmod. Kernel crashed doing this:

e00: 8b 43 30 mov 0x30(%ebx),%eax

...which checks out with the register dump. %ebx is 0x69000000.
and the address we failed to look up was 0x69000030.

My guess from a cursory look at the assembly is that %ebx should be
holding a pointer to cifs_sb. It's referenced quite a few times, but
doesn't seem to change until just before returning from the function.
The interesting bit is that there are a lot of other places (even some
that look like they've probably already been traversed) in this code
where %ebx is dereferenced but they didn't trigger the problem...

That said, there's a lot of jumping around in this assembly code and
it's not completely clear to me how it got to the point where it crashed.
We'll probably need to see if this can be independently reproduced. Can
you email along the details of how you're reproducing this? In
particular:

1) all mount options being used
2) details on the server (what OS, what version of samba, etc)
3) version of mount.cifs being used

--
Jeff Layton <[email protected]>

2009-01-19 21:38:37

by Bernhard Schmidt

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:

> That said, there's a lot of jumping around in this assembly code and
> it's not completely clear to me how it got to the point where it crashed.
> We'll probably need to see if this can be independently reproduced. Can
> you email along the details of how you're reproducing this? In
> particular:
>
> 1) all mount options being used

Not more than the ones shown in my original mail. The exact mount
command is *checkingfirewall*

mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub

> 2) details on the server (what OS, what version of samba, etc)
> 3) version of mount.cifs being used

The server is a Debian Lenny based i386 system running a self-compiled
vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.

The client is Ubuntu Intrepid (8.10) with intrepid-security,
intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
work misteriously (displays the usage instruction), the package version
is 3.2.3-1ubuntu3.4.

All those programs are not changed. After booting into kernel
2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
intrepid-updates) the above noted mount command works fine, after
booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
between those client kernel versions and reproduce it.

The changes between the working and the non-working version should
be as noted in

https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23

And I've just discovered something. Remind you, I'm trying to mount
2001:a60:f001:1::69. However, after entering the password the system
tries to reach another address

22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
length 32

The (wrong) address changes slightly between consecutive tries, I've seen
2001:a60:f001:1:4f:abe3:0:69
2001:a60:f001:1:58:abe3:0:69
as well.

Also, when I try some other (legal, but unused) address it bails out at
another code location, e.g.

mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
/pub

results in an immediate (!)

[ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
[ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
[ 486.471008] *pde = 00000000
[ 486.471012] Oops: 0000 [#13] SMP
[ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[ 486.471076]
[ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
[ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
[ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
[ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
[ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
[ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
[ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
[ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
[ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
[ 486.471124] Call Trace:
[ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
[ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
[ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
[ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
[ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
[ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
[ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
[ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
[ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
[ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
[ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
[ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
[ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
[ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
[ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
[ 486.471227] =======================
[ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
c3 78 67 8b 07
[ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
[ 486.471269] ---[ end trace cbe7d886432a8981 ]---#

This looks like an address corruption in the memory, maybe the original
BUG was there before and did not trigger because the server was reachable.

Bernhard

2009-01-20 12:25:49

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Mon, 19 Jan 2009 22:38:20 +0100
Bernhard Schmidt <[email protected]> wrote:

> On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:
>
> > That said, there's a lot of jumping around in this assembly code and
> > it's not completely clear to me how it got to the point where it crashed.
> > We'll probably need to see if this can be independently reproduced. Can
> > you email along the details of how you're reproducing this? In
> > particular:
> >
> > 1) all mount options being used
>
> Not more than the ones shown in my original mail. The exact mount
> command is *checkingfirewall*
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub
>
> > 2) details on the server (what OS, what version of samba, etc)
> > 3) version of mount.cifs being used
>
> The server is a Debian Lenny based i386 system running a self-compiled
> vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.
>
> The client is Ubuntu Intrepid (8.10) with intrepid-security,
> intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
> work misteriously (displays the usage instruction), the package version
> is 3.2.3-1ubuntu3.4.
>
> All those programs are not changed. After booting into kernel
> 2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
> intrepid-updates) the above noted mount command works fine, after
> booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
> between those client kernel versions and reproduce it.
>
> The changes between the working and the non-working version should
> be as noted in
>
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23
>
> And I've just discovered something. Remind you, I'm trying to mount
> 2001:a60:f001:1::69. However, after entering the password the system
> tries to reach another address
>
> 22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
> ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
> length 32
>
> The (wrong) address changes slightly between consecutive tries, I've seen
> 2001:a60:f001:1:4f:abe3:0:69
> 2001:a60:f001:1:58:abe3:0:69
> as well.
>
> Also, when I try some other (legal, but unused) address it bails out at
> another code location, e.g.
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
> /pub
>
> results in an immediate (!)
>
> [ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
> [ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
> [ 486.471008] *pde = 00000000
> [ 486.471012] Oops: 0000 [#13] SMP
> [ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
> iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
> mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
> dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
> [ 486.471076]
> [ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
> [ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
> [ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
> [ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
> [ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
> [ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
> [ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
> [ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
> [ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
> [ 486.471124] Call Trace:
> [ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
> [ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
> [ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
> [ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
> [ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
> [ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
> [ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
> [ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
> [ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
> [ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
> [ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
> [ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
> [ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
> [ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
> [ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
> [ 486.471227] =======================
> [ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
> c3 78 67 8b 07
> [ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
> [ 486.471269] ---[ end trace cbe7d886432a8981 ]---#
>
> This looks like an address corruption in the memory, maybe the original
> BUG was there before and did not trigger because the server was reachable.
>
> Bernhard
>

I've still not been able to reproduce this here though I don't have any
test boxes with kernels this old and I don't have any 32-bit x86 boxes
handy. Here's something that may be helpful though. Could you turn up
cifsFYI, reproduce this and then send along the log messages?

# echo 7 > /proc/fs/cifs/cifsFYI
# mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub /pub

...then send along all of the cifs debug messages and the oops message.
I suspect either the parsing of the ip= option is falling down somehow, or
in6_pton is failing to translate the address.

--
Jeff Layton <[email protected]>

2009-01-20 18:51:24

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Tue, 20 Jan 2009 07:25:16 -0500
Jeff Layton <[email protected]> wrote:

> On Mon, 19 Jan 2009 22:38:20 +0100
> Bernhard Schmidt <[email protected]> wrote:
>
> > On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:
> >
> > > That said, there's a lot of jumping around in this assembly code and
> > > it's not completely clear to me how it got to the point where it crashed.
> > > We'll probably need to see if this can be independently reproduced. Can
> > > you email along the details of how you're reproducing this? In
> > > particular:
> > >
> > > 1) all mount options being used
> >
> > Not more than the ones shown in my original mail. The exact mount
> > command is *checkingfirewall*
> >
> > mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub
> >
> > > 2) details on the server (what OS, what version of samba, etc)
> > > 3) version of mount.cifs being used
> >
> > The server is a Debian Lenny based i386 system running a self-compiled
> > vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.
> >
> > The client is Ubuntu Intrepid (8.10) with intrepid-security,
> > intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
> > work misteriously (displays the usage instruction), the package version
> > is 3.2.3-1ubuntu3.4.
> >
> > All those programs are not changed. After booting into kernel
> > 2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
> > intrepid-updates) the above noted mount command works fine, after
> > booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
> > between those client kernel versions and reproduce it.
> >
> > The changes between the working and the non-working version should
> > be as noted in
> >
> > https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
> > https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23
> >
> > And I've just discovered something. Remind you, I'm trying to mount
> > 2001:a60:f001:1::69. However, after entering the password the system
> > tries to reach another address
> >
> > 22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
> > ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
> > length 32
> >
> > The (wrong) address changes slightly between consecutive tries, I've seen
> > 2001:a60:f001:1:4f:abe3:0:69
> > 2001:a60:f001:1:58:abe3:0:69
> > as well.
> >
> > Also, when I try some other (legal, but unused) address it bails out at
> > another code location, e.g.
> >
> > mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
> > /pub
> >
> > results in an immediate (!)
> >
> > [ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
> > [ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
> > [ 486.471008] *pde = 00000000
> > [ 486.471012] Oops: 0000 [#13] SMP
> > [ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
> > iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
> > snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
> > mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
> > dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
> > [ 486.471076]
> > [ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
> > [ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
> > [ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
> > [ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
> > [ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
> > [ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > [ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
> > [ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
> > [ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
> > [ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
> > [ 486.471124] Call Trace:
> > [ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
> > [ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
> > [ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
> > [ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
> > [ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
> > [ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
> > [ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
> > [ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
> > [ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
> > [ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
> > [ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
> > [ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
> > [ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
> > [ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
> > [ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
> > [ 486.471227] =======================
> > [ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
> > c3 78 67 8b 07
> > [ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
> > [ 486.471269] ---[ end trace cbe7d886432a8981 ]---#
> >
> > This looks like an address corruption in the memory, maybe the original
> > BUG was there before and did not trigger because the server was reachable.
> >
> > Bernhard
> >
>
> I've still not been able to reproduce this here though I don't have any
> test boxes with kernels this old and I don't have any 32-bit x86 boxes
> handy. Here's something that may be helpful though. Could you turn up
> cifsFYI, reproduce this and then send along the log messages?
>
> # echo 7 > /proc/fs/cifs/cifsFYI
> # mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub /pub
>
> ...then send along all of the cifs debug messages and the oops message.
> I suspect either the parsing of the ip= option is falling down somehow, or
> in6_pton is failing to translate the address.
>

FWIW, I went ahead and built a fedora 10 i686 xen guest and tried to
reproduce this there (kernel 2.6.27.9-159.fc10.i686). I assigned the
guest the same IPv6 address that your client is using and tried the
exact same mount string.

Everything seems to work as expected, and I don't see any neighbor
solicitations for bogus IPv6 addrs.

The best thing I can suggest at this point is to try inserting some
debug printk's at various points where the IPv6 address is parsed and
see if you can determine at what point the address becomes corrupted.

--
Jeff Layton <[email protected]>

2009-01-22 13:42:01

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Mon, 19 Jan 2009 22:38:20 +0100
Bernhard Schmidt <[email protected]> wrote:

> On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:
>
> > That said, there's a lot of jumping around in this assembly code and
> > it's not completely clear to me how it got to the point where it crashed.
> > We'll probably need to see if this can be independently reproduced. Can
> > you email along the details of how you're reproducing this? In
> > particular:
> >
> > 1) all mount options being used
>
> Not more than the ones shown in my original mail. The exact mount
> command is *checkingfirewall*
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub
>
> > 2) details on the server (what OS, what version of samba, etc)
> > 3) version of mount.cifs being used
>
> The server is a Debian Lenny based i386 system running a self-compiled
> vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.
>
> The client is Ubuntu Intrepid (8.10) with intrepid-security,
> intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
> work misteriously (displays the usage instruction), the package version
> is 3.2.3-1ubuntu3.4.
>
> All those programs are not changed. After booting into kernel
> 2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
> intrepid-updates) the above noted mount command works fine, after
> booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
> between those client kernel versions and reproduce it.
>
> The changes between the working and the non-working version should
> be as noted in
>
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23
>
> And I've just discovered something. Remind you, I'm trying to mount
> 2001:a60:f001:1::69. However, after entering the password the system
> tries to reach another address
>
> 22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
> ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
> length 32
>
> The (wrong) address changes slightly between consecutive tries, I've seen
> 2001:a60:f001:1:4f:abe3:0:69
> 2001:a60:f001:1:58:abe3:0:69
> as well.
>
> Also, when I try some other (legal, but unused) address it bails out at
> another code location, e.g.
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
> /pub
>
> results in an immediate (!)
>
> [ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
> [ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
> [ 486.471008] *pde = 00000000
> [ 486.471012] Oops: 0000 [#13] SMP
> [ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
> iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
> mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
> dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
> [ 486.471076]
> [ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
> [ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
> [ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
> [ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
> [ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
> [ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
> [ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
> [ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
> [ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
> [ 486.471124] Call Trace:
> [ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
> [ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
> [ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
> [ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
> [ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
> [ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
> [ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
> [ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
> [ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
> [ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
> [ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
> [ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
> [ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
> [ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
> [ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
> [ 486.471227] =======================
> [ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
> c3 78 67 8b 07
> [ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
> [ 486.471269] ---[ end trace cbe7d886432a8981 ]---#
>
> This looks like an address corruption in the memory, maybe the original
> BUG was there before and did not trigger because the server was reachable.
>

Unfortunately, I'm still unable to reproduce this.

Here's a small debug patch that may be helpful. Essentially it extends
one of the existing cifsFYI printk's to make it also dump the address
to which we're going to connect. This would allow us to verify that the
address parser is working correctly.

Basically you'll need to build a new cifs.ko (or entire kernel) with
this patch, and then turn up cifsFYI debug messages:

# insmod /location/of/new/cifs.ko
# echo 7 > /proc/fs/cifs/cifsFYI
# mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub /pub

Once it crashes, please send the entire set of messages dumped to the
console after the mount attempt, including the oops message. Hopefully
that will allow us to verify whether address corruption is in the
parsing phase.

Thanks,
--
Jeff Layton <[email protected]>


Attachments:
(No filename) (6.49 kB)
cifs-ipv6-debug.patch (628.00 B)
Download all attachments

2009-01-22 14:45:52

by Jeff Layton

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Mon, 19 Jan 2009 22:38:20 +0100
Bernhard Schmidt <[email protected]> wrote:

> On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:
>
> > That said, there's a lot of jumping around in this assembly code and
> > it's not completely clear to me how it got to the point where it crashed.
> > We'll probably need to see if this can be independently reproduced. Can
> > you email along the details of how you're reproducing this? In
> > particular:
> >
> > 1) all mount options being used
>
> Not more than the ones shown in my original mail. The exact mount
> command is *checkingfirewall*
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub
>
> > 2) details on the server (what OS, what version of samba, etc)
> > 3) version of mount.cifs being used
>
> The server is a Debian Lenny based i386 system running a self-compiled
> vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.
>
> The client is Ubuntu Intrepid (8.10) with intrepid-security,
> intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
> work misteriously (displays the usage instruction), the package version
> is 3.2.3-1ubuntu3.4.
>
> All those programs are not changed. After booting into kernel
> 2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
> intrepid-updates) the above noted mount command works fine, after
> booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
> between those client kernel versions and reproduce it.
>
> The changes between the working and the non-working version should
> be as noted in
>
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23
>
> And I've just discovered something. Remind you, I'm trying to mount
> 2001:a60:f001:1::69. However, after entering the password the system
> tries to reach another address
>
> 22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
> ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
> length 32
>
> The (wrong) address changes slightly between consecutive tries, I've seen
> 2001:a60:f001:1:4f:abe3:0:69
> 2001:a60:f001:1:58:abe3:0:69
> as well.
>
> Also, when I try some other (legal, but unused) address it bails out at
> another code location, e.g.
>
> mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
> /pub
>
> results in an immediate (!)
>
> [ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
> [ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
> [ 486.471008] *pde = 00000000
> [ 486.471012] Oops: 0000 [#13] SMP
> [ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
> iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
> mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
> dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
> [ 486.471076]
> [ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
> [ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
> [ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
> [ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
> [ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
> [ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
> [ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
> [ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
> [ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
> [ 486.471124] Call Trace:
> [ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
> [ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
> [ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
> [ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
> [ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
> [ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
> [ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
> [ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
> [ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
> [ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
> [ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
> [ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
> [ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
> [ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
> [ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
> [ 486.471227] =======================
> [ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
> c3 78 67 8b 07
> [ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
> [ 486.471269] ---[ end trace cbe7d886432a8981 ]---#
>
> This looks like an address corruption in the memory, maybe the original
> BUG was there before and did not trigger because the server was reachable.
>

I think I may see the bug...

I think the "addr" struct in cifs_mount is too small for ipv6 addresses.
Here's a proposed patch for 2.6.27.y. Could you apply it and let me
know if it fixes the bug?

--
Jeff Layton <[email protected]>


Attachments:
(No filename) (5.90 kB)
0001-cifs-make-sure-we-allocate-enough-storage-for-socke.patch (3.33 kB)
Download all attachments

2009-01-22 17:03:19

by Bernhard Schmidt

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

Hello Jeff,

> I think I may see the bug...
>
> I think the "addr" struct in cifs_mount is too small for ipv6 addresses.
> Here's a proposed patch for 2.6.27.y. Could you apply it and let me
> know if it fixes the bug?

Sorry, I was out of town, I'll build a kernel asap.

Stefan confirmed the bug (on i386 platform, apparently x86_64 did not
expose the broken behaviour) and the bug fixed at Ubuntu with your
patch. See https://bugs.launchpad.net/bugs/318565 . So I assume he
tested it and it fixed the problem, but I'll test as well myself.

Thanks!
Bernhard

2009-01-23 00:01:18

by Steve French

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

Good catch.

Merged into cifs-2.6.git

When Linus gets back - will request push upstream - this is important.

On Thu, Jan 22, 2009 at 8:45 AM, Jeff Layton <[email protected]> wrote:
> On Mon, 19 Jan 2009 22:38:20 +0100
> Bernhard Schmidt <[email protected]> wrote:
>
>> On Mon, Jan 19, 2009 at 04:07:25PM -0500, Jeff Layton wrote:
>>
>> > That said, there's a lot of jumping around in this assembly code and
>> > it's not completely clear to me how it got to the point where it crashed.
>> > We'll probably need to see if this can be independently reproduced. Can
>> > you email along the details of how you're reproducing this? In
>> > particular:
>> >
>> > 1) all mount options being used
>>
>> Not more than the ones shown in my original mail. The exact mount
>> command is *checkingfirewall*
>>
>> mount -t cifs -o user=berni,ip=2001:a60:f001:1::69 //fileserv/pub /pub
>>
>> > 2) details on the server (what OS, what version of samba, etc)
>> > 3) version of mount.cifs being used
>>
>> The server is a Debian Lenny based i386 system running a self-compiled
>> vanilla 2.6.28 kernel and Debian packaged Samba 3.2.5-3.
>>
>> The client is Ubuntu Intrepid (8.10) with intrepid-security,
>> intrepid-updates and intrepid-proposed enabled. mount.cifs -V does not
>> work misteriously (displays the usage instruction), the package version
>> is 3.2.3-1ubuntu3.4.
>>
>> All those programs are not changed. After booting into kernel
>> 2.6.27-7.14 (intrepid release) or 2.6.27-9.19 (intrepid-security,
>> intrepid-updates) the above noted mount command works fine, after
>> booting 2.6.27-11.23 (intrepid-proposed) it's broken. I can change
>> between those client kernel versions and reproduce it.
>>
>> The changes between the working and the non-working version should
>> be as noted in
>>
>> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-10.20
>> https://launchpad.net/ubuntu/intrepid/+source/linux/2.6.27-11.23
>>
>> And I've just discovered something. Remind you, I'm trying to mount
>> 2001:a60:f001:1::69. However, after entering the password the system
>> tries to reach another address
>>
>> 22:31:24.336790 IP6 2001:a60:f001:1:219:66ff:fe8b:a6e > ff02::1:ff00:69:
>> ICMP6, neighbor solicitation, who has 2001:a60:f001:1:55:abe3:0:69,
>> length 32
>>
>> The (wrong) address changes slightly between consecutive tries, I've seen
>> 2001:a60:f001:1:4f:abe3:0:69
>> 2001:a60:f001:1:58:abe3:0:69
>> as well.
>>
>> Also, when I try some other (legal, but unused) address it bails out at
>> another code location, e.g.
>>
>> mount -t cifs -o user=berni,ip=2001:a60:f001:1:cc::69 //fileserv/pub
>> /pub
>>
>> results in an immediate (!)
>>
>> [ 486.470983] BUG: unable to handle kernel paging request at 0000cc0c
>> [ 486.470992] IP: [<f9c0a3d6>] :cifs:ipv6_connect+0x46/0x1a0
>> [ 486.471008] *pde = 00000000
>> [ 486.471012] Oops: 0000 [#13] SMP
>> [ 486.471016] Modules linked in: nls_cp437 cifs af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth kvm_amd kvm ppdev tun ipv6 pci_slot container sbs sbshc video output battery
>> iptable_filter ip_tables x_tables ac parport_pc lp parport serio_raw pcspkr psmouse snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi k8temp snd_rawmidi
>> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i2c_piix4 i2c_core evdev dm_multipath scsi_dh pl2303 usbserial fglrx(P) agpgart wmi button shpchp pci_hotplug ext3 jbd
>> mbcache sr_mod cdrom pata_acpi sd_mod crc_t10dif pata_atiixp sg usbhid hid usb_storage libusual ata_generic ahci ohci_hcd ehci_hcd libata usbcore scsi_mod dock r8169 mii dm_mirror dm_log dm_snapshot
>> dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
>> [ 486.471076]
>> [ 486.471079] Pid: 6607, comm: mount.cifs Tainted: P D (2.6.27-11-generic #1)
>> [ 486.471082] EIP: 0060:[<f9c0a3d6>] EFLAGS: 00010246 CPU: 0
>> [ 486.471092] EIP is at ipv6_connect+0x46/0x1a0 [cifs]
>> [ 486.471095] EAX: 0000cc00 EBX: f9c3b094 ECX: 0000001c EDX: c4ecfe44
>> [ 486.471097] ESI: c4ecfe44 EDI: c4ecfe54 EBP: c4ecfd94 ESP: c4ecfd7c
>> [ 486.471100] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>> [ 486.471103] Process mount.cifs (pid: 6607, ti=c4ece000 task=ce5abed0 task.ti=c4ece000)
>> [ 486.471105] Stack: c4ecfd94 f9c1708b 00000000 f9c3b094 c4eaa000 f9c3b094 c4ecfe64 f9c0bf5c
>> [ 486.471112] c10b1030 c4eac000 c4ecfdcc c0189c8d c4eab000 000280d0 f9c3b094 000a100c
>> [ 486.471118] 00000001 caa0e3c0 dbf3e200 0000000c 00000000 00000020 c4ecfe4c c4ecfe0c
>> [ 486.471124] Call Trace:
>> [ 486.471127] [<f9c1708b>] ? cifs_inet_pton+0x7b/0x80 [cifs]
>> [ 486.471141] [<f9c0bf5c>] ? cifs_mount+0x46c/0xda0 [cifs]
>> [ 486.471151] [<c0189c8d>] ? prep_new_page+0xdd/0x160
>> [ 486.471161] [<c024e132>] ? idr_get_empty_slot+0xe2/0x270
>> [ 486.471167] [<f9bfddf3>] ? cifs_read_super+0x93/0x1e0 [cifs]
>> [ 486.471176] [<c01b38f0>] ? set_anon_super+0x0/0xd0
>> [ 486.471181] [<f9bfdfa9>] ? cifs_get_sb+0x69/0xc0 [cifs]
>> [ 486.471192] [<c01b468e>] ? vfs_kern_mount+0x5e/0x130
>> [ 486.471197] [<c01b47be>] ? do_kern_mount+0x3e/0xe0
>> [ 486.471201] [<c01cccff>] ? do_new_mount+0x6f/0x90
>> [ 486.471206] [<c01cd242>] ? do_mount+0x1d2/0x1f0
>> [ 486.471210] [<c01ca95d>] ? exact_copy_from_user+0x4d/0xa0
>> [ 486.471214] [<c01caf6e>] ? copy_mount_options+0x6e/0xd0
>> [ 486.471218] [<c01cd2f1>] ? sys_mount+0x91/0xc0
>> [ 486.471222] [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
>> [ 486.471227] =======================
>> [ 486.471228] Code: 8b 02 85 c0 0f 84 eb 00 00 00 66 83 7e 02 00 66 c7 06 0a 00 66 c7 45 f2 00 00 75 59 66 c7 46 02 01 bd 8b 07 b9 1c 00 00 00 89 f2 <8b> 58 0c c7 04 24 00 00 00 00 ff 53 10 85 c0 89
>> c3 78 67 8b 07
>> [ 486.471257] EIP: [<f9c0a3d6>] ipv6_connect+0x46/0x1a0 [cifs] SS:ESP 0068:c4ecfd7c
>> [ 486.471269] ---[ end trace cbe7d886432a8981 ]---#
>>
>> This looks like an address corruption in the memory, maybe the original
>> BUG was there before and did not trigger because the server was reachable.
>>
>
> I think I may see the bug...
>
> I think the "addr" struct in cifs_mount is too small for ipv6 addresses.
> Here's a proposed patch for 2.6.27.y. Could you apply it and let me
> know if it fixes the bug?
>
> --
> Jeff Layton <[email protected]>
>



--
Thanks,

Steve

2009-01-23 01:49:43

by Bernhard Schmidt

[permalink] [raw]
Subject: Re: [linux-cifs-client] BUG: Possible cifs+IPv6-Regression 2.6.27.4 -> 2.6.27.9

On Thu, Jan 22, 2009 at 09:45:17AM -0500, Jeff Layton wrote:

> I think I may see the bug...
>
> I think the "addr" struct in cifs_mount is too small for ipv6 addresses.
> Here's a proposed patch for 2.6.27.y. Could you apply it and let me
> know if it fixes the bug?

I can confirm that Ubuntu kernel 2.6.27-11.25 (which has only your patch
in the changes) fixes this issue.

Thanks a lot,
Bernhard