2010-11-01 12:08:52

by Linus Torvalds

[permalink] [raw]
Subject: Linux 2.6.37-rc1

The merge window for 2.6.37 is over, and -rc1 is out there (or will be
soon, as things are uploading from my laptop here in Boston and then
mirroring out)

There's a lot of changes there - just shy of 10k commits since 2.6.36
- despite the slightly shortened merge window. Way too many to list.
But the part that I think deserves some extra mention is that we've
finally largely gotten rid of the BKL (big kernel lock) in all the
core stuff, and you can easily compile a kernel without any BKL
support at all. It's been a long road, and thanks to Arnd and others
who did it.

Note that "core code" does not mean "everything". There are still
drivers out there that need the BKL, and if you configure the kernel
without it, you won't be able to configure in the V4L drivers, for
example. They still have lock_kernel/unlock_kernel calls in them, but
hopefully that will get fixed too, to the point where in the not too
distant future we will hopefully see only some legacy drivers that
nobody uses needing the old locking.

Other than that, it looks like a fairly normal release. Lots of
changes all over, with drivers - as usual - dominating the patches.
Give it all a try, and hopefully this will be a quiet week wrt
development with lots of core developers here at the kernel summit.

Linus


2010-11-01 15:38:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1

On Mon, 1 Nov 2010, Linus Torvalds wrote:

> The merge window for 2.6.37 is over, and -rc1 is out there (or will be
> soon, as things are uploading from my laptop here in Boston and then
> mirroring out)

Did you switch back to tarballs and patches only or is there going to
be an update of your git tree in the foreseeable future ?

tglx

2010-11-01 15:47:27

by Nick Bowler

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1

On 2010-11-01 16:37 +0100, Thomas Gleixner wrote:
> On Mon, 1 Nov 2010, Linus Torvalds wrote:
> > The merge window for 2.6.37 is over, and -rc1 is out there (or will be
> > soon, as things are uploading from my laptop here in Boston and then
> > mirroring out)
>
> Did you switch back to tarballs and patches only or is there going to
> be an update of your git tree in the foreseeable future ?

Seems that the v2.6.37-rc1 tag is in the git tree, but is not reachable
from the master branch. See

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=shortlog;h=v2.6.37-rc1

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2010-11-01 15:53:25

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1

On Mon, 1 Nov 2010 11:47:17 -0400, Nick Bowler <[email protected]> wrote:
> On 2010-11-01 16:37 +0100, Thomas Gleixner wrote:
> > On Mon, 1 Nov 2010, Linus Torvalds wrote:
> > > The merge window for 2.6.37 is over, and -rc1 is out there (or will be
> > > soon, as things are uploading from my laptop here in Boston and then
> > > mirroring out)
> >
> > Did you switch back to tarballs and patches only or is there going to
> > be an update of your git tree in the foreseeable future ?
>
> Seems that the v2.6.37-rc1 tag is in the git tree, but is not reachable
> from the master branch. See
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=shortlog;h=v2.6.37-rc1
>

Yes, it looks like only the master branch hasn't been updated.

we can get -rc1 by 'git fetch --tags'

Ryusuke Konishi

2010-11-01 16:08:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1

On Mon, Nov 1, 2010 at 11:52 AM, Ryusuke Konishi
<[email protected]> wrote:
>
> Yes, it looks like only the master branch hasn't been updated.

Ahh yes. I pushed out the new work, but only the tag, not the branch.

Fixed.

Linus

2010-11-02 18:53:41

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (acpi_video)


When I disable CONFIG_INPUT (on purpose):

ERROR: "input_event" [drivers/acpi/video.ko] undefined!
ERROR: "input_register_device" [drivers/acpi/video.ko] undefined!
ERROR: "input_free_device" [drivers/acpi/video.ko] undefined!
ERROR: "input_unregister_device" [drivers/acpi/video.ko] undefined!
ERROR: "input_allocate_device" [drivers/acpi/video.ko] undefined!


due to the gpu/stub (poulsbo) driver:

config STUB_POULSBO
tristate "Intel GMA500 Stub Driver"
depends on PCI
# Poulsbo stub depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select ACPI_VIDEO if ACPI

The config does not select INPUT, even though the comment says that
ACPI_VIDEO's dependencies need to be selected, and ACPI_VIDEO does
depend on INPUT. But then, I'm trying to build a kernel with
CONFIG_INPUT disabled....


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-03 21:09:38

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (cciss: remove_proc_entry warning)



[ 109.073624] ------------[ cut here ]------------
[ 109.078814] WARNING: at /local/linsrc/lnx-2637-rc1/fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
Nov 3 12:58:45 [ 109.088567] Hardware name: OptiPlex GX620
control kernel: [ 109.095824] name 'driver/cciss'
[ 109.073624] -[ 109.100141] Modules linked in:-----------[ cut cciss(-) here ]--------- ipt_MASQUERADE---
Nov 3 12:58 iptable_nat:45 control kern nf_natel: [ 109.07881 af_packet4] WARNING: at / nfsdlocal/linsrc/lnx lockd-2637-rc1/fs/pro nfs_aclc/generic.c:816 auth_rpcgssremove_proc_entr exportfsy+0x156/0x35e()
scoNov 3 12:58:45 bridgecontrol kernel: stp[ 109.088567] H llcardware name: Op bneptiPlex GX620 l2cap
Nov crc16 3 12:58:45 cont bluetoothrol kernel: [ 1 rfkill09.095824] name sunrpc'driver/cciss'
ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput joydev mousedev ppdev snd_intel8x0 snd_ac97_codec usbkbd ac97_bus snd_seq usbmouse snd_seq_device usbhid snd_pcm led_class hid snd_timer iTCO_wdt tg3 dcdbas iTCO_vendor_support snd sr_mod i2c_i801 sg pcspkr rtc_cmos parport_pc cdrom rtc_core evdev shpchp soundcore rng_core rtc_lib parport 8250_pnp snd_page_alloc pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 109.164277] Pid: 3463, comm: rmmod Not tainted 2.6.37-rc1 #7
[ 109.164280] Call Trace:
[ 109.164292] [<ffffffff8107eb8d>] warn_slowpath_common+0xc6/0xf3
[ 109.164299] [<ffffffff8107ecaa>] warn_slowpath_fmt+0x5b/0x6b
[ 109.164307] [<ffffffff8155175b>] ? _raw_spin_unlock+0x40/0x4b
[ 109.164313] [<ffffffff8123dd1e>] remove_proc_entry+0x156/0x35e
[ 109.164320] [<ffffffff812cd91b>] ? do_raw_spin_unlock+0xff/0x10f
[ 109.164327] [<ffffffff8113823d>] ? trace_hardirqs_on+0x10/0x4a
[ 109.164333] [<ffffffff8155162d>] ? _raw_spin_unlock_irq+0x4c/0x7b
[ 109.164339] [<ffffffff8154d4d1>] ? wait_for_common+0x145/0x15e
[ 109.164345] [<ffffffff81075337>] ? default_wake_function+0x0/0x22
[ 109.164357] [<ffffffffa0615a8f>] cciss_cleanup+0xa9/0xc7 [cciss]
[ 109.164365] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
[ 109.164371] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
[ 109.164377] [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5
[ 109.164383] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 109.164389] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[ 109.164394] ---[ end trace 88e8568246ed0b1d ]---



---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-03 21:16:33

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (pcrypt fault)


modprobe pcrypt; rmmod pcrypt ==>


[ 76.081639] calling pcrypt_init+0x0/0x107 [pcrypt] @ 3016
Nov 3 13:02:15 control kernel: [ 76.089883] initcall pcrypt_init+0x0/0x107 [pcrypt] returned 0 after 2476 usecs
[ 76.081639] calling pcrypt_i


[ 79.940445] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 79.946419] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
[ 79.954652] CPU 0
[ 79.954652] Modules linked in: pcrypt(-) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ppdev ac97_bus snd_seq snd_seq_device usbmouse snd_pcm led_class usbkbd usbhid hid iTCO_wdt snd_timer iTCO_vendor_support tg3 snd sr_mod dcdbas sg soundcore rng_core cdrom pcspkr parport_pc i2c_i801 rtc_cmos snd_page_alloc evdev shpchp rtc_core parport rtc_lib mac_hid pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class!
ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 80.054943]
[ 80.058247] Pid: 3074, comm: rmmod Not tainted 2.6.37-rc1 #7 0HH807/OptiPlex GX620
[ 80.058247] RIP: 0010:[<ffffffff810c3a98>] [<ffffffff810c3a98>] __lock_acquire+0x131/0x4e8
[ 80.058247] RSP: 0018:ffff88006d0b9cd8 EFLAGS: 00010002
[ 80.058247] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88006d3c93c8 RCX: 0000000000000000
[ 80.058247] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d3c93c8
[ 80.058247] RBP: ffff88006d0b9d38 R08: 0000000000000001 R09: 0000000000000000
[ 80.058247] R10: ffff88006d0b9ea8 R11: ffff88007c002c80 R12: 0000000000000000
[ 80.058247] R13: ffff88006c5f8000 R14: 0000000000000000 R15: 0000000000000000
[ 80.058247] FS: 00007f6fed3ba6f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
[ 80.058247] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 80.058247] CR2: 0000000000627410 CR3: 000000006d047000 CR4: 00000000000006f0
[ 80.058247] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 80.058247] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 80.058247] Process rmmod (pid: 3074, threadinfo ffff88006d0b8000, task ffff88006c5f8000)
[ 80.058247] Stack:
[ 80.058247] ffff88006d3c9240 ffffea00017e53c0 ffffffff81158ccc ffffea00017e10d0
[ 80.058247] ffff88006d0b9d38 000000007c002640 ffff88006d2963c8 0000000000000000
[ 80.173526] ffff88006c5f8000 ffffffff81158c28 0000000000000001 0000000000000000
[ 80.184364] Call Trace:
[ 80.185803] [<ffffffff81158ccc>] ? padata_sysfs_release+0x4/0x25
[ 80.185803] [<ffffffff81158c28>] ? padata_stop+0x27/0x51
[ 80.185803] [<ffffffff810c3f4f>] lock_acquire+0x100/0x150
[ 80.200956] [<ffffffff81158c28>] ? padata_stop+0x27/0x51
[ 80.205967] [<ffffffff81158c28>] ? padata_stop+0x27/0x51
[ 80.205967] [<ffffffff8154e90b>] __mutex_lock_common+0x45/0x658
[ 80.222282] [<ffffffff81158c28>] ? padata_stop+0x27/0x51
[ 80.226676] [<ffffffff811b32a2>] ? free_debug_processing+0x245/0x27d
[ 80.237637] [<ffffffffa003db0c>] ? pcrypt_fini_padata+0x4a/0x96 [pcrypt]
[ 80.249091] [<ffffffff811b3493>] ? __slab_free+0x1b9/0x1d6
[ 80.257880] [<ffffffff8154f020>] mutex_lock_nested+0x4e/0x5a
[ 80.257880] [<ffffffff81158c28>] padata_stop+0x27/0x51
[ 80.273324] [<ffffffffa003db1b>] pcrypt_fini_padata+0x59/0x96 [pcrypt]
[ 80.283024] [<ffffffffa003df60>] pcrypt_exit+0x1c/0x5e [pcrypt]
[ 80.290927] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
[ 80.298370] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
[ 80.305282] [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5
[ 80.313996] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 80.313996] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[ 80.333600] Code: ff 05 8d 1b 72 01 44 89 4d c0 e8 a4 db ff ff 48 ff 05 85 1b 72 01 48 85 c0 44 8b 4d c0 75 0c 48 ff 05 7d 1b 72 01 e9 a8 03 00 00 <f0> ff 80 98 01 00 00 8b 35 4b 18 f9 00 48 ff 05 6c 1b 72 01 45
[ 80.355817] RIP [<ffffffff810c3a98>] __lock_acquire+0x131/0x4e8
[ 80.359975] RSP <ffff88006d0b9cd8>
[ 80.371613] ---[ end trace 8f6f53761e872c8f ]---
control kernel:
control kernel: [ 80.305282] [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5


kernel config file is attached (nearly allmodconfig).
There is a chance of some CONFIG that is not helpful...

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***


Attachments:
config-2637-rc1 (116.19 kB)

2010-11-03 21:23:47

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)


Maybe this isn't normal usage: just modprobe cls_cgroup && rmmod cls_cgroup:


[ 107.806607] ------------[ cut here ]------------
[ 107.810180] kernel BUG at /local/linsrc/lnx-2637-rc1/kernel/cgroup.c:3855!
[ 107.810180] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 107.822274] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
[ 107.824889] CPU 0
[ 107.832854] Modules linked in: cls_cgroup(-) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus usbmouse snd_seq snd_seq_device usbkbd usbhid snd_pcm ppdev hid tg3 led_class snd_timer dcdbas sr_mod snd iTCO_wdt cdrom iTCO_vendor_support sg rtc_cmos pcspkr soundcore i2c_i801 rng_core snd_page_alloc rtc_core parport_pc shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_c!
lass ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 107.933458]
[ 107.933458] Pid: 3400, comm: rmmod Not tainted 2.6.37-rc1 #7 0HH807/OptiPlex GX620
[ 107.937800] RIP: 0010:[<ffffffff810e6c9d>] [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
[ 107.937800] RSP: 0018:ffff88006c107ea8 EFLAGS: 00010202
[ 107.937800] RAX: 0000000000000000 RBX: ffffffffa0009d50 RCX: 0000000000000000
[ 107.937800] RDX: ffffffff81a3a5f0 RSI: ffff88006c107dc8 RDI: ffff88006c107e48
[ 107.937800] RBP: ffff88006c107ec8 R08: ffffffff81a3a5f0 R09: 000000000000039a
[ 107.937800] R10: 0000000000000001 R11: ffff88006c107e48 R12: 0000000000000000
[ 107.937800] R13: 00007fff2664ffc0 R14: 0000000000000000 R15: 0000000000000001
[ 107.937800] FS: 00007f52809e46f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
[ 107.937800] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 107.937800] CR2: 0000003fb5a7bf20 CR3: 000000006c1d8000 CR4: 00000000000006f0
[ 107.937800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 107.937800] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 107.937800] Process rmmod (pid: 3400, threadinfo ffff88006c106000, task ffff880075a33000)
[ 107.937800] Stack:
[ 107.937800] ffff88006c107ec8 ffffffffa000a0e0 0000000000000000 00007fff2664ffc0
[ 107.937800] ffff88006c107ed8 ffffffffa0009819 ffff88006c107f78 ffffffff810d3cb0
[ 108.048442] ffffffffa000a0e0 0000000000000880 ffff88006c107f14 ffffffff8155036b
[ 108.057485] Call Trace:
[ 108.065148] [<ffffffffa0009819>] exit_cgroup_cls+0x45/0x4e [cls_cgroup]
[ 108.070071] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
[ 108.085255] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
[ 108.093771] [<ffffffff81007075>] ? xen_zap_pfn_range+0x53/0x139
[ 108.101589] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 108.111624] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[ 108.119099] Code: 05 51 8d 71 01 0f 0b eb fe 31 f6 48 c7 c7 a0 a5 a3 81 48 ff 05 45 8d 71 01 e8 42 83 46 00 83 7b 58 07 7f 0b 48 ff 05 43 8d 71 01 <0f> 0b eb fe 48 ff 05 40 8d 71 01 48 8d bb 30 01 00 00 48 63 43
[ 108.145840] RIP [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
[ 108.152902] RSP <ffff88006c107ea8>
[ 108.161767] ---[ end trace 659fde6f8f5f2810 ]---



kernel config file is attached (almost allmodconfig).
There may be some CONFIG options that are not helping...

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***


Attachments:
config-2637-rc1 (116.19 kB)

2010-11-03 22:01:26

by Eric Dumazet

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

Le mercredi 03 novembre 2010 à 14:21 -0700, Randy Dunlap a écrit :
> Maybe this isn't normal usage: just modprobe cls_cgroup && rmmod cls_cgroup:
>
>
> [ 107.806607] ------------[ cut here ]------------
> [ 107.810180] kernel BUG at /local/linsrc/lnx-2637-rc1/kernel/cgroup.c:3855!
> [ 107.810180] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> [ 107.822274] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
> [ 107.824889] CPU 0
> [ 107.832854] Modules linked in: cls_cgroup(-) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus usbmouse snd_seq snd_seq_device usbkbd usbhid snd_pcm ppdev hid tg3 led_class snd_timer dcdbas sr_mod snd iTCO_wdt cdrom iTCO_vendor_support sg rtc_cmos pcspkr soundcore i2c_i801 rng_core snd_page_alloc rtc_core parport_pc shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_c!
> lass ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
> [ 107.933458]
> [ 107.933458] Pid: 3400, comm: rmmod Not tainted 2.6.37-rc1 #7 0HH807/OptiPlex GX620
> [ 107.937800] RIP: 0010:[<ffffffff810e6c9d>] [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
> [ 107.937800] RSP: 0018:ffff88006c107ea8 EFLAGS: 00010202
> [ 107.937800] RAX: 0000000000000000 RBX: ffffffffa0009d50 RCX: 0000000000000000
> [ 107.937800] RDX: ffffffff81a3a5f0 RSI: ffff88006c107dc8 RDI: ffff88006c107e48
> [ 107.937800] RBP: ffff88006c107ec8 R08: ffffffff81a3a5f0 R09: 000000000000039a
> [ 107.937800] R10: 0000000000000001 R11: ffff88006c107e48 R12: 0000000000000000
> [ 107.937800] R13: 00007fff2664ffc0 R14: 0000000000000000 R15: 0000000000000001
> [ 107.937800] FS: 00007f52809e46f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
> [ 107.937800] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 107.937800] CR2: 0000003fb5a7bf20 CR3: 000000006c1d8000 CR4: 00000000000006f0
> [ 107.937800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 107.937800] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 107.937800] Process rmmod (pid: 3400, threadinfo ffff88006c106000, task ffff880075a33000)
> [ 107.937800] Stack:
> [ 107.937800] ffff88006c107ec8 ffffffffa000a0e0 0000000000000000 00007fff2664ffc0
> [ 107.937800] ffff88006c107ed8 ffffffffa0009819 ffff88006c107f78 ffffffff810d3cb0
> [ 108.048442] ffffffffa000a0e0 0000000000000880 ffff88006c107f14 ffffffff8155036b
> [ 108.057485] Call Trace:
> [ 108.065148] [<ffffffffa0009819>] exit_cgroup_cls+0x45/0x4e [cls_cgroup]
> [ 108.070071] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
> [ 108.085255] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
> [ 108.093771] [<ffffffff81007075>] ? xen_zap_pfn_range+0x53/0x139
> [ 108.101589] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [ 108.111624] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
> [ 108.119099] Code: 05 51 8d 71 01 0f 0b eb fe 31 f6 48 c7 c7 a0 a5 a3 81 48 ff 05 45 8d 71 01 e8 42 83 46 00 83 7b 58 07 7f 0b 48 ff 05 43 8d 71 01 <0f> 0b eb fe 48 ff 05 40 8d 71 01 48 8d bb 30 01 00 00 48 63 43
> [ 108.145840] RIP [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
> [ 108.152902] RSP <ffff88006c107ea8>
> [ 108.161767] ---[ end trace 659fde6f8f5f2810 ]---
>
>
>
> kernel config file is attached (almost allmodconfig).
> There may be some CONFIG options that are not helping...
>
> ---

commits 8e039d84b323c450
(cgroups: net_cls as module)

followed by commit f845172531f
(cls_cgroup: Store classid in struct sock)

are the problem :

if CONFIG_NET_CLS_CGROUP is not defined

exit_cgroup_cls() does :

#ifndef CONFIG_NET_CLS_CGROUP
net_cls_subsys_id = -1; <<< -1
synchronize_rcu();
#endif
cgroup_unload_subsys(&net_cls_subsys);


but net_cls_subsys_id is an alias of net_cls_subsys.subsys_id

so putting -1 in it triggers BUG_ON() on line 3855 of kernel/cgroup.c

BUG_ON(ss->subsys_id < CGROUP_BUILTIN_SUBSYS_COUNT);



Herbert, I'll let you fix it ?

Thanks

2010-11-03 22:19:35

by Li Zefan

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

On 2010年11月04日 06:01, Eric Dumazet wrote:
> Le mercredi 03 novembre 2010 à 14:21 -0700, Randy Dunlap a écrit :
>> Maybe this isn't normal usage: just modprobe cls_cgroup && rmmod cls_cgroup:
>>
>>
>> [ 107.806607] ------------[ cut here ]------------
>> [ 107.810180] kernel BUG at /local/linsrc/lnx-2637-rc1/kernel/cgroup.c:3855!
>> [ 107.810180] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>> [ 107.822274] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
>> [ 107.824889] CPU 0
>> [ 107.832854] Modules linked in: cls_cgroup(-) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus usbmouse snd_seq snd_seq_device usbkbd usbhid snd_pcm ppdev hid tg3 led_class snd_timer dcdbas sr_mod snd iTCO_wdt cdrom iTCO_vendor_support sg rtc_cmos pcspkr soundcore i2c_i801 rng_core snd_page_alloc rtc_core parport_pc shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware
_c!
>> lass ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
>> [ 107.933458]
>> [ 107.933458] Pid: 3400, comm: rmmod Not tainted 2.6.37-rc1 #7 0HH807/OptiPlex GX620
>> [ 107.937800] RIP: 0010:[<ffffffff810e6c9d>] [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
>> [ 107.937800] RSP: 0018:ffff88006c107ea8 EFLAGS: 00010202
>> [ 107.937800] RAX: 0000000000000000 RBX: ffffffffa0009d50 RCX: 0000000000000000
>> [ 107.937800] RDX: ffffffff81a3a5f0 RSI: ffff88006c107dc8 RDI: ffff88006c107e48
>> [ 107.937800] RBP: ffff88006c107ec8 R08: ffffffff81a3a5f0 R09: 000000000000039a
>> [ 107.937800] R10: 0000000000000001 R11: ffff88006c107e48 R12: 0000000000000000
>> [ 107.937800] R13: 00007fff2664ffc0 R14: 0000000000000000 R15: 0000000000000001
>> [ 107.937800] FS: 00007f52809e46f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
>> [ 107.937800] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 107.937800] CR2: 0000003fb5a7bf20 CR3: 000000006c1d8000 CR4: 00000000000006f0
>> [ 107.937800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 107.937800] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 107.937800] Process rmmod (pid: 3400, threadinfo ffff88006c106000, task ffff880075a33000)
>> [ 107.937800] Stack:
>> [ 107.937800] ffff88006c107ec8 ffffffffa000a0e0 0000000000000000 00007fff2664ffc0
>> [ 107.937800] ffff88006c107ed8 ffffffffa0009819 ffff88006c107f78 ffffffff810d3cb0
>> [ 108.048442] ffffffffa000a0e0 0000000000000880 ffff88006c107f14 ffffffff8155036b
>> [ 108.057485] Call Trace:
>> [ 108.065148] [<ffffffffa0009819>] exit_cgroup_cls+0x45/0x4e [cls_cgroup]
>> [ 108.070071] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
>> [ 108.085255] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
>> [ 108.093771] [<ffffffff81007075>] ? xen_zap_pfn_range+0x53/0x139
>> [ 108.101589] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>> [ 108.111624] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
>> [ 108.119099] Code: 05 51 8d 71 01 0f 0b eb fe 31 f6 48 c7 c7 a0 a5 a3 81 48 ff 05 45 8d 71 01 e8 42 83 46 00 83 7b 58 07 7f 0b 48 ff 05 43 8d 71 01 <0f> 0b eb fe 48 ff 05 40 8d 71 01 48 8d bb 30 01 00 00 48 63 43
>> [ 108.145840] RIP [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
>> [ 108.152902] RSP <ffff88006c107ea8>
>> [ 108.161767] ---[ end trace 659fde6f8f5f2810 ]---
>>
>>
>>
>> kernel config file is attached (almost allmodconfig).
>> There may be some CONFIG options that are not helping...
>>
>> ---
>
> commits 8e039d84b323c450
> (cgroups: net_cls as module)
>
> followed by commit f845172531f
> (cls_cgroup: Store classid in struct sock)
>
> are the problem :
>
> if CONFIG_NET_CLS_CGROUP is not defined
>
> exit_cgroup_cls() does :
>
> #ifndef CONFIG_NET_CLS_CGROUP
> net_cls_subsys_id = -1; <<< -1
> synchronize_rcu();
> #endif
> cgroup_unload_subsys(&net_cls_subsys);
>
>
> but net_cls_subsys_id is an alias of net_cls_subsys.subsys_id
>
> so putting -1 in it triggers BUG_ON() on line 3855 of kernel/cgroup.c
>
> BUG_ON(ss->subsys_id < CGROUP_BUILTIN_SUBSYS_COUNT);
>
> Herbert, I'll let you fix it ?
>

Exactly what I was going to reply. This bug report also reveals
another bug..

I'll post fixes for the 2 bugs in minutes.

2010-11-03 22:32:01

by Li Zefan

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

Li Zefan wrote:
> On 2010年11月04日 06:01, Eric Dumazet wrote:
>> Le mercredi 03 novembre 2010 à 14:21 -0700, Randy Dunlap a écrit :
>>> Maybe this isn't normal usage: just modprobe cls_cgroup && rmmod cls_cgroup:
>>>
>>>
>>> [ 107.806607] ------------[ cut here ]------------
>>> [ 107.810180] kernel BUG at /local/linsrc/lnx-2637-rc1/kernel/cgroup.c:3855!
>>> [ 107.810180] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> [ 107.822274] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
>>> [ 107.824889] CPU 0
>>> [ 107.832854] Modules linked in: cls_cgroup(-) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus usbmouse snd_seq snd_seq_device usbkbd usbhid snd_pcm ppdev hid tg3 led_class snd_timer dcdbas sr_mod snd iTCO_wdt cdrom iTCO_vendor_support sg rtc_cmos pcspkr soundcore i2c_i801 rng_core snd_page_alloc rtc_core parport_pc shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmwar
e
> _c!
>>> lass ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
>>> [ 107.933458]
>>> [ 107.933458] Pid: 3400, comm: rmmod Not tainted 2.6.37-rc1 #7 0HH807/OptiPlex GX620
>>> [ 107.937800] RIP: 0010:[<ffffffff810e6c9d>] [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
>>> [ 107.937800] RSP: 0018:ffff88006c107ea8 EFLAGS: 00010202
>>> [ 107.937800] RAX: 0000000000000000 RBX: ffffffffa0009d50 RCX: 0000000000000000
>>> [ 107.937800] RDX: ffffffff81a3a5f0 RSI: ffff88006c107dc8 RDI: ffff88006c107e48
>>> [ 107.937800] RBP: ffff88006c107ec8 R08: ffffffff81a3a5f0 R09: 000000000000039a
>>> [ 107.937800] R10: 0000000000000001 R11: ffff88006c107e48 R12: 0000000000000000
>>> [ 107.937800] R13: 00007fff2664ffc0 R14: 0000000000000000 R15: 0000000000000001
>>> [ 107.937800] FS: 00007f52809e46f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
>>> [ 107.937800] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [ 107.937800] CR2: 0000003fb5a7bf20 CR3: 000000006c1d8000 CR4: 00000000000006f0
>>> [ 107.937800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [ 107.937800] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [ 107.937800] Process rmmod (pid: 3400, threadinfo ffff88006c106000, task ffff880075a33000)
>>> [ 107.937800] Stack:
>>> [ 107.937800] ffff88006c107ec8 ffffffffa000a0e0 0000000000000000 00007fff2664ffc0
>>> [ 107.937800] ffff88006c107ed8 ffffffffa0009819 ffff88006c107f78 ffffffff810d3cb0
>>> [ 108.048442] ffffffffa000a0e0 0000000000000880 ffff88006c107f14 ffffffff8155036b
>>> [ 108.057485] Call Trace:
>>> [ 108.065148] [<ffffffffa0009819>] exit_cgroup_cls+0x45/0x4e [cls_cgroup]
>>> [ 108.070071] [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
>>> [ 108.085255] [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
>>> [ 108.093771] [<ffffffff81007075>] ? xen_zap_pfn_range+0x53/0x139
>>> [ 108.101589] [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>> [ 108.111624] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
>>> [ 108.119099] Code: 05 51 8d 71 01 0f 0b eb fe 31 f6 48 c7 c7 a0 a5 a3 81 48 ff 05 45 8d 71 01 e8 42 83 46 00 83 7b 58 07 7f 0b 48 ff 05 43 8d 71 01 <0f> 0b eb fe 48 ff 05 40 8d 71 01 48 8d bb 30 01 00 00 48 63 43
>>> [ 108.145840] RIP [<ffffffff810e6c9d>] cgroup_unload_subsys+0x64/0x1c8
>>> [ 108.152902] RSP <ffff88006c107ea8>
>>> [ 108.161767] ---[ end trace 659fde6f8f5f2810 ]---
>>>
>>>
>>>
>>> kernel config file is attached (almost allmodconfig).
>>> There may be some CONFIG options that are not helping...
>>>
>>> ---
>>
>> commits 8e039d84b323c450
>> (cgroups: net_cls as module)
>>
>> followed by commit f845172531f
>> (cls_cgroup: Store classid in struct sock)
>>
>> are the problem :
>>
>> if CONFIG_NET_CLS_CGROUP is not defined
>>
>> exit_cgroup_cls() does :
>>
>> #ifndef CONFIG_NET_CLS_CGROUP
>> net_cls_subsys_id = -1; <<< -1
>> synchronize_rcu();
>> #endif
>> cgroup_unload_subsys(&net_cls_subsys);
>>
>>
>> but net_cls_subsys_id is an alias of net_cls_subsys.subsys_id
>>
>> so putting -1 in it triggers BUG_ON() on line 3855 of kernel/cgroup.c
>>
>> BUG_ON(ss->subsys_id < CGROUP_BUILTIN_SUBSYS_COUNT);
>>
>> Herbert, I'll let you fix it ?
>>
>
> Exactly what I was going to reply. This bug report also reveals
> another bug..
>
> I'll post fixes for the 2 bugs in minutes.

Sorry I'll leave so I can't make it. I'll fix this later
if Herbert hasn't fix it.

2010-11-03 23:18:16

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)



[ 303.127418] calling floppy_module_init+0x0/0x93 [floppy] @ 5726
[ 303.134577] ------------[ cut here ]------------
[ 303.139329] WARNING: at /local/linsrc/lnx-2637-rc1/lib/list_debug.c:26 __list_add+0x4d/0xa5()
[ 303.148248] Hardware name: OptiPlex GX620
[ 303.153682] list_add corruption. next->prev should be prev (ffffffff81ae5e50), but was 6b6b6b6b6b6b6b6b. (next=ffff88006c908590).
[ 303.165678] Modules linked in: floppy(+) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ppdev ac97_bus snd_seq snd_seq_device led_class snd_pcm iTCO_wdt snd_timer usbmouse iTCO_vendor_support usbkbd snd usbhid tg3 hid sg dcdbas sr_mod soundcore rtc_cmos pcspkr i2c_i801 cdrom parport_pc rng_core evdev snd_page_alloc shpchp rtc_core parport rtc_lib mac_hid pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class!
ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 303.269866] Pid: 5726, comm: modprobe Not tainted 2.6.37-rc1 #10
[ 303.275973] Call Trace:
[ 303.278775] [<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
Nov 3 16:13:19 [ 303.284885] [<ffffffff812d3582>] ? __percpu_counter_init+0x9e/0xdf
control kernel: [ 303.292959] [<ffffffff8107e30a>] warn_slowpath_fmt+0x5b/0x6b
[ 303.127418] c[ 303.299754] [<ffffffff812cf38b>] __list_add+0x4d/0xa5
alling floppy_m[ 303.306249] [<ffffffff812d359f>] __percpu_counter_init+0xbb/0xdf
odule_init+0x0/0[ 303.314093] [<ffffffff8117c577>] bdi_init+0x13f/0x1c2
x93 [floppy] @ 5[ 303.320270] [<ffffffffa0bdb3bb>] ? do_fd_request+0x0/0x111 [floppy]
726
Nov 3 16:13[ 303.328429] [<ffffffffa0bdb3bb>] ? do_fd_request+0x0/0x111 [floppy]
:19 control kern[ 303.335791] [<ffffffff812a7444>] blk_alloc_queue_node+0x8f/0x220
el: [ 303.13457[ 303.343257] [<ffffffff812a773b>] blk_init_queue_node+0x30/0x90
7] ------------[[ 303.350874] [<ffffffff812a77b3>] blk_init_queue+0x18/0x21
cut here ]-----[ 303.357447] [<ffffffffa0bf29da>] floppy_init+0x95/0x7c0 [floppy]
-------
Nov 3 1[ 303.365279] [<ffffffff81017b19>] ? read_tsc+0x17/0x29
6:13:19 control [ 303.371436] [<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
kernel: [ 303.1[ 303.379517] [<ffffffffa0bf318d>] floppy_module_init+0x88/0x93 [floppy]
39329] WARNING: [ 303.387890] [<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
at /local/linsrc[ 303.395650] [<ffffffff810020a6>] do_one_initcall+0x6c/0x1ef
/lnx-2637-rc1/li[ 303.403068] [<ffffffff810d52f3>] sys_init_module+0xe1/0x2a5
b/list_debug.c:2[ 303.409709] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
6 __list_add+0x4Nov 3 16:13:19 control kernel: [ 303.419712] ---[ end trace e26c2a9ce976be75 ]---
cd usbcore nls_base i915 drm_kms_helper intel_ag[ 303.429812] Floppy drive(s):p button intel_g fd0 is 1.44Mtt video thermal
_sys hwmon output [last unloaded: mperf]
Nov 3 16:13:19 control kernel: [ 303.269866] Pid: 5726, comm: modprobe Not tainted 2.6.37-rc1 #10
Nov 3 16:13:19 control kernel: [ 303.275973] Call Trace:
Nov 3 16:13:19 control kernel: [ 303.278775] [<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
Nov 3 16:13:19 control kernel: [ 3

[ 306.480304] floppy0: no floppy controllers found
Nov 3 16:13:22 [ 306.488160] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 306.492030] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
[ 306.492030] CPU 0
[ 306.492030] Modules linked in: floppy(+) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ppdev ac97_bus snd_seq snd_seq_device led_class snd_pcm iTCO_wdt snd_timer usbmouse iTCO_vendor_support usbkbd snd usbhid tg3 hid sg dcdbas sr_mod soundcore rtc_cmos pcspkr i2c_i801 cdrom parport_pc rng_core evdev snd_page_alloc shpchp rtc_core parport rtc_lib mac_hid pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class!
ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 306.626211]
[ 306.626211] Pid: 5726, comm: modprobe Tainted: G W 2.6.37-rc1 #10 0HH807/OptiPlex GX620
[ 306.626211] RIP: 0010:[<ffffffff810c309f>] [<ffffffff810c309f>] __lock_acquire+0xd8/0x4e8
[ 306.626211] RSP: 0018:ffff88006d38dd48 EFLAGS: 00010002
[ 306.626211] RAX: 0000000000000006 RBX: 6b6b6b6b6b6b6d13 RCX: 0000000000000000
[ 306.626211] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6d13
[ 306.626211] RBP: ffff88006d38dda8 R08: 0000000000000001 R09: 0000000000000001
[ 306.626211] R10: ffffffff81812d48 R11: ffff88006d38de78 R12: 0000000000000000
[ 306.626211] R13: ffff88006cb13000 R14: 0000000000000000 R15: 0000000000000000
[ 306.626211] FS: 00007f248ff436f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
[ 306.626211] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 306.626211] CR2: 000000000064b000 CR3: 000000006c976000 CR4: 00000000000006f0
[ 306.626211] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 306.626211] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 306.626211] Process modprobe (pid: 5726, threadinfo ffff88006d38c000, task ffff88006cb13000)
[ 306.626211] Stack:
[ 306.783845] 0000000000000202 ffffffff812b10c2 ffff88006d38dd88 ffffffff811b2af3
[ 306.787082] ffff88006d24ca88 000000006d24ca88 ffff88006d38dd88 0000000000000000
[ 306.787082] ffff88006cb13000 ffffffff81092650 0000000000000001 0000000000000000
[ 306.787082] Call Trace:
[ 306.787082] [<ffffffff812b10c2>] ? disk_release+0x97/0xa3
[ 306.787082] [<ffffffff811b2af3>] ? __slab_free+0x1b9/0x1d6
[ 306.787082] [<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
[ 306.787082] [<ffffffff810c35af>] lock_acquire+0x100/0x150
[ 306.787082] [<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
[ 306.787082] [<ffffffff81092694>] del_timer_sync+0x94/0x15c
[ 306.874354] [<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
[ 306.874354] [<ffffffff812a763f>] blk_sync_queue+0x24/0x55
[ 306.874354] [<ffffffff812a7692>] blk_cleanup_queue+0x22/0x9b
[ 306.874354] [<ffffffffa0bf30e1>] floppy_init+0x79c/0x7c0 [floppy]
[ 306.874354] [<ffffffff81017b19>] ? read_tsc+0x17/0x29
[ 306.874354] [<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
[ 306.874354] [<ffffffffa0bf318d>] floppy_module_init+0x88/0x93 [floppy]
[ 306.874354] [<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
[ 306.940576] [<ffffffff810020a6>] do_one_initcall+0x6c/0x1ef
[ 306.943732] [<ffffffff810d52f3>] sys_init_module+0xe1/0x2a5
[ 306.943732] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[ 306.943732] Code: 05 4f 15 72 01 e8 9c b1 fb ff 48 ff 05 4b 15 72 01 48 ff 05 4c 15 72 01 48 ff 05 55 15 72 01 e9 e3 03 00 00 48 ff 05 41 15 72 01 <48> 81 3b 40 34 0b 82 75 07 48 ff 05 41 15 72 01 83 fe 01 77 13
[ 306.982675] RIP [<ffffffff810c309f>] __lock_acquire+0xd8/0x4e8
[ 306.982675] RSP <ffff88006d38dd48>
[ 306.982675] ---[ end trace e26c2a9ce976be76 ]---
Nov 3 16:13:22 control kernel: Nov 3 16:13:22 Nov 3 16:13:22 control kernel: [ 306.492030] CPU 0


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-03 23:18:59

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (libipw remove_proc_entry warning)


Nov 3 16:03:11 control kernel: [ 74.701367] (5170000[ 74.846676] calling libipw_init+0x0/0xe4 [libipw] @ 2992
[ 74.852356] libipw: 802.11 data/management/control stack, git-1.1.13
[ 74.858790] libipw: Copyright (C) 2004-2005 Intel Corporation <[email protected]>
[ 74.866977] initcall libipw_init+0x0/0xe4 [libipw] returned 0 after 14318 usecs
Nov 3 16:03:11 control kernel: [ 74.846676] calling libipw_iNov 3 16:03:11 control kernel: [ 74.852356] libipw: 802.11 data/management/control stack, git-1.1.13
Nov 3 16:03:11 control kernel: [ 74.858790] libipw: C


[ 78.273409] ------------[ cut here ]------------
[ 78.278210] WARNING: at /local/linsrc/lnx-2637-rc1/fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
Nov 3 16:03:14 [ 78.288314] Hardware name: OptiPlex GX620
control kernel: [ 78.294870] name 'libipw'
[ 78.298520] Modules linked in: libipw(-) lib80211 cfg80211 ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetoothNov 3 16:03:14 rfkillcontrol kernel: sunrpc[ 78.278210] W ipt_REJECTARNING: at /loca nf_conntrack_ipv4l/linsrc/lnx-263 nf_defrag_ipv47-rc1/fs/proc/ge iptable_filterneric.c:816 remo ip_tablesve_proc_entry+0x ip6t_REJECT156/0x35e()
Nov xt_tcpudp 3 16:03:14 cont nf_conntrack_ipv6rol kernel: [ nf_defrag_ipv678.288314] Hardw xt_stateare name: OptiPl nf_conntrackex GX620 ip6table_filter
Nov 3 1 ip6_tables6:03:14 control x_tableskernel: [ 78.2 ipv694870] name 'lib p4_clockmodipw'
freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus ppdev snd_seq snd_seq_device usbmouse snd_pcm led_class usbkbd snd_timer usbhid iTCO_wdt hid snd iTCO_vendor_support tg3 sr_mod dcdbas cdrom soundcore i2c_i801 rtc_cmos sg pcspkr rng_core parport_pc snd_page_alloc rtc_core shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 78.429364] Pid: 3067, comm: rmmod Not tainted 2.6.37-rc1 #10
[ 78.435209] Call Trace:
[ 78.437982] [<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
[ 78.444119] [<ffffffff8107e30a>] warn_slowpath_fmt+0x5b/0x6b
[ 78.449973] [<ffffffff81550dbb>] ? _raw_spin_unlock+0x40/0x4b
[ 78.456247] [<ffffffff8123d37e>] remove_proc_entry+0x156/0x35e
Nov 3 16:03:14 [ 78.462289] [<ffffffff81017d29>] ? native_sched_clock+0x3b/0x6d
control kernel: [ 78.470047] [<ffffffff810b067f>] ? sched_clock_cpu+0x147/0x160
[ 78.298520] M[ 78.477041] [<ffffffff81137479>] ? trace_hardirqs_off+0x10/0x4a
odules linked in: libipw(-) lib80211 cfg80211 ip[ 78.487138] [<ffffffff810b0742>] ? local_clock+0xaa/0xf4
t_MASQUERADE iptable_nat nf_nat [ 78.496297] [<ffffffff810c0ab9>] ? lock_release_holdtime+0x41/0x177
Nov 3 16:03:14 [ 78.505725] [<ffffffff810c3a1b>] ? lock_release_nested+0xfb/0x133
Nov 3 16:03:14 [ 78.514477] [<ffffffffa0c39cad>] libipw_exit+0x49/0x5d [libipw]
Nov 3 16:03:14 [ 78.523311] [<ffffffff810d3310>] sys_delete_module+0x2d6/0x368
[ 78.531755] [<ffffffff8154f9cb>] ? lockdep_sys_exit_thunk+0x35/0x67
Nov 3 16:03:14 Nov 3 16:03:14 [ 78.540954] [<ffffffff810fd60f>] ? audit_syscall_entry+0x172/0x1a5
Nov 3 16:03:14 [ 78.550029] [<ffffffff8154f955>] ? trace_hardirqs_on_thunk+0x3a/0x3f
Nov 3 16:03:14 [ 78.559196] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
Nov 3 16:03:14 [ 78.567700] ---[ end trace b9ae9f3ab8d89ea5 ]---


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-03 23:21:58

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (scsi_debug: list corruption)



[ 102.555847] calling scsi_debug_init+0x0/0x704 [scsi_debug] @ 3337
Nov 3 16:06:12 control kernel: [ 102.555847] calling scsi_deb[ 102.622974] scsi_debug: host protection
[ 102.627513] scsi4 : scsi_debug, version 1.82 [20100324], dev_size_mb=8, opts=0x0
Nov 3 16:06:13 control kernel: [ 102.639095] initcall scsi_debug_init+0x0/0x704 [scsi_debug] returned 0 after 75039 usecs
Nov 3 16:06:13 [ 102.651072] ------------[ cut here ]------------
[ 102.657373] WARNING: at /local/linsrc/lnx-2637-rc1/lib/list_debug.c:26 __list_add+0x4d/0xa5()
[ 102.666012] Hardware name: OptiPlex GX620
[ 102.671396] list_add corruption. next->prev should be prev (ffffffff81ae5e50), but was 6b6b6b6b6b6b6b6b. (next=ffff88006c880590).
[ 102.683509] Modules linked in: scsi_debug ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev ppdev snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device usbkbd snd_pcm usbmouse led_class snd_timer usbhid iTCO_wdt hid tg3 snd sr_mod iTCO_vendor_support dcdbas cdrom pcspkr i2c_i801 sg soundcore rtc_cmos parport_pc rng_core evdev snd_page_alloc shpchp rtc_core parport rtc_lib mac_hid 8250_pnp pci_hotplug unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_clas!
s ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 102.787884] Pid: 3349, comm: scsi_scan_4 Not tainted 2.6.37-rc1 #10
[ 102.794602] Call Trace:
[ 102.797152] [<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
[ 102.803270] [<ffffffff812d3582>] ? __percpu_counter_init+0x9e/0xdf
[ 102.809973] [<ffffffff8107e30a>] warn_slowpath_fmt+0x5b/0x6b
[ 102.815846] [<ffffffff812cf38b>] __list_add+0x4d/0xa5
Nov 3 16:06:13 [ 102.821109] [<ffffffff812d359f>] __percpu_counter_init+0xbb/0xdf
control kernel: [ 102.829046] [<ffffffff8117c577>] bdi_init+0x13f/0x1c2
[ 102.651072] -[ 102.835171] [<ffffffff81411c7c>] ? scsi_request_fn+0x0/0x6f4
-----------[ cut[ 102.842260] [<ffffffff81411c7c>] ? scsi_request_fn+0x0/0x6f4
here ]---------[ 102.849903] [<ffffffff812a7444>] blk_alloc_queue_node+0x8f/0x220
---
Nov 3 16:06[ 102.856915] [<ffffffff812a773b>] blk_init_queue_node+0x30/0x90
:13 control kern[ 102.864592] [<ffffffff812a77b3>] blk_init_queue+0x18/0x21
el: [ 102.65737[ 102.871037] [<ffffffff8141239d>] __scsi_alloc_queue+0x2d/0x207
[ 102.878377] [<ffffffff8141259b>] scsi_alloc_queue+0x24/0x9c
Nov 3 16:06:13 [ 102.884579] [<ffffffff81414f27>] scsi_alloc_sdev+0x1de/0x31d
control kernel: [ 102.891302] [<ffffffff814162bb>] scsi_probe_and_add_lun+0x191/0x615
[ 102.671396] l[ 102.899044] [<ffffffff813f9cde>] ? attribute_container_add_device+0x258/0x26e
ist_add corrupti[ 102.908167] [<ffffffff813eedda>] ? get_device+0x1e/0x36
on. next->prev s[ 102.914335] [<ffffffff81414c3b>] ? scsi_alloc_target+0x2d9/0x33e
hould be prev (f[ 102.922209] [<ffffffff814173c5>] ? scsi_scan_host_selected+0xec/0x1a7
fffffff81ae5e50)[ 102.929769] [<ffffffff814173c5>] ? scsi_scan_host_selected+0xec/0x1a7
, but was 6b6b6b[ 102.938062] [<ffffffff8141708b>] __scsi_scan_target+0xbe/0x25c
6b6b6b6b6b. (nex[ 102.944996] [<ffffffff8141729c>] scsi_scan_channel+0x73/0xb0
t=ffff88006c8805[ 102.952467] [<ffffffff8141742b>] scsi_scan_host_selected+0x152/0x1a7
90).
Nov 3 16:0[ 102.959976] [<fffffff141752d>] ? do_scan_async+0x0/0x3d
6:13 control ker[ 102.966801] [<ffffffff81417521>] do_scsi_scan_host+0xa1/0xad
nel: [ 102.683509] Modules linked in: scsi_debu[ 102.976609] [<ffffffff8141754e>] do_scan_async+0x21/0x3d
g ipt_MASQUERADE iptable_nat nf_[ 102.985889] [<ffffffff8141752d>] ? do_scan_async+0x0/0x3d
nat af_packet nfsd lockd nfs_acl auth_rpcgss exp[ 102.994412] [<ffffffff810a77f6>] kthread+0xc3/0xd2
ortfs sco bridge stp llc bnep l2[ 103.003098] [<ffffffff8113785d>] ? trace_hardirqs_on_caller+0x18/0x48
Nov 3 16:06:13 control kernel: [ 103.003098] [ 103.012765] [<ffffffff8100f904>] kernel_thread_helper+0x4/0x10
[<ffffffff8113785d>] ? trace_har[ 103.022542] [<ffffffff81551210>] ? restore_args+0x0/0x30
dirqs_on_caller+0x18/0x48
Nov 3 16:06:13 contro[ 103.031013] [<ffffffff810a7733>] ? kthread+0x0/0xd2
l kernel: [ 103.012765] [<ffff[ 103.039803] [<ffffffff8100f900>] ? kernel_thread_helper+0x0/0x10
ffff8100f904>] kernel_thread_helper+0x4/0x10
Nov[ 103.048967] ---[ end trace 0acaa11d0c3e9c22 ]---
3 16:06:13 control kernel: [ 103.022542] [<ffffffff81551210>] ? restore_args+0x0/0x30
Nov 3 16:06:13 control kernel: [ 103.031013] [<ffffffff810a7733>] ? kthread+0x0/0xd[ 103.070125] scsi 4:0:0:0: Direct-Access Linux scsi_debug 0004 PQ: 0 ANSI: 5
2
Nov 3 16:06:13 control kernel: [ 103.039803] [<ffffffff8100f900>] ? kernel_thread_helper+0x0/0x10
Nov 3 16:06:13 control kernel: [ 103.048967] ---[ end trace 0acaa11d0c3e9c22 ]---
Nov 3 16:06:13 control kernel: [ 103.070125] scsi 4:0:0:0: Direct-Access Linux scsi_debug 0004 PQ: 0 ANSI: 5
[ 103.110940] sd 4:0:0:0: Attached scsi generic sg2 type 0
Nov 3 16:06:13 control kernel: [ 103.110940] sd 4:0:0:0: Attached scsi generic[ 103.125754] sd 4:0:0:0: [sdb] 16384 512-byte logical blocks: (8.38 MB/8.00 MiB)
sg2 type 0
Nov 3 16:06:13 [ 103.164255] sd 4:0:0:0: [sdb] Write Protect is off
[ 103.171863] sd 4:0:0:0: [sdb] Mode Sense: 73 00 10 08
Nov 3 16:06:13 control kernel: Nov 3 16:06:13 control kernel: [ 103.224509] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
Nov 3 16:06:13 control kernel: [ 103.224509] sd 4:0:0:0: [sdb] Write cache: en[ 103.284456] sdb: unknown partition table
Nov 3 16:06:13 control kernel: [ 103.284456] sdb: unknown par[ 103.308541] sd 4:0:0:0: [sdb] Attached SCSI disk
Nov 3 16:06:13 control kernel: [ 103.308541] sd 4:0:0:0: [sdb]


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-03 23:31:59

by Herbert Xu

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

On Wed, Nov 03, 2010 at 11:01:17PM +0100, Eric Dumazet wrote:
>
> commits 8e039d84b323c450
> (cgroups: net_cls as module)
>
> followed by commit f845172531f
> (cls_cgroup: Store classid in struct sock)

Indeed, it looks like the tree I worked on didn't have the first
patch applied for some reason.

Anyway, this patch should fix the problem. Thanks Eric!

cls_cgroup: Fix crash on module unload

Somewhere along the lines net_cls_subsys_id became a macro when
cls_cgroup is built as a module. Not only did it make cls_cgroup
completely useless, it also causes it to crash on module unload.

This patch fixes this by removing that macro.

Thanks to Eric Dumazet for diagnosing this problem.

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 37dff78..d49c40f 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -34,8 +34,6 @@ struct cgroup_subsys net_cls_subsys = {
.populate = cgrp_populate,
#ifdef CONFIG_NET_CLS_CGROUP
.subsys_id = net_cls_subsys_id,
-#else
-#define net_cls_subsys_id net_cls_subsys.subsys_id
#endif
.module = THIS_MODULE,
};

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2010-11-04 01:46:15

by Li Zefan

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

>> commits 8e039d84b323c450
>> (cgroups: net_cls as module)
>>
>> followed by commit f845172531f
>> (cls_cgroup: Store classid in struct sock)
>
> Indeed, it looks like the tree I worked on didn't have the first
> patch applied for some reason.
>

The first patch was merged in .34, and the second one .35, and
from the changelog and the diff, seems you did know cls_cgroup
can be a module. ;)

> Anyway, this patch should fix the problem. Thanks Eric!
>
> cls_cgroup: Fix crash on module unload
>
> Somewhere along the lines net_cls_subsys_id became a macro when
> cls_cgroup is built as a module. Not only did it make cls_cgroup
> completely useless, it also causes it to crash on module unload.
>
> This patch fixes this by removing that macro.
>
> Thanks to Eric Dumazet for diagnosing this problem.
>
> Reported-by: Randy Dunlap <[email protected]>
> Signed-off-by: Herbert Xu <[email protected]>
>

Reviewed-by: Li Zefan <[email protected]>

> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> index 37dff78..d49c40f 100644
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -34,8 +34,6 @@ struct cgroup_subsys net_cls_subsys = {
> .populate = cgrp_populate,
> #ifdef CONFIG_NET_CLS_CGROUP
> .subsys_id = net_cls_subsys_id,
> -#else
> -#define net_cls_subsys_id net_cls_subsys.subsys_id
> #endif
> .module = THIS_MODULE,
> };
>
> Cheers,

2010-11-04 01:55:48

by David Miller

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

From: Herbert Xu <[email protected]>
Date: Wed, 3 Nov 2010 18:31:05 -0500

> cls_cgroup: Fix crash on module unload
>
> Somewhere along the lines net_cls_subsys_id became a macro when
> cls_cgroup is built as a module. Not only did it make cls_cgroup
> completely useless, it also causes it to crash on module unload.
>
> This patch fixes this by removing that macro.
>
> Thanks to Eric Dumazet for diagnosing this problem.
>
> Reported-by: Randy Dunlap <[email protected]>
> Signed-off-by: Herbert Xu <[email protected]>

Applied, and queued up for -stable, thanks everyone!

2010-11-04 02:25:13

by Xiaotian Feng

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (scsi_debug: list corruption)

On Thu, Nov 4, 2010 at 7:20 AM, Randy Dunlap <[email protected]> wrote:
>
>
> [  102.555847] calling  scsi_debug_init+0x0/0x704 [scsi_debug] @ 3337
> Nov  3 16:06:12 control kernel: [  102.555847] calling  scsi_deb[  102.622974] scsi_debug: host protection
> [  102.627513] scsi4 : scsi_debug, version 1.82 [20100324], dev_size_mb=8, opts=0x0
> Nov  3 16:06:13 control kernel: [  102.639095] initcall scsi_debug_init+0x0/0x704 [scsi_debug] returned 0 after 75039 usecs
> Nov  3 16:06:13 [  102.651072] ------------[ cut here ]------------
> [  102.657373] WARNING: at /local/linsrc/lnx-2637-rc1/lib/list_debug.c:26 __list_add+0x4d/0xa5()
> [  102.666012] Hardware name: OptiPlex GX620
> [  102.671396] list_add corruption. next->prev should be prev (ffffffff81ae5e50), but was 6b6b6b6b6b6b6b6b. (next=ffff88006c880590).

This might be related with net rds percpu_counter corruption. Does
following patch fix your issue?
http://patchwork.ozlabs.org/patch/69939/

>
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

2010-11-04 15:57:01

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (net/sched: cls_cgroup)

On 11/03/10 16:31, Herbert Xu wrote:
> On Wed, Nov 03, 2010 at 11:01:17PM +0100, Eric Dumazet wrote:
>>
>> commits 8e039d84b323c450
>> (cgroups: net_cls as module)
>>
>> followed by commit f845172531f
>> (cls_cgroup: Store classid in struct sock)
>
> Indeed, it looks like the tree I worked on didn't have the first
> patch applied for some reason.
>
> Anyway, this patch should fix the problem. Thanks Eric!
>
> cls_cgroup: Fix crash on module unload
>
> Somewhere along the lines net_cls_subsys_id became a macro when
> cls_cgroup is built as a module. Not only did it make cls_cgroup
> completely useless, it also causes it to crash on module unload.
>
> This patch fixes this by removing that macro.
>
> Thanks to Eric Dumazet for diagnosing this problem.
>
> Reported-by: Randy Dunlap <[email protected]>
> Signed-off-by: Herbert Xu <[email protected]>

Tested-by: Randy Dunlap <[email protected]>

Thanks.

>
> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> index 37dff78..d49c40f 100644
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -34,8 +34,6 @@ struct cgroup_subsys net_cls_subsys = {
> .populate = cgrp_populate,
> #ifdef CONFIG_NET_CLS_CGROUP
> .subsys_id = net_cls_subsys_id,
> -#else
> -#define net_cls_subsys_id net_cls_subsys.subsys_id
> #endif
> .module = THIS_MODULE,
> };
>
> Cheers,


--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-04 16:12:44

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (scsi_debug: list corruption)

On 11/03/10 19:25, Xiaotian Feng wrote:
> On Thu, Nov 4, 2010 at 7:20 AM, Randy Dunlap <[email protected]> wrote:
>>
>>
>> [ 102.555847] calling scsi_debug_init+0x0/0x704 [scsi_debug] @ 3337
>> Nov 3 16:06:12 control kernel: [ 102.555847] calling scsi_deb[ 102.622974] scsi_debug: host protection
>> [ 102.627513] scsi4 : scsi_debug, version 1.82 [20100324], dev_size_mb=8, opts=0x0
>> Nov 3 16:06:13 control kernel: [ 102.639095] initcall scsi_debug_init+0x0/0x704 [scsi_debug] returned 0 after 75039 usecs
>> Nov 3 16:06:13 [ 102.651072] ------------[ cut here ]------------
>> [ 102.657373] WARNING: at /local/linsrc/lnx-2637-rc1/lib/list_debug.c:26 __list_add+0x4d/0xa5()
>> [ 102.666012] Hardware name: OptiPlex GX620
>> [ 102.671396] list_add corruption. next->prev should be prev (ffffffff81ae5e50), but was 6b6b6b6b6b6b6b6b. (next=ffff88006c880590).
>
> This might be related with net rds percpu_counter corruption. Does
> following patch fix your issue?
> http://patchwork.ozlabs.org/patch/69939/

Yes, somehow that patch fixes it.

What good eyes you have.

thanks,
--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-05 22:10:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

That first warning says that something stayed on a list even though it
was released (the 0x6b thing is the kmalloc free poison pattern). And
the oops looks related to something similar.

Randy, is this one also related to that ipv6 percpu list corruption?
IOW, does it go away with

http://patchwork.ozlabs.org/patch/69939/

like one of your other reports did?

And David - I think we need that patch merged. The error case for when
the percpu list entry is free'd without unlinking is _so_ annoying
(random crashes in totally unrelated code) that I think we need to get
that one closed asap. Hmm?

Linus

On Wed, Nov 3, 2010 at 4:16 PM, Randy Dunlap <[email protected]> wrote:
>
>
> [ ?303.127418] calling ?floppy_module_init+0x0/0x93 [floppy] @ 5726
> [ ?303.134577] ------------[ cut here ]------------
> [ ?303.139329] WARNING: at /local/linsrc/lnx-2637-rc1/lib/list_debug.c:26 __list_add+0x4d/0xa5()
> [ ?303.148248] Hardware name: OptiPlex GX620
> [ ?303.153682] list_add corruption. next->prev should be prev (ffffffff81ae5e50), but was 6b6b6b6b6b6b6b6b. (next=ffff88006c908590).
> [ ?303.165678] Modules linked in: floppy(+) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ppdev ac97_bus snd_seq snd_seq_device led_class snd_pcm iTCO_wdt snd_timer usbmouse iTCO_vendor_support usbkbd snd usbhid tg3 hid sg dcdbas sr_mod soundcore rtc_cmos pcspkr i2c_i801 cdrom parport_pc rng_core evdev snd_page_alloc shpchp rtc_core parport rtc_lib mac_hid pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class!
> ?ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
> [ ?303.269866] Pid: 5726, comm: modprobe Not tainted 2.6.37-rc1 #10
> [ ?303.275973] Call Trace:
> [ ?303.278775] ?[<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
> Nov ?3 16:13:19 [ ?303.284885] ?[<ffffffff812d3582>] ? __percpu_counter_init+0x9e/0xdf
> control kernel: [ ?303.292959] ?[<ffffffff8107e30a>] warn_slowpath_fmt+0x5b/0x6b
> [ ?303.127418] c[ ?303.299754] ?[<ffffffff812cf38b>] __list_add+0x4d/0xa5
> alling ?floppy_m[ ?303.306249] ?[<ffffffff812d359f>] __percpu_counter_init+0xbb/0xdf
> odule_init+0x0/0[ ?303.314093] ?[<ffffffff8117c577>] bdi_init+0x13f/0x1c2
> x93 [floppy] @ 5[ ?303.320270] ?[<ffffffffa0bdb3bb>] ? do_fd_request+0x0/0x111 [floppy]
> 726
> Nov ?3 16:13[ ?303.328429] ?[<ffffffffa0bdb3bb>] ? do_fd_request+0x0/0x111 [floppy]
> :19 control kern[ ?303.335791] ?[<ffffffff812a7444>] blk_alloc_queue_node+0x8f/0x220
> el: [ ?303.13457[ ?303.343257] ?[<ffffffff812a773b>] blk_init_queue_node+0x30/0x90
> 7] ------------[[ ?303.350874] ?[<ffffffff812a77b3>] blk_init_queue+0x18/0x21
> ?cut here ]-----[ ?303.357447] ?[<ffffffffa0bf29da>] floppy_init+0x95/0x7c0 [floppy]
> -------
> Nov ?3 1[ ?303.365279] ?[<ffffffff81017b19>] ? read_tsc+0x17/0x29
> 6:13:19 control [ ?303.371436] ?[<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
> kernel: [ ?303.1[ ?303.379517] ?[<ffffffffa0bf318d>] floppy_module_init+0x88/0x93 [floppy]
> 39329] WARNING: [ ?303.387890] ?[<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
> at /local/linsrc[ ?303.395650] ?[<ffffffff810020a6>] do_one_initcall+0x6c/0x1ef
> /lnx-2637-rc1/li[ ?303.403068] ?[<ffffffff810d52f3>] sys_init_module+0xe1/0x2a5
> b/list_debug.c:2[ ?303.409709] ?[<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
> 6 __list_add+0x4Nov ?3 16:13:19 control kernel: [ ?303.419712] ---[ end trace e26c2a9ce976be75 ]---
> cd usbcore nls_base i915 drm_kms_helper intel_ag[ ?303.429812] Floppy drive(s):p button intel_g fd0 is 1.44Mtt video thermal
> _sys hwmon output [last unloaded: mperf]
> Nov ?3 16:13:19 control kernel: [ ?303.269866] Pid: 5726, comm: modprobe Not tainted 2.6.37-rc1 #10
> Nov ?3 16:13:19 control kernel: [ ?303.275973] Call Trace:
> Nov ?3 16:13:19 control kernel: [ ?303.278775] ?[<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
> Nov ?3 16:13:19 control kernel: [ ?3
>
> [ ?306.480304] floppy0: no floppy controllers found
> Nov ?3 16:13:22 [ ?306.488160] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> [ ?306.492030] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
> [ ?306.492030] CPU 0
> [ ?306.492030] Modules linked in: floppy(+) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ppdev ac97_bus snd_seq snd_seq_device led_class snd_pcm iTCO_wdt snd_timer usbmouse iTCO_vendor_support usbkbd snd usbhid tg3 hid sg dcdbas sr_mod soundcore rtc_cmos pcspkr i2c_i801 cdrom parport_pc rng_core evdev snd_page_alloc shpchp rtc_core parport rtc_lib mac_hid pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class!
> ?ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
> [ ?306.626211]
> [ ?306.626211] Pid: 5726, comm: modprobe Tainted: G ? ? ? ?W ? 2.6.37-rc1 #10 0HH807/OptiPlex GX620
> [ ?306.626211] RIP: 0010:[<ffffffff810c309f>] ?[<ffffffff810c309f>] __lock_acquire+0xd8/0x4e8
> [ ?306.626211] RSP: 0018:ffff88006d38dd48 ?EFLAGS: 00010002
> [ ?306.626211] RAX: 0000000000000006 RBX: 6b6b6b6b6b6b6d13 RCX: 0000000000000000
> [ ?306.626211] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6d13
> [ ?306.626211] RBP: ffff88006d38dda8 R08: 0000000000000001 R09: 0000000000000001
> [ ?306.626211] R10: ffffffff81812d48 R11: ffff88006d38de78 R12: 0000000000000000
> [ ?306.626211] R13: ffff88006cb13000 R14: 0000000000000000 R15: 0000000000000000
> [ ?306.626211] FS: ?00007f248ff436f0(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
> [ ?306.626211] CS: ?0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ ?306.626211] CR2: 000000000064b000 CR3: 000000006c976000 CR4: 00000000000006f0
> [ ?306.626211] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ ?306.626211] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ ?306.626211] Process modprobe (pid: 5726, threadinfo ffff88006d38c000, task ffff88006cb13000)
> [ ?306.626211] Stack:
> [ ?306.783845] ?0000000000000202 ffffffff812b10c2 ffff88006d38dd88 ffffffff811b2af3
> [ ?306.787082] ?ffff88006d24ca88 000000006d24ca88 ffff88006d38dd88 0000000000000000
> [ ?306.787082] ?ffff88006cb13000 ffffffff81092650 0000000000000001 0000000000000000
> [ ?306.787082] Call Trace:
> [ ?306.787082] ?[<ffffffff812b10c2>] ? disk_release+0x97/0xa3
> [ ?306.787082] ?[<ffffffff811b2af3>] ? __slab_free+0x1b9/0x1d6
> [ ?306.787082] ?[<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
> [ ?306.787082] ?[<ffffffff810c35af>] lock_acquire+0x100/0x150
> [ ?306.787082] ?[<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
> [ ?306.787082] ?[<ffffffff81092694>] del_timer_sync+0x94/0x15c
> [ ?306.874354] ?[<ffffffff81092650>] ? del_timer_sync+0x50/0x15c
> [ ?306.874354] ?[<ffffffff812a763f>] blk_sync_queue+0x24/0x55
> [ ?306.874354] ?[<ffffffff812a7692>] blk_cleanup_queue+0x22/0x9b
> [ ?306.874354] ?[<ffffffffa0bf30e1>] floppy_init+0x79c/0x7c0 [floppy]
> [ ?306.874354] ?[<ffffffff81017b19>] ? read_tsc+0x17/0x29
> [ ?306.874354] ?[<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
> [ ?306.874354] ?[<ffffffffa0bf318d>] floppy_module_init+0x88/0x93 [floppy]
> [ ?306.874354] ?[<ffffffffa0bf3105>] ? floppy_module_init+0x0/0x93 [floppy]
> [ ?306.940576] ?[<ffffffff810020a6>] do_one_initcall+0x6c/0x1ef
> [ ?306.943732] ?[<ffffffff810d52f3>] sys_init_module+0xe1/0x2a5
> [ ?306.943732] ?[<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
> [ ?306.943732] Code: 05 4f 15 72 01 e8 9c b1 fb ff 48 ff 05 4b 15 72 01 48 ff 05 4c 15 72 01 48 ff 05 55 15 72 01 e9 e3 03 00 00 48 ff 05 41 15 72 01 <48> 81 3b 40 34 0b 82 75 07 48 ff 05 41 15 72 01 83 fe 01 77 13
> [ ?306.982675] RIP ?[<ffffffff810c309f>] __lock_acquire+0xd8/0x4e8
> [ ?306.982675] ?RSP <ffff88006d38dd48>
> [ ?306.982675] ---[ end trace e26c2a9ce976be76 ]---
> Nov ?3 16:13:22 control kernel: Nov ?3 16:13:22 Nov ?3 16:13:22 control kernel: [ ?306.492030] CPU 0
>
>
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>

2010-11-05 22:12:14

by David Miller

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

From: Linus Torvalds <[email protected]>
Date: Fri, 5 Nov 2010 15:10:07 -0700

> Randy, is this one also related to that ipv6 percpu list corruption?
> IOW, does it go away with
>
> http://patchwork.ozlabs.org/patch/69939/
>
> like one of your other reports did?
>
> And David - I think we need that patch merged. The error case for when
> the percpu list entry is free'd without unlinking is _so_ annoying
> (random crashes in totally unrelated code) that I think we need to get
> that one closed asap. Hmm?

Agreed, I'll push this to you right now.

2010-11-05 22:25:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (libipw remove_proc_entry warning)

This bug seems to be due to commit 27ae60f8f7aac ("ipw2x00: replace
"ieee80211" with "libipw" where appropriate"), where Pavel did this:

- libipw_proc = proc_mkdir(DRV_NAME, init_net.proc_net);
+ libipw_proc = proc_mkdir("ieee80211", init_net.proc_net);

but then the cleanup was kept as

remove_proc_entry(DRV_NAME, init_net.proc_net);

in both places (both in the failure case and in the unload case). The
error string is also total crap, and says

"Unable to create " DRV_NAME " proc directory\n");

Even though it doesn't actually create a proc directory named DRV_NAME at all.

So that patch looks like total and utter crap to me. The commit message says

"Keep /proc/net/ieee80211 under the original name to avoid breaking user
interface."

but the thing is, it really didn't fix anything but that one create
thing. It needs to fix all the other cases too.

Totally UNTESTED patch attached. It may or may not compile. And maybe
it doesn't catch all cases, but it should catch the obvious ones.

Linus

On Wed, Nov 3, 2010 at 4:18 PM, Randy Dunlap <[email protected]> wrote:
>
> Nov ?3 16:03:11 control kernel: [ ? 74.701367] ? ? (5170000[ ? 74.846676] calling ?libipw_init+0x0/0xe4 [libipw] @ 2992
> [ ? 74.852356] libipw: 802.11 data/management/control stack, git-1.1.13
> [ ? 74.858790] libipw: Copyright (C) 2004-2005 Intel Corporation <[email protected]>
> [ ? 74.866977] initcall libipw_init+0x0/0xe4 [libipw] returned 0 after 14318 usecs
> Nov ?3 16:03:11 control kernel: [ ? 74.846676] calling ?libipw_iNov ?3 16:03:11 control kernel: [ ? 74.852356] libipw: 802.11 data/management/control stack, git-1.1.13
> Nov ?3 16:03:11 control kernel: [ ? 74.858790] libipw: C
>
>
> [ ? 78.273409] ------------[ cut here ]------------
> [ ? 78.278210] WARNING: at /local/linsrc/lnx-2637-rc1/fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
> Nov ?3 16:03:14 [ ? 78.288314] Hardware name: OptiPlex GX620
> control kernel: [ ? 78.294870] name 'libipw'
> [ ? 78.298520] Modules linked in: libipw(-) lib80211 cfg80211 ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetoothNov ?3 16:03:14 ?rfkillcontrol kernel: ?sunrpc[ ? 78.278210] W ipt_REJECTARNING: at /loca nf_conntrack_ipv4l/linsrc/lnx-263 nf_defrag_ipv47-rc1/fs/proc/ge iptable_filterneric.c:816 remo ip_tablesve_proc_entry+0x ip6t_REJECT156/0x35e()
> Nov ?xt_tcpudp 3 16:03:14 cont nf_conntrack_ipv6rol kernel: [ ? ?nf_defrag_ipv678.288314] Hardw xt_stateare name: OptiPl nf_conntrackex GX620 ? ? ? ? ip6table_filter
> Nov ?3 1 ip6_tables6:03:14 control ?x_tableskernel: [ ? 78.2 ipv694870] name 'lib p4_clockmodipw'
> ?freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev snd_intel8x0 snd_ac97_codec ac97_bus ppdev snd_seq snd_seq_device usbmouse snd_pcm led_class usbkbd snd_timer usbhid iTCO_wdt hid snd iTCO_vendor_support tg3 sr_mod dcdbas cdrom soundcore i2c_i801 rtc_cmos sg pcspkr rng_core parport_pc snd_page_alloc rtc_core shpchp evdev rtc_lib parport 8250_pnp pci_hotplug mac_hid unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
> [ ? 78.429364] Pid: 3067, comm: rmmod Not tainted 2.6.37-rc1 #10
> [ ? 78.435209] Call Trace:
> [ ? 78.437982] ?[<ffffffff8107e1ed>] warn_slowpath_common+0xc6/0xf3
> [ ? 78.444119] ?[<ffffffff8107e30a>] warn_slowpath_fmt+0x5b/0x6b
> [ ? 78.449973] ?[<ffffffff81550dbb>] ? _raw_spin_unlock+0x40/0x4b
> [ ? 78.456247] ?[<ffffffff8123d37e>] remove_proc_entry+0x156/0x35e
> Nov ?3 16:03:14 [ ? 78.462289] ?[<ffffffff81017d29>] ? native_sched_clock+0x3b/0x6d
> control kernel: [ ? 78.470047] ?[<ffffffff810b067f>] ? sched_clock_cpu+0x147/0x160
> [ ? 78.298520] M[ ? 78.477041] ?[<ffffffff81137479>] ? trace_hardirqs_off+0x10/0x4a
> odules linked in: libipw(-) lib80211 cfg80211 ip[ ? 78.487138] ?[<ffffffff810b0742>] ? local_clock+0xaa/0xf4
> t_MASQUERADE iptable_nat nf_nat [ ? 78.496297] ?[<ffffffff810c0ab9>] ? lock_release_holdtime+0x41/0x177
> Nov ?3 16:03:14 [ ? 78.505725] ?[<ffffffff810c3a1b>] ? lock_release_nested+0xfb/0x133
> Nov ?3 16:03:14 [ ? 78.514477] ?[<ffffffffa0c39cad>] libipw_exit+0x49/0x5d [libipw]
> Nov ?3 16:03:14 [ ? 78.523311] ?[<ffffffff810d3310>] sys_delete_module+0x2d6/0x368
> [ ? 78.531755] ?[<ffffffff8154f9cb>] ? lockdep_sys_exit_thunk+0x35/0x67
> Nov ?3 16:03:14 Nov ?3 16:03:14 [ ? 78.540954] ?[<ffffffff810fd60f>] ? audit_syscall_entry+0x172/0x1a5
> Nov ?3 16:03:14 [ ? 78.550029] ?[<ffffffff8154f955>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> Nov ?3 16:03:14 [ ? 78.559196] ?[<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
> Nov ?3 16:03:14 [ ? 78.567700] ---[ end trace b9ae9f3ab8d89ea5 ]---
>
>
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>


Attachments:
patch.diff (1.59 kB)

2010-11-05 23:06:08

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

On 11/05/10 15:10, Linus Torvalds wrote:
> That first warning says that something stayed on a list even though it
> was released (the 0x6b thing is the kmalloc free poison pattern). And
> the oops looks related to something similar.
>
> Randy, is this one also related to that ipv6 percpu list corruption?
> IOW, does it go away with
>
> http://patchwork.ozlabs.org/patch/69939/
>
> like one of your other reports did?

The list_debug.c message goes away, but it still gets the GP fault.

> And David - I think we need that patch merged. The error case for when
> the percpu list entry is free'd without unlinking is _so_ annoying
> (random crashes in totally unrelated code) that I think we need to get
> that one closed asap. Hmm?
>
> Linus

New GP fault message, on 2.6.37-rc1-git3 + several patches:

[ 93.235463] calling floppy_module_init+0x0/0x93 [floppy] @ 3220
[ 93.243305] Floppy drive(s): fd0 is 1.44M
[ 93.235463] calling floppy_m
[ 96.272319] floppy0: no floppy controllers found
[ 96.278224] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 96.280934] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1.3/devnum
[ 96.280934] CPU 1
[ 96.280934] Modules linked in: floppy(+) ipt_MASQUERADE iptable_nat nf_nat af_packet nfsd lockd nfs_acl auth_rpcgss exportfs sco bridge stp llc bnep l2cap crc16 bluetooth rfkill sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 p4_clockmod freq_table speedstep_lib binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod kvm uinput mousedev joydev ppdev snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm usbmouse usbkbd snd_timer led_class snd usbhid iTCO_wdt tg3 hid iTCO_vendor_support dcdbas sg sr_mod i2c_i801 rtc_cmos pcspkr soundcore evdev rtc_core parport_pc rng_core cdrom shpchp snd_page_alloc rtc_lib mac_hid parport pci_hotplug 8250_pnp unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class
ehci_hcd usbcore nls_base i915 drm_kms_helper intel_agp button intel_gtt video thermal_sys hwmon output [last unloaded: mperf]
[ 96.312371]
[ 96.312371] Pid: 3220, comm: modprobe Not tainted 2.6.37-rc1-git3 #1 0HH807/OptiPlex GX620
[ 96.278224] g[ 96.312371] RIP: 0010:[<ffffffff810c304b>] [<ffffffff810c304b>] __lock_acquire+0xd8/0x4e8
eneral protectio[ 96.312371] RSP: 0018:ffff88006c8c3d48 EFLAGS: 00010002
n fault: 0000 [#[ 96.312371] RAX: 0000000000000002 RBX: 6b6b6b6b6b6b6d13 RCX: 0000000000000000
1] SMP DEBUG_PAG[ 96.312371] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6d13
EALLOC
[ 96.312371] RBP: ffff88006c8c3da8 R08: 0000000000000001 R09: 0000000000000001
[ 96.312371] R10: ffffffff81814820 R11: ffff88006c8c3e78 R12: 0000000000000000
[ 96.312371] R13: ffff88006c9e8000 R14: 0000000000000000 R15: 0000000000000000
[ 96.312371] FS: 00007f0fa99786f0(0000) GS:ffff88007c800000(0000) knlGS:0000000000000000
[ 96.312371] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 96.312371] CR2: 00007fff7cde9040 CR3: 000000006c8da000 CR4: 00000000000006e0
[ 96.312371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 96.312371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 96.312371] Process modprobe (pid: 3220, threadinfo ffff88006c8c2000, task ffff88006c9e8000)
[ 96.312371] Stack:
[ 96.312371] 0000000000000206 ffffffff812b04ee ffff88006c8c3d88 ffffffff811b1f1b
[ 96.312371] ffff88006d2f6ba8 000000006d2f6ba8 ffff88006c8c3d88 0000000000000000
[ 96.312371] ffff88006c9e8000 ffffffff81092644 0000000000000001 0000000000000000
[ 96.312371] Call Trace:
[ 96.312371] [<ffffffff812b04ee>] ? disk_release+0x97/0xa3
[ 96.312371] [<ffffffff811b1f1b>] ? __slab_free+0x1b9/0x1d6
[ 96.312371] [<ffffffff81092644>] ? del_timer_sync+0x50/0x15c
[ 96.312371] [<ffffffff810c355b>] lock_acquire+0x100/0x150
[ 96.312371] [<ffffffff81092644>] ? del_timer_sync+0x50/0x15c
[ 96.312371] [<ffffffff81092688>] del_timer_sync+0x94/0x15c
[ 96.312371] [<ffffffff81092644>] ? del_timer_sync+0x50/0x15c
[ 96.312371] [<ffffffff812a6a6a>] blk_sync_queue+0x24/0x55
[ 96.312371] [<ffffffff812a6abd>] blk_cleanup_queue+0x22/0x9b
[ 96.312371] [<ffffffffa0bd90e1>] floppy_init+0x79c/0x7c0 [floppy]
[ 96.312371] [<ffffffff81017b19>] ? read_tsc+0x17/0x29
[ 96.312371] [<ffffffffa0bd9105>] ? floppy_module_init+0x0/0x93 [floppy]
[ 96.312371] [<ffffffffa0bd918d>] floppy_module_init+0x88/0x93 [floppy]
[ 96.312371] [<ffffffffa0bd9105>] ? floppy_module_init+0x0/0x93 [floppy]
[ 96.312371] [<ffffffff810020a6>] do_one_initcall+0x6c/0x1ef
[ 96.312371] [<ffffffff810d529f>] sys_init_module+0xe1/0x2a5
[ 96.312371] [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[ 96.312371] Code: 05 63 45 72 01 e8 e4 b1 fb ff 48 ff 05 5f 45 72 01 48 ff 05 60 45 72 01 48 ff 05 69 45 72 01 e9 e3 03 00 00 48 ff 05 55 45 72 01 <48> 81 3b 00 64 0b 82 75 07 48 ff 05 55 45 72 01 83 fe 01 77 13
[ 96.312371] RIP [<ffffffff810c304b>] __lock_acquire+0xd8/0x4e8
[ 96.312371] RSP <ffff88006c8c3d48>
[ 96.312371] ---[ end trace 7bbc73ed1fe8d49d ]---
[ 96.280934] last sysfs file: [ 96.312371] ffff88006c9e8000 ffffffff81092644 0000000000000001 0000000000000000
[ 96.312371] Call Trace:
[ 96.312371] Code: 05 63 45 72 01 e8 e4 b1 fb ff 48 ff 05 5f 45 72 01 48 ff 05 60 45 72 01 48 ff 05 69 45 72 0



--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-05 23:07:28

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (libipw remove_proc_entry warning)

On 11/05/10 15:24, Linus Torvalds wrote:
> This bug seems to be due to commit 27ae60f8f7aac ("ipw2x00: replace
> "ieee80211" with "libipw" where appropriate"), where Pavel did this:
>
> - libipw_proc = proc_mkdir(DRV_NAME, init_net.proc_net);
> + libipw_proc = proc_mkdir("ieee80211", init_net.proc_net);
>
> but then the cleanup was kept as
>
> remove_proc_entry(DRV_NAME, init_net.proc_net);
>
> in both places (both in the failure case and in the unload case). The
> error string is also total crap, and says
>
> "Unable to create " DRV_NAME " proc directory\n");
>
> Even though it doesn't actually create a proc directory named DRV_NAME at all.
>
> So that patch looks like total and utter crap to me. The commit message says
>
> "Keep /proc/net/ieee80211 under the original name to avoid breaking user
> interface."
>
> but the thing is, it really didn't fix anything but that one create
> thing. It needs to fix all the other cases too.
>
> Totally UNTESTED patch attached. It may or may not compile. And maybe
> it doesn't catch all cases, but it should catch the obvious ones.

That works for me.

Tested-by: Randy Dunlap <[email protected]>


thanks,
--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-05 23:38:05

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

On Fri, Nov 5, 2010 at 4:03 PM, Randy Dunlap <[email protected]> wrote:
> On 11/05/10 15:10, Linus Torvalds wrote:
>>
>> Randy, is this one also related to that ipv6 percpu list corruption?
>> IOW, does it go away with
>>
>> ? http://patchwork.ozlabs.org/patch/69939/
>>
>> like one of your other reports did?
>
> The list_debug.c message goes away, but it still gets the GP fault.

Ok. I think there's a separate floppy.c bug introduced in commit
488211844e0c ("floppy: switch to one queue per drive instead of
sharing a queue").

We do "put_disk()" on the disk device _before_ we then clean up the
queue associated with that disk.

So maybe this trivial patch is in order?

Again - UNTESTED. Jens, Vivek?

Linus


Attachments:
patch.diff (477.00 B)

2010-11-05 23:39:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (libipw remove_proc_entry warning)

On Fri, Nov 5, 2010 at 4:06 PM, Randy Dunlap <[email protected]> wrote:
> On 11/05/10 15:24, Linus Torvalds wrote:
>>
>> Totally UNTESTED patch attached. It may or may not compile. And maybe
>> it doesn't catch all cases, but it should catch the obvious ones.
>
> That works for me.
>
> Tested-by: Randy Dunlap <[email protected]>

Goodie. Pavel, John - feel free to add my sign-off on that patch. Make
up a relevant commit message. Ok?

Linus

2010-11-06 00:34:59

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

On 11/05/10 16:36, Linus Torvalds wrote:
> On Fri, Nov 5, 2010 at 4:03 PM, Randy Dunlap <[email protected]> wrote:
>> On 11/05/10 15:10, Linus Torvalds wrote:
>>>
>>> Randy, is this one also related to that ipv6 percpu list corruption?
>>> IOW, does it go away with
>>>
>>> http://patchwork.ozlabs.org/patch/69939/
>>>
>>> like one of your other reports did?
>>
>> The list_debug.c message goes away, but it still gets the GP fault.
>
> Ok. I think there's a separate floppy.c bug introduced in commit
> 488211844e0c ("floppy: switch to one queue per drive instead of
> sharing a queue").
>
> We do "put_disk()" on the disk device _before_ we then clean up the
> queue associated with that disk.
>
> So maybe this trivial patch is in order?
>
> Again - UNTESTED. Jens, Vivek?

That survives load/unload 3 times.

Tested-by: Randy Dunlap <[email protected]>


--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-06 12:22:46

by Vivek Goyal

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

On Fri, Nov 05, 2010 at 04:36:59PM -0700, Linus Torvalds wrote:
> On Fri, Nov 5, 2010 at 4:03 PM, Randy Dunlap <[email protected]> wrote:
> > On 11/05/10 15:10, Linus Torvalds wrote:
> >>
> >> Randy, is this one also related to that ipv6 percpu list corruption?
> >> IOW, does it go away with
> >>
> >> ? http://patchwork.ozlabs.org/patch/69939/
> >>
> >> like one of your other reports did?
> >
> > The list_debug.c message goes away, but it still gets the GP fault.
>
> Ok. I think there's a separate floppy.c bug introduced in commit
> 488211844e0c ("floppy: switch to one queue per drive instead of
> sharing a queue").
>
> We do "put_disk()" on the disk device _before_ we then clean up the
> queue associated with that disk.
>
> So maybe this trivial patch is in order?
>
> Again - UNTESTED. Jens, Vivek?

This one looks good to me.

While scanning the floopy code, I found one more instance of trying to
access disk->queue pointer after doing put_disk() on gendisk. For some
reason, floppy moule still loads/unloads fine. May be object is still
around with right pointer values.


o There seems to be one more instance of trying to cleanup the request queue
after we have called put_disk() on associated gendisk.

o This fix is more out of code inspection. Even without this fix for some
reason I am able to load/unload floppy module without any issues.

o Floppy module loads/unloads fine after the fix.

Signed-off-by: Vivek Goyal <[email protected]>
---
drivers/block/floppy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/block/floppy.c
===================================================================
--- linux-2.6.orig/drivers/block/floppy.c 2010-11-06 07:49:29.000000000 -0400
+++ linux-2.6/drivers/block/floppy.c 2010-11-06 08:03:37.646062993 -0400
@@ -4573,8 +4573,8 @@ static void __exit floppy_module_exit(vo
device_remove_file(&floppy_device[drive].dev, &dev_attr_cmos);
platform_device_unregister(&floppy_device[drive]);
}
- put_disk(disks[drive]);
blk_cleanup_queue(disks[drive]->queue);
+ put_disk(disks[drive]);
}

del_timer_sync(&fd_timeout);

2010-11-06 14:53:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (floppy module load: no device found)

On Sat, Nov 6, 2010 at 5:16 AM, Vivek Goyal <[email protected]> wrote:
>
> While scanning the floopy code, I found one more instance of trying to
> access disk->queue pointer after doing put_disk() on gendisk. For some
> reason, floppy moule still loads/unloads fine. May be object is still
> around with right pointer values.

Yes - the normal use-after-free is fairly silent and only causes
problems if something re-allocates the same memory immediately, which
is quite a small race to hit under normal load.

But if you had had slab poisoning on, you'd have seen the same oops
Randy did (well, not the exact same one since the call trace would be
slightly different due to being from a different point, but _very_
similar).

Anyway, applied.

Linus

2010-11-08 21:45:00

by John W. Linville

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (libipw remove_proc_entry warning)

On Fri, Nov 05, 2010 at 04:38:14PM -0700, Linus Torvalds wrote:
> On Fri, Nov 5, 2010 at 4:06 PM, Randy Dunlap <[email protected]> wrote:
> > On 11/05/10 15:24, Linus Torvalds wrote:
> >>
> >> Totally UNTESTED patch attached. It may or may not compile. And maybe
> >> it doesn't catch all cases, but it should catch the obvious ones.
> >
> > That works for me.
> >
> > Tested-by: Randy Dunlap <[email protected]>
>
> Goodie. Pavel, John - feel free to add my sign-off on that patch. Make
> up a relevant commit message. Ok?

In the queue...

commit 269e2d77b82d92d8dad543a2375e74372e9d773e
Author: Linus Torvalds <[email protected]>
Date: Mon Nov 8 16:27:12 2010 -0500

libipw: fix proc entry removal

This bug seems to be due to commit 27ae60f8f7aac ("ipw2x00: replace
"ieee80211" with "libipw" where appropriate"), where Pavel did this:

- libipw_proc = proc_mkdir(DRV_NAME, init_net.proc_net);
+ libipw_proc = proc_mkdir("ieee80211", init_net.proc_net);

but then the cleanup was kept as

remove_proc_entry(DRV_NAME, init_net.proc_net);

in both places (both in the failure case and in the unload case). The
error string is also total crap, and says

"Unable to create " DRV_NAME " proc directory\n");

Even though it doesn't actually create a proc directory named DRV_NAME at all.

So that patch looks like total and utter crap to me. The commit message says

"Keep /proc/net/ieee80211 under the original name to avoid breaking user
interface."

but the thing is, it really didn't fix anything but that one create
thing. It needs to fix all the other cases too.

Signed-off-by: Linus Torvalds <[email protected]>
Tested-by: Randy Dunlap <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

Sorry for the delay -- I'm not a good traveler!
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2010-11-10 11:32:46

by Steffen Klassert

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (pcrypt fault)

On Wed, Nov 03, 2010 at 02:15:19PM -0700, Randy Dunlap wrote:
>
> modprobe pcrypt; rmmod pcrypt ==>
>
>
> [ 76.081639] calling pcrypt_init+0x0/0x107 [pcrypt] @ 3016
> Nov 3 13:02:15 control kernel: [ 76.089883] initcall pcrypt_init+0x0/0x107 [pcrypt] returned 0 after 2476 usecs
> [ 76.081639] calling pcrypt_i
>
>
> [ 79.940445] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC

Looks like a use after free of the padata instance.
Does the patch below fix it?

Thanks for reporting,

Steffen


Subject: [PATCH] crypto: pcrypt - Fix use after free on padata_free

kobject_put is called from padata_free for the padata kobject.
The kobject's release function frees the padata instance,
so don't call kobject_put for the padata kobject from pcrypt.

Signed-off-by: Steffen Klassert <[email protected]>
---
crypto/pcrypt.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index de30782..75586f1 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -504,7 +504,6 @@ err:

static void pcrypt_fini_padata(struct padata_pcrypt *pcrypt)
{
- kobject_put(&pcrypt->pinst->kobj);
free_cpumask_var(pcrypt->cb_cpumask->mask);
kfree(pcrypt->cb_cpumask);

--
1.7.0.4

2010-11-10 18:13:30

by Randy Dunlap

[permalink] [raw]
Subject: Re: Linux 2.6.37-rc1 (pcrypt fault)

On 11/10/10 03:21, Steffen Klassert wrote:
> On Wed, Nov 03, 2010 at 02:15:19PM -0700, Randy Dunlap wrote:
>>
>> modprobe pcrypt; rmmod pcrypt ==>
>>
>>
>> [ 76.081639] calling pcrypt_init+0x0/0x107 [pcrypt] @ 3016
>> Nov 3 13:02:15 control kernel: [ 76.089883] initcall pcrypt_init+0x0/0x107 [pcrypt] returned 0 after 2476 usecs
>> [ 76.081639] calling pcrypt_i
>>
>>
>> [ 79.940445] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>
> Looks like a use after free of the padata instance.
> Does the patch below fix it?

Yes, it does. Thanks.

Tested-by: Randy Dunlap <[email protected]>


> Thanks for reporting,
>
> Steffen
>
>
> Subject: [PATCH] crypto: pcrypt - Fix use after free on padata_free
>
> kobject_put is called from padata_free for the padata kobject.
> The kobject's release function frees the padata instance,
> so don't call kobject_put for the padata kobject from pcrypt.
>
> Signed-off-by: Steffen Klassert <[email protected]>
> ---
> crypto/pcrypt.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> index de30782..75586f1 100644
> --- a/crypto/pcrypt.c
> +++ b/crypto/pcrypt.c
> @@ -504,7 +504,6 @@ err:
>
> static void pcrypt_fini_padata(struct padata_pcrypt *pcrypt)
> {
> - kobject_put(&pcrypt->pinst->kobj);
> free_cpumask_var(pcrypt->cb_cpumask->mask);
> kfree(pcrypt->cb_cpumask);
>


--
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***