2010-12-18 00:04:16

by Christian Hesse

[permalink] [raw]
Subject: Thunar crashes kernel

Hallo everybody,

I'm running kernel 2.6.36.2 from Arch Linux, patched with autogroup. It ran
perfectly stable so far, now I found a way to crash it:

Thunar (1.0.2-1, binary Arch version) is configured to have tree view in side
pane (View -> Side Pane -> Tree). If I click the arrow to expant "File
System" I can see to content of my root filesystem for a moment, then the
screen gets black and the following error is printed:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffff810a0ff1>] cgroup_path+0x21/0xe0
PGD 7b98d067 PUD 7b483067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs
file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:0e/PNP0C09:00/PNP0C0A:00/power_supply/BAT1/voltage_now
CPU 1 Modules linked in: usb_storage netconsole configfs tun michael_mic arc4
ecb fuse cpufreq_ondemand rfcomm microcode sco bnep l2cap crc16 ip6t_REJECT
ip6t_LOG nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables ipv6
xt_pkttype ipt_REDIRECT ipt_MASQUERADE xt_DSCP xt_dscp xt_tcpudp ipt_REJECT
ipt_LOG xt_limit xt_recent xt_state iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables x_tables
loop snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
vboxnetadp vboxnetflt lib80211_crypt_tkip snd_pcm_oss snd_mixer_oss joydev
battery ac wl(P) vboxdrv i915 thermal snd_hda_codec_realtek btusb bluetooth
rfkill drm_kms_helper lib80211 drm snd_hda_intel snd_hda_codec snd_hwdep sky2
sg snd_pcm snd_timer i2c_algo_bit button snd video soundcore output
snd_page_alloc psmouse acpi_cpufreq i2c_i801 evdev shpchp i2c_core pcspkr
intel_agp freq_table pci_hotplug serio_raw processor mperf dummy ext3 jbd
mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod
sd_mod ahci uhci_hcd libahci libata ehci_hcd scsi_mod usbcore

Pid: 14239, comm: Thunar Tainted: P 2.6.36-ARCH #1
NF110/NF210/NF310 /NF110/NF210/NF310 RIP: 0010:[<ffffffff810a0ff1>]
[<ffffffff810a0ff1>] cgroup_path+0x21/0xe0 RSP: 0018:ffff88007ababd08
EFLAGS: 00010082 RAX: ffff88007b9a7400 RBX: ffff880078310000 RCX:
0000000000000001 RDX: 0000000000000040 RSI: ffff88007ababda8 RDI:
0000000000000000 RBP: ffff88007ababd28 R08: 0000000000000000 R09:
0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12:
0000000000000000 R13: ffff88007ababda8 R14: 0000000000000000 R15:
ffff88007abaa000 FS: 00007f0a5e794710(0000) GS:ffff880001a80000(0000)
knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 000000007b1a8000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Thunar (pid: 14239, threadinfo ffff88007abaa000, task
ffff88007b946f00) Stack:
ffff880078310000 ffff880078310000 ffff88007f050100 0000000000000000
<0> ffff88007ababe28 ffffffff8104b0c2 0000000000000000 0000000000000000
<0> 0000000000000000 0000000000000000 ffff88005c739cb0 ffff88007abaa000
Call Trace:
[<ffffffff8104b0c2>] sched_debug_show+0x7a2/0xd70
[<ffffffff8114caed>] seq_read+0xdd/0x420
[<ffffffff8114ca10>] ? seq_read+0x0/0x420
[<ffffffff81184c2e>] proc_reg_read+0x7e/0xc0
[<ffffffff8112e6f3>] vfs_read+0xc3/0x180
[<ffffffff8112e7fc>] sys_read+0x4c/0x80
[<ffffffff8100af42>] system_call_fastpath+0x16/0x1b
Code: ff 0f 0b 0f 1f 80 00 00 00 00 55 48 89 e5 48 83 ec 20 4c 89 64 24 08 4c
89 6c 24 10 49 89 fc 48 89 1c 24 4c 89 74 24 18 49 89 f5 <48> 8b 47 38 48 85
c0 74 09 48 81 ff b0 c5 71 81 75 25 66 41 c7 RIP [<ffffffff810a0ff1>]
cgroup_path+0x21/0xe0 RSP <ffff88007ababd08> CR2: 0000000000000038
---[ end trace 0a7b41d179fb781e ]---
note: Thunar[14239] exited with preempt_count 2

Any ideas what goes wrong?
--
Regards,
Chris


2010-12-18 04:33:00

by Mike Galbraith

[permalink] [raw]
Subject: Re: Thunar crashes kernel

On Sat, 2010-12-18 at 00:55 +0100, Christian Hesse wrote:
> Hallo everybody,
>
> I'm running kernel 2.6.36.2 from Arch Linux, patched with autogroup. It ran
> perfectly stable so far, now I found a way to crash it:

You're running a buggy version of the patch. If you want autogroup in
36.2, you'll want what was integrated into tip, plus a fix to that.

Attached is a quilt tarball of what I've got plugged into 36.2 here if
you want to try it.

-Mike


Attachments:
autogroup-2.6.36.2.tar.gz (9.17 kB)

2010-12-18 17:02:42

by Romain Francoise

[permalink] [raw]
Subject: Re: Thunar crashes kernel

Mike Galbraith <[email protected]> writes:

> Attached is a quilt tarball of what I've got plugged into 36.2
> here if you want to try it.

FWIW, I added fix_skip_clock_update.diff on top of what I was running
previously (2.6.36.2 + the two patches you posted previously and are
included in the tarball), and the WARN_ON_ONCE(test_tsk_need_resched(next));
triggers on two machines. It looks like this:

[ 2712.156085] ------------[ cut here ]------------
[ 2712.156097] WARNING: at /home/romain/tmp/linux/linux-2.6-2.6.36/debian/build/source_amd64_none/kernel/sched.c:3807 schedule+0x448/0x5f9()
[ 2712.156102] Hardware name:
[ 2712.156105] Modules linked in: authenc xfrm6_mode_tunnel xfrm4_mode_tunnel deflate zlib_deflate ctr twofish_generic twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic xcbc rmd160 sha512_generic sha1_generic hmac crypto_null xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key ext4 jbd2 crc16 cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables coretemp acpi_cpufreq mperf loop sha256_generic aes_x86_64 aes_generic cbc dm_crypt dm_mod i915 ftdi_sio snd_hda_codec_realtek drm_kms_helper drm snd_hda_intel snd_hda_codec usbserial tpm_tis i2c_i801 tpm snd_pcsp i2c_algo_bit i2c_core rng_core tpm_bios snd_hwdep snd_pcm video psmouse parport_pc parport output snd_timer snd evdev soundcore serio_raw snd_page_alloc led_class processor button ext3 jbd mbcache raid1 md_mod sd_mod crc_t10dif ata_generic uhci_hcd thermal thermal_sys ata_piix ehci_hcd floppy r8169 usbcore mii libata scsi_mod nls_base [last unloaded: scsi_wait_scan]
[ 2712.156225] Pid: 2958, comm: tmux Not tainted 2.6.36-ore-amd64 #1
[ 2712.156229] Call Trace:
[ 2712.156236] [<ffffffff8104487c>] ? warn_slowpath_common+0x78/0x8c
[ 2712.156242] [<ffffffff8130e173>] ? schedule+0x448/0x5f9
[ 2712.156248] [<ffffffff8130e719>] ? schedule_timeout+0xa0/0xd7
[ 2712.156254] [<ffffffff81050c65>] ? process_timeout+0x0/0xb
[ 2712.156261] [<ffffffff811192e6>] ? sys_epoll_wait+0x17f/0x267
[ 2712.156268] [<ffffffff8103fab3>] ? default_wake_function+0x0/0xf
[ 2712.156274] [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b
[ 2712.156279] ---[ end trace 06ece612984887c2 ]---

On the other machine:

[ 6279.152199] Pid: 29865, comm: kworker/0:2 Tainted: P 2.6.36-ore-amd64 #1
[ 6279.152201] Call Trace:
[ 6279.152207] [<ffffffff8104487c>] ? warn_slowpath_common+0x78/0x8c
[ 6279.152211] [<ffffffff8130e173>] ? schedule+0x448/0x5f9
[ 6279.152215] [<ffffffff8105a396>] ? worker_thread+0x23b/0x240
[ 6279.152218] [<ffffffff8105a15b>] ? worker_thread+0x0/0x240
[ 6279.152221] [<ffffffff8105a15b>] ? worker_thread+0x0/0x240
[ 6279.152225] [<ffffffff8105d10c>] ? kthread+0x7a/0x82
[ 6279.152228] [<ffffffff81009824>] ? kernel_thread_helper+0x4/0x10
[ 6279.152232] [<ffffffff8105d092>] ? kthread+0x0/0x82
[ 6279.152234] [<ffffffff81009820>] ? kernel_thread_helper+0x0/0x10
[ 6279.152237] ---[ end trace 5be61ee2b4fbcfdd ]---

2010-12-18 17:13:09

by Mike Galbraith

[permalink] [raw]
Subject: Re: Thunar crashes kernel

On Sat, 2010-12-18 at 18:02 +0100, Romain Francoise wrote:
> Mike Galbraith <[email protected]> writes:
>
> > Attached is a quilt tarball of what I've got plugged into 36.2
> > here if you want to try it.
>
> FWIW, I added fix_skip_clock_update.diff on top of what I was running
> previously (2.6.36.2 + the two patches you posted previously and are
> included in the tarball), and the WARN_ON_ONCE(test_tsk_need_resched(next));
> triggers on two machines. It looks like this:

Thanks. I've hit it twice too, looking for how the heck it can happen.
Both of mine were worker_thread.

>
> [ 2712.156085] ------------[ cut here ]------------
> [ 2712.156097] WARNING: at /home/romain/tmp/linux/linux-2.6-2.6.36/debian/build/source_amd64_none/kernel/sched.c:3807 schedule+0x448/0x5f9()
> [ 2712.156102] Hardware name:
> [ 2712.156105] Modules linked in: authenc xfrm6_mode_tunnel xfrm4_mode_tunnel deflate zlib_deflate ctr twofish_generic twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic xcbc rmd160 sha512_generic sha1_generic hmac crypto_null xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key ext4 jbd2 crc16 cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables coretemp acpi_cpufreq mperf loop sha256_generic aes_x86_64 aes_generic cbc dm_crypt dm_mod i915 ftdi_sio snd_hda_codec_realtek drm_kms_helper drm snd_hda_intel snd_hda_codec usbserial tpm_tis i2c_i801 tpm snd_pcsp i2c_algo_bit i2c_core rng_core tpm_bios snd_hwdep snd_pcm video psmouse parport_pc parport output snd_timer snd evdev soundcore serio_raw snd_page_alloc led_class processor button ext3 jbd mbcache raid1 md_mod sd_mod crc_t10dif ata_generic uhci_hcd thermal thermal_sys ata_piix ehci_hcd floppy r8169 usbcore mii libata scsi_mod nls_base [last unloaded: scsi_wait_scan]
> [ 2712.156225] Pid: 2958, comm: tmux Not tainted 2.6.36-ore-amd64 #1
> [ 2712.156229] Call Trace:
> [ 2712.156236] [<ffffffff8104487c>] ? warn_slowpath_common+0x78/0x8c
> [ 2712.156242] [<ffffffff8130e173>] ? schedule+0x448/0x5f9
> [ 2712.156248] [<ffffffff8130e719>] ? schedule_timeout+0xa0/0xd7
> [ 2712.156254] [<ffffffff81050c65>] ? process_timeout+0x0/0xb
> [ 2712.156261] [<ffffffff811192e6>] ? sys_epoll_wait+0x17f/0x267
> [ 2712.156268] [<ffffffff8103fab3>] ? default_wake_function+0x0/0xf
> [ 2712.156274] [<ffffffff81008a02>] ? system_call_fastpath+0x16/0x1b
> [ 2712.156279] ---[ end trace 06ece612984887c2 ]---
>
> On the other machine:
>
> [ 6279.152199] Pid: 29865, comm: kworker/0:2 Tainted: P 2.6.36-ore-amd64 #1
> [ 6279.152201] Call Trace:
> [ 6279.152207] [<ffffffff8104487c>] ? warn_slowpath_common+0x78/0x8c
> [ 6279.152211] [<ffffffff8130e173>] ? schedule+0x448/0x5f9
> [ 6279.152215] [<ffffffff8105a396>] ? worker_thread+0x23b/0x240
> [ 6279.152218] [<ffffffff8105a15b>] ? worker_thread+0x0/0x240
> [ 6279.152221] [<ffffffff8105a15b>] ? worker_thread+0x0/0x240
> [ 6279.152225] [<ffffffff8105d10c>] ? kthread+0x7a/0x82
> [ 6279.152228] [<ffffffff81009824>] ? kernel_thread_helper+0x4/0x10
> [ 6279.152232] [<ffffffff8105d092>] ? kthread+0x0/0x82
> [ 6279.152234] [<ffffffff81009820>] ? kernel_thread_helper+0x0/0x10
> [ 6279.152237] ---[ end trace 5be61ee2b4fbcfdd ]---

2010-12-20 14:45:58

by Christian Hesse

[permalink] [raw]
Subject: Re: Thunar crashes kernel

On Sat, 18 Dec 2010 05:32:53 +0100 Mike Galbraith <[email protected]> wrote:
> On Sat, 2010-12-18 at 00:55 +0100, Christian Hesse wrote:
> > Hallo everybody,
> >
> > I'm running kernel 2.6.36.2 from Arch Linux, patched with autogroup. It
> > ran perfectly stable so far, now I found a way to crash it:
>
> You're running a buggy version of the patch. If you want autogroup in
> 36.2, you'll want what was integrated into tip, plus a fix to that.
>
> Attached is a quilt tarball of what I've got plugged into 36.2 here if
> you want to try it.

That fixed it for me.
Thanks a lot!
--
Regards,
Chris