2011-06-09 21:30:02

by Dave Jones

[permalink] [raw]
Subject: 3.0rc2 oops in mem_cgroup_from_task


I just got the oops below while building a kernel.
When it oopsed, the kernel modesetting oops-on-framebuffer thing
happened, and the box wedged solid for about a minute.
Then it woke up, and I was able to ctrl-f1 back to my X session
to capture the dmesg. The stuff that follows the oops looks
quite disturbing, but I think it's from hanging with interrupts off
for a minute.

Dave

general protection fault: 0000 [#1] PREEMPT SMP
CPU 1
Modules linked in: nfs fscache fuse nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables btusb bluetooth arc4 zaurus dell_wmi sparse_keymap snd_usb_audio cdc_ether usbnet cdc_wdm mii snd_usbmidi_lib snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_idt cdc_acm dell_laptop uvcvideo snd_hda_intel snd_hda_codec dcdbas snd_hwdep videodev microcode v4l2_compat_ioctl32 snd_seq snd_seq_device snd_pcm joydev iTCO_wdt i2c_i801 iTCO_vendor_support iwlagn pcspkr snd_timer mac80211 snd soundcore snd_page_alloc cfg80211 tg3 rfkill wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]

Pid: 34, comm: khugepaged Not tainted 3.0.0-rc2+ #72 Dell Inc. Adamo 13 /0N70T0
RIP: 0010:[<ffffffff81138590>] [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
RSP: 0018:ffff880135c97bd0 EFLAGS: 00010286
RAX: 6b6b6b6b6b6b6b6b RBX: ffff880013c48000 RCX: 0000000000000000
RDX: 0000000000000246 RSI: ffffffff81a26610 RDI: ffff880013c48000
RBP: ffff880135c97be0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff880135c97cf0 R11: 0000000005491edb R12: ffff880013c48000
R13: 0000000000000200 R14: ffff880135c97ce8 R15: 0000000000000200
FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000338f21400a CR3: 000000004f64c000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khugepaged (pid: 34, threadinfo ffff880135c96000, task ffff880135c98000)
Stack:
ffff880013c48000 0000000000000200 ffff880135c97bf0 ffffffff81139792
ffff880135c97cc0 ffffffff8113a75a ffff880135c96000 ffff88013fc00000
ffff880135c98000 ffff880135c97c78 ffff880135c98000 00000000001d2c40
Call Trace:
[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
[<ffffffff810493f3>] ? need_resched+0x23/0x2d
[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
[<ffffffff81134024>] khugepaged+0x5da/0xfaf
[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
[<ffffffff81078625>] kthread+0xa8/0xb0
[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
[<ffffffff814d5660>] ? gs_change+0x13/0x13
Code: ff 84 c0 74 b5 eb 03 45 31 e4 5a 5b 4c 89 e0 41 5c 41 5d 5d c3 55 48 89 e5 41 54 53 66 66 66 66 90 48 8b 87 90 12 00 00 49 89 fc
8b 58 50 e8 b4 d8 f3 ff 85 c0 74 4d 80 3d 8c e5 6a 01 00 75
RIP [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
RSP <ffff880135c97bd0>
psmouse.c: TouchPad at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
iwlagn 0000:04:00.0: Queue 4 stuck for 10000 ms.
iwlagn 0000:04:00.0: On demand firmware reload
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c6 48 00 00 18 00
end_request: I/O error, dev sda, sector 366724680
Buffer I/O error on device dm-3, logical block 44179657
Buffer I/O error on device dm-3, logical block 44179658
Buffer I/O error on device dm-3, logical block 44179659
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 0 size 12288 starting block 44179657)
ieee80211 phy0: Hardware restart was requested
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 80 00 00 18 00
end_request: I/O error, dev sda, sector 366724992
Buffer I/O error on device dm-3, logical block 44179696
Buffer I/O error on device dm-3, logical block 44179697
Buffer I/O error on device dm-3, logical block 44179698
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 12288 size 12288 starting block 44179696)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 b8 00 00 18 00
end_request: I/O error, dev sda, sector 366725048
Buffer I/O error on device dm-3, logical block 44179703
Buffer I/O error on device dm-3, logical block 44179704
Buffer I/O error on device dm-3, logical block 44179705
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 24576 size 12288 starting block 44179703)
---[ end trace 95e652595eaf01aa ]---
psmouse.c: resync failed, issuing reconnect request
BUG: sleeping function called from invalid context at kernel/mutex.c:271
in_atomic(): 0, irqs_disabled(): 0, pid: 34, name: khugepaged
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee a0 00 00 50 00
end_request: I/O error, dev sda, sector 353889952
Buffer I/O error on device dm-3, logical block 42575316
Buffer I/O error on device dm-3, logical block 42575317
Buffer I/O error on device dm-3, logical block 42575318
Buffer I/O error on device dm-3, logical block 42575319
Buffer I/O error on device dm-3, logical block 42575320
Buffer I/O error on device dm-3, logical block 42575321
Buffer I/O error on device dm-3, logical block 42575322
Buffer I/O error on device dm-3, logical block 42575323
Buffer I/O error on device dm-3, logical block 42575324
Buffer I/O error on device dm-3, logical block 42575325
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629148 (offset 0 size 40960 starting block 42575316)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 f8 40 00 00 20 00
end_request: I/O error, dev sda, sector 366213184
Buffer I/O error on device dm-3, logical block 44115720
Buffer I/O error on device dm-3, logical block 44115721
Buffer I/O error on device dm-3, logical block 44115722
Buffer I/O error on device dm-3, logical block 44115723
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031693 (offset 0 size 16384 starting block 44115720)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 69 80 00 00 10 00
end_request: I/O error, dev sda, sector 365914496
Buffer I/O error on device dm-3, logical block 44078384
Buffer I/O error on device dm-3, logical block 44078385
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031978 (offset 0 size 8192 starting block 44078384)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 12 d5 00 00 00 18 00
end_request: I/O error, dev sda, sector 353555712
Buffer I/O error on device dm-3, logical block 42533536
Buffer I/O error on device dm-3, logical block 42533537
Buffer I/O error on device dm-3, logical block 42533538
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628547 (offset 0 size 12288 starting block 42533536)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fb a0 00 00 20 00
end_request: I/O error, dev sda, sector 366214048
Buffer I/O error on device dm-3, logical block 44115828
Buffer I/O error on device dm-3, logical block 44115829
Buffer I/O error on device dm-3, logical block 44115830
Buffer I/O error on device dm-3, logical block 44115831
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031956 (offset 0 size 16384 starting block 44115828)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd 80 00 00 28 00
end_request: I/O error, dev sda, sector 366861696
Buffer I/O error on device dm-3, logical block 44196784
Buffer I/O error on device dm-3, logical block 44196785
Buffer I/O error on device dm-3, logical block 44196786
Buffer I/O error on device dm-3, logical block 44196787
Buffer I/O error on device dm-3, logical block 44196788
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031940 (offset 0 size 20480 starting block 44196784)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 f1 78 00 00 18 00
end_request: I/O error, dev sda, sector 362017144
Buffer I/O error on device dm-3, logical block 43591215
Buffer I/O error on device dm-3, logical block 43591216
Buffer I/O error on device dm-3, logical block 43591217
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 0 size 12288 starting block 43591215)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 dc 60 00 00 10 00
end_request: I/O error, dev sda, sector 361946208
Buffer I/O error on device dm-3, logical block 43582348
Buffer I/O error on device dm-3, logical block 43582349
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 12288 size 8192 starting block 43582348)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd a8 00 00 28 00
end_request: I/O error, dev sda, sector 366861736
Buffer I/O error on device dm-3, logical block 44196789
Buffer I/O error on device dm-3, logical block 44196790
Buffer I/O error on device dm-3, logical block 44196791
Buffer I/O error on device dm-3, logical block 44196792
Buffer I/O error on device dm-3, logical block 44196793
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031841 (offset 0 size 20480 starting block 44196789)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd d0 00 00 48 00
end_request: I/O error, dev sda, sector 366861776
Buffer I/O error on device dm-3, logical block 44196794
Buffer I/O error on device dm-3, logical block 44196795
Buffer I/O error on device dm-3, logical block 44196796
Buffer I/O error on device dm-3, logical block 44196797
Buffer I/O error on device dm-3, logical block 44196798
Buffer I/O error on device dm-3, logical block 44196799
Buffer I/O error on device dm-3, logical block 44196800
Buffer I/O error on device dm-3, logical block 44196801
Buffer I/O error on device dm-3, logical block 44196802
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031992 (offset 0 size 36864 starting block 44196794)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee f0 00 00 48 00
end_request: I/O error, dev sda, sector 353890032
Buffer I/O error on device dm-3, logical block 42575326
Buffer I/O error on device dm-3, logical block 42575327
Buffer I/O error on device dm-3, logical block 42575328
Buffer I/O error on device dm-3, logical block 42575329
Buffer I/O error on device dm-3, logical block 42575330
Buffer I/O error on device dm-3, logical block 42575331
Buffer I/O error on device dm-3, logical block 42575332
Buffer I/O error on device dm-3, logical block 42575333
Buffer I/O error on device dm-3, logical block 42575334
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629931 (offset 0 size 36864 starting block 42575326)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fd c0 00 00 20 00
end_request: I/O error, dev sda, sector 366214592
Buffer I/O error on device dm-3, logical block 44115896
Buffer I/O error on device dm-3, logical block 44115897
Buffer I/O error on device dm-3, logical block 44115898
Buffer I/O error on device dm-3, logical block 44115899
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11011057 (offset 0 size 16384 starting block 44115896)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 24 50 00 00 08 00
end_request: I/O error, dev sda, sector 365896784
Buffer I/O error on device dm-3, logical block 44076170
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032013 (offset 0 size 4096 starting block 44076170)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 20 00 00 08 00
end_request: I/O error, dev sda, sector 382721056
Buffer I/O error on device dm-3, logical block 46179204
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 16384 size 4096 starting block 46179204)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 40 00 00 08 00
end_request: I/O error, dev sda, sector 382721088
Buffer I/O error on device dm-3, logical block 46179208
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 32768 size 4096 starting block 46179208)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 60 00 00 08 00
end_request: I/O error, dev sda, sector 382721120
Buffer I/O error on device dm-3, logical block 46179212
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 49152 size 4096 starting block 46179212)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 78 00 00 08 00
end_request: I/O error, dev sda, sector 382721144
Buffer I/O error on device dm-3, logical block 46179215
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 61440 size 4096 starting block 46179215)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 90 00 00 10 00
end_request: I/O error, dev sda, sector 382721168
Buffer I/O error on device dm-3, logical block 46179218
Buffer I/O error on device dm-3, logical block 46179219
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 73728 size 8192 starting block 46179218)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc c8 00 00 08 00
end_request: I/O error, dev sda, sector 382721224
Buffer I/O error on device dm-3, logical block 46179225
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 102400 size 4096 starting block 46179225)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc f8 00 00 08 00
end_request: I/O error, dev sda, sector 382721272
Buffer I/O error on device dm-3, logical block 46179231
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 126976 size 4096 starting block 46179231)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 af 80 00 00 10 00
end_request: I/O error, dev sda, sector 353415040
Buffer I/O error on device dm-3, logical block 42515952
Buffer I/O error on device dm-3, logical block 42515953
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628554 (offset 0 size 8192 starting block 42515952)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fe e0 00 00 20 00
end_request: I/O error, dev sda, sector 366214880
Buffer I/O error on device dm-3, logical block 44115932
Buffer I/O error on device dm-3, logical block 44115933
Buffer I/O error on device dm-3, logical block 44115934
Buffer I/O error on device dm-3, logical block 44115935
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031999 (offset 0 size 16384 starting block 44115932)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 91 6e f0 00 00 10 00
end_request: I/O error, dev sda, sector 361852656
Buffer I/O error on device dm-3, logical block 43570654
Buffer I/O error on device dm-3, logical block 43570655
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896269 (offset 0 size 8192 starting block 43570654)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 b2 20 00 00 10 00
end_request: I/O error, dev sda, sector 353415712
Buffer I/O error on device dm-3, logical block 42516036
Buffer I/O error on device dm-3, logical block 42516037
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629161 (offset 0 size 8192 starting block 42516036)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 00 60 00 00 20 00
end_request: I/O error, dev sda, sector 361889888
Buffer I/O error on device dm-3, logical block 43575308
Buffer I/O error on device dm-3, logical block 43575309
Buffer I/O error on device dm-3, logical block 43575310
Buffer I/O error on device dm-3, logical block 43575311
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896254 (offset 0 size 16384 starting block 43575308)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] INFO: lockdep is turned off.
Pid: 34, comm: khugepaged Tainted: G D 3.0.0-rc2+ #72
Call Trace:
[<ffffffff8104d276>] __might_sleep+0x112/0x117
[<ffffffff814cca48>] mutex_lock_nested+0x25/0x40
[<ffffffff810ee9b7>] perf_event_exit_task+0x2d/0x1cd
[<ffffffff8105ea1f>] do_exit+0x353/0x7fa
[<ffffffff8105c569>] ? kmsg_dump+0x89/0x13c
[<ffffffff814cf78d>] oops_end+0xbc/0xc5
[<ffffffff8100d087>] die+0x5a/0x63
[<ffffffff814cf18f>] do_general_protection+0x128/0x131
[<ffffffff814cead5>] general_protection+0x25/0x30
[<ffffffff81138590>] ? task_subsys_state.constprop.30+0x16/0x78
[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
[<ffffffff810493f3>] ? need_resched+0x23/0x2d
[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
[<ffffffff81134024>] khugepaged+0x5da/0xfaf
[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
[<ffffffff81078625>] kthread+0xa8/0xb0
[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
[<ffffffff814d5660>] ? gs_change+0x13/0x13
Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 94 1b 40 00 00 18 00
end_request: I/O error, dev sda, sector 362027840
Buffer I/O error on device dm-3, logical block 43592552
Buffer I/O error on device dm-3, logical block 43592553
Buffer I/O error on device dm-3, logical block 43592554
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 0 size 12288 starting block 43592552)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 06 b0 00 00 10 00
end_request: I/O error, dev sda, sector 361957040
Buffer I/O error on device dm-3, logical block 43583702
Buffer I/O error on device dm-3, logical block 43583703
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 12288 size 8192 starting block 43583702)
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 ff 40 00 00 20 00
end_request: I/O error, dev sda, sector 366214976
Buffer I/O error on device dm-3, logical block 44115944
Buffer I/O error on device dm-3, logical block 44115945
Buffer I/O error on device dm-3, logical block 44115946
Buffer I/O error on device dm-3, logical block 44115947
EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032040 (offset 0 size 16384 starting block 44115944)
JBD2: Detected IO errors while flushing file data on dm-3-8


2011-06-09 22:47:39

by Ying Han

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

++cc to more mm folks.

--Ying

On Thu, Jun 9, 2011 at 2:29 PM, Dave Jones <[email protected]> wrote:
>
> I just got the oops below while building a kernel.
> When it oopsed, the kernel modesetting oops-on-framebuffer thing
> happened, and the box wedged solid for about a minute.
> Then it woke up, and I was able to ctrl-f1 back to my X session
> to capture the dmesg. ? The stuff that follows the oops looks
> quite disturbing, but I think it's from hanging with interrupts off
> for a minute.
>
> ? ? ? ?Dave
>
> general protection fault: 0000 [#1] PREEMPT SMP
> CPU 1
> Modules linked in: nfs fscache fuse nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables btusb bluetooth arc4 zaurus dell_wmi sparse_keymap snd_usb_audio cdc_ether usbnet cdc_wdm mii snd_usbmidi_lib snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_idt cdc_acm dell_laptop uvcvideo snd_hda_intel snd_hda_codec dcdbas snd_hwdep videodev microcode v4l2_compat_ioctl32 snd_seq snd_seq_device snd_pcm joydev iTCO_wdt i2c_i801 iTCO_vendor_support iwlagn pcspkr snd_timer mac80211 snd soundcore snd_page_alloc cfg80211 tg3 rfkill wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
>
> Pid: 34, comm: khugepaged Not tainted 3.0.0-rc2+ #72 Dell Inc. Adamo 13 ? /0N70T0
> RIP: 0010:[<ffffffff81138590>] ?[<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> RSP: 0018:ffff880135c97bd0 ?EFLAGS: 00010286
> RAX: 6b6b6b6b6b6b6b6b RBX: ffff880013c48000 RCX: 0000000000000000
> RDX: 0000000000000246 RSI: ffffffff81a26610 RDI: ffff880013c48000
> RBP: ffff880135c97be0 R08: 0000000000000001 R09: 0000000000000000
> R10: ffff880135c97cf0 R11: 0000000005491edb R12: ffff880013c48000
> R13: 0000000000000200 R14: ffff880135c97ce8 R15: 0000000000000200
> FS: ?0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
> CS: ?0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000338f21400a CR3: 000000004f64c000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process khugepaged (pid: 34, threadinfo ffff880135c96000, task ffff880135c98000)
> Stack:
> ?ffff880013c48000 0000000000000200 ffff880135c97bf0 ffffffff81139792
> ?ffff880135c97cc0 ffffffff8113a75a ffff880135c96000 ffff88013fc00000
> ?ffff880135c98000 ffff880135c97c78 ffff880135c98000 00000000001d2c40
> Call Trace:
> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> ?[<ffffffff81078625>] kthread+0xa8/0xb0
> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> ?[<ffffffff814d5660>] ? gs_change+0x13/0x13
> Code: ff 84 c0 74 b5 eb 03 45 31 e4 5a 5b 4c 89 e0 41 5c 41 5d 5d c3 55 48 89 e5 41 54 53 66 66 66 66 90 48 8b 87 90 12 00 00 49 89 fc
> ?8b 58 50 e8 b4 d8 f3 ff 85 c0 74 4d 80 3d 8c e5 6a 01 00 75
> RIP ?[<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> ?RSP <ffff880135c97bd0>
> psmouse.c: TouchPad at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
> iwlagn 0000:04:00.0: Queue 4 stuck for 10000 ms.
> iwlagn 0000:04:00.0: On demand firmware reload
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c6 48 00 00 18 00
> end_request: I/O error, dev sda, sector 366724680
> Buffer I/O error on device dm-3, logical block 44179657
> Buffer I/O error on device dm-3, logical block 44179658
> Buffer I/O error on device dm-3, logical block 44179659
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 0 size 12288 starting block 44179657)
> ieee80211 phy0: Hardware restart was requested
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 80 00 00 18 00
> end_request: I/O error, dev sda, sector 366724992
> Buffer I/O error on device dm-3, logical block 44179696
> Buffer I/O error on device dm-3, logical block 44179697
> Buffer I/O error on device dm-3, logical block 44179698
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 12288 size 12288 starting block 44179696)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 b8 00 00 18 00
> end_request: I/O error, dev sda, sector 366725048
> Buffer I/O error on device dm-3, logical block 44179703
> Buffer I/O error on device dm-3, logical block 44179704
> Buffer I/O error on device dm-3, logical block 44179705
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 24576 size 12288 starting block 44179703)
> ---[ end trace 95e652595eaf01aa ]---
> psmouse.c: resync failed, issuing reconnect request
> BUG: sleeping function called from invalid context at kernel/mutex.c:271
> in_atomic(): 0, irqs_disabled(): 0, pid: 34, name: khugepaged
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee a0 00 00 50 00
> end_request: I/O error, dev sda, sector 353889952
> Buffer I/O error on device dm-3, logical block 42575316
> Buffer I/O error on device dm-3, logical block 42575317
> Buffer I/O error on device dm-3, logical block 42575318
> Buffer I/O error on device dm-3, logical block 42575319
> Buffer I/O error on device dm-3, logical block 42575320
> Buffer I/O error on device dm-3, logical block 42575321
> Buffer I/O error on device dm-3, logical block 42575322
> Buffer I/O error on device dm-3, logical block 42575323
> Buffer I/O error on device dm-3, logical block 42575324
> Buffer I/O error on device dm-3, logical block 42575325
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629148 (offset 0 size 40960 starting block 42575316)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 f8 40 00 00 20 00
> end_request: I/O error, dev sda, sector 366213184
> Buffer I/O error on device dm-3, logical block 44115720
> Buffer I/O error on device dm-3, logical block 44115721
> Buffer I/O error on device dm-3, logical block 44115722
> Buffer I/O error on device dm-3, logical block 44115723
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031693 (offset 0 size 16384 starting block 44115720)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 69 80 00 00 10 00
> end_request: I/O error, dev sda, sector 365914496
> Buffer I/O error on device dm-3, logical block 44078384
> Buffer I/O error on device dm-3, logical block 44078385
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031978 (offset 0 size 8192 starting block 44078384)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 12 d5 00 00 00 18 00
> end_request: I/O error, dev sda, sector 353555712
> Buffer I/O error on device dm-3, logical block 42533536
> Buffer I/O error on device dm-3, logical block 42533537
> Buffer I/O error on device dm-3, logical block 42533538
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628547 (offset 0 size 12288 starting block 42533536)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fb a0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214048
> Buffer I/O error on device dm-3, logical block 44115828
> Buffer I/O error on device dm-3, logical block 44115829
> Buffer I/O error on device dm-3, logical block 44115830
> Buffer I/O error on device dm-3, logical block 44115831
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031956 (offset 0 size 16384 starting block 44115828)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd 80 00 00 28 00
> end_request: I/O error, dev sda, sector 366861696
> Buffer I/O error on device dm-3, logical block 44196784
> Buffer I/O error on device dm-3, logical block 44196785
> Buffer I/O error on device dm-3, logical block 44196786
> Buffer I/O error on device dm-3, logical block 44196787
> Buffer I/O error on device dm-3, logical block 44196788
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031940 (offset 0 size 20480 starting block 44196784)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 f1 78 00 00 18 00
> end_request: I/O error, dev sda, sector 362017144
> Buffer I/O error on device dm-3, logical block 43591215
> Buffer I/O error on device dm-3, logical block 43591216
> Buffer I/O error on device dm-3, logical block 43591217
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 0 size 12288 starting block 43591215)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 dc 60 00 00 10 00
> end_request: I/O error, dev sda, sector 361946208
> Buffer I/O error on device dm-3, logical block 43582348
> Buffer I/O error on device dm-3, logical block 43582349
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 12288 size 8192 starting block 43582348)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd a8 00 00 28 00
> end_request: I/O error, dev sda, sector 366861736
> Buffer I/O error on device dm-3, logical block 44196789
> Buffer I/O error on device dm-3, logical block 44196790
> Buffer I/O error on device dm-3, logical block 44196791
> Buffer I/O error on device dm-3, logical block 44196792
> Buffer I/O error on device dm-3, logical block 44196793
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031841 (offset 0 size 20480 starting block 44196789)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd d0 00 00 48 00
> end_request: I/O error, dev sda, sector 366861776
> Buffer I/O error on device dm-3, logical block 44196794
> Buffer I/O error on device dm-3, logical block 44196795
> Buffer I/O error on device dm-3, logical block 44196796
> Buffer I/O error on device dm-3, logical block 44196797
> Buffer I/O error on device dm-3, logical block 44196798
> Buffer I/O error on device dm-3, logical block 44196799
> Buffer I/O error on device dm-3, logical block 44196800
> Buffer I/O error on device dm-3, logical block 44196801
> Buffer I/O error on device dm-3, logical block 44196802
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031992 (offset 0 size 36864 starting block 44196794)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee f0 00 00 48 00
> end_request: I/O error, dev sda, sector 353890032
> Buffer I/O error on device dm-3, logical block 42575326
> Buffer I/O error on device dm-3, logical block 42575327
> Buffer I/O error on device dm-3, logical block 42575328
> Buffer I/O error on device dm-3, logical block 42575329
> Buffer I/O error on device dm-3, logical block 42575330
> Buffer I/O error on device dm-3, logical block 42575331
> Buffer I/O error on device dm-3, logical block 42575332
> Buffer I/O error on device dm-3, logical block 42575333
> Buffer I/O error on device dm-3, logical block 42575334
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629931 (offset 0 size 36864 starting block 42575326)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fd c0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214592
> Buffer I/O error on device dm-3, logical block 44115896
> Buffer I/O error on device dm-3, logical block 44115897
> Buffer I/O error on device dm-3, logical block 44115898
> Buffer I/O error on device dm-3, logical block 44115899
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11011057 (offset 0 size 16384 starting block 44115896)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 24 50 00 00 08 00
> end_request: I/O error, dev sda, sector 365896784
> Buffer I/O error on device dm-3, logical block 44076170
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032013 (offset 0 size 4096 starting block 44076170)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 20 00 00 08 00
> end_request: I/O error, dev sda, sector 382721056
> Buffer I/O error on device dm-3, logical block 46179204
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 16384 size 4096 starting block 46179204)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 40 00 00 08 00
> end_request: I/O error, dev sda, sector 382721088
> Buffer I/O error on device dm-3, logical block 46179208
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 32768 size 4096 starting block 46179208)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 60 00 00 08 00
> end_request: I/O error, dev sda, sector 382721120
> Buffer I/O error on device dm-3, logical block 46179212
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 49152 size 4096 starting block 46179212)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 78 00 00 08 00
> end_request: I/O error, dev sda, sector 382721144
> Buffer I/O error on device dm-3, logical block 46179215
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 61440 size 4096 starting block 46179215)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 90 00 00 10 00
> end_request: I/O error, dev sda, sector 382721168
> Buffer I/O error on device dm-3, logical block 46179218
> Buffer I/O error on device dm-3, logical block 46179219
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 73728 size 8192 starting block 46179218)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc c8 00 00 08 00
> end_request: I/O error, dev sda, sector 382721224
> Buffer I/O error on device dm-3, logical block 46179225
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 102400 size 4096 starting block 46179225)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc f8 00 00 08 00
> end_request: I/O error, dev sda, sector 382721272
> Buffer I/O error on device dm-3, logical block 46179231
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 126976 size 4096 starting block 46179231)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 af 80 00 00 10 00
> end_request: I/O error, dev sda, sector 353415040
> Buffer I/O error on device dm-3, logical block 42515952
> Buffer I/O error on device dm-3, logical block 42515953
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628554 (offset 0 size 8192 starting block 42515952)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fe e0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214880
> Buffer I/O error on device dm-3, logical block 44115932
> Buffer I/O error on device dm-3, logical block 44115933
> Buffer I/O error on device dm-3, logical block 44115934
> Buffer I/O error on device dm-3, logical block 44115935
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031999 (offset 0 size 16384 starting block 44115932)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 91 6e f0 00 00 10 00
> end_request: I/O error, dev sda, sector 361852656
> Buffer I/O error on device dm-3, logical block 43570654
> Buffer I/O error on device dm-3, logical block 43570655
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896269 (offset 0 size 8192 starting block 43570654)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 b2 20 00 00 10 00
> end_request: I/O error, dev sda, sector 353415712
> Buffer I/O error on device dm-3, logical block 42516036
> Buffer I/O error on device dm-3, logical block 42516037
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629161 (offset 0 size 8192 starting block 42516036)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 00 60 00 00 20 00
> end_request: I/O error, dev sda, sector 361889888
> Buffer I/O error on device dm-3, logical block 43575308
> Buffer I/O error on device dm-3, logical block 43575309
> Buffer I/O error on device dm-3, logical block 43575310
> Buffer I/O error on device dm-3, logical block 43575311
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896254 (offset 0 size 16384 starting block 43575308)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?INFO: lockdep is turned off.
> Pid: 34, comm: khugepaged Tainted: G ? ? ?D ? ? 3.0.0-rc2+ #72
> Call Trace:
> ?[<ffffffff8104d276>] __might_sleep+0x112/0x117
> ?[<ffffffff814cca48>] mutex_lock_nested+0x25/0x40
> ?[<ffffffff810ee9b7>] perf_event_exit_task+0x2d/0x1cd
> ?[<ffffffff8105ea1f>] do_exit+0x353/0x7fa
> ?[<ffffffff8105c569>] ? kmsg_dump+0x89/0x13c
> ?[<ffffffff814cf78d>] oops_end+0xbc/0xc5
> ?[<ffffffff8100d087>] die+0x5a/0x63
> ?[<ffffffff814cf18f>] do_general_protection+0x128/0x131
> ?[<ffffffff814cead5>] general_protection+0x25/0x30
> ?[<ffffffff81138590>] ? task_subsys_state.constprop.30+0x16/0x78
> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> ?[<ffffffff81078625>] kthread+0xa8/0xb0
> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> ?[<ffffffff814d5660>] ? gs_change+0x13/0x13
> Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 94 1b 40 00 00 18 00
> end_request: I/O error, dev sda, sector 362027840
> Buffer I/O error on device dm-3, logical block 43592552
> Buffer I/O error on device dm-3, logical block 43592553
> Buffer I/O error on device dm-3, logical block 43592554
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 0 size 12288 starting block 43592552)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 06 b0 00 00 10 00
> end_request: I/O error, dev sda, sector 361957040
> Buffer I/O error on device dm-3, logical block 43583702
> Buffer I/O error on device dm-3, logical block 43583703
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 12288 size 8192 starting block 43583702)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 ff 40 00 00 20 00
> end_request: I/O error, dev sda, sector 366214976
> Buffer I/O error on device dm-3, logical block 44115944
> Buffer I/O error on device dm-3, logical block 44115945
> Buffer I/O error on device dm-3, logical block 44115946
> Buffer I/O error on device dm-3, logical block 44115947
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032040 (offset 0 size 16384 starting block 44115944)
> JBD2: Detected IO errors while flushing file data on dm-3-8
>

2011-06-09 23:42:15

by Ying Han

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

++cc Hugh who might have seen similar crashes on his machine.

--Ying

On Thu, Jun 9, 2011 at 3:47 PM, Ying Han <[email protected]> wrote:
> ++cc to more mm folks.
>
> --Ying
>
> On Thu, Jun 9, 2011 at 2:29 PM, Dave Jones <[email protected]> wrote:
>>
>> I just got the oops below while building a kernel.
>> When it oopsed, the kernel modesetting oops-on-framebuffer thing
>> happened, and the box wedged solid for about a minute.
>> Then it woke up, and I was able to ctrl-f1 back to my X session
>> to capture the dmesg. ? The stuff that follows the oops looks
>> quite disturbing, but I think it's from hanging with interrupts off
>> for a minute.
>>
>> ? ? ? ?Dave
>>
>> general protection fault: 0000 [#1] PREEMPT SMP
>> CPU 1
>> Modules linked in: nfs fscache fuse nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables btusb bluetooth arc4 zaurus dell_wmi sparse_keymap snd_usb_audio cdc_ether usbnet cdc_wdm mii snd_usbmidi_lib snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_idt cdc_acm dell_laptop uvcvideo snd_hda_intel snd_hda_codec dcdbas snd_hwdep videodev microcode v4l2_compat_ioctl32 snd_seq snd_seq_device snd_pcm joydev iTCO_wdt i2c_i801 iTCO_vendor_support iwlagn pcspkr snd_timer mac80211 snd soundcore snd_page_alloc cfg80211 tg3 rfkill wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
>>
>> Pid: 34, comm: khugepaged Not tainted 3.0.0-rc2+ #72 Dell Inc. Adamo 13 ? /0N70T0
>> RIP: 0010:[<ffffffff81138590>] ?[<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
>> RSP: 0018:ffff880135c97bd0 ?EFLAGS: 00010286
>> RAX: 6b6b6b6b6b6b6b6b RBX: ffff880013c48000 RCX: 0000000000000000
>> RDX: 0000000000000246 RSI: ffffffff81a26610 RDI: ffff880013c48000
>> RBP: ffff880135c97be0 R08: 0000000000000001 R09: 0000000000000000
>> R10: ffff880135c97cf0 R11: 0000000005491edb R12: ffff880013c48000
>> R13: 0000000000000200 R14: ffff880135c97ce8 R15: 0000000000000200
>> FS: ?0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
>> CS: ?0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 000000338f21400a CR3: 000000004f64c000 CR4: 00000000000406e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process khugepaged (pid: 34, threadinfo ffff880135c96000, task ffff880135c98000)
>> Stack:
>> ?ffff880013c48000 0000000000000200 ffff880135c97bf0 ffffffff81139792
>> ?ffff880135c97cc0 ffffffff8113a75a ffff880135c96000 ffff88013fc00000
>> ?ffff880135c98000 ffff880135c97c78 ffff880135c98000 00000000001d2c40
>> Call Trace:
>> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
>> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
>> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
>> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
>> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
>> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
>> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
>> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
>> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
>> ?[<ffffffff81078625>] kthread+0xa8/0xb0
>> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
>> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
>> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
>> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
>> ?[<ffffffff814d5660>] ? gs_change+0x13/0x13
>> Code: ff 84 c0 74 b5 eb 03 45 31 e4 5a 5b 4c 89 e0 41 5c 41 5d 5d c3 55 48 89 e5 41 54 53 66 66 66 66 90 48 8b 87 90 12 00 00 49 89 fc
>> ?8b 58 50 e8 b4 d8 f3 ff 85 c0 74 4d 80 3d 8c e5 6a 01 00 75
>> RIP ?[<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
>> ?RSP <ffff880135c97bd0>
>> psmouse.c: TouchPad at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
>> iwlagn 0000:04:00.0: Queue 4 stuck for 10000 ms.
>> iwlagn 0000:04:00.0: On demand firmware reload
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c6 48 00 00 18 00
>> end_request: I/O error, dev sda, sector 366724680
>> Buffer I/O error on device dm-3, logical block 44179657
>> Buffer I/O error on device dm-3, logical block 44179658
>> Buffer I/O error on device dm-3, logical block 44179659
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 0 size 12288 starting block 44179657)
>> ieee80211 phy0: Hardware restart was requested
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 80 00 00 18 00
>> end_request: I/O error, dev sda, sector 366724992
>> Buffer I/O error on device dm-3, logical block 44179696
>> Buffer I/O error on device dm-3, logical block 44179697
>> Buffer I/O error on device dm-3, logical block 44179698
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 12288 size 12288 starting block 44179696)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 b8 00 00 18 00
>> end_request: I/O error, dev sda, sector 366725048
>> Buffer I/O error on device dm-3, logical block 44179703
>> Buffer I/O error on device dm-3, logical block 44179704
>> Buffer I/O error on device dm-3, logical block 44179705
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 24576 size 12288 starting block 44179703)
>> ---[ end trace 95e652595eaf01aa ]---
>> psmouse.c: resync failed, issuing reconnect request
>> BUG: sleeping function called from invalid context at kernel/mutex.c:271
>> in_atomic(): 0, irqs_disabled(): 0, pid: 34, name: khugepaged
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee a0 00 00 50 00
>> end_request: I/O error, dev sda, sector 353889952
>> Buffer I/O error on device dm-3, logical block 42575316
>> Buffer I/O error on device dm-3, logical block 42575317
>> Buffer I/O error on device dm-3, logical block 42575318
>> Buffer I/O error on device dm-3, logical block 42575319
>> Buffer I/O error on device dm-3, logical block 42575320
>> Buffer I/O error on device dm-3, logical block 42575321
>> Buffer I/O error on device dm-3, logical block 42575322
>> Buffer I/O error on device dm-3, logical block 42575323
>> Buffer I/O error on device dm-3, logical block 42575324
>> Buffer I/O error on device dm-3, logical block 42575325
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629148 (offset 0 size 40960 starting block 42575316)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 f8 40 00 00 20 00
>> end_request: I/O error, dev sda, sector 366213184
>> Buffer I/O error on device dm-3, logical block 44115720
>> Buffer I/O error on device dm-3, logical block 44115721
>> Buffer I/O error on device dm-3, logical block 44115722
>> Buffer I/O error on device dm-3, logical block 44115723
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031693 (offset 0 size 16384 starting block 44115720)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 69 80 00 00 10 00
>> end_request: I/O error, dev sda, sector 365914496
>> Buffer I/O error on device dm-3, logical block 44078384
>> Buffer I/O error on device dm-3, logical block 44078385
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031978 (offset 0 size 8192 starting block 44078384)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 12 d5 00 00 00 18 00
>> end_request: I/O error, dev sda, sector 353555712
>> Buffer I/O error on device dm-3, logical block 42533536
>> Buffer I/O error on device dm-3, logical block 42533537
>> Buffer I/O error on device dm-3, logical block 42533538
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628547 (offset 0 size 12288 starting block 42533536)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fb a0 00 00 20 00
>> end_request: I/O error, dev sda, sector 366214048
>> Buffer I/O error on device dm-3, logical block 44115828
>> Buffer I/O error on device dm-3, logical block 44115829
>> Buffer I/O error on device dm-3, logical block 44115830
>> Buffer I/O error on device dm-3, logical block 44115831
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031956 (offset 0 size 16384 starting block 44115828)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd 80 00 00 28 00
>> end_request: I/O error, dev sda, sector 366861696
>> Buffer I/O error on device dm-3, logical block 44196784
>> Buffer I/O error on device dm-3, logical block 44196785
>> Buffer I/O error on device dm-3, logical block 44196786
>> Buffer I/O error on device dm-3, logical block 44196787
>> Buffer I/O error on device dm-3, logical block 44196788
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031940 (offset 0 size 20480 starting block 44196784)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 f1 78 00 00 18 00
>> end_request: I/O error, dev sda, sector 362017144
>> Buffer I/O error on device dm-3, logical block 43591215
>> Buffer I/O error on device dm-3, logical block 43591216
>> Buffer I/O error on device dm-3, logical block 43591217
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 0 size 12288 starting block 43591215)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 dc 60 00 00 10 00
>> end_request: I/O error, dev sda, sector 361946208
>> Buffer I/O error on device dm-3, logical block 43582348
>> Buffer I/O error on device dm-3, logical block 43582349
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 12288 size 8192 starting block 43582348)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd a8 00 00 28 00
>> end_request: I/O error, dev sda, sector 366861736
>> Buffer I/O error on device dm-3, logical block 44196789
>> Buffer I/O error on device dm-3, logical block 44196790
>> Buffer I/O error on device dm-3, logical block 44196791
>> Buffer I/O error on device dm-3, logical block 44196792
>> Buffer I/O error on device dm-3, logical block 44196793
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031841 (offset 0 size 20480 starting block 44196789)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd d0 00 00 48 00
>> end_request: I/O error, dev sda, sector 366861776
>> Buffer I/O error on device dm-3, logical block 44196794
>> Buffer I/O error on device dm-3, logical block 44196795
>> Buffer I/O error on device dm-3, logical block 44196796
>> Buffer I/O error on device dm-3, logical block 44196797
>> Buffer I/O error on device dm-3, logical block 44196798
>> Buffer I/O error on device dm-3, logical block 44196799
>> Buffer I/O error on device dm-3, logical block 44196800
>> Buffer I/O error on device dm-3, logical block 44196801
>> Buffer I/O error on device dm-3, logical block 44196802
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031992 (offset 0 size 36864 starting block 44196794)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee f0 00 00 48 00
>> end_request: I/O error, dev sda, sector 353890032
>> Buffer I/O error on device dm-3, logical block 42575326
>> Buffer I/O error on device dm-3, logical block 42575327
>> Buffer I/O error on device dm-3, logical block 42575328
>> Buffer I/O error on device dm-3, logical block 42575329
>> Buffer I/O error on device dm-3, logical block 42575330
>> Buffer I/O error on device dm-3, logical block 42575331
>> Buffer I/O error on device dm-3, logical block 42575332
>> Buffer I/O error on device dm-3, logical block 42575333
>> Buffer I/O error on device dm-3, logical block 42575334
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629931 (offset 0 size 36864 starting block 42575326)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fd c0 00 00 20 00
>> end_request: I/O error, dev sda, sector 366214592
>> Buffer I/O error on device dm-3, logical block 44115896
>> Buffer I/O error on device dm-3, logical block 44115897
>> Buffer I/O error on device dm-3, logical block 44115898
>> Buffer I/O error on device dm-3, logical block 44115899
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11011057 (offset 0 size 16384 starting block 44115896)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 24 50 00 00 08 00
>> end_request: I/O error, dev sda, sector 365896784
>> Buffer I/O error on device dm-3, logical block 44076170
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032013 (offset 0 size 4096 starting block 44076170)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 20 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721056
>> Buffer I/O error on device dm-3, logical block 46179204
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 16384 size 4096 starting block 46179204)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 40 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721088
>> Buffer I/O error on device dm-3, logical block 46179208
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 32768 size 4096 starting block 46179208)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 60 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721120
>> Buffer I/O error on device dm-3, logical block 46179212
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 49152 size 4096 starting block 46179212)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 78 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721144
>> Buffer I/O error on device dm-3, logical block 46179215
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 61440 size 4096 starting block 46179215)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 90 00 00 10 00
>> end_request: I/O error, dev sda, sector 382721168
>> Buffer I/O error on device dm-3, logical block 46179218
>> Buffer I/O error on device dm-3, logical block 46179219
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 73728 size 8192 starting block 46179218)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc c8 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721224
>> Buffer I/O error on device dm-3, logical block 46179225
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 102400 size 4096 starting block 46179225)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc f8 00 00 08 00
>> end_request: I/O error, dev sda, sector 382721272
>> Buffer I/O error on device dm-3, logical block 46179231
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 126976 size 4096 starting block 46179231)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 af 80 00 00 10 00
>> end_request: I/O error, dev sda, sector 353415040
>> Buffer I/O error on device dm-3, logical block 42515952
>> Buffer I/O error on device dm-3, logical block 42515953
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628554 (offset 0 size 8192 starting block 42515952)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fe e0 00 00 20 00
>> end_request: I/O error, dev sda, sector 366214880
>> Buffer I/O error on device dm-3, logical block 44115932
>> Buffer I/O error on device dm-3, logical block 44115933
>> Buffer I/O error on device dm-3, logical block 44115934
>> Buffer I/O error on device dm-3, logical block 44115935
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031999 (offset 0 size 16384 starting block 44115932)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 91 6e f0 00 00 10 00
>> end_request: I/O error, dev sda, sector 361852656
>> Buffer I/O error on device dm-3, logical block 43570654
>> Buffer I/O error on device dm-3, logical block 43570655
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896269 (offset 0 size 8192 starting block 43570654)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 b2 20 00 00 10 00
>> end_request: I/O error, dev sda, sector 353415712
>> Buffer I/O error on device dm-3, logical block 42516036
>> Buffer I/O error on device dm-3, logical block 42516037
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629161 (offset 0 size 8192 starting block 42516036)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 00 60 00 00 20 00
>> end_request: I/O error, dev sda, sector 361889888
>> Buffer I/O error on device dm-3, logical block 43575308
>> Buffer I/O error on device dm-3, logical block 43575309
>> Buffer I/O error on device dm-3, logical block 43575310
>> Buffer I/O error on device dm-3, logical block 43575311
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896254 (offset 0 size 16384 starting block 43575308)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?INFO: lockdep is turned off.
>> Pid: 34, comm: khugepaged Tainted: G ? ? ?D ? ? 3.0.0-rc2+ #72
>> Call Trace:
>> ?[<ffffffff8104d276>] __might_sleep+0x112/0x117
>> ?[<ffffffff814cca48>] mutex_lock_nested+0x25/0x40
>> ?[<ffffffff810ee9b7>] perf_event_exit_task+0x2d/0x1cd
>> ?[<ffffffff8105ea1f>] do_exit+0x353/0x7fa
>> ?[<ffffffff8105c569>] ? kmsg_dump+0x89/0x13c
>> ?[<ffffffff814cf78d>] oops_end+0xbc/0xc5
>> ?[<ffffffff8100d087>] die+0x5a/0x63
>> ?[<ffffffff814cf18f>] do_general_protection+0x128/0x131
>> ?[<ffffffff814cead5>] general_protection+0x25/0x30
>> ?[<ffffffff81138590>] ? task_subsys_state.constprop.30+0x16/0x78
>> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
>> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
>> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
>> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
>> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
>> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
>> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
>> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
>> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
>> ?[<ffffffff81078625>] kthread+0xa8/0xb0
>> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
>> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
>> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
>> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
>> ?[<ffffffff814d5660>] ? gs_change+0x13/0x13
>> Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 94 1b 40 00 00 18 00
>> end_request: I/O error, dev sda, sector 362027840
>> Buffer I/O error on device dm-3, logical block 43592552
>> Buffer I/O error on device dm-3, logical block 43592553
>> Buffer I/O error on device dm-3, logical block 43592554
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 0 size 12288 starting block 43592552)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 06 b0 00 00 10 00
>> end_request: I/O error, dev sda, sector 361957040
>> Buffer I/O error on device dm-3, logical block 43583702
>> Buffer I/O error on device dm-3, logical block 43583703
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 12288 size 8192 starting block 43583702)
>> sd 0:0:0:0: [sda] Unhandled error code
>> sd 0:0:0:0: [sda] ?Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
>> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 ff 40 00 00 20 00
>> end_request: I/O error, dev sda, sector 366214976
>> Buffer I/O error on device dm-3, logical block 44115944
>> Buffer I/O error on device dm-3, logical block 44115945
>> Buffer I/O error on device dm-3, logical block 44115946
>> Buffer I/O error on device dm-3, logical block 44115947
>> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032040 (offset 0 size 16384 starting block 44115944)
>> JBD2: Detected IO errors while flushing file data on dm-3-8
>>
>

2011-06-10 00:20:55

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Thu, 9 Jun 2011 16:42:09 -0700
Ying Han <[email protected]> wrote:

> ++cc Hugh who might have seen similar crashes on his machine.
>

Thank you for forwarding. Hmm. It seems the panic happens at khugepaged's
page collapse_huge_page().

==
count_vm_event(THP_COLLAPSE_ALLOC);
if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
==
It passes target mm to memcg and memcg gets a cgroup by
==
mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
==
Panic here means....mm->owner's task_subsys_state contains bad pointer ?

I'll dig. Thanks.
-Kame

2011-06-10 01:31:08

by Hugh Dickins

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> On Thu, 9 Jun 2011 16:42:09 -0700
> Ying Han <[email protected]> wrote:
>
> > ++cc Hugh who might have seen similar crashes on his machine.

Yes, I was testing my tmpfs changes, and saw it on i386 yesterday
morning. Same trace as Dave's (including khugepaged, which may or
may not be relevant), aside from the i386/x86_64 differences.

BUG: unable to handle kernel paging request at 6b6b6b87

I needed to move forward with other work on that laptop, so just
jotted down the details to come back to later. It came after one
hour of building swapping load in memcg, I've not tried again since.

>
> Thank you for forwarding. Hmm. It seems the panic happens at khugepaged's
> page collapse_huge_page().

Yes, the inlining in my kernel was different,
so collapse_huge_page() showed up in my backtrace.

>
> ==
> count_vm_event(THP_COLLAPSE_ALLOC);
> if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
> ==
> It passes target mm to memcg and memcg gets a cgroup by
> ==
> mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
> ==
> Panic here means....mm->owner's task_subsys_state contains bad pointer ?

781cc621 <mem_cgroup_from_task>:
781cc621: 55 push %ebp
781cc622: 31 c0 xor %eax,%eax
781cc624: 89 e5 mov %esp,%ebp
781cc626: 8b 55 08 mov 0x8(%ebp),%edx
781cc629: 85 d2 test %edx,%edx
781cc62b: 74 09 je 781cc636 <mem_cgroup_from_task+0x15>
781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
781cc633: 8b 40 1c mov 0x1c(%eax),%eax <==========
781cc636: c9 leave
781cc637: c3 ret

Hugh

2011-06-10 02:40:13

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Thu, 9 Jun 2011 18:30:49 -0700 (PDT)
Hugh Dickins <[email protected]> wrote:

> On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > On Thu, 9 Jun 2011 16:42:09 -0700
> > Ying Han <[email protected]> wrote:
> >
> > > ++cc Hugh who might have seen similar crashes on his machine.
>
> Yes, I was testing my tmpfs changes, and saw it on i386 yesterday
> morning. Same trace as Dave's (including khugepaged, which may or
> may not be relevant), aside from the i386/x86_64 differences.
>
> BUG: unable to handle kernel paging request at 6b6b6b87
>
> I needed to move forward with other work on that laptop, so just
> jotted down the details to come back to later. It came after one
> hour of building swapping load in memcg, I've not tried again since.
>
> >
> > Thank you for forwarding. Hmm. It seems the panic happens at khugepaged's
> > page collapse_huge_page().
>
> Yes, the inlining in my kernel was different,
> so collapse_huge_page() showed up in my backtrace.
>
> >
> > ==
> > count_vm_event(THP_COLLAPSE_ALLOC);
> > if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
> > ==
> > It passes target mm to memcg and memcg gets a cgroup by
> > ==
> > mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
> > ==
> > Panic here means....mm->owner's task_subsys_state contains bad pointer ?
>
> 781cc621 <mem_cgroup_from_task>:
> 781cc621: 55 push %ebp
> 781cc622: 31 c0 xor %eax,%eax
> 781cc624: 89 e5 mov %esp,%ebp
> 781cc626: 8b 55 08 mov 0x8(%ebp),%edx
> 781cc629: 85 d2 test %edx,%edx
> 781cc62b: 74 09 je 781cc636 <mem_cgroup_from_task+0x15>
> 781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
> 781cc633: 8b 40 1c mov 0x1c(%eax),%eax <==========
> 781cc636: c9 leave
> 781cc637: c3 ret
>

then, access to task->cgroups->subsys[?] causes access to 6b6b6b87...

Then, task->cgroups or task->cgroups->subsys contains bad pointer.
Considering khugepaged, it grabs mm_struct and memcg make an access to
(mm->owner)->cgroups->subsys.

Then, from memcg's point of view, we need to doubt mm->owner is valid or not
for this kind of tasks.

Thank you for inputs.

-Kame






2011-06-10 03:26:51

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Fri, 10 Jun 2011 11:33:11 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

> On Thu, 9 Jun 2011 18:30:49 -0700 (PDT)
> Hugh Dickins <[email protected]> wrote:

> > 781cc621 <mem_cgroup_from_task>:
> > 781cc621: 55 push %ebp
> > 781cc622: 31 c0 xor %eax,%eax
> > 781cc624: 89 e5 mov %esp,%ebp
> > 781cc626: 8b 55 08 mov 0x8(%ebp),%edx
> > 781cc629: 85 d2 test %edx,%edx
> > 781cc62b: 74 09 je 781cc636 <mem_cgroup_from_task+0x15>
> > 781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
> > 781cc633: 8b 40 1c mov 0x1c(%eax),%eax <==========
> > 781cc636: c9 leave
> > 781cc637: c3 ret
> >
>
> then, access to task->cgroups->subsys[?] causes access to 6b6b6b87...
>
> Then, task->cgroups or task->cgroups->subsys contains bad pointer.
> Considering khugepaged, it grabs mm_struct and memcg make an access to
> (mm->owner)->cgroups->subsys.
>
> Then, from memcg's point of view, we need to doubt mm->owner is valid or not
> for this kind of tasks.
>

Dave's log shows 6b6b6b6b6b..., too.

I guess it as "POISON_FREE" of slab object. Then, task->cgroups may used after free.


Thanks,
-Kame

2011-06-10 04:02:50

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Fri, 10 Jun 2011 12:19:49 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

> On Fri, 10 Jun 2011 11:33:11 +0900
> KAMEZAWA Hiroyuki <[email protected]> wrote:
>
> > On Thu, 9 Jun 2011 18:30:49 -0700 (PDT)
> > Hugh Dickins <[email protected]> wrote:
>
> > > 781cc621 <mem_cgroup_from_task>:
> > > 781cc621: 55 push %ebp
> > > 781cc622: 31 c0 xor %eax,%eax
> > > 781cc624: 89 e5 mov %esp,%ebp
> > > 781cc626: 8b 55 08 mov 0x8(%ebp),%edx
> > > 781cc629: 85 d2 test %edx,%edx
> > > 781cc62b: 74 09 je 781cc636 <mem_cgroup_from_task+0x15>
> > > 781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
> > > 781cc633: 8b 40 1c mov 0x1c(%eax),%eax <==========
> > > 781cc636: c9 leave
> > > 781cc637: c3 ret
> > >
> >
> > then, access to task->cgroups->subsys[?] causes access to 6b6b6b87...
> >
> > Then, task->cgroups or task->cgroups->subsys contains bad pointer.
> > Considering khugepaged, it grabs mm_struct and memcg make an access to
> > (mm->owner)->cgroups->subsys.
> >
> > Then, from memcg's point of view, we need to doubt mm->owner is valid or not
> > for this kind of tasks.
> >
>
> Dave's log shows 6b6b6b6b6b..., too.
>
> I guess it as "POISON_FREE" of slab object. Then, task->cgroups may used after free.
>
Ah, sorry.

0x1c(%eax) == 6b6b6b87 means %eax was 0x6b6b6b6b.
The %eax was the contents of [task->cgroups]....hmm, then, task itself is freed
pointer (and poisoned). So, it seems a problem of accessing mm->owner..

Thanks,
-Kame











2011-06-10 04:37:20

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: [PATCH] [BUGFIX] update mm->owner even if no next owner.


I think this can be a fix.
maybe good to CC Oleg.
==
>From dff52fb35af0cf36486965d19ee79e04b59f1dc4 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <[email protected]>
Date: Fri, 10 Jun 2011 13:15:14 +0900
Subject: [PATCH] [BUGFIX] update mm->owner even if no next owner.

A panic is reported.

> Call Trace:
>  [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
>  [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
>  [<ffffffff810493f3>] ? need_resched+0x23/0x2d
>  [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
>  [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
>  [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
>  [<ffffffff81134024>] khugepaged+0x5da/0xfaf
>  [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
>  [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
>  [<ffffffff81078625>] kthread+0xa8/0xb0
>  [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
>  [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
>  [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
>  [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a

The code is.
> return container_of(task_subsys_state(p, mem_cgroup_subsys_id),
> struct mem_cgroup, css);


What happens here is accssing a freed task struct "p" from mm->owner.
So, it's doubtful that mm->owner points to freed task struct.

At thread exit, we need to handle mm->owner. If exitting-thread == mm->owner,
we modify mm->owner to points to other exisiting task. But, we do not update
mm->owner when there are no more threads. But if a kernel thread, like khugepaged,
picks up a mm_struct without updating mm->users, there is a trouble.

When mm_users shows that the task is the last task belongs to mm.
mm->owner is not updated and remained to point to the task. So, in this case,
mm->owner points to a not exisiting task. This was good because if there
are no thread, no charge happens in old days. But now, we have ksm and
khugepaged.

rcu_read_lock() used in memcg is of no use because mm->owner can be
freed before we take rcu_read_lock.
Then, mm->owner should be cleared if there are no next owner.

Reported-by: Hugh Dickins <[email protected]>
Reported-by: Dave Jones <[email protected]>
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
kernel/exit.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 20a4064..dbc3736 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -582,8 +582,10 @@ void mm_update_next_owner(struct mm_struct *mm)
struct task_struct *c, *g, *p = current;

retry:
- if (!mm_need_new_owner(mm, p))
+ if (!mm_need_new_owner(mm, p)) {
+ rcu_assign_pointer(mm->owner, NULL);
return;
+ }

read_lock(&tasklist_lock);
/*
@@ -617,7 +619,7 @@ retry:
* most likely racing with swapoff (try_to_unuse()) or /proc or
* ptrace or page migration (get_task_mm()). Mark owner as NULL.
*/
- mm->owner = NULL;
+ rcu_assign_pointer(mm->owner, NULL);
return;

assign_new_owner:
--
1.7.4.1

2011-06-10 05:15:34

by Daisuke Nishimura

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

Thank you for your investigation and a patch.

I've not been able to replicate this issue on my machine, I think it would be
better to push this patch to -stable, if it can fix the issue.

Thanks,
Daisuke Nishimura.

On Fri, 10 Jun 2011 13:30:21 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

>
> I think this can be a fix.
> maybe good to CC Oleg.
> ==
> From dff52fb35af0cf36486965d19ee79e04b59f1dc4 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <[email protected]>
> Date: Fri, 10 Jun 2011 13:15:14 +0900
> Subject: [PATCH] [BUGFIX] update mm->owner even if no next owner.
>
> A panic is reported.
>
> > Call Trace:
> >  [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> >  [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> >  [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> >  [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> >  [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> >  [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> >  [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> >  [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> >  [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> >  [<ffffffff81078625>] kthread+0xa8/0xb0
> >  [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> >  [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> >  [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> >  [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
>
> The code is.
> > return container_of(task_subsys_state(p, mem_cgroup_subsys_id),
> > struct mem_cgroup, css);
>
>
> What happens here is accssing a freed task struct "p" from mm->owner.
> So, it's doubtful that mm->owner points to freed task struct.
>
> At thread exit, we need to handle mm->owner. If exitting-thread == mm->owner,
> we modify mm->owner to points to other exisiting task. But, we do not update
> mm->owner when there are no more threads. But if a kernel thread, like khugepaged,
> picks up a mm_struct without updating mm->users, there is a trouble.
>
> When mm_users shows that the task is the last task belongs to mm.
> mm->owner is not updated and remained to point to the task. So, in this case,
> mm->owner points to a not exisiting task. This was good because if there
> are no thread, no charge happens in old days. But now, we have ksm and
> khugepaged.
>
> rcu_read_lock() used in memcg is of no use because mm->owner can be
> freed before we take rcu_read_lock.
> Then, mm->owner should be cleared if there are no next owner.
>
> Reported-by: Hugh Dickins <[email protected]>
> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
> ---
> kernel/exit.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 20a4064..dbc3736 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -582,8 +582,10 @@ void mm_update_next_owner(struct mm_struct *mm)
> struct task_struct *c, *g, *p = current;
>
> retry:
> - if (!mm_need_new_owner(mm, p))
> + if (!mm_need_new_owner(mm, p)) {
> + rcu_assign_pointer(mm->owner, NULL);
> return;
> + }
>
> read_lock(&tasklist_lock);
> /*
> @@ -617,7 +619,7 @@ retry:
> * most likely racing with swapoff (try_to_unuse()) or /proc or
> * ptrace or page migration (get_task_mm()). Mark owner as NULL.
> */
> - mm->owner = NULL;
> + rcu_assign_pointer(mm->owner, NULL);
> return;
>
> assign_new_owner:
> --
> 1.7.4.1
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2011-06-10 05:29:07

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Fri, 10 Jun 2011 13:21:46 +0800
Xiaotian Feng <[email protected]> wrote:

> On Fri, Jun 10, 2011 at 12:30 PM, KAMEZAWA Hiroyuki <
> [email protected]> wrote:
>
> >
> > I think this can be a fix.
> > maybe good to CC Oleg.
>
> ==
> > From dff52fb35af0cf36486965d19ee79e04b59f1dc4 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <[email protected]>
> > Date: Fri, 10 Jun 2011 13:15:14 +0900
> > Subject: [PATCH] [BUGFIX] update mm->owner even if no next owner.
> >
> > A panic is reported.
> >
> > > Call Trace:
> > > [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> > > [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> > > [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> > > [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> > > [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> > > [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> > > [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> > > [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> > > [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> > > [<ffffffff81078625>] kthread+0xa8/0xb0
> > > [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> > > [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> > > [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> > > [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> >
> > The code is.
> > > return container_of(task_subsys_state(p, mem_cgroup_subsys_id),
> > > struct mem_cgroup, css);
> >
> >
> > What happens here is accssing a freed task struct "p" from mm->owner.
> > So, it's doubtful that mm->owner points to freed task struct.
> >
> >
> But from the bug itself, it looks more likely kernel is hitting a freed
> p->cgroups, right?
> If p is already freed, the kernel will fault on
> 781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
>
> Then you will not get a value of 6b6b6b87, right?


%edx here is a pointer for task struct.
Then, task->cgroup == 0x6b6b6b6b. It means "task" is freed.

Thanks,
-Kame



2011-06-10 21:49:51

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
>
> I think this can be a fix.

Sorry, I think not: I've not digested your rationale,
but three things stand out:

1. Why has this only just started happening? I may not have run that
test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
maybe not with khugepaged, but certainly with ksmd.

2. Your hunk below:
> - if (!mm_need_new_owner(mm, p))
> + if (!mm_need_new_owner(mm, p)) {
> + rcu_assign_pointer(mm->owner, NULL);
is now setting mm->owner to NULL at times when we were sure it did not
need updating before (task is not the owner): you're damaging mm->owner.

3. There's a patch from Andrea in 3.0-rc1 which looks very likely to be
relevant, 692e0b35427a "mm: thp: optimize memcg charge in khugepaged".
I'll try reproducing without that tonight (I crashed in 20 minutes
this morning, so it's not too hard).

Hugh

> maybe good to CC Oleg.
> ==
> From dff52fb35af0cf36486965d19ee79e04b59f1dc4 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <[email protected]>
> Date: Fri, 10 Jun 2011 13:15:14 +0900
> Subject: [PATCH] [BUGFIX] update mm->owner even if no next owner.
>
> A panic is reported.
>
> > Call Trace:
> >  [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> >  [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> >  [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> >  [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> >  [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> >  [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> >  [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> >  [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> >  [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> >  [<ffffffff81078625>] kthread+0xa8/0xb0
> >  [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> >  [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> >  [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> >  [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
>
> The code is.
> > return container_of(task_subsys_state(p, mem_cgroup_subsys_id),
> > struct mem_cgroup, css);
>
>
> What happens here is accssing a freed task struct "p" from mm->owner.
> So, it's doubtful that mm->owner points to freed task struct.
>
> At thread exit, we need to handle mm->owner. If exitting-thread == mm->owner,
> we modify mm->owner to points to other exisiting task. But, we do not update
> mm->owner when there are no more threads. But if a kernel thread, like khugepaged,
> picks up a mm_struct without updating mm->users, there is a trouble.
>
> When mm_users shows that the task is the last task belongs to mm.
> mm->owner is not updated and remained to point to the task. So, in this case,
> mm->owner points to a not exisiting task. This was good because if there
> are no thread, no charge happens in old days. But now, we have ksm and
> khugepaged.
>
> rcu_read_lock() used in memcg is of no use because mm->owner can be
> freed before we take rcu_read_lock.
> Then, mm->owner should be cleared if there are no next owner.
>
> Reported-by: Hugh Dickins <[email protected]>
> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
> ---
> kernel/exit.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 20a4064..dbc3736 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -582,8 +582,10 @@ void mm_update_next_owner(struct mm_struct *mm)
> struct task_struct *c, *g, *p = current;
>
> retry:
> - if (!mm_need_new_owner(mm, p))
> + if (!mm_need_new_owner(mm, p)) {
> + rcu_assign_pointer(mm->owner, NULL);
> return;
> + }
>
> read_lock(&tasklist_lock);
> /*
> @@ -617,7 +619,7 @@ retry:
> * most likely racing with swapoff (try_to_unuse()) or /proc or
> * ptrace or page migration (get_task_mm()). Mark owner as NULL.
> */
> - mm->owner = NULL;
> + rcu_assign_pointer(mm->owner, NULL);
> return;
>
> assign_new_owner:
> --
> 1.7.4.1

2011-06-10 23:54:56

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Fri, Jun 10, 2011 at 02:49:35PM -0700, Hugh Dickins wrote:
> On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> >
> > I think this can be a fix.
>
> Sorry, I think not: I've not digested your rationale,
> but three things stand out:
>
> 1. Why has this only just started happening? I may not have run that
> test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> maybe not with khugepaged, but certainly with ksmd.
>
> 2. Your hunk below:
> > - if (!mm_need_new_owner(mm, p))
> > + if (!mm_need_new_owner(mm, p)) {
> > + rcu_assign_pointer(mm->owner, NULL);
> is now setting mm->owner to NULL at times when we were sure it did not
> need updating before (task is not the owner): you're damaging mm->owner.
>
> 3. There's a patch from Andrea in 3.0-rc1 which looks very likely to be
> relevant, 692e0b35427a "mm: thp: optimize memcg charge in khugepaged".
> I'll try reproducing without that tonight (I crashed in 20 minutes
> this morning, so it's not too hard).

It looks likely. This change moved the memcg charge out of the
mmap_sem read section, which kept the last task of the mm from
exiting:

do_exit
exit_mm
mmput
khugepaged_exit
down_write(&mm->mmap_sem);
up_write(&mm->mmap_sem);

2011-06-11 00:46:42

by Hiroyuki Kamezawa

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

2011/6/11 Hugh Dickins <[email protected]>:
> On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
>>
>> I think this can be a fix.
>
> Sorry, I think not: I've not digested your rationale,
> but three things stand out:
>
> 1. Why has this only just started happening? ?I may not have run that
> ? test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> ? maybe not with khugepaged, but certainly with ksmd.
>
Not sure. I pointed this just by review because I found "charge" in
khugepaged is out of mmap_sem now.

> 2. Your hunk below:
>> - ? ? if (!mm_need_new_owner(mm, p))
>> + ? ? if (!mm_need_new_owner(mm, p)) {
>> + ? ? ? ? ? ? rcu_assign_pointer(mm->owner, NULL);
> ? is now setting mm->owner to NULL at times when we were sure it did not
> ? need updating before (task is not the owner): you're damaging mm->owner.
>
Ah, yes. It's my mistake.

> 3. There's a patch from Andrea in 3.0-rc1 which looks very likely to be
> ? relevant, 692e0b35427a "mm: thp: optimize memcg charge in khugepaged".
> ? I'll try reproducing without that tonight (I crashed in 20 minutes
> ? this morning, so it's not too hard).
>

Thanks,
-Kame

2011-06-11 08:19:46

by Michal Hocko

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

[Let's add some more people to CC]

On Thu 09-06-11 17:29:57, Dave Jones wrote:
>
> I just got the oops below while building a kernel.
> When it oopsed, the kernel modesetting oops-on-framebuffer thing
> happened, and the box wedged solid for about a minute.
> Then it woke up, and I was able to ctrl-f1 back to my X session
> to capture the dmesg. The stuff that follows the oops looks
> quite disturbing, but I think it's from hanging with interrupts off
> for a minute.
>
> Dave
>
> general protection fault: 0000 [#1] PREEMPT SMP
> CPU 1
> Modules linked in: nfs fscache fuse nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables btusb bluetooth arc4 zaurus dell_wmi sparse_keymap snd_usb_audio cdc_ether usbnet cdc_wdm mii snd_usbmidi_lib snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_idt cdc_acm dell_laptop uvcvideo snd_hda_intel snd_hda_codec dcdbas snd_hwdep videodev microcode v4l2_compat_ioctl32 snd_seq snd_seq_device snd_pcm joydev iTCO_wdt i2c_i801 iTCO_vendor_support iwlagn pcspkr snd_timer mac80211 snd soundcore snd_page_alloc cfg80211 tg3 rfkill wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
>
> Pid: 34, comm: khugepaged Not tainted 3.0.0-rc2+ #72 Dell Inc. Adamo 13 /0N70T0
> RIP: 0010:[<ffffffff81138590>] [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> RSP: 0018:ffff880135c97bd0 EFLAGS: 00010286
> RAX: 6b6b6b6b6b6b6b6b RBX: ffff880013c48000 RCX: 0000000000000000
> RDX: 0000000000000246 RSI: ffffffff81a26610 RDI: ffff880013c48000
> RBP: ffff880135c97be0 R08: 0000000000000001 R09: 0000000000000000
> R10: ffff880135c97cf0 R11: 0000000005491edb R12: ffff880013c48000
> R13: 0000000000000200 R14: ffff880135c97ce8 R15: 0000000000000200
> FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000338f21400a CR3: 000000004f64c000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process khugepaged (pid: 34, threadinfo ffff880135c96000, task ffff880135c98000)
> Stack:
> ffff880013c48000 0000000000000200 ffff880135c97bf0 ffffffff81139792
> ffff880135c97cc0 ffffffff8113a75a ffff880135c96000 ffff88013fc00000
> ffff880135c98000 ffff880135c97c78 ffff880135c98000 00000000001d2c40
> Call Trace:
> [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> [<ffffffff81078625>] kthread+0xa8/0xb0
> [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> [<ffffffff814d5660>] ? gs_change+0x13/0x13
> Code: ff 84 c0 74 b5 eb 03 45 31 e4 5a 5b 4c 89 e0 41 5c 41 5d 5d c3 55 48 89 e5 41 54 53 66 66 66 66 90 48 8b 87 90 12 00 00 49 89 fc
> 8b 58 50 e8 b4 d8 f3 ff 85 c0 74 4d 80 3d 8c e5 6a 01 00 75
> RIP [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> RSP <ffff880135c97bd0>
> psmouse.c: TouchPad at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
> iwlagn 0000:04:00.0: Queue 4 stuck for 10000 ms.
> iwlagn 0000:04:00.0: On demand firmware reload
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c6 48 00 00 18 00
> end_request: I/O error, dev sda, sector 366724680
> Buffer I/O error on device dm-3, logical block 44179657
> Buffer I/O error on device dm-3, logical block 44179658
> Buffer I/O error on device dm-3, logical block 44179659
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 0 size 12288 starting block 44179657)
> ieee80211 phy0: Hardware restart was requested
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 80 00 00 18 00
> end_request: I/O error, dev sda, sector 366724992
> Buffer I/O error on device dm-3, logical block 44179696
> Buffer I/O error on device dm-3, logical block 44179697
> Buffer I/O error on device dm-3, logical block 44179698
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 12288 size 12288 starting block 44179696)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 b8 00 00 18 00
> end_request: I/O error, dev sda, sector 366725048
> Buffer I/O error on device dm-3, logical block 44179703
> Buffer I/O error on device dm-3, logical block 44179704
> Buffer I/O error on device dm-3, logical block 44179705
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 24576 size 12288 starting block 44179703)
> ---[ end trace 95e652595eaf01aa ]---
> psmouse.c: resync failed, issuing reconnect request
> BUG: sleeping function called from invalid context at kernel/mutex.c:271
> in_atomic(): 0, irqs_disabled(): 0, pid: 34, name: khugepaged
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee a0 00 00 50 00
> end_request: I/O error, dev sda, sector 353889952
> Buffer I/O error on device dm-3, logical block 42575316
> Buffer I/O error on device dm-3, logical block 42575317
> Buffer I/O error on device dm-3, logical block 42575318
> Buffer I/O error on device dm-3, logical block 42575319
> Buffer I/O error on device dm-3, logical block 42575320
> Buffer I/O error on device dm-3, logical block 42575321
> Buffer I/O error on device dm-3, logical block 42575322
> Buffer I/O error on device dm-3, logical block 42575323
> Buffer I/O error on device dm-3, logical block 42575324
> Buffer I/O error on device dm-3, logical block 42575325
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629148 (offset 0 size 40960 starting block 42575316)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 f8 40 00 00 20 00
> end_request: I/O error, dev sda, sector 366213184
> Buffer I/O error on device dm-3, logical block 44115720
> Buffer I/O error on device dm-3, logical block 44115721
> Buffer I/O error on device dm-3, logical block 44115722
> Buffer I/O error on device dm-3, logical block 44115723
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031693 (offset 0 size 16384 starting block 44115720)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 69 80 00 00 10 00
> end_request: I/O error, dev sda, sector 365914496
> Buffer I/O error on device dm-3, logical block 44078384
> Buffer I/O error on device dm-3, logical block 44078385
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031978 (offset 0 size 8192 starting block 44078384)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 12 d5 00 00 00 18 00
> end_request: I/O error, dev sda, sector 353555712
> Buffer I/O error on device dm-3, logical block 42533536
> Buffer I/O error on device dm-3, logical block 42533537
> Buffer I/O error on device dm-3, logical block 42533538
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628547 (offset 0 size 12288 starting block 42533536)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fb a0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214048
> Buffer I/O error on device dm-3, logical block 44115828
> Buffer I/O error on device dm-3, logical block 44115829
> Buffer I/O error on device dm-3, logical block 44115830
> Buffer I/O error on device dm-3, logical block 44115831
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031956 (offset 0 size 16384 starting block 44115828)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd 80 00 00 28 00
> end_request: I/O error, dev sda, sector 366861696
> Buffer I/O error on device dm-3, logical block 44196784
> Buffer I/O error on device dm-3, logical block 44196785
> Buffer I/O error on device dm-3, logical block 44196786
> Buffer I/O error on device dm-3, logical block 44196787
> Buffer I/O error on device dm-3, logical block 44196788
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031940 (offset 0 size 20480 starting block 44196784)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 f1 78 00 00 18 00
> end_request: I/O error, dev sda, sector 362017144
> Buffer I/O error on device dm-3, logical block 43591215
> Buffer I/O error on device dm-3, logical block 43591216
> Buffer I/O error on device dm-3, logical block 43591217
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 0 size 12288 starting block 43591215)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 dc 60 00 00 10 00
> end_request: I/O error, dev sda, sector 361946208
> Buffer I/O error on device dm-3, logical block 43582348
> Buffer I/O error on device dm-3, logical block 43582349
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 12288 size 8192 starting block 43582348)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd a8 00 00 28 00
> end_request: I/O error, dev sda, sector 366861736
> Buffer I/O error on device dm-3, logical block 44196789
> Buffer I/O error on device dm-3, logical block 44196790
> Buffer I/O error on device dm-3, logical block 44196791
> Buffer I/O error on device dm-3, logical block 44196792
> Buffer I/O error on device dm-3, logical block 44196793
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031841 (offset 0 size 20480 starting block 44196789)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd d0 00 00 48 00
> end_request: I/O error, dev sda, sector 366861776
> Buffer I/O error on device dm-3, logical block 44196794
> Buffer I/O error on device dm-3, logical block 44196795
> Buffer I/O error on device dm-3, logical block 44196796
> Buffer I/O error on device dm-3, logical block 44196797
> Buffer I/O error on device dm-3, logical block 44196798
> Buffer I/O error on device dm-3, logical block 44196799
> Buffer I/O error on device dm-3, logical block 44196800
> Buffer I/O error on device dm-3, logical block 44196801
> Buffer I/O error on device dm-3, logical block 44196802
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031992 (offset 0 size 36864 starting block 44196794)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee f0 00 00 48 00
> end_request: I/O error, dev sda, sector 353890032
> Buffer I/O error on device dm-3, logical block 42575326
> Buffer I/O error on device dm-3, logical block 42575327
> Buffer I/O error on device dm-3, logical block 42575328
> Buffer I/O error on device dm-3, logical block 42575329
> Buffer I/O error on device dm-3, logical block 42575330
> Buffer I/O error on device dm-3, logical block 42575331
> Buffer I/O error on device dm-3, logical block 42575332
> Buffer I/O error on device dm-3, logical block 42575333
> Buffer I/O error on device dm-3, logical block 42575334
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629931 (offset 0 size 36864 starting block 42575326)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fd c0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214592
> Buffer I/O error on device dm-3, logical block 44115896
> Buffer I/O error on device dm-3, logical block 44115897
> Buffer I/O error on device dm-3, logical block 44115898
> Buffer I/O error on device dm-3, logical block 44115899
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11011057 (offset 0 size 16384 starting block 44115896)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 24 50 00 00 08 00
> end_request: I/O error, dev sda, sector 365896784
> Buffer I/O error on device dm-3, logical block 44076170
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032013 (offset 0 size 4096 starting block 44076170)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 20 00 00 08 00
> end_request: I/O error, dev sda, sector 382721056
> Buffer I/O error on device dm-3, logical block 46179204
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 16384 size 4096 starting block 46179204)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 40 00 00 08 00
> end_request: I/O error, dev sda, sector 382721088
> Buffer I/O error on device dm-3, logical block 46179208
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 32768 size 4096 starting block 46179208)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 60 00 00 08 00
> end_request: I/O error, dev sda, sector 382721120
> Buffer I/O error on device dm-3, logical block 46179212
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 49152 size 4096 starting block 46179212)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 78 00 00 08 00
> end_request: I/O error, dev sda, sector 382721144
> Buffer I/O error on device dm-3, logical block 46179215
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 61440 size 4096 starting block 46179215)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 90 00 00 10 00
> end_request: I/O error, dev sda, sector 382721168
> Buffer I/O error on device dm-3, logical block 46179218
> Buffer I/O error on device dm-3, logical block 46179219
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 73728 size 8192 starting block 46179218)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc c8 00 00 08 00
> end_request: I/O error, dev sda, sector 382721224
> Buffer I/O error on device dm-3, logical block 46179225
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 102400 size 4096 starting block 46179225)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc f8 00 00 08 00
> end_request: I/O error, dev sda, sector 382721272
> Buffer I/O error on device dm-3, logical block 46179231
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 126976 size 4096 starting block 46179231)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 af 80 00 00 10 00
> end_request: I/O error, dev sda, sector 353415040
> Buffer I/O error on device dm-3, logical block 42515952
> Buffer I/O error on device dm-3, logical block 42515953
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628554 (offset 0 size 8192 starting block 42515952)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fe e0 00 00 20 00
> end_request: I/O error, dev sda, sector 366214880
> Buffer I/O error on device dm-3, logical block 44115932
> Buffer I/O error on device dm-3, logical block 44115933
> Buffer I/O error on device dm-3, logical block 44115934
> Buffer I/O error on device dm-3, logical block 44115935
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031999 (offset 0 size 16384 starting block 44115932)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 91 6e f0 00 00 10 00
> end_request: I/O error, dev sda, sector 361852656
> Buffer I/O error on device dm-3, logical block 43570654
> Buffer I/O error on device dm-3, logical block 43570655
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896269 (offset 0 size 8192 starting block 43570654)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 b2 20 00 00 10 00
> end_request: I/O error, dev sda, sector 353415712
> Buffer I/O error on device dm-3, logical block 42516036
> Buffer I/O error on device dm-3, logical block 42516037
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629161 (offset 0 size 8192 starting block 42516036)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 00 60 00 00 20 00
> end_request: I/O error, dev sda, sector 361889888
> Buffer I/O error on device dm-3, logical block 43575308
> Buffer I/O error on device dm-3, logical block 43575309
> Buffer I/O error on device dm-3, logical block 43575310
> Buffer I/O error on device dm-3, logical block 43575311
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896254 (offset 0 size 16384 starting block 43575308)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] INFO: lockdep is turned off.
> Pid: 34, comm: khugepaged Tainted: G D 3.0.0-rc2+ #72
> Call Trace:
> [<ffffffff8104d276>] __might_sleep+0x112/0x117
> [<ffffffff814cca48>] mutex_lock_nested+0x25/0x40
> [<ffffffff810ee9b7>] perf_event_exit_task+0x2d/0x1cd
> [<ffffffff8105ea1f>] do_exit+0x353/0x7fa
> [<ffffffff8105c569>] ? kmsg_dump+0x89/0x13c
> [<ffffffff814cf78d>] oops_end+0xbc/0xc5
> [<ffffffff8100d087>] die+0x5a/0x63
> [<ffffffff814cf18f>] do_general_protection+0x128/0x131
> [<ffffffff814cead5>] general_protection+0x25/0x30
> [<ffffffff81138590>] ? task_subsys_state.constprop.30+0x16/0x78
> [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> [<ffffffff81078625>] kthread+0xa8/0xb0
> [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> [<ffffffff814d5660>] ? gs_change+0x13/0x13
> Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 94 1b 40 00 00 18 00
> end_request: I/O error, dev sda, sector 362027840
> Buffer I/O error on device dm-3, logical block 43592552
> Buffer I/O error on device dm-3, logical block 43592553
> Buffer I/O error on device dm-3, logical block 43592554
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 0 size 12288 starting block 43592552)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 06 b0 00 00 10 00
> end_request: I/O error, dev sda, sector 361957040
> Buffer I/O error on device dm-3, logical block 43583702
> Buffer I/O error on device dm-3, logical block 43583703
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 12288 size 8192 starting block 43583702)
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 ff 40 00 00 20 00
> end_request: I/O error, dev sda, sector 366214976
> Buffer I/O error on device dm-3, logical block 44115944
> Buffer I/O error on device dm-3, logical block 44115945
> Buffer I/O error on device dm-3, logical block 44115946
> Buffer I/O error on device dm-3, logical block 44115947
> EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032040 (offset 0 size 16384 starting block 44115944)
> JBD2: Detected IO errors while flushing file data on dm-3-8
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-06-11 15:47:12

by Hugh Dickins

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

The discussion in https://lkml.org/lkml/2011/6/9/538
has continued in https://lkml.org/lkml/2011/6/10/2

On Sat, 11 Jun 2011, Michal Hocko wrote:

> [Let's add some more people to CC]
>
> On Thu 09-06-11 17:29:57, Dave Jones wrote:
> >
> > I just got the oops below while building a kernel.
> > When it oopsed, the kernel modesetting oops-on-framebuffer thing
> > happened, and the box wedged solid for about a minute.
> > Then it woke up, and I was able to ctrl-f1 back to my X session
> > to capture the dmesg. The stuff that follows the oops looks
> > quite disturbing, but I think it's from hanging with interrupts off
> > for a minute.
> >
> > Dave
> >
> > general protection fault: 0000 [#1] PREEMPT SMP
> > CPU 1
> > Modules linked in: nfs fscache fuse nfsd lockd nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables btusb bluetooth arc4 zaurus dell_wmi sparse_keymap snd_usb_audio cdc_ether usbnet cdc_wdm mii snd_usbmidi_lib snd_rawmidi snd_hda_codec_hdmi snd_hda_codec_idt cdc_acm dell_laptop uvcvideo snd_hda_intel snd_hda_codec dcdbas snd_hwdep videodev microcode v4l2_compat_ioctl32 snd_seq snd_seq_device snd_pcm joydev iTCO_wdt i2c_i801 iTCO_vendor_support iwlagn pcspkr snd_timer mac80211 snd soundcore snd_page_alloc cfg80211 tg3 rfkill wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
> >
> > Pid: 34, comm: khugepaged Not tainted 3.0.0-rc2+ #72 Dell Inc. Adamo 13 /0N70T0
> > RIP: 0010:[<ffffffff81138590>] [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> > RSP: 0018:ffff880135c97bd0 EFLAGS: 00010286
> > RAX: 6b6b6b6b6b6b6b6b RBX: ffff880013c48000 RCX: 0000000000000000
> > RDX: 0000000000000246 RSI: ffffffff81a26610 RDI: ffff880013c48000
> > RBP: ffff880135c97be0 R08: 0000000000000001 R09: 0000000000000000
> > R10: ffff880135c97cf0 R11: 0000000005491edb R12: ffff880013c48000
> > R13: 0000000000000200 R14: ffff880135c97ce8 R15: 0000000000000200
> > FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 000000338f21400a CR3: 000000004f64c000 CR4: 00000000000406e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process khugepaged (pid: 34, threadinfo ffff880135c96000, task ffff880135c98000)
> > Stack:
> > ffff880013c48000 0000000000000200 ffff880135c97bf0 ffffffff81139792
> > ffff880135c97cc0 ffffffff8113a75a ffff880135c96000 ffff88013fc00000
> > ffff880135c98000 ffff880135c97c78 ffff880135c98000 00000000001d2c40
> > Call Trace:
> > [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> > [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> > [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> > [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> > [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> > [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> > [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> > [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> > [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> > [<ffffffff81078625>] kthread+0xa8/0xb0
> > [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> > [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> > [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> > [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> > [<ffffffff814d5660>] ? gs_change+0x13/0x13
> > Code: ff 84 c0 74 b5 eb 03 45 31 e4 5a 5b 4c 89 e0 41 5c 41 5d 5d c3 55 48 89 e5 41 54 53 66 66 66 66 90 48 8b 87 90 12 00 00 49 89 fc
> > 8b 58 50 e8 b4 d8 f3 ff 85 c0 74 4d 80 3d 8c e5 6a 01 00 75
> > RIP [<ffffffff81138590>] task_subsys_state.constprop.30+0x16/0x78
> > RSP <ffff880135c97bd0>
> > psmouse.c: TouchPad at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
> > iwlagn 0000:04:00.0: Queue 4 stuck for 10000 ms.
> > iwlagn 0000:04:00.0: On demand firmware reload
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c6 48 00 00 18 00
> > end_request: I/O error, dev sda, sector 366724680
> > Buffer I/O error on device dm-3, logical block 44179657
> > Buffer I/O error on device dm-3, logical block 44179658
> > Buffer I/O error on device dm-3, logical block 44179659
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 0 size 12288 starting block 44179657)
> > ieee80211 phy0: Hardware restart was requested
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 80 00 00 18 00
> > end_request: I/O error, dev sda, sector 366724992
> > Buffer I/O error on device dm-3, logical block 44179696
> > Buffer I/O error on device dm-3, logical block 44179697
> > Buffer I/O error on device dm-3, logical block 44179698
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 12288 size 12288 starting block 44179696)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 db c7 b8 00 00 18 00
> > end_request: I/O error, dev sda, sector 366725048
> > Buffer I/O error on device dm-3, logical block 44179703
> > Buffer I/O error on device dm-3, logical block 44179704
> > Buffer I/O error on device dm-3, logical block 44179705
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031834 (offset 24576 size 12288 starting block 44179703)
> > ---[ end trace 95e652595eaf01aa ]---
> > psmouse.c: resync failed, issuing reconnect request
> > BUG: sleeping function called from invalid context at kernel/mutex.c:271
> > in_atomic(): 0, irqs_disabled(): 0, pid: 34, name: khugepaged
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee a0 00 00 50 00
> > end_request: I/O error, dev sda, sector 353889952
> > Buffer I/O error on device dm-3, logical block 42575316
> > Buffer I/O error on device dm-3, logical block 42575317
> > Buffer I/O error on device dm-3, logical block 42575318
> > Buffer I/O error on device dm-3, logical block 42575319
> > Buffer I/O error on device dm-3, logical block 42575320
> > Buffer I/O error on device dm-3, logical block 42575321
> > Buffer I/O error on device dm-3, logical block 42575322
> > Buffer I/O error on device dm-3, logical block 42575323
> > Buffer I/O error on device dm-3, logical block 42575324
> > Buffer I/O error on device dm-3, logical block 42575325
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629148 (offset 0 size 40960 starting block 42575316)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 f8 40 00 00 20 00
> > end_request: I/O error, dev sda, sector 366213184
> > Buffer I/O error on device dm-3, logical block 44115720
> > Buffer I/O error on device dm-3, logical block 44115721
> > Buffer I/O error on device dm-3, logical block 44115722
> > Buffer I/O error on device dm-3, logical block 44115723
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031693 (offset 0 size 16384 starting block 44115720)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 69 80 00 00 10 00
> > end_request: I/O error, dev sda, sector 365914496
> > Buffer I/O error on device dm-3, logical block 44078384
> > Buffer I/O error on device dm-3, logical block 44078385
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031978 (offset 0 size 8192 starting block 44078384)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 12 d5 00 00 00 18 00
> > end_request: I/O error, dev sda, sector 353555712
> > Buffer I/O error on device dm-3, logical block 42533536
> > Buffer I/O error on device dm-3, logical block 42533537
> > Buffer I/O error on device dm-3, logical block 42533538
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628547 (offset 0 size 12288 starting block 42533536)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fb a0 00 00 20 00
> > end_request: I/O error, dev sda, sector 366214048
> > Buffer I/O error on device dm-3, logical block 44115828
> > Buffer I/O error on device dm-3, logical block 44115829
> > Buffer I/O error on device dm-3, logical block 44115830
> > Buffer I/O error on device dm-3, logical block 44115831
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031956 (offset 0 size 16384 starting block 44115828)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd 80 00 00 28 00
> > end_request: I/O error, dev sda, sector 366861696
> > Buffer I/O error on device dm-3, logical block 44196784
> > Buffer I/O error on device dm-3, logical block 44196785
> > Buffer I/O error on device dm-3, logical block 44196786
> > Buffer I/O error on device dm-3, logical block 44196787
> > Buffer I/O error on device dm-3, logical block 44196788
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031940 (offset 0 size 20480 starting block 44196784)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 f1 78 00 00 18 00
> > end_request: I/O error, dev sda, sector 362017144
> > Buffer I/O error on device dm-3, logical block 43591215
> > Buffer I/O error on device dm-3, logical block 43591216
> > Buffer I/O error on device dm-3, logical block 43591217
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 0 size 12288 starting block 43591215)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 dc 60 00 00 10 00
> > end_request: I/O error, dev sda, sector 361946208
> > Buffer I/O error on device dm-3, logical block 43582348
> > Buffer I/O error on device dm-3, logical block 43582349
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10887935 (offset 12288 size 8192 starting block 43582348)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd a8 00 00 28 00
> > end_request: I/O error, dev sda, sector 366861736
> > Buffer I/O error on device dm-3, logical block 44196789
> > Buffer I/O error on device dm-3, logical block 44196790
> > Buffer I/O error on device dm-3, logical block 44196791
> > Buffer I/O error on device dm-3, logical block 44196792
> > Buffer I/O error on device dm-3, logical block 44196793
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031841 (offset 0 size 20480 starting block 44196789)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 dd dd d0 00 00 48 00
> > end_request: I/O error, dev sda, sector 366861776
> > Buffer I/O error on device dm-3, logical block 44196794
> > Buffer I/O error on device dm-3, logical block 44196795
> > Buffer I/O error on device dm-3, logical block 44196796
> > Buffer I/O error on device dm-3, logical block 44196797
> > Buffer I/O error on device dm-3, logical block 44196798
> > Buffer I/O error on device dm-3, logical block 44196799
> > Buffer I/O error on device dm-3, logical block 44196800
> > Buffer I/O error on device dm-3, logical block 44196801
> > Buffer I/O error on device dm-3, logical block 44196802
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031992 (offset 0 size 36864 starting block 44196794)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 17 ee f0 00 00 48 00
> > end_request: I/O error, dev sda, sector 353890032
> > Buffer I/O error on device dm-3, logical block 42575326
> > Buffer I/O error on device dm-3, logical block 42575327
> > Buffer I/O error on device dm-3, logical block 42575328
> > Buffer I/O error on device dm-3, logical block 42575329
> > Buffer I/O error on device dm-3, logical block 42575330
> > Buffer I/O error on device dm-3, logical block 42575331
> > Buffer I/O error on device dm-3, logical block 42575332
> > Buffer I/O error on device dm-3, logical block 42575333
> > Buffer I/O error on device dm-3, logical block 42575334
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629931 (offset 0 size 36864 starting block 42575326)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fd c0 00 00 20 00
> > end_request: I/O error, dev sda, sector 366214592
> > Buffer I/O error on device dm-3, logical block 44115896
> > Buffer I/O error on device dm-3, logical block 44115897
> > Buffer I/O error on device dm-3, logical block 44115898
> > Buffer I/O error on device dm-3, logical block 44115899
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11011057 (offset 0 size 16384 starting block 44115896)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 cf 24 50 00 00 08 00
> > end_request: I/O error, dev sda, sector 365896784
> > Buffer I/O error on device dm-3, logical block 44076170
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032013 (offset 0 size 4096 starting block 44076170)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 20 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721056
> > Buffer I/O error on device dm-3, logical block 46179204
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 16384 size 4096 starting block 46179204)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 40 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721088
> > Buffer I/O error on device dm-3, logical block 46179208
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 32768 size 4096 starting block 46179208)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 60 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721120
> > Buffer I/O error on device dm-3, logical block 46179212
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 49152 size 4096 starting block 46179212)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 78 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721144
> > Buffer I/O error on device dm-3, logical block 46179215
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 61440 size 4096 starting block 46179215)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc 90 00 00 10 00
> > end_request: I/O error, dev sda, sector 382721168
> > Buffer I/O error on device dm-3, logical block 46179218
> > Buffer I/O error on device dm-3, logical block 46179219
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 73728 size 8192 starting block 46179218)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc c8 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721224
> > Buffer I/O error on device dm-3, logical block 46179225
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 102400 size 4096 starting block 46179225)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 16 cf dc f8 00 00 08 00
> > end_request: I/O error, dev sda, sector 382721272
> > Buffer I/O error on device dm-3, logical block 46179231
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11535009 (offset 126976 size 4096 starting block 46179231)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 af 80 00 00 10 00
> > end_request: I/O error, dev sda, sector 353415040
> > Buffer I/O error on device dm-3, logical block 42515952
> > Buffer I/O error on device dm-3, logical block 42515953
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10628554 (offset 0 size 8192 starting block 42515952)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 fe e0 00 00 20 00
> > end_request: I/O error, dev sda, sector 366214880
> > Buffer I/O error on device dm-3, logical block 44115932
> > Buffer I/O error on device dm-3, logical block 44115933
> > Buffer I/O error on device dm-3, logical block 44115934
> > Buffer I/O error on device dm-3, logical block 44115935
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11031999 (offset 0 size 16384 starting block 44115932)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 91 6e f0 00 00 10 00
> > end_request: I/O error, dev sda, sector 361852656
> > Buffer I/O error on device dm-3, logical block 43570654
> > Buffer I/O error on device dm-3, logical block 43570655
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896269 (offset 0 size 8192 starting block 43570654)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 10 b2 20 00 00 10 00
> > end_request: I/O error, dev sda, sector 353415712
> > Buffer I/O error on device dm-3, logical block 42516036
> > Buffer I/O error on device dm-3, logical block 42516037
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10629161 (offset 0 size 8192 starting block 42516036)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 92 00 60 00 00 20 00
> > end_request: I/O error, dev sda, sector 361889888
> > Buffer I/O error on device dm-3, logical block 43575308
> > Buffer I/O error on device dm-3, logical block 43575309
> > Buffer I/O error on device dm-3, logical block 43575310
> > Buffer I/O error on device dm-3, logical block 43575311
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896254 (offset 0 size 16384 starting block 43575308)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] INFO: lockdep is turned off.
> > Pid: 34, comm: khugepaged Tainted: G D 3.0.0-rc2+ #72
> > Call Trace:
> > [<ffffffff8104d276>] __might_sleep+0x112/0x117
> > [<ffffffff814cca48>] mutex_lock_nested+0x25/0x40
> > [<ffffffff810ee9b7>] perf_event_exit_task+0x2d/0x1cd
> > [<ffffffff8105ea1f>] do_exit+0x353/0x7fa
> > [<ffffffff8105c569>] ? kmsg_dump+0x89/0x13c
> > [<ffffffff814cf78d>] oops_end+0xbc/0xc5
> > [<ffffffff8100d087>] die+0x5a/0x63
> > [<ffffffff814cf18f>] do_general_protection+0x128/0x131
> > [<ffffffff814cead5>] general_protection+0x25/0x30
> > [<ffffffff81138590>] ? task_subsys_state.constprop.30+0x16/0x78
> > [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> > [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> > [<ffffffff810493f3>] ? need_resched+0x23/0x2d
> > [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> > [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> > [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> > [<ffffffff81134024>] khugepaged+0x5da/0xfaf
> > [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> > [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> > [<ffffffff81078625>] kthread+0xa8/0xb0
> > [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> > [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> > [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> > [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
> > [<ffffffff814d5660>] ? gs_change+0x13/0x13
> > Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 94 1b 40 00 00 18 00
> > end_request: I/O error, dev sda, sector 362027840
> > Buffer I/O error on device dm-3, logical block 43592552
> > Buffer I/O error on device dm-3, logical block 43592553
> > Buffer I/O error on device dm-3, logical block 43592554
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 0 size 12288 starting block 43592552)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 93 06 b0 00 00 10 00
> > end_request: I/O error, dev sda, sector 361957040
> > Buffer I/O error on device dm-3, logical block 43583702
> > Buffer I/O error on device dm-3, logical block 43583703
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 10896287 (offset 12288 size 8192 starting block 43583702)
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> > sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 15 d3 ff 40 00 00 20 00
> > end_request: I/O error, dev sda, sector 366214976
> > Buffer I/O error on device dm-3, logical block 44115944
> > Buffer I/O error on device dm-3, logical block 44115945
> > Buffer I/O error on device dm-3, logical block 44115946
> > Buffer I/O error on device dm-3, logical block 44115947
> > EXT4-fs warning (device dm-3): ext4_end_bio:242: I/O error writing to inode 11032040 (offset 0 size 16384 starting block 44115944)
> > JBD2: Detected IO errors while flushing file data on dm-3-8
> > --

2011-06-11 16:04:24

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, 11 Jun 2011, Hiroyuki Kamezawa wrote:
> 2011/6/11 Hugh Dickins <[email protected]>:
> > On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> >>
> >> I think this can be a fix.
> >
> > Sorry, I think not: I've not digested your rationale,
> > but three things stand out:
> >
> > 1. Why has this only just started happening? ?I may not have run that
> > ? test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> > ? maybe not with khugepaged, but certainly with ksmd.
> >
> Not sure. I pointed this just by review because I found "charge" in
> khugepaged is out of mmap_sem now.

Right, Andrea's patch cited below.

>
> > 2. Your hunk below:
> >> - ? ? if (!mm_need_new_owner(mm, p))
> >> + ? ? if (!mm_need_new_owner(mm, p)) {
> >> + ? ? ? ? ? ? rcu_assign_pointer(mm->owner, NULL);
> > ? is now setting mm->owner to NULL at times when we were sure it did not
> > ? need updating before (task is not the owner): you're damaging mm->owner.
> >
> Ah, yes. It's my mistake.
>
> > 3. There's a patch from Andrea in 3.0-rc1 which looks very likely to be
> > ? relevant, 692e0b35427a "mm: thp: optimize memcg charge in khugepaged".
> > ? I'll try reproducing without that tonight (I crashed in 20 minutes
> > ? this morning, so it's not too hard).

I had another go at reproducing it, 2 hours that time, then a try with
692e0b35427a reverted: it ran overnight for 9 hours when I stopped it.

Andrea, please would you ask Linus to revert that commit before -rc3?
Or is there something else you'd like us to try instead? I admit that
I've not actually taken the time to think through exactly how it goes
wrong, but it does look dangerous.

The way I reproduce it is with my tmpfs kbuilds swapping load,
in this case restricting mem by memcg, and (perhaps the important
detail, not certain) doing concurrent swapoff/swapon repeatedly -
swapoff takes another mm_users reference to the mm it's working on,
which can cause surprises.

Hugh

2011-06-11 17:06:44

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, Jun 11, 2011 at 09:04:14AM -0700, Hugh Dickins wrote:
> I had another go at reproducing it, 2 hours that time, then a try with
> 692e0b35427a reverted: it ran overnight for 9 hours when I stopped it.
>
> Andrea, please would you ask Linus to revert that commit before -rc3?
> Or is there something else you'd like us to try instead? I admit that
> I've not actually taken the time to think through exactly how it goes
> wrong, but it does look dangerous.

Here I was asked if the mem_cgroup_newpage_charge need the mmap_sem at
all. And if not why not to release the mmap_sem early.

https://lkml.org/lkml/2011/3/14/276

So I didn't see why mmap_sem was needed, I also asked confirmation and
who answered agreed it was safe without mmap_sem even if it's the only
place doing that. Maybe that assumption was wrong and we need
mmap_sem after all if this commit is causing problems.

Or did you find something wrong in the actual patch?

Do I understand right that the bug just that we must run
alloc_hugepage_vma+mem_cgroup_newpage_charge within the same critical
section protected by the mmap_sem read mode? Do we know why?

> The way I reproduce it is with my tmpfs kbuilds swapping load,
> in this case restricting mem by memcg, and (perhaps the important
> detail, not certain) doing concurrent swapoff/swapon repeatedly -
> swapoff takes another mm_users reference to the mm it's working on,
> which can cause surprises.

Ok.

2011-06-11 17:51:51

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, Jun 11, 2011 at 01:54:42AM +0200, Johannes Weiner wrote:
> On Fri, Jun 10, 2011 at 02:49:35PM -0700, Hugh Dickins wrote:
> > On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > >
> > > I think this can be a fix.
> >
> > Sorry, I think not: I've not digested your rationale,
> > but three things stand out:
> >
> > 1. Why has this only just started happening? I may not have run that
> > test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> > maybe not with khugepaged, but certainly with ksmd.
> >
> > 2. Your hunk below:
> > > - if (!mm_need_new_owner(mm, p))
> > > + if (!mm_need_new_owner(mm, p)) {
> > > + rcu_assign_pointer(mm->owner, NULL);
> > is now setting mm->owner to NULL at times when we were sure it did not
> > need updating before (task is not the owner): you're damaging mm->owner.

This is a problem with the patch, but I think Kame's analysis and
approach to fix it are still correct.

mm_update_next_owner() does not set mm->owner to NULL when the last
possible owner goes away, but leaves it pointing to a possibly stale
task struct.

Noone cared before khugepaged, and up to Andrea's patch khugepaged
prevented the last possible owner from exiting until the call into the
memory controller had finished.

Here is a revised version of Kame's fix.

---
From: KAMEZAWA Hiroyuki <[email protected]>
Subject: [PATCH] [BUGFIX] mm: clear mm->owner when last possible owner leaves

The following crash was reported:

> Call Trace:
> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> ?[<ffffffff81078625>] kthread+0xa8/0xb0
> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a

What happens is that khugepaged tries to charge a huge page against an
mm whose last possible owner has already exited, and the memory
controller crashes when the stale mm->owner is used to look up the
cgroup to charge.

mm->owner has never been set to NULL with the last owner going away,
but nobody cared until khugepaged came along.

Even then it wasn't a problem because the final mmput() on an mm was
forced to acquire and release mmap_sem in write-mode, preventing an
exiting owner to go away while the mmap_sem was held, and until
"692e0b3 mm: thp: optimize memcg charge in khugepaged", the memory
cgroup charge was protected by mmap_sem in read-mode.

Instead of going back to relying on the mmap_sem to enforce lifetime
of a task, this patch ensures that mm->owner is properly set to NULL
when the last possible owner is exiting, which the memory controller
can handle just fine.

Reported-by: Hugh Dickins <[email protected]>
Reported-by: Dave Jones <[email protected]>
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
---

diff --git a/kernel/exit.c b/kernel/exit.c
index 20a4064..ef8ff79 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -563,27 +563,27 @@ void exit_files(struct task_struct *tsk)
/*
* Task p is exiting and it owned mm, lets find a new owner for it
*/
-static inline int
-mm_need_new_owner(struct mm_struct *mm, struct task_struct *p)
-{
- /*
- * If there are other users of the mm and the owner (us) is exiting
- * we need to find a new owner to take on the responsibility.
- */
- if (atomic_read(&mm->mm_users) <= 1)
- return 0;
- if (mm->owner != p)
- return 0;
- return 1;
-}
-
void mm_update_next_owner(struct mm_struct *mm)
{
struct task_struct *c, *g, *p = current;

retry:
- if (!mm_need_new_owner(mm, p))
+ /*
+ * If the exiting or execing task is not the owner, it's
+ * someone else's problem.
+ */
+ if (mm->owner != p)
+ return;
+
+ /*
+ * The current owner is exiting/execing and there are no other
+ * candidates. Do not leave the mm pointing to a possibly
+ * freed task structure.
+ */
+ if (atomic_read(&mm->mm_users <= 1)) {
+ mm->owner = NULL;
return;
+ }

read_lock(&tasklist_lock);
/*

2011-06-11 18:04:28

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, Jun 11, 2011 at 06:39:43PM +0200, Andrea Arcangeli wrote:
> On Sat, Jun 11, 2011 at 09:04:14AM -0700, Hugh Dickins wrote:
> > I had another go at reproducing it, 2 hours that time, then a try with
> > 692e0b35427a reverted: it ran overnight for 9 hours when I stopped it.
> >
> > Andrea, please would you ask Linus to revert that commit before -rc3?
> > Or is there something else you'd like us to try instead? I admit that
> > I've not actually taken the time to think through exactly how it goes
> > wrong, but it does look dangerous.
>
> Here I was asked if the mem_cgroup_newpage_charge need the mmap_sem at
> all. And if not why not to release the mmap_sem early.
>
> https://lkml.org/lkml/2011/3/14/276
>
> So I didn't see why mmap_sem was needed, I also asked confirmation and
> who answered agreed it was safe without mmap_sem even if it's the only
> place doing that. Maybe that assumption was wrong and we need
> mmap_sem after all if this commit is causing problems.
>
> Or did you find something wrong in the actual patch?
>
> Do I understand right that the bug just that we must run
> alloc_hugepage_vma+mem_cgroup_newpage_charge within the same critical
> section protected by the mmap_sem read mode? Do we know why?

The problem is that mm->owner points to a stale task structure if the
last possible owner is exiting. The mmap_sem just prevented the task
from actually exiting through write-acquiring the mmap_sem in
khugepaged_exit().

I think enforcing lifetime of an object through locks is not the
nicest thing to do, so I stand by what I wrote in the mail you linked
to above :) and agree with Kame that mm->owner should just not point
to a stale task struct. The memcg code can handle it going NULL.

2011-06-11 18:44:28

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, Jun 11, 2011 at 07:51:36PM +0200, Johannes Weiner wrote:
> This is a problem with the patch, but I think Kame's analysis and
> approach to fix it are still correct.

I agree with Kame's analysis too. This explains why removing the
mmap_sem read mode introduced the problem, it was quite some
unexpected subtleness not apparent to the naked eye, as memcg didn't
explicitly relay on mmap_sem but it did implicitly during exit because
of the __khugepaged_exit waiting if we were collapsing an hugepage...

So the fix is safe because the task struct is freed with
delayed_put_task_struct and that won't run until we rcu_read_unlock
after mem_cgroup_from_task.

Reviewed-by: Andrea Arcangeli <[email protected]>

Thanks,
Andrea

2011-06-11 23:04:24

by Hiroyuki Kamezawa

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

2011/6/12 Andrea Arcangeli <[email protected]>:
> On Sat, Jun 11, 2011 at 07:51:36PM +0200, Johannes Weiner wrote:
>> This is a problem with the patch, but I think Kame's analysis and
>> approach to fix it are still correct.
>
> I agree with Kame's analysis too. This explains why removing the
> mmap_sem read mode introduced the problem, it was quite some
> unexpected subtleness not apparent to the naked eye, as memcg didn't
> explicitly relay on mmap_sem but it did implicitly during exit because
> of the __khugepaged_exit waiting if we were collapsing an hugepage...
>
> So the fix is safe because the task struct is freed with
> delayed_put_task_struct and that won't run until we rcu_read_unlock
> after mem_cgroup_from_task.
>
> Reviewed-by: Andrea Arcangeli <[email protected]>
>

Thank you all.

Regards,
-Kame

2011-06-12 09:09:24

by Michal Hocko

[permalink] [raw]
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task

On Sat 11-06-11 08:46:54, Hugh Dickins wrote:
> The discussion in https://lkml.org/lkml/2011/6/9/538
> has continued in https://lkml.org/lkml/2011/6/10/2

Thanks for the pointer, Hugh. I have missed that thread @kvack.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-06-13 01:42:21

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sat, 11 Jun 2011, Johannes Weiner wrote:
> On Sat, Jun 11, 2011 at 01:54:42AM +0200, Johannes Weiner wrote:
> > On Fri, Jun 10, 2011 at 02:49:35PM -0700, Hugh Dickins wrote:
> > > On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > > >
> > > > I think this can be a fix.
> > >
> > > Sorry, I think not: I've not digested your rationale,
> > > but three things stand out:
> > >
> > > 1. Why has this only just started happening? I may not have run that
> > > test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> > > maybe not with khugepaged, but certainly with ksmd.
> > >
> > > 2. Your hunk below:
> > > > - if (!mm_need_new_owner(mm, p))
> > > > + if (!mm_need_new_owner(mm, p)) {
> > > > + rcu_assign_pointer(mm->owner, NULL);
> > > is now setting mm->owner to NULL at times when we were sure it did not
> > > need updating before (task is not the owner): you're damaging mm->owner.
>
> This is a problem with the patch, but I think Kame's analysis and
> approach to fix it are still correct.

Yes, I was looking at his patch, when I should have spent more time
reading his comments: you're right, the analysis is fine, and I too
dislike stale pointers.

>
> mm_update_next_owner() does not set mm->owner to NULL when the last
> possible owner goes away, but leaves it pointing to a possibly stale
> task struct.
>
> Noone cared before khugepaged, and up to Andrea's patch khugepaged
> prevented the last possible owner from exiting until the call into the
> memory controller had finished.
>
> Here is a revised version of Kame's fix.

It seems to be strangely difficult to get right!
I have no idea what your
if (atomic_read(&mm->mm_users <= 1)) {
actually ends up doing, I'm surprised it only gives compiler warnings
rather than an error.

The version I've signed off and am actually testing is below;
but I've not had enough time to spare on the machine which reproduced
it before, and another I thought I'd delegate it to last night,
failed to reproduce without the patch. Try again tonight.

Thought I'd better respond despite inadequate testing, given the flaw
in the posted patch. Hope the one below is flawless.

Hugh

>
> ---

From: KAMEZAWA Hiroyuki <[email protected]>
Subject: [PATCH] [BUGFIX] mm: clear mm->owner when last possible owner leaves

The following crash was reported:

> Call Trace:
> ?[<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> ?[<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> ?[<ffffffff810493f3>] ? need_resched+0x23/0x2d
> ?[<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> ?[<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> ?[<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> ?[<ffffffff81134024>] khugepaged+0x5da/0xfaf
> ?[<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> ?[<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> ?[<ffffffff81078625>] kthread+0xa8/0xb0
> ?[<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> ?[<ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> ?[<ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> ?[<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a

What happens is that khugepaged tries to charge a huge page against an
mm whose last possible owner has already exited, and the memory
controller crashes when the stale mm->owner is used to look up the
cgroup to charge.

mm->owner has never been set to NULL with the last owner going away,
but nobody cared until khugepaged came along.

Even then it wasn't a problem because the final mmput() on an mm was
forced to acquire and release mmap_sem in write-mode, preventing an
exiting owner to go away while the mmap_sem was held, and until
"692e0b3 mm: thp: optimize memcg charge in khugepaged", the memory
cgroup charge was protected by mmap_sem in read-mode.

Instead of going back to relying on the mmap_sem to enforce lifetime
of a task, this patch ensures that mm->owner is properly set to NULL
when the last possible owner is exiting, which the memory controller
can handle just fine.

Reported-by: Hugh Dickins <[email protected]>
Reported-by: Dave Jones <[email protected]>
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
---

diff --git a/kernel/exit.c b/kernel/exit.c
index 20a4064..ef8ff79 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -563,27 +563,27 @@ void exit_files(struct task_struct *tsk)
/*
* Task p is exiting and it owned mm, lets find a new owner for it
*/
-static inline int
-mm_need_new_owner(struct mm_struct *mm, struct task_struct *p)
-{
- /*
- * If there are other users of the mm and the owner (us) is exiting
- * we need to find a new owner to take on the responsibility.
- */
- if (atomic_read(&mm->mm_users) <= 1)
- return 0;
- if (mm->owner != p)
- return 0;
- return 1;
-}
-
void mm_update_next_owner(struct mm_struct *mm)
{
struct task_struct *c, *g, *p = current;

retry:
- if (!mm_need_new_owner(mm, p))
+ /*
+ * If the exiting or execing task is not the owner, it's
+ * someone else's problem.
+ */
+ if (mm->owner != p)
+ return;
+
+ /*
+ * The current owner is exiting/execing and there are no other
+ * candidates. Do not leave the mm pointing to a possibly
+ * freed task structure.
+ */
+ if (atomic_read(&mm->mm_users) <= 1) {
+ mm->owner = NULL;
return;
+ }

read_lock(&tasklist_lock);
/*

2011-06-13 02:01:19

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Sun, 12 Jun 2011 18:41:58 -0700 (PDT)
Hugh Dickins <[email protected]> wrote:

> On Sat, 11 Jun 2011, Johannes Weiner wrote:
> > On Sat, Jun 11, 2011 at 01:54:42AM +0200, Johannes Weiner wrote:
> > > On Fri, Jun 10, 2011 at 02:49:35PM -0700, Hugh Dickins wrote:
> > > > On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > > > >
> > > > > I think this can be a fix.
> > > >
> > > > Sorry, I think not: I've not digested your rationale,
> > > > but three things stand out:
> > > >
> > > > 1. Why has this only just started happening? I may not have run that
> > > > test on 3.0-rc1, but surely I ran it for hours with 2.6.39;
> > > > maybe not with khugepaged, but certainly with ksmd.
> > > >
> > > > 2. Your hunk below:
> > > > > - if (!mm_need_new_owner(mm, p))
> > > > > + if (!mm_need_new_owner(mm, p)) {
> > > > > + rcu_assign_pointer(mm->owner, NULL);
> > > > is now setting mm->owner to NULL at times when we were sure it did not
> > > > need updating before (task is not the owner): you're damaging mm->owner.
> >
> > This is a problem with the patch, but I think Kame's analysis and
> > approach to fix it are still correct.
>
> Yes, I was looking at his patch, when I should have spent more time
> reading his comments: you're right, the analysis is fine, and I too
> dislike stale pointers.
>
> >
> > mm_update_next_owner() does not set mm->owner to NULL when the last
> > possible owner goes away, but leaves it pointing to a possibly stale
> > task struct.
> >
> > Noone cared before khugepaged, and up to Andrea's patch khugepaged
> > prevented the last possible owner from exiting until the call into the
> > memory controller had finished.
> >
> > Here is a revised version of Kame's fix.
>
> It seems to be strangely difficult to get right!
> I have no idea what your
> if (atomic_read(&mm->mm_users <= 1)) {
> actually ends up doing, I'm surprised it only gives compiler warnings
> rather than an error.
>
> The version I've signed off and am actually testing is below;
> but I've not had enough time to spare on the machine which reproduced
> it before, and another I thought I'd delegate it to last night,
> failed to reproduce without the patch. Try again tonight.
>
> Thought I'd better respond despite inadequate testing, given the flaw
> in the posted patch. Hope the one below is flawless.
>

Thank you, I'll do test, too.

-Kame

2011-06-13 14:03:32

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] [BUGFIX] update mm->owner even if no next owner.

On Mon, 13 Jun 2011, KAMEZAWA Hiroyuki wrote:
> On Sun, 12 Jun 2011 18:41:58 -0700 (PDT)
> Hugh Dickins <[email protected]> wrote:
> >
> > The version I've signed off and am actually testing is below;
> > but I've not had enough time to spare on the machine which reproduced
> > it before, and another I thought I'd delegate it to last night,
> > failed to reproduce without the patch. Try again tonight.
> >
> > Thought I'd better respond despite inadequate testing, given the flaw
> > in the posted patch. Hope the one below is flawless.
> >
>
> Thank you, I'll do test, too.

I confirm it fixes the bug: ran ten hours last night, when a couple
of tries just before without the patch each failed in ten minutes.

(But the load was not testing whether it keeps ownership when it should,
I hope you know a quick check on that: our earlier fixes should fail that.)

Hugh