2010-12-22 12:28:22

by Thomas Meyer

[permalink] [raw]
Subject: 2.6.37-rc7: NULL pointer dereference

BUG: unable to handle kernel NULL pointer dereference at 00000008
IP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430
*pde = 00000000
Oops: 0000 [#1]
last sysfs file: /sys/devices/platform/regulatory.0/uevent
Modules linked in: vfat fat usb_storage fuse sco bnep l2cap bluetooth cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables kvm_intel kvm uinput arc4 ecb snd_hda_codec_hdmi snd_hda_codec_realtek iwlagn snd_hda_intel snd_hda_codec iwlcore uvcvideo snd_hwdep mac80211 snd_seq videodev snd_seq_device snd_pcm cfg80211 snd_timer rfkill v4l1_compat wmi snd pcspkr soundcore joydev serio_raw snd_page_alloc ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt [last unloaded: scsi_wait_scan]
Pid: 8058, comm: swapoff Tainted: G I 2.6.37-rc7 #221 JM11-MS/Aspire 1810T
EIP: 0060:[<c04eae14>] EFLAGS: 00010246 CPU: 0
EIP is at __mem_cgroup_try_charge+0x234/0x430
EAX: 00000008 EBX: 00000000 ECX: f2e71f10 EDX: f2f96380
ESI: f3e55860 EDI: 00020000 EBP: f2e71eb4 ESP: f2e71e54
DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
Process swapoff (pid: 8058, ti=f2e70000 task=f3e55860 task.ti=f2e70000)
Stack:
f2e71e88 c0456607 26ba7c1c f3e55860 00000010 f3e55860 069d208a b2ee651d
00000008 000000d0 f2f96380 00000005 01ffffff f2e71f10 00000246 ec1a64ae
ffffffff 00000000 27b52eae f044dc84 00000000 f2f96380 00000000 000000d0
Call Trace:
[<c0456607>] ? ktime_get_ts+0x107/0x140
[<c04ebb89>] ? mem_cgroup_try_charge_swapin+0x49/0xb0
[<c04d9b4b>] ? unuse_mm+0x1db/0x300
[<c04dad9a>] ? sys_swapoff+0x2aa/0x890
[<c047cd58>] ? audit_syscall_entry+0x218/0x240
[<c047d043>] ? audit_syscall_exit+0x1f3/0x220
[<c0403013>] ? sysenter_do_call+0x12/0x22
Code: 55 c8 8b 82 90 01 00 00 85 c0 74 09 8b 80 7c 03 00 00 8b 58 2c 3b 1d 54 20 a9 c0 74 61 3b 1d 4c ca a4 c0 74 6a 8d 43 08 89 45 c0 <8b> 43 08 a8 01 0f 85 73 fe ff ff 8d 4b 04 89 5d bc 8d 76 00 8b
EIP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430 SS:ESP 0068:f2e71e54
CR2: 0000000000000008


2010-12-22 15:37:13

by Minchan Kim

[permalink] [raw]
Subject: Re: 2.6.37-rc7: NULL pointer dereference

Cced linux-mm and maintainers of memcg.

On Wed, Dec 22, 2010 at 9:25 PM, Thomas Meyer <[email protected]> wrote:
> BUG: unable to handle kernel NULL pointer dereference at 00000008
> IP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430
> *pde = 00000000
> Oops: 0000 [#1]
> last sysfs file: /sys/devices/platform/regulatory.0/uevent
> Modules linked in: vfat fat usb_storage fuse sco bnep l2cap bluetooth cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables kvm_intel kvm uinput arc4 ecb snd_hda_codec_hdmi snd_hda_codec_realtek iwlagn snd_hda_intel snd_hda_codec iwlcore uvcvideo snd_hwdep mac80211 snd_seq videodev snd_seq_device snd_pcm cfg80211 snd_timer rfkill v4l1_compat wmi snd pcspkr soundcore joydev serio_raw snd_page_alloc ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt [last unloaded: scsi_wait_scan]
> Pid: 8058, comm: swapoff Tainted: G ? ? ? ? ?I 2.6.37-rc7 #221 JM11-MS/Aspire 1810T
> EIP: 0060:[<c04eae14>] EFLAGS: 00010246 CPU: 0
> EIP is at __mem_cgroup_try_charge+0x234/0x430
> EAX: 00000008 EBX: 00000000 ECX: f2e71f10 EDX: f2f96380
> ESI: f3e55860 EDI: 00020000 EBP: f2e71eb4 ESP: f2e71e54
> ?DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> Process swapoff (pid: 8058, ti=f2e70000 task=f3e55860 task.ti=f2e70000)
> Stack:
> ?f2e71e88 c0456607 26ba7c1c f3e55860 00000010 f3e55860 069d208a b2ee651d
> ?00000008 000000d0 f2f96380 00000005 01ffffff f2e71f10 00000246 ec1a64ae
> ?ffffffff 00000000 27b52eae f044dc84 00000000 f2f96380 00000000 000000d0
> Call Trace:
> ?[<c0456607>] ? ktime_get_ts+0x107/0x140
> ?[<c04ebb89>] ? mem_cgroup_try_charge_swapin+0x49/0xb0
> ?[<c04d9b4b>] ? unuse_mm+0x1db/0x300
> ?[<c04dad9a>] ? sys_swapoff+0x2aa/0x890
> ?[<c047cd58>] ? audit_syscall_entry+0x218/0x240
> ?[<c047d043>] ? audit_syscall_exit+0x1f3/0x220
> ?[<c0403013>] ? sysenter_do_call+0x12/0x22
> Code: 55 c8 8b 82 90 01 00 00 85 c0 74 09 8b 80 7c 03 00 00 8b 58 2c 3b 1d 54 20 a9 c0 74 61 3b 1d 4c ca a4 c0 74 6a 8d 43 08 89 45 c0 <8b> 43 08 a8 01 0f 85 73 fe ff ff 8d 4b 04 89 5d bc 8d 76 00 8b
> EIP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430 SS:ESP 0068:f2e71e54
> CR2: 0000000000000008
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>



--
Kind regards,
Minchan Kim

2010-12-22 16:42:08

by Johannes Weiner

[permalink] [raw]
Subject: Re: 2.6.37-rc7: NULL pointer dereference

On Thu, Dec 23, 2010 at 12:37:11AM +0900, Minchan Kim wrote:
> Cced linux-mm and maintainers of memcg.
>
> On Wed, Dec 22, 2010 at 9:25 PM, Thomas Meyer <[email protected]> wrote:
> > BUG: unable to handle kernel NULL pointer dereference at 00000008
> > IP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430
> > *pde = 00000000
> > Oops: 0000 [#1]
> > last sysfs file: /sys/devices/platform/regulatory.0/uevent
> > Modules linked in: vfat fat usb_storage fuse sco bnep l2cap bluetooth cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables kvm_intel kvm uinput arc4 ecb snd_hda_codec_hdmi snd_hda_codec_realtek iwlagn snd_hda_intel snd_hda_codec iwlcore uvcvideo snd_hwdep mac80211 snd_seq videodev snd_seq_device snd_pcm cfg80211 snd_timer rfkill v4l1_compat wmi snd pcspkr soundcore joydev serio_raw snd_page_alloc ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt [last unloaded: scsi_wait_scan]
> > Pid: 8058, comm: swapoff Tainted: G ? ? ? ? ?I 2.6.37-rc7 #221 JM11-MS/Aspire 1810T
> > EIP: 0060:[<c04eae14>] EFLAGS: 00010246 CPU: 0
> > EIP is at __mem_cgroup_try_charge+0x234/0x430
> > EAX: 00000008 EBX: 00000000 ECX: f2e71f10 EDX: f2f96380
> > ESI: f3e55860 EDI: 00020000 EBP: f2e71eb4 ESP: f2e71e54
> > ?DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> > Process swapoff (pid: 8058, ti=f2e70000 task=f3e55860 task.ti=f2e70000)
> > Stack:
> > ?f2e71e88 c0456607 26ba7c1c f3e55860 00000010 f3e55860 069d208a b2ee651d
> > ?00000008 000000d0 f2f96380 00000005 01ffffff f2e71f10 00000246 ec1a64ae
> > ?ffffffff 00000000 27b52eae f044dc84 00000000 f2f96380 00000000 000000d0
> > Call Trace:
> > ?[<c0456607>] ? ktime_get_ts+0x107/0x140
> > ?[<c04ebb89>] ? mem_cgroup_try_charge_swapin+0x49/0xb0
> > ?[<c04d9b4b>] ? unuse_mm+0x1db/0x300
> > ?[<c04dad9a>] ? sys_swapoff+0x2aa/0x890
> > ?[<c047cd58>] ? audit_syscall_entry+0x218/0x240
> > ?[<c047d043>] ? audit_syscall_exit+0x1f3/0x220
> > ?[<c0403013>] ? sysenter_do_call+0x12/0x22
> > Code: 55 c8 8b 82 90 01 00 00 85 c0 74 09 8b 80 7c 03 00 00 8b 58 2c 3b 1d 54 20 a9 c0 74 61 3b 1d 4c ca a4 c0 74 6a 8d 43 08 89 45 c0 <8b> 43 08 a8 01 0f 85 73 fe ff ff 8d 4b 04 89 5d bc 8d 76 00 8b
> > EIP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430 SS:ESP 0068:f2e71e54
> > CR2: 0000000000000008

This could be explained by a kernel without VM_BUG_ON(), where
!mm->owner goes uncaught until css_tryget() reads mem.css.flags (eight
bytes member offset on 32-bit).

Does
http://marc.info/?l=linux-mm&m=128889198016021&w=2
help?

2010-12-23 10:14:00

by Balbir Singh

[permalink] [raw]
Subject: Re: 2.6.37-rc7: NULL pointer dereference

* MinChan Kim <[email protected]> [2010-12-23 00:37:11]:

> Cced linux-mm and maintainers of memcg.
>
> On Wed, Dec 22, 2010 at 9:25 PM, Thomas Meyer <[email protected]> wrote:
> > BUG: unable to handle kernel NULL pointer dereference at 00000008
> > IP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430
> > *pde = 00000000
> > Oops: 0000 [#1]
> > last sysfs file: /sys/devices/platform/regulatory.0/uevent
> > Modules linked in: vfat fat usb_storage fuse sco bnep l2cap bluetooth cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables kvm_intel kvm uinput arc4 ecb snd_hda_codec_hdmi snd_hda_codec_realtek iwlagn snd_hda_intel snd_hda_codec iwlcore uvcvideo snd_hwdep mac80211 snd_seq videodev snd_seq_device snd_pcm cfg80211 snd_timer rfkill v4l1_compat wmi snd pcspkr soundcore joydev serio_raw snd_page_alloc ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt [last unloaded: scsi_wait_scan]
> > Pid: 8058, comm: swapoff Tainted: G ? ? ? ? ?I 2.6.37-rc7 #221 JM11-MS/Aspire 1810T
> > EIP: 0060:[<c04eae14>] EFLAGS: 00010246 CPU: 0
> > EIP is at __mem_cgroup_try_charge+0x234/0x430
> > EAX: 00000008 EBX: 00000000 ECX: f2e71f10 EDX: f2f96380
> > ESI: f3e55860 EDI: 00020000 EBP: f2e71eb4 ESP: f2e71e54
> > ?DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> > Process swapoff (pid: 8058, ti=f2e70000 task=f3e55860 task.ti=f2e70000)
> > Stack:
> > ?f2e71e88 c0456607 26ba7c1c f3e55860 00000010 f3e55860 069d208a b2ee651d
> > ?00000008 000000d0 f2f96380 00000005 01ffffff f2e71f10 00000246 ec1a64ae
> > ?ffffffff 00000000 27b52eae f044dc84 00000000 f2f96380 00000000 000000d0
> > Call Trace:
> > ?[<c0456607>] ? ktime_get_ts+0x107/0x140
> > ?[<c04ebb89>] ? mem_cgroup_try_charge_swapin+0x49/0xb0
> > ?[<c04d9b4b>] ? unuse_mm+0x1db/0x300
> > ?[<c04dad9a>] ? sys_swapoff+0x2aa/0x890
> > ?[<c047cd58>] ? audit_syscall_entry+0x218/0x240
> > ?[<c047d043>] ? audit_syscall_exit+0x1f3/0x220
> > ?[<c0403013>] ? sysenter_do_call+0x12/0x22
> > Code: 55 c8 8b 82 90 01 00 00 85 c0 74 09 8b 80 7c 03 00 00 8b 58 2c 3b 1d 54 20 a9 c0 74 61 3b 1d 4c ca a4 c0 74 6a 8d 43 08 89 45 c0 <8b> 43 08 a8 01 0f 85 73 fe ff ff 8d 4b 04 89 5d bc 8d 76 00 8b
> > EIP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430 SS:ESP 0068:f2e71e54
> > CR2: 0000000000000008

Thanks for the report, does this happen at bootup?

--
Three Cheers,
Balbir

2010-12-23 13:21:38

by Thomas Meyer

[permalink] [raw]
Subject: Re: 2.6.37-rc7: NULL pointer dereference

Am 22.12.2010 um 20:06 schrieb Balbir Singh <[email protected]>:

> Thanks for the report, does this happen at bootup?

I tried to manually upgrade systemd-10 on Fedora 14 to systemd-15. The above error occured after the installation, while trying to reboot the computer. Sadly I needed to revert to systemd-10 because of SELinux policy problems.

With kind regards
Thomas-

2010-12-29 21:50:43

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.37-rc7: NULL pointer dereference

On Wed, 22 Dec 2010, Johannes Weiner wrote:
> On Thu, Dec 23, 2010 at 12:37:11AM +0900, Minchan Kim wrote:
> > On Wed, Dec 22, 2010 at 9:25 PM, Thomas Meyer <[email protected]> wrote:
> > > BUG: unable to handle kernel NULL pointer dereference at 00000008
> > > IP: [<c04eae14>] __mem_cgroup_try_charge+0x234/0x430
> > > Process swapoff (pid: 8058, ti=f2e70000 task=f3e55860 task.ti=f2e70000)
> > > Call Trace:
> > > ?[<c0456607>] ? ktime_get_ts+0x107/0x140
> > > ?[<c04ebb89>] ? mem_cgroup_try_charge_swapin+0x49/0xb0
> > > ?[<c04d9b4b>] ? unuse_mm+0x1db/0x300
> > > ?[<c04dad9a>] ? sys_swapoff+0x2aa/0x890
> > > ?[<c047cd58>] ? audit_syscall_entry+0x218/0x240
> > > ?[<c047d043>] ? audit_syscall_exit+0x1f3/0x220
> > > ?[<c0403013>] ? sysenter_do_call+0x12/0x22
>
> This could be explained by a kernel without VM_BUG_ON(), where
> !mm->owner goes uncaught until css_tryget() reads mem.css.flags (eight
> bytes member offset on 32-bit).
>
> Does
> http://marc.info/?l=linux-mm&m=128889198016021&w=2
> help?

I'm sure you're right, Hannes. Thanks for the prod. Sadly, Kame
and I both let the fix drift, expecting it to magick its way into
Linus's tree. We're now at rc8: I'd better change my Acked-by to
a Signed-off-by and try sending it in immediately: will do so now.

Hugh

2010-12-29 22:07:27

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] memcg: fix wrong VM_BUG_ON() in try_charge()'s mm->owner check

From: KAMEZAWA Hiroyuki <[email protected]>

At __mem_cgroup_try_charge(), VM_BUG_ON(!mm->owner) is checked.
But as commented in mem_cgroup_from_task(), mm->owner can be NULL
in some racy case. This check of VM_BUG_ON() is bad.

A possible story to hit this is at swapoff()->try_to_unuse(). It passes
mm_struct to mem_cgroup_try_charge_swapin() while mm->owner is NULL. If we
can't get proper mem_cgroup from swap_cgroup information, mm->owner is used
as charge target and we see NULL.

Cc: Daisuke Nishimura <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Reported-by: Hugh Dickins <[email protected]>
Reported-by: Thomas Meyer <[email protected]>
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Reviewed-by: Balbir Singh <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Cc: [email protected]
---
Sorry, I hit this on 2.6.36, and we lined up this patch early in
November, but never really pushed it: now Thomas hit it on 37-rc7.

mm/memcontrol.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

--- 2.6.37-rc8/mm/memcontrol.c 2010-11-29 22:29:32.000000000 -0800
+++ linux/mm/memcontrol.c 2010-12-28 21:42:29.000000000 -0800
@@ -1925,19 +1925,18 @@ again:

rcu_read_lock();
p = rcu_dereference(mm->owner);
- VM_BUG_ON(!p);
/*
- * because we don't have task_lock(), "p" can exit while
- * we're here. In that case, "mem" can point to root
- * cgroup but never be NULL. (and task_struct itself is freed
- * by RCU, cgroup itself is RCU safe.) Then, we have small
- * risk here to get wrong cgroup. But such kind of mis-account
- * by race always happens because we don't have cgroup_mutex().
- * It's overkill and we allow that small race, here.
+ * Because we don't have task_lock(), "p" can exit.
+ * In that case, "mem" can point to root or p can be NULL with
+ * race with swapoff. Then, we have small risk of mis-accouning.
+ * But such kind of mis-account by race always happens because
+ * we don't have cgroup_mutex(). It's overkill and we allo that
+ * small race, here.
+ * (*) swapoff at el will charge against mm-struct not against
+ * task-struct. So, mm->owner can be NULL.
*/
mem = mem_cgroup_from_task(p);
- VM_BUG_ON(!mem);
- if (mem_cgroup_is_root(mem)) {
+ if (!mem || mem_cgroup_is_root(mem)) {
rcu_read_unlock();
goto done;
}