2008-12-15 15:30:48

by Louis Rilling

[permalink] [raw]
Subject: OOPS in Linux 2.6.28-rc8: NULL pointer access in real_lookup()

Hi,

While letting two Linux 2.6.28-rc8 (.config in attachment) boxes running
(almost) idle during the week-end, I got the same OOPS on both machines:

[141484.875805] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[141484.876023] IP: [<0000000000000000>] 0x0
[141484.876117] PGD 6e83f067 PUD 6e83e067 PMD 0
[141484.876218] Oops: 0010 [#1] SMP
[141484.876312] last sysfs file: /sys/kernel/uevent_seqnum
[141484.876407] Dumping ftrace buffer:
[141484.876494] (ftrace buffer empty)
[141484.876578] CPU 0
[141484.876662] Modules linked in:
[141484.876750] Pid: 5258, comm: find Not tainted 2.6.28-rc8 #37
[141484.876843] RIP: 0010:[<0000000000000000>] [<0000000000000000>] 0x0
[141484.876940] RSP: 0018:ffff88007c141c90 EFLAGS: 00010282
[141484.877031] RAX: ffffffff806e8f90 RBX: ffff88006edb9070 RCX: ffffffff80a94468
[141484.877198] RDX: ffff88007c141de8 RSI: ffff88006edb9070 RDI: ffff88006ed0e300
[141484.877363] RBP: ffff88007c141cd8 R08: ffff88007a884900 R09: 0000000000000038
[141484.877527] R10: 0000000000000000 R11: 0000000000000246 R12: ffff88006ed11e70
[141484.877691] R13: ffff88006ed0e300 R14: ffff88007c141de8 R15: ffff88006ed0e3f0
[141484.877858] FS: 00007f147e0606e0(0000) GS:ffffffff80a05a80(0000) knlGS:0000000000000000
[141484.878027] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[141484.878120] CR2: 0000000000000000 CR3: 000000007c11d000 CR4: 00000000000006e0
[141484.878286] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[141484.878347] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[141484.878347] Process find (pid: 5258, threadinfo ffff88007c140000, task ffff88007a884240)
[141484.878347] Stack:
[141484.878347] ffffffff802c06a5 ffff88007c141d18 ffff88007c141d08 ffff88007ef3d280
[141484.878347] 0000000000000000 ffff88007c141de8 ffff88006ed0e300 ffff88007c141d18
[141484.878347] ffff88006e82c00e ffff88007c141d58 ffffffff802c1cbd ffff88007cc7c2c0
[141484.878347] Call Trace:
[141484.878347] [<ffffffff802c06a5>] ? do_lookup+0xdc/0x164
[141484.878347] [<ffffffff802c1cbd>] __link_path_walk+0x53a/0x69d
[141484.878347] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[141484.878347] [<ffffffff802c1e73>] path_walk+0x53/0x9b
[141484.878347] [<ffffffff802c2016>] do_path_lookup+0x116/0x136
[141484.878347] [<ffffffff802c2e0c>] ? getname+0x16b/0x1ad
[141484.878347] [<ffffffff802c3864>] user_path_at+0x57/0x98
[141484.878347] [<ffffffff802bbd61>] ? new_encode_dev+0x9/0x24
[141484.878347] [<ffffffff802bc213>] ? cp_new_stat+0xdb/0xf4
[141484.878347] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[141484.878347] [<ffffffff802bc082>] vfs_lstat_fd+0x23/0x50
[141484.878347] [<ffffffff802bc29f>] sys_newlstat+0x27/0x41
[141484.878347] [<ffffffff8020b7ec>] ? sysret_check+0x27/0x62
[141484.878347] [<ffffffff8025d6f8>] ? trace_hardirqs_on_caller+0x105/0x129
[141484.878347] [<ffffffff806b99a5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[141484.878347] [<ffffffff802c2e0c>] ? getname+0x16b/0x1ad
[141484.878347] [<ffffffff8020b7bb>] system_call_fastpath+0x16/0x1b
[141484.878347] Code: Bad RIP value.
[141484.878347] RIP [<0000000000000000>] 0x0
[141484.878347] RSP <ffff88007c141c90>
[141484.878347] CR2: 0000000000000000
[141484.881877] ---[ end trace bf0740fca14d9788 ]---

The actual NULL pointer access is a call to a NULL lookup() operation in
real_lookup() line 506:

result = dir->i_op->lookup(dir, dentry, nd);

It seems that this happened during the execution of a cron tasks like man-db or
updatedb (named find in my Debian), but I could not reproduce the bug yet.


The boxes actually run on a same NFSroot with local tmpfs mounts.
Contents of /proc/mounts (after reboot since cat /proc/mounts was blocked as
show below), in case it helps:

rootfs / rootfs rw 0 0
none /sys sysfs rw 0 0
none /proc proc rw 0 0
udev /dev tmpfs rw,size=10240k,mode=755 0 0
10.4.7.254:/srv/nfsroot64 / nfs
rw,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nointr,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.7.254 0 0
tmpfs /var/run tmpfs rw 0 0
10.4.7.254:/srv/nfsroot64 /dev/.static/dev nfs
ro,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nointr,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.7.254 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0
usbfs /proc/bus/usb usbfs rw,nosuid,nodev,noexec 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0
debugfs /debug debugfs rw 0 0
configfs /config configfs rw 0 0
none /sys/kernel/config configfs rw 0 0


Blocked tasks:

[258896.001138] SysRq : Show Blocked State
[258896.001251] task PC stack pid father
[258896.001356] events/0 D 0000000103660de6 6480 9 2
[258896.001459] ffff88007d32bd50 0000000000000046 ffff88007d32bd50 0000000000000046
[258896.001640] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580
[258896.001822] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1a0c0 ffffffff80c1d580
[258896.002003] Call Trace:
[258896.002093] [<ffffffff8025d612>] ? trace_hardirqs_on_caller+0x1f/0x129
[258896.002191] [<ffffffff806b987a>] __down_write_nested+0x95/0xba
[258896.002285] [<ffffffff806b98aa>] __down_write+0xb/0xd
[258896.002378] [<ffffffff806b8bd9>] down_write+0x6d/0x7d
[258896.002471] [<ffffffff802ced13>] ? mark_mounts_for_expiry+0x49/0x118
[258896.002568] [<ffffffff802ced13>] mark_mounts_for_expiry+0x49/0x118
[258896.002665] [<ffffffff803a1b48>] ? nfs_expire_automounts+0x0/0x38
[258896.002759] [<ffffffff803a1b5d>] nfs_expire_automounts+0x15/0x38
[258896.002854] [<ffffffff8024df90>] run_workqueue+0xfc/0x211
[258896.002948] [<ffffffff8024df39>] ? run_workqueue+0xa5/0x211
[258896.003045] [<ffffffff8024ed2b>] worker_thread+0xe8/0xf9
[258896.003139] [<ffffffff80251dcd>] ? autoremove_wake_function+0x0/0x3d
[258896.003235] [<ffffffff8024ec43>] ? worker_thread+0x0/0xf9
[258896.003328] [<ffffffff80251c40>] kthread+0x4e/0x7e
[258896.003420] [<ffffffff8020c8d9>] child_rip+0xa/0x11
[258896.003512] [<ffffffff8020bdf4>] ? restore_args+0x0/0x30
[258896.003605] [<ffffffff80251bf2>] ? kthread+0x0/0x7e
[258896.003697] [<ffffffff8020c8cf>] ? child_rip+0x0/0x11
[258896.003788] find D 00000001036425a0 3328 5569 5568
[258896.003788] ffff88007c1c1928 0000000000000046 ffff88007c1c1898 ffffffff802c9075
[258896.003788] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580
[258896.003788] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1a0c0 ffffffff80c1d580
[258896.003788] Call Trace:
[258896.003788] [<ffffffff802c9075>] ? d_obtain_alias+0x13a/0x159
[258896.003788] [<ffffffff802856d5>] ? time_hardirqs_on+0x12/0x26
[258896.003788] [<ffffffff806b8675>] ? __mutex_lock_common+0x208/0x315
[258896.003788] [<ffffffff806b867d>] __mutex_lock_common+0x210/0x315
[258896.003788] [<ffffffff802cd541>] ? graft_tree+0x7e/0xd2
[258896.003788] [<ffffffff802cd541>] ? graft_tree+0x7e/0xd2
[258896.003788] [<ffffffff806b883a>] mutex_lock_nested+0x3a/0x3f
[258896.003788] [<ffffffff802cd541>] graft_tree+0x7e/0xd2
[258896.003788] [<ffffffff802cd78e>] ? do_add_mount+0x33/0x110
[258896.003788] [<ffffffff802cd811>] do_add_mount+0xb6/0x110
[258896.003788] [<ffffffff803a1a68>] nfs_follow_mountpoint+0x257/0x337
[258896.003788] [<ffffffff802c837f>] ? dput+0x42/0x143
[258896.003788] [<ffffffff80393b00>] ? nfs_lookup_revalidate+0x217/0x2f2
[258896.003788] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[258896.003788] [<ffffffff806b9ceb>] ? _spin_unlock+0x2b/0x2f
[258896.003788] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[258896.003788] [<ffffffff802c29b2>] do_follow_link+0xd6/0x288
[258896.003788] [<ffffffff802c062f>] ? do_lookup+0x66/0x164
[258896.003788] [<ffffffff802c1cf8>] __link_path_walk+0x575/0x69d
[258896.003788] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[258896.003788] [<ffffffff802c1e73>] path_walk+0x53/0x9b
[258896.003788] [<ffffffff802c2016>] do_path_lookup+0x116/0x136
[258896.003788] [<ffffffff802c2095>] path_lookup_open+0x5f/0xa0
[258896.003788] [<ffffffff802c218d>] do_filp_open+0xb7/0x77c
[258896.003788] [<ffffffff8020b668>] ? ftrace_call+0x5/0x2b
[258896.003788] [<ffffffff806b9ceb>] ? _spin_unlock+0x2b/0x2f
[258896.003788] [<ffffffff802cbbfe>] ? alloc_fd+0x101/0x110
[258896.003788] [<ffffffff802b736a>] do_sys_open+0x58/0xdf
[258896.003788] [<ffffffff802b7424>] sys_open+0x20/0x22
[258896.003788] [<ffffffff8020b7bb>] system_call_fastpath+0x16/0x1b
[258896.003788] cat D 0000000103da00ee 3328 5599 4371
[258896.003788] ffff88007c171e08 0000000000000046 0000000000000000 ffffe20002e4aa20
[258896.003788] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1d580
[258896.003788] ffffffff80c1d580 ffffffff80c1d580 ffffffff80c1a0c0 ffffffff80c1d580
[258896.003788] Call Trace:
[258896.003788] [<ffffffff8025d612>] ? trace_hardirqs_on_caller+0x1f/0x129
[258896.003788] [<ffffffff806b9947>] __down_read+0x9b/0xbf
[258896.003788] [<ffffffff806b8c59>] down_read+0x70/0x80
[258896.003788] [<ffffffff802cc9e5>] ? m_start+0x22/0x3b
[258896.003788] [<ffffffff802cc9e5>] m_start+0x22/0x3b
[258896.003788] [<ffffffff802cfcf9>] seq_read+0x105/0x334
[258896.003788] [<ffffffff802b952b>] vfs_read+0xa9/0xe3
[258896.003788] [<ffffffff802b979e>] sys_read+0x4c/0x71
[258896.003788] [<ffffffff8020b7bb>] system_call_fastpath+0x16/0x1b

Thanks!

Louis

--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes


Attachments:
(No filename) (0.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments