2006-11-20 17:36:38

by Sergey Vlasov

[permalink] [raw]
Subject: proc_kill_inodes() race WRT vfs_read() and other users of filp->f_ops

Hello!

proc_kill_inodes() (called by remove_proc_entry()) tries to revoke
access to the removed entry by finding all open files for that entry
and setting filp->f_op = NULL in them. However, it does so without
any locking against users of filp->f_op (file_list_lock() protects
only against concurrent file opens, but does nothing to protect
against sys_read() calls for an already open file).

One way to get a reproducible oops:

# modprobe bonding
# while true; do ip link set bond0 name bond1; ip link set bond1 name bond0; done

and try to run this program in parallel:

#include <fcntl.h>
#include <unistd.h>

int main(int argc, char **argv)
{
char buf[4096];
for (;;) {
int fd = open(argv[1], O_RDONLY);
if (fd != -1) {
while (read(fd, buf, sizeof(buf)) != -1)
;
close(fd);
}
}
}

./file_read_check /proc/net/bonding/bond0

After some warnings:

de_put: deferred delete of bond0
remove_proc_entry: bonding/bond0 busy, count=1
de_put: deferred delete of bond0
remove_proc_entry: bonding/bond0 busy, count=1
de_put: deferred delete of bond0
remove_proc_entry: bonding/bond0 busy, count=1

the kernel soon oopses:

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008
printing eip:
c01621e8
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: bonding binfmt_misc nfsd exportfs lockd nfs_acl sunrpc autofs4 ac thermal processor fan button ide_cd cdrom tsdev analog ns558 gameport psmouse parport_pc parport floppy pcspkr sg it87 hwmon_vid hwmon eeprom i2c_isa ipx p8023 usb_storage libusual snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_mixer_oss snd_ac97_bus snd_pcm intel_agp snd_timer snd skge agpgart uhci_hcd ehci_hcd soundcore i2c_i801 usbcore snd_page_alloc i2c_core nls_koi8_r nls_cp866 vfat fat nls_base ext2 mbcache dm_mod rtc xfs ata_piix libata sd_mod scsi_mod ide_disk ide_generic generic piix ide_core
CPU: 1
EIP: 0060:[<c01621e8>] Not tainted VLI
EFLAGS: 00010246 (2.6.18-std-smp-alt0.11.1 #1)
EIP is at vfs_read+0x90/0x15d
eax: 00000000 ebx: f6272a40 ecx: c02dc100 edx: 00000004
esi: 00000000 edi: bf883b04 ebp: 00001000 esp: ee7fdf84
ds: 007b es: 007b ss: 0068
Process file_read_check (pid: 29955, ti=ee7fd000 task=dfcf27a0 task.ti=ee7fd000)
Stack: 00001000 00000000 f6272a40 fffffff7 bf884bb8 ee7fd000 c0162672 ee7fdfa4
000000b7 00000000 00000000 00000003 bf883b04 c0102e17 00000003 bf883b04
00001000 bf883b04 bf884bb8 bf884b18 00000003 0000007b 0000007b 00000003
Call Trace:
[<c0162672>] sys_read+0x41/0x67
[<c0102e17>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:
Code: 85 c0 89 c5 0f 88 e5 00 00 00 8b 0d 00 89 3e c0 ba 04 00 00 00 89 d8 ff 91 f0 00 00 00 85 c0 74 07 89 c5 e9 c7 00 00 00 8b 43 10 <8b> 70 08 85 f6 74 11 8b 44 24 1c 89 e9 89 fa 89 04 24 89 d8 ff
EIP: [<c01621e8>] vfs_read+0x90/0x15d SS:ESP 0068:ee7fdf84

The kernel crashes at this place in vfs_read():

if (file->f_op->read)
ret = file->f_op->read(file, buf, count, pos);

because file->f_op is NULL, even though it was checked to be non-NULL
before.

This was reproduced on a 2.6.18.3-based kernel, but I don't see any
changes which would solve this problem in the current version.

Does someone have an idea how to fix this?

--
Sergey Vlasov


Attachments:
(No filename) (3.29 kB)
(No filename) (189.00 B)
Download all attachments