2000-11-24 14:19:30

by Igor Yu. Zhbanov

[permalink] [raw]
Subject: Kernel Oops on locking sockets via fcntl()

Hello!

One fine day accidentally I have opened an Xserver's socket placed in /tmp
with my favourite text editor "le". I have got a message from the kernel similar
to this:

Unable to handle kernel NULL pointer dereference at virtual address 00000038
current->tss.cr3 = 0336f000, %cr3 = 0336f000
*pde = 0336d067
*pte = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c012d49f>]
EFLAGS: 00010202
... and so on.

I couldn't leave it aside. And I started to debug the Kernel by inserting printk
(it's my favourite method of kernel debugging :). Finally I found out the place
where the "Oops" has occurred. It is in function fcntl_setlk() which is placed
in the file linux/fs/locks.c. Here is a piece of code from this function:
-----------------------------------------------------------
struct file *filp;
...
if (filp->f_op->lock != NULL) { /* Oooops!!! */
error = filp->f_op->lock(filp, cmd, &file_lock);
if (error < 0)
goto out_putf;
}
error = posix_lock_file(filp, &file_lock, cmd == F_SETLKW);
-----------------------------------------------------------
As you may see the first "if" statement checks if "filp->f_op->lock" is equal
to "NULL". But in the case of sockets "filp->f_op" is equal to "NULL"!
The "filp->f_op" is the pointer to "file_operations" structure which contains
pointers to functions operating with this type of file.

This problem can be fixed in several ways:
1) We must check EVERYWHERE if "filp->f_op" is equal to "NULL" (inserting this
check to fcntl_setlk() is not enough since you will get an "Oops" when you
will try to unlock the socket).
2) We make certain the "filp->f_op" is not equal to "NULL". Thoughts about it
is described below.
3) Both 1) and 2).
4) Something else.

By the way can we lock a sockets or a pipes or another special files? Or maybe
the function should return an error in case of special files? The answer to this
question depends on the concepts of the Linux Kernel.

It seems what there are no problems with flock() but does is actually locks the
sockets?

Now some words about "filp->f_op". In many functions deal with files and inodes
we may see peace of code like this (linux/fs/ext2/inode.c):
-----------------------------------------------------
inode->i_op = NULL;
if (inode->i_ino == EXT2_ACL_IDX_INO ||
inode->i_ino == EXT2_ACL_DATA_INO)
/* Nothing to do */ ;
else if (S_ISREG(inode->i_mode))
inode->i_op = &ext2_file_inode_operations;
else if (S_ISDIR(inode->i_mode))
inode->i_op = &ext2_dir_inode_operations;
else if (S_ISLNK(inode->i_mode))
inode->i_op = &ext2_symlink_inode_operations;
else if (S_ISCHR(inode->i_mode))
inode->i_op = &chrdev_inode_operations;
else if (S_ISBLK(inode->i_mode))
inode->i_op = &blkdev_inode_operations;
else if (S_ISFIFO(inode->i_mode))
init_fifo(inode);
-----------------------------------------------------
What about "else if (S_ISSOCK(inode->i_mode)) ..."? Since there is no "S_ISSOCK"
the "inode->i_op" is still equal to "NULL" by default. Is it normal or we just
forgot about the sockets? Since the EXT2 is not the only one file system which
can contain a sockets I have found similar code in the following files:
linux/fs/coda/cnode.c
linux/fs/isofs/inode.c
linux/fs/nfs/inode.c
linux/fs/umsdos/inode.c
linux/fs/ext2/inode.c
linux/fs/ext2/namei.c
linux/fs/minix/inode.c
linux/fs/minix/namei.c
linux/fs/sysv/inode.c
linux/fs/sysv/namei.c
linux/fs/ufs/inode.c
linux/fs/ufs/namei.c

And everywhere "S_ISSOCK()" is absent. Is it normal?

Also I have found "struct file_operations socket_file_ops" structure definition
in the file "linux/net/socket.c". Should we use this structure?

That's all. Awaiting your suggestions.

P.S. Since I'm not currently subscribed to Linux Kernel Mailing List please Cc:
all replies also to [email protected] if any.


2000-11-24 14:26:10

by Andi Kleen

[permalink] [raw]
Subject: Re: Kernel Oops on locking sockets via fcntl()

On Fri, Nov 24, 2000 at 04:32:26PM +0300, Igor Yu. Zhbanov wrote:
> Hello!
>
> One fine day accidentally I have opened an Xserver's socket placed in /tmp
> with my favourite text editor "le". I have got a message from the kernel similar
> to this:


Which kernel version are you using ? It should be already fixed in all
modern kernels (newer 2.2 and 2.4)


-Andi

2000-11-27 13:29:50

by Igor Yu. Zhbanov

[permalink] [raw]
Subject: Re: Kernel Oops on locking sockets via fcntl()

On Fri, 24 Nov 2000, Andi Kleen wrote:

> On Fri, Nov 24, 2000 at 04:32:26PM +0300, Igor Yu. Zhbanov wrote:
> > Hello!
> >
> > One fine day accidentally I have opened an Xserver's socket placed in /tmp
> > with my favourite text editor "le". I have got a message from the kernel similar
> > to this:
>
>
> Which kernel version are you using ? It should be already fixed in all
> modern kernels (newer 2.2 and 2.4)
>
>
> -Andi
>

My Kernel version is 2.2.17

2000-11-27 13:53:29

by Andi Kleen

[permalink] [raw]
Subject: Re: Kernel Oops on locking sockets via fcntl()

On Mon, Nov 27, 2000 at 03:59:05PM +0300, Igor Yu. Zhbanov wrote:
> On Fri, 24 Nov 2000, Andi Kleen wrote:
>
> > On Fri, Nov 24, 2000 at 04:32:26PM +0300, Igor Yu. Zhbanov wrote:
> > > Hello!
> > >
> > > One fine day accidentally I have opened an Xserver's socket placed in /tmp
> > > with my favourite text editor "le". I have got a message from the kernel similar
> > > to this:
> >
> >
> > Which kernel version are you using ? It should be already fixed in all
> > modern kernels (newer 2.2 and 2.4)
> >
> >
> > -Andi
> >
>
> My Kernel version is 2.2.17

Oops, looks like the patch got lost. I cannot find it in 2.2.18pre23 anymore.
It came up a few weeks ago and was included in some distribution kernels
(e.g. the SuSE one), but apparently fell through the tracks in the official
2.2 kernel.

Alan? Would you consider including it still in 2.2.18 ?


-Andi


diff -urN linux.a/fs/locks.c linux.s/fs/locks.c
--- linux.a/fs/locks.c Fri Sep 1 01:01:12 2000
+++ linux.s/fs/locks.c Fri Sep 1 01:33:05 2000
@@ -345,6 +345,9 @@
if (!filp->f_dentry || !filp->f_dentry->d_inode)
goto out_putf;

+ if (!filp->f_op)
+ goto out_putf;
+
if (!flock_to_posix_lock(filp, &file_lock, &flock))
goto out_putf;

@@ -425,6 +428,8 @@
goto out_putf;
if (!(inode = dentry->d_inode))
goto out_putf;
+ if (!filp->f_op)
+ goto out_putf;

/* Don't allow mandatory locks on files that may be memory mapped
* and shared.
@@ -517,6 +522,8 @@
error = -EINVAL;
if (!filp->f_dentry || !filp->f_dentry->d_inode)
goto out_putf;
+ if (!filp->f_op)
+ goto out_putf;

if (!flock64_to_posix_lock(filp, &file_lock, &flock))
goto out_putf;
@@ -585,6 +592,8 @@
goto out_putf;
if (!(inode = dentry->d_inode))
goto out_putf;
+ if (!filp->f_op)
+ goto out_putf;

/* Don't allow mandatory locks on files that may be memory mapped
* and shared.
@@ -669,8 +678,9 @@
before = &inode->i_flock;
while ((fl = *before) != NULL) {
if ((fl->fl_flags & FL_POSIX) && fl->fl_owner == owner) {
- int (*lock)(struct file *, int, struct file_lock *);
- lock = filp->f_op->lock;
+ int (*lock)(struct file *, int, struct file_lock *) = NULL;
+ if (filp->f_op)
+ lock = filp->f_op->lock;
if (lock) {
file_lock = *fl;
file_lock.fl_type = F_UNLCK;



-Andi

2000-11-28 12:07:10

by David Miller

[permalink] [raw]
Subject: Re: Kernel Oops on locking sockets via fcntl()


This should fix the bug, there were some cases outside
of file locking (bonus points to anyone who can craft
an exploit for the ELF interpreter case :-):

--- ./fs/binfmt_elf.c.~1~ Mon Oct 30 11:34:25 2000
+++ ./fs/binfmt_elf.c Tue Nov 28 03:03:43 2000
@@ -247,7 +247,7 @@
goto out;
if (!elf_check_arch(interp_elf_ex))
goto out;
- if (!interpreter->f_op->mmap)
+ if (!interpreter->f_op || !interpreter->f_op->mmap)
goto out;

/*
@@ -364,7 +364,7 @@

do_brk(0, text_data);
retval = -ENOEXEC;
- if (!interpreter->f_op->read)
+ if (!interpreter->f_op || !interpreter->f_op->read)
goto out;
retval = interpreter->f_op->read(interpreter, addr, text_data, &offset);
if (retval < 0)
@@ -789,7 +789,7 @@

/* First of all, some simple consistency checks */
if (elf_ex.e_type != ET_EXEC || elf_ex.e_phnum > 2 ||
- !elf_check_arch(&elf_ex) || !file->f_op->mmap)
+ !elf_check_arch(&elf_ex) || !file->f_op || !file->f_op->mmap)
goto out;

/* Now read in all of the header information */
--- ./fs/dquot.c.~1~ Fri Nov 17 17:24:03 2000
+++ ./fs/dquot.c Tue Nov 28 03:05:43 2000
@@ -1474,7 +1474,7 @@
if (IS_ERR(f))
goto out_lock;
error = -EIO;
- if (!f->f_op->read && !f->f_op->write)
+ if (!f->f_op || (!f->f_op->read && !f->f_op->write))
goto out_f;
inode = f->f_dentry->d_inode;
error = -EACCES;
--- ./fs/exec.c.~1~ Thu Nov 9 20:44:59 2000
+++ ./fs/exec.c Tue Nov 28 03:15:42 2000
@@ -604,6 +604,8 @@
/* Huh? We had already checked for MAY_EXEC, WTF do we check this? */
if (!(mode & 0111)) /* with at least _one_ execute bit set */
return -EACCES;
+ if (bprm->file->f_op == NULL)
+ return -EACCES;

bprm->e_uid = current->euid;
bprm->e_gid = current->egid;
--- ./fs/locks.c.~1~ Tue Nov 7 21:04:51 2000
+++ ./fs/locks.c Tue Nov 28 03:13:34 2000
@@ -511,7 +511,8 @@
struct file_lock *fl = *thisfl_p;
int (*lock)(struct file *, int, struct file_lock *);

- if ((lock = fl->fl_file->f_op->lock) != NULL) {
+ if (fl->fl_file->f_op &&
+ (lock = fl->fl_file->f_op->lock) != NULL) {
fl->fl_type = F_UNLCK;
lock(fl->fl_file, F_SETLK, fl);
}
@@ -1355,7 +1356,7 @@
if (!flock_to_posix_lock(filp, &file_lock, &flock))
goto out_putf;

- if (filp->f_op->lock) {
+ if (filp->f_op && filp->f_op->lock) {
error = filp->f_op->lock(filp, F_GETLK, &file_lock);
if (error < 0)
goto out_putf;
@@ -1479,7 +1480,7 @@
goto out_putf;
}

- if (filp->f_op->lock != NULL) {
+ if (filp->f_op && filp->f_op->lock != NULL) {
error = filp->f_op->lock(filp, cmd, file_lock);
if (error < 0)
goto out_putf;
@@ -1520,7 +1521,7 @@
if (!flock64_to_posix_lock(filp, &file_lock, &flock))
goto out_putf;

- if (filp->f_op->lock) {
+ if (filp->f_op && filp->f_op->lock) {
error = filp->f_op->lock(filp, F_GETLK, &file_lock);
if (error < 0)
goto out_putf;
@@ -1617,7 +1618,7 @@
goto out_putf;
}

- if (filp->f_op->lock != NULL) {
+ if (filp->f_op && filp->f_op->lock != NULL) {
error = filp->f_op->lock(filp, cmd, file_lock);
if (error < 0)
goto out_putf;