2006-03-31 17:45:37

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 0/10] fuse: updates

Various FUSE updates.

Sorry for the late submission, I wanted to set up a public git
repository, but not yet got an account on kernel.org for it.

Patches 0-9 are independent from other patches.

Patch 10 is dependent on the locking patches.

If not too late they should go into 2.6.17. Or if locking changes are
deemed to simmer in -mm longer, then patch 10 should be delayed with
them.

Thanks,
Miklos


2006-03-31 17:47:31

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 1/10] fuse: fix oops in fuse_send_readpages()

During heavy parallel filesystem activity it was possible to Oops the
kernel. The reason is that read_cache_pages() could skip pages which
have already been inserted into the cache by another task.
Occasionally this may result in zero pages actually being sent, while
fuse_send_readpages() relies on at least one page being in the
request.

So check this corner case and just free the request instead of trying
to send it.

Reported and tested by Konstantin Isakov.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/file.c
===================================================================
--- linux.orig/fs/fuse/file.c 2006-03-31 18:55:11.000000000 +0200
+++ linux/fs/fuse/file.c 2006-03-31 18:55:29.000000000 +0200
@@ -397,8 +397,12 @@ static int fuse_readpages(struct file *f
return -EINTR;

err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
- if (!err)
- fuse_send_readpages(data.req, file, inode);
+ if (!err) {
+ if (data.req->num_pages)
+ fuse_send_readpages(data.req, file, inode);
+ else
+ fuse_put_request(fc, data.req);
+ }
return err;
}

2006-03-31 17:48:46

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 2/10] fuse: fix fuse_dev_poll() return value

fuse_dev_poll() returned an error value instead of a poll mask.
Luckily (or unluckily) -ENODEV does contain the POLLERR bit.

There's also a race if filesystem is unmounted between fuse_get_conn()
and spin_lock(), in which case this event will be missed by poll().

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:11.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:30.000000000 +0200
@@ -804,17 +804,18 @@ static ssize_t fuse_dev_write(struct fil

static unsigned fuse_dev_poll(struct file *file, poll_table *wait)
{
- struct fuse_conn *fc = fuse_get_conn(file);
unsigned mask = POLLOUT | POLLWRNORM;
-
+ struct fuse_conn *fc = fuse_get_conn(file);
if (!fc)
- return -ENODEV;
+ return POLLERR;

poll_wait(file, &fc->waitq, wait);

spin_lock(&fuse_lock);
- if (!list_empty(&fc->pending))
- mask |= POLLIN | POLLRDNORM;
+ if (!fc->connected)
+ mask = POLLERR;
+ else if (!list_empty(&fc->pending))
+ mask |= POLLIN | POLLRDNORM;
spin_unlock(&fuse_lock);

return mask;

2006-03-31 17:50:38

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 3/10] fuse: add O_ASYNC support to FUSE device

From: Jeff Dike <[email protected]>

This adds asynchronous notification to FUSE - a FUSE server can
request O_ASYNC on a /dev/fuse file descriptor and receive SIGIO
when there is input available.

One subtlety - fuse_dev_fasync, which is called when O_ASYNC is
requested, does no locking, unlink the other methods. I think it's
unnecessary, as the fuse_conn.fasync list is manipulated only by
fasync_helper and kill_fasync, which provide their own locking. It
would also be wrong to use the fuse_lock, as it's a spin lock and
fasync_helper can sleep. My one concern with this is the fuse_conn
going away underneath fuse_dev_fasync - sys_fcntl takes a reference
on the file struct, so this seems not to be a problem.

Signed-off-by: Jeff Dike <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:30.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
@@ -317,6 +317,7 @@ static void queue_request(struct fuse_co
list_add_tail(&req->list, &fc->pending);
req->state = FUSE_REQ_PENDING;
wake_up(&fc->waitq);
+ kill_fasync(&fc->fasync, SIGIO, POLL_IN);
}

/*
@@ -901,6 +902,7 @@ void fuse_abort_conn(struct fuse_conn *f
end_requests(fc, &fc->pending);
end_requests(fc, &fc->processing);
wake_up_all(&fc->waitq);
+ kill_fasync(&fc->fasync, SIGIO, POLL_IN);
}
spin_unlock(&fuse_lock);
}
@@ -917,12 +919,24 @@ static int fuse_dev_release(struct inode
end_requests(fc, &fc->processing);
}
spin_unlock(&fuse_lock);
- if (fc)
+ if (fc) {
+ fasync_helper(-1, file, 0, &fc->fasync);
kobject_put(&fc->kobj);
+ }

return 0;
}

+static int fuse_dev_fasync(int fd, struct file *file, int on)
+{
+ struct fuse_conn *fc = fuse_get_conn(file);
+ if (!fc)
+ return -ENODEV;
+
+ /* No locking - fasync_helper does its own locking */
+ return fasync_helper(fd, file, on, &fc->fasync);
+}
+
const struct file_operations fuse_dev_operations = {
.owner = THIS_MODULE,
.llseek = no_llseek,
@@ -932,6 +946,7 @@ const struct file_operations fuse_dev_op
.writev = fuse_dev_writev,
.poll = fuse_dev_poll,
.release = fuse_dev_release,
+ .fasync = fuse_dev_fasync,
};

static struct miscdevice fuse_miscdevice = {
Index: linux/fs/fuse/fuse_i.h
===================================================================
--- linux.orig/fs/fuse/fuse_i.h 2006-03-31 18:55:11.000000000 +0200
+++ linux/fs/fuse/fuse_i.h 2006-03-31 18:55:31.000000000 +0200
@@ -318,6 +318,9 @@ struct fuse_conn {

/** kobject */
struct kobject kobj;
+
+ /** O_ASYNC requests */
+ struct fasync_struct *fasync;
};

static inline struct fuse_conn *get_fuse_conn_super(struct super_block *sb)
Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:11.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
@@ -216,6 +216,7 @@ static void fuse_put_super(struct super_
spin_unlock(&fuse_lock);
up_write(&fc->sbput_sem);
/* Flush all readers on this fs */
+ kill_fasync(&fc->fasync, SIGIO, POLL_IN);
wake_up_all(&fc->waitq);
kobject_del(&fc->kobj);
kobject_put(&fc->kobj);

2006-03-31 17:52:48

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 4/10] fuse: add O_NONBLOCK support to FUSE device

From: Jeff Dike <[email protected]>

I don't like duplicating the connected and list_empty tests in
fuse_dev_readv, but this seemed cleaner than adding the f_flags test
to request_wait.

Signed-off-by: Jeff Dike <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
@@ -619,6 +619,12 @@ static ssize_t fuse_dev_readv(struct fil
err = -EPERM;
if (!fc)
goto err_unlock;
+
+ err = -EAGAIN;
+ if((file->f_flags & O_NONBLOCK) && fc->connected &&
+ list_empty(&fc->pending))
+ goto err_unlock;
+
request_wait(fc);
err = -ENODEV;
if (!fc->connected)

2006-03-31 17:54:38

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 5/10] fuse: simplify locking

This is in preparation for removing the global spinlock in favor of a
per-mount one.

The only critical part is the interaction between fuse_dev_release()
and fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.

This is ensured by the fput() operation, which will synchronize the
assignment with other CPU's that may do a final fput() soon after
this.

Also redundant locking is removed from fuse_fill_super(), where
exclusion is already ensured by the BKL held for this function by the
VFS.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
@@ -23,13 +23,11 @@ static kmem_cache_t *fuse_req_cachep;

static struct fuse_conn *fuse_get_conn(struct file *file)
{
- struct fuse_conn *fc;
- spin_lock(&fuse_lock);
- fc = file->private_data;
- if (fc && !fc->connected)
- fc = NULL;
- spin_unlock(&fuse_lock);
- return fc;
+ /*
+ * Lockless access is OK, because file->private data is set
+ * once during mount and is valid until the file is released.
+ */
+ return file->private_data;
}

static void fuse_request_init(struct fuse_req *req)
@@ -607,19 +605,16 @@ static ssize_t fuse_dev_readv(struct fil
unsigned long nr_segs, loff_t *off)
{
int err;
- struct fuse_conn *fc;
struct fuse_req *req;
struct fuse_in *in;
struct fuse_copy_state cs;
unsigned reqsize;
+ struct fuse_conn *fc = fuse_get_conn(file);
+ if (!fc)
+ return -EPERM;

restart:
spin_lock(&fuse_lock);
- fc = file->private_data;
- err = -EPERM;
- if (!fc)
- goto err_unlock;
-
err = -EAGAIN;
if((file->f_flags & O_NONBLOCK) && fc->connected &&
list_empty(&fc->pending))
@@ -915,17 +910,13 @@ void fuse_abort_conn(struct fuse_conn *f

static int fuse_dev_release(struct inode *inode, struct file *file)
{
- struct fuse_conn *fc;
-
- spin_lock(&fuse_lock);
- fc = file->private_data;
+ struct fuse_conn *fc = fuse_get_conn(file);
if (fc) {
+ spin_lock(&fuse_lock);
fc->connected = 0;
end_requests(fc, &fc->pending);
end_requests(fc, &fc->processing);
- }
- spin_unlock(&fuse_lock);
- if (fc) {
+ spin_unlock(&fuse_lock);
fasync_helper(-1, file, 0, &fc->fasync);
kobject_put(&fc->kobj);
}
Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
@@ -414,37 +414,6 @@ static struct fuse_conn *new_conn(void)
return fc;
}

-static struct fuse_conn *get_conn(struct file *file, struct super_block *sb)
-{
- struct fuse_conn *fc;
- int err;
-
- err = -EINVAL;
- if (file->f_op != &fuse_dev_operations)
- goto out_err;
-
- err = -ENOMEM;
- fc = new_conn();
- if (!fc)
- goto out_err;
-
- spin_lock(&fuse_lock);
- err = -EINVAL;
- if (file->private_data)
- goto out_unlock;
-
- kobject_get(&fc->kobj);
- file->private_data = fc;
- spin_unlock(&fuse_lock);
- return fc;
-
- out_unlock:
- spin_unlock(&fuse_lock);
- kobject_put(&fc->kobj);
- out_err:
- return ERR_PTR(err);
-}
-
static struct inode *get_root_inode(struct super_block *sb, unsigned mode)
{
struct fuse_attr attr;
@@ -526,12 +495,9 @@ static void fuse_send_init(struct fuse_c

static unsigned long long conn_id(void)
{
+ /* BKL is held for ->get_sb() */
static unsigned long long ctr = 1;
- unsigned long long val;
- spin_lock(&fuse_lock);
- val = ctr++;
- spin_unlock(&fuse_lock);
- return val;
+ return ctr++;
}

static int fuse_fill_super(struct super_block *sb, void *data, int silent)
@@ -556,10 +522,17 @@ static int fuse_fill_super(struct super_
if (!file)
return -EINVAL;

- fc = get_conn(file, sb);
- fput(file);
- if (IS_ERR(fc))
- return PTR_ERR(fc);
+ if (file->f_op != &fuse_dev_operations)
+ return -EINVAL;
+
+ /* Setting file->private_data can't race with other mount()
+ instances, since BKL is held for ->get_sb() */
+ if (file->private_data)
+ return -EINVAL;
+
+ fc = new_conn();
+ if (!fc)
+ return -ENOMEM;

fc->flags = d.flags;
fc->user_id = d.user_id;
@@ -589,10 +562,16 @@ static int fuse_fill_super(struct super_
goto err_put_root;

sb->s_root = root_dentry;
- spin_lock(&fuse_lock);
fc->mounted = 1;
fc->connected = 1;
- spin_unlock(&fuse_lock);
+ kobject_get(&fc->kobj);
+ file->private_data = fc;
+ /*
+ * atomic_dec_and_test() in fput() provides the necessary
+ * memory barrier for file->private_data to be visible on all
+ * CPUs after this
+ */
+ fput(file);

fuse_send_init(fc);

@@ -601,6 +580,7 @@ static int fuse_fill_super(struct super_
err_put_root:
dput(root_dentry);
err:
+ fput(file);
kobject_put(&fc->kobj);
return err;
}

2006-03-31 17:57:00

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 6/10] fuse: use a per-mount spinlock

Remove the global spinlock in favor of a per-mount one.

This patch is basically find & replace. The difficult part has
already been done by the previous patch.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
@@ -1,6 +1,6 @@
/*
FUSE: Filesystem in Userspace
- Copyright (C) 2001-2005 Miklos Szeredi <[email protected]>
+ Copyright (C) 2001-2006 Miklos Szeredi <[email protected]>

This program can be distributed under the terms of the GNU GPL.
See the file COPYING.
@@ -94,11 +94,11 @@ static struct fuse_req *do_get_request(s
{
struct fuse_req *req;

- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
BUG_ON(list_empty(&fc->unused_list));
req = list_entry(fc->unused_list.next, struct fuse_req, list);
list_del_init(&req->list);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
fuse_request_init(req);
req->preallocated = 1;
req->in.h.uid = current->fsuid;
@@ -124,7 +124,7 @@ struct fuse_req *fuse_get_request(struct
return do_get_request(fc);
}

-/* Must be called with fuse_lock held */
+/* Must be called with fc->lock held */
static void fuse_putback_request(struct fuse_conn *fc, struct fuse_req *req)
{
if (req->preallocated) {
@@ -143,9 +143,9 @@ static void fuse_putback_request(struct
void fuse_put_request(struct fuse_conn *fc, struct fuse_req *req)
{
if (atomic_dec_and_test(&req->count)) {
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
fuse_putback_request(fc, req);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}
}

@@ -155,15 +155,15 @@ static void fuse_put_request_locked(stru
fuse_putback_request(fc, req);
}

-void fuse_release_background(struct fuse_req *req)
+void fuse_release_background(struct fuse_conn *fc, struct fuse_req *req)
{
iput(req->inode);
iput(req->inode2);
if (req->file)
fput(req->file);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
list_del(&req->bg_entry);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}

/*
@@ -182,7 +182,7 @@ void fuse_release_background(struct fuse
* interrupted and put in the background, it will return with an error
* and hence never be reset and reused.
*
- * Called with fuse_lock, unlocks it
+ * Called with fc->lock, unlocks it
*/
static void request_end(struct fuse_conn *fc, struct fuse_req *req)
{
@@ -191,14 +191,14 @@ static void request_end(struct fuse_conn
if (!req->background) {
wake_up(&req->waitq);
fuse_put_request_locked(fc, req);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
} else {
void (*end) (struct fuse_conn *, struct fuse_req *) = req->end;
req->end = NULL;
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
down_read(&fc->sbput_sem);
if (fc->mounted)
- fuse_release_background(req);
+ fuse_release_background(fc, req);
up_read(&fc->sbput_sem);
if (end)
end(fc, req);
@@ -248,16 +248,16 @@ static void background_request(struct fu
get_file(req->file);
}

-/* Called with fuse_lock held. Releases, and then reacquires it. */
+/* Called with fc->lock held. Releases, and then reacquires it. */
static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
{
sigset_t oldset;

- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
block_sigs(&oldset);
wait_event_interruptible(req->waitq, req->state == FUSE_REQ_FINISHED);
restore_sigs(&oldset);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (req->state == FUSE_REQ_FINISHED && !req->interrupted)
return;

@@ -271,9 +271,9 @@ static void request_wait_answer(struct f
locked state, there mustn't be any filesystem
operation (e.g. page fault), since that could lead
to deadlock */
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
wait_event(req->waitq, !req->locked);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
}
if (req->state == FUSE_REQ_PENDING) {
list_del(&req->list);
@@ -324,7 +324,7 @@ static void queue_request(struct fuse_co
void request_send(struct fuse_conn *fc, struct fuse_req *req)
{
req->isreply = 1;
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (!fc->connected)
req->out.h.error = -ENOTCONN;
else if (fc->conn_error)
@@ -337,15 +337,15 @@ void request_send(struct fuse_conn *fc,

request_wait_answer(fc, req);
}
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}

static void request_send_nowait(struct fuse_conn *fc, struct fuse_req *req)
{
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (fc->connected) {
queue_request(fc, req);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
} else {
req->out.h.error = -ENOTCONN;
request_end(fc, req);
@@ -361,9 +361,9 @@ void request_send_noreply(struct fuse_co
void request_send_background(struct fuse_conn *fc, struct fuse_req *req)
{
req->isreply = 1;
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
background_request(fc, req);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
request_send_nowait(fc, req);
}

@@ -372,16 +372,16 @@ void request_send_background(struct fuse
* anything that could cause a page-fault. If the request was already
* interrupted bail out.
*/
-static int lock_request(struct fuse_req *req)
+static int lock_request(struct fuse_conn *fc, struct fuse_req *req)
{
int err = 0;
if (req) {
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (req->interrupted)
err = -ENOENT;
else
req->locked = 1;
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}
return err;
}
@@ -391,18 +391,19 @@ static int lock_request(struct fuse_req
* requester thread is currently waiting for it to be unlocked, so
* wake it up.
*/
-static void unlock_request(struct fuse_req *req)
+static void unlock_request(struct fuse_conn *fc, struct fuse_req *req)
{
if (req) {
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
req->locked = 0;
if (req->interrupted)
wake_up(&req->waitq);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}
}

struct fuse_copy_state {
+ struct fuse_conn *fc;
int write;
struct fuse_req *req;
const struct iovec *iov;
@@ -415,11 +416,12 @@ struct fuse_copy_state {
unsigned len;
};

-static void fuse_copy_init(struct fuse_copy_state *cs, int write,
- struct fuse_req *req, const struct iovec *iov,
- unsigned long nr_segs)
+static void fuse_copy_init(struct fuse_copy_state *cs, struct fuse_conn *fc,
+ int write, struct fuse_req *req,
+ const struct iovec *iov, unsigned long nr_segs)
{
memset(cs, 0, sizeof(*cs));
+ cs->fc = fc;
cs->write = write;
cs->req = req;
cs->iov = iov;
@@ -449,7 +451,7 @@ static int fuse_copy_fill(struct fuse_co
unsigned long offset;
int err;

- unlock_request(cs->req);
+ unlock_request(cs->fc, cs->req);
fuse_copy_finish(cs);
if (!cs->seglen) {
BUG_ON(!cs->nr_segs);
@@ -472,7 +474,7 @@ static int fuse_copy_fill(struct fuse_co
cs->seglen -= cs->len;
cs->addr += cs->len;

- return lock_request(cs->req);
+ return lock_request(cs->fc, cs->req);
}

/* Do as much copy to/from userspace buffer as we can */
@@ -584,9 +586,9 @@ static void request_wait(struct fuse_con
if (signal_pending(current))
break;

- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
schedule();
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
}
set_current_state(TASK_RUNNING);
remove_wait_queue(&fc->waitq, &wait);
@@ -614,7 +616,7 @@ static ssize_t fuse_dev_readv(struct fil
return -EPERM;

restart:
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
err = -EAGAIN;
if((file->f_flags & O_NONBLOCK) && fc->connected &&
list_empty(&fc->pending))
@@ -643,14 +645,14 @@ static ssize_t fuse_dev_readv(struct fil
request_end(fc, req);
goto restart;
}
- spin_unlock(&fuse_lock);
- fuse_copy_init(&cs, 1, req, iov, nr_segs);
+ spin_unlock(&fc->lock);
+ fuse_copy_init(&cs, fc, 1, req, iov, nr_segs);
err = fuse_copy_one(&cs, &in->h, sizeof(in->h));
if (!err)
err = fuse_copy_args(&cs, in->numargs, in->argpages,
(struct fuse_arg *) in->args, 0);
fuse_copy_finish(&cs);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
req->locked = 0;
if (!err && req->interrupted)
err = -ENOENT;
@@ -665,12 +667,12 @@ static ssize_t fuse_dev_readv(struct fil
else {
req->state = FUSE_REQ_SENT;
list_move_tail(&req->list, &fc->processing);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}
return reqsize;

err_unlock:
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
return err;
}

@@ -739,7 +741,7 @@ static ssize_t fuse_dev_writev(struct fi
if (!fc)
return -ENODEV;

- fuse_copy_init(&cs, 0, NULL, iov, nr_segs);
+ fuse_copy_init(&cs, fc, 0, NULL, iov, nr_segs);
if (nbytes < sizeof(struct fuse_out_header))
return -EINVAL;

@@ -751,7 +753,7 @@ static ssize_t fuse_dev_writev(struct fi
oh.len != nbytes)
goto err_finish;

- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
err = -ENOENT;
if (!fc->connected)
goto err_unlock;
@@ -762,9 +764,9 @@ static ssize_t fuse_dev_writev(struct fi
goto err_unlock;

if (req->interrupted) {
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
fuse_copy_finish(&cs);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
request_end(fc, req);
return -ENOENT;
}
@@ -772,12 +774,12 @@ static ssize_t fuse_dev_writev(struct fi
req->out.h = oh;
req->locked = 1;
cs.req = req;
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);

err = copy_out_args(&cs, &req->out, nbytes);
fuse_copy_finish(&cs);

- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
req->locked = 0;
if (!err) {
if (req->interrupted)
@@ -789,7 +791,7 @@ static ssize_t fuse_dev_writev(struct fi
return err ? err : nbytes;

err_unlock:
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
err_finish:
fuse_copy_finish(&cs);
return err;
@@ -813,12 +815,12 @@ static unsigned fuse_dev_poll(struct fil

poll_wait(file, &fc->waitq, wait);

- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (!fc->connected)
mask = POLLERR;
else if (!list_empty(&fc->pending))
mask |= POLLIN | POLLRDNORM;
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);

return mask;
}
@@ -826,7 +828,7 @@ static unsigned fuse_dev_poll(struct fil
/*
* Abort all requests on the given list (pending or processing)
*
- * This function releases and reacquires fuse_lock
+ * This function releases and reacquires fc->lock
*/
static void end_requests(struct fuse_conn *fc, struct list_head *head)
{
@@ -835,7 +837,7 @@ static void end_requests(struct fuse_con
req = list_entry(head->next, struct fuse_req, list);
req->out.h.error = -ECONNABORTED;
request_end(fc, req);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
}
}

@@ -866,10 +868,10 @@ static void end_io_requests(struct fuse_
req->end = NULL;
/* The end function will consume this reference */
__fuse_get_request(req);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
wait_event(req->waitq, !req->locked);
end(fc, req);
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
}
}
}
@@ -896,7 +898,7 @@ static void end_io_requests(struct fuse_
*/
void fuse_abort_conn(struct fuse_conn *fc)
{
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
if (fc->connected) {
fc->connected = 0;
end_io_requests(fc);
@@ -905,18 +907,18 @@ void fuse_abort_conn(struct fuse_conn *f
wake_up_all(&fc->waitq);
kill_fasync(&fc->fasync, SIGIO, POLL_IN);
}
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
}

static int fuse_dev_release(struct inode *inode, struct file *file)
{
struct fuse_conn *fc = fuse_get_conn(file);
if (fc) {
- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
fc->connected = 0;
end_requests(fc, &fc->pending);
end_requests(fc, &fc->processing);
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
fasync_helper(-1, file, 0, &fc->fasync);
kobject_put(&fc->kobj);
}
Index: linux/fs/fuse/fuse_i.h
===================================================================
--- linux.orig/fs/fuse/fuse_i.h 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/fuse_i.h 2006-03-31 18:55:31.000000000 +0200
@@ -1,6 +1,6 @@
/*
FUSE: Filesystem in Userspace
- Copyright (C) 2001-2005 Miklos Szeredi <[email protected]>
+ Copyright (C) 2001-2006 Miklos Szeredi <[email protected]>

This program can be distributed under the terms of the GNU GPL.
See the file COPYING.
@@ -144,7 +144,7 @@ struct fuse_req {
/*
* The following bitfields are either set once before the
* request is queued or setting/clearing them is protected by
- * fuse_lock
+ * fuse_conn->lock
*/

/** True if the request has reply */
@@ -213,6 +213,9 @@ struct fuse_req {
* unmounted.
*/
struct fuse_conn {
+ /** Lock protecting accessess to members of this structure */
+ spinlock_t lock;
+
/** The user id for this mount */
uid_t user_id;

@@ -352,21 +355,6 @@ static inline u64 get_node_id(struct ino
extern const struct file_operations fuse_dev_operations;

/**
- * This is the single global spinlock which protects FUSE's structures
- *
- * The following data is protected by this lock:
- *
- * - the private_data field of the device file
- * - the s_fs_info field of the super block
- * - unused_list, pending, processing lists in fuse_conn
- * - background list in fuse_conn
- * - the unique request ID counter reqctr in fuse_conn
- * - the sb (super_block) field in fuse_conn
- * - the file (device file) field in fuse_conn
- */
-extern spinlock_t fuse_lock;
-
-/**
* Get a filled in inode
*/
struct inode *fuse_iget(struct super_block *sb, unsigned long nodeid,
@@ -490,7 +478,7 @@ void request_send_background(struct fuse
/**
* Release inodes and file associated with background request
*/
-void fuse_release_background(struct fuse_req *req);
+void fuse_release_background(struct fuse_conn *fc, struct fuse_req *req);

/* Abort all requests */
void fuse_abort_conn(struct fuse_conn *fc);
Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
@@ -1,6 +1,6 @@
/*
FUSE: Filesystem in Userspace
- Copyright (C) 2001-2005 Miklos Szeredi <[email protected]>
+ Copyright (C) 2001-2006 Miklos Szeredi <[email protected]>

This program can be distributed under the terms of the GNU GPL.
See the file COPYING.
@@ -22,7 +22,6 @@ MODULE_AUTHOR("Miklos Szeredi <miklos@sz
MODULE_DESCRIPTION("Filesystem in Userspace");
MODULE_LICENSE("GPL");

-spinlock_t fuse_lock;
static kmem_cache_t *fuse_inode_cachep;
static struct subsystem connections_subsys;

@@ -207,13 +206,14 @@ static void fuse_put_super(struct super_

down_write(&fc->sbput_sem);
while (!list_empty(&fc->background))
- fuse_release_background(list_entry(fc->background.next,
+ fuse_release_background(fc,
+ list_entry(fc->background.next,
struct fuse_req, bg_entry));

- spin_lock(&fuse_lock);
+ spin_lock(&fc->lock);
fc->mounted = 0;
fc->connected = 0;
- spin_unlock(&fuse_lock);
+ spin_unlock(&fc->lock);
up_write(&fc->sbput_sem);
/* Flush all readers on this fs */
kill_fasync(&fc->fasync, SIGIO, POLL_IN);
@@ -388,6 +388,7 @@ static struct fuse_conn *new_conn(void)
fc = kzalloc(sizeof(*fc), GFP_KERNEL);
if (fc) {
int i;
+ spin_lock_init(&fc->lock);
init_waitqueue_head(&fc->waitq);
INIT_LIST_HEAD(&fc->pending);
INIT_LIST_HEAD(&fc->processing);
@@ -734,7 +735,6 @@ static int __init fuse_init(void)
printk("fuse init (API version %i.%i)\n",
FUSE_KERNEL_VERSION, FUSE_KERNEL_MINOR_VERSION);

- spin_lock_init(&fuse_lock);
res = fuse_fs_init();
if (res)
goto err;

2006-03-31 17:58:41

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 7/10] fuse: consolidate device errors

Return consistent error values for the case when the opened device
file has no mount associated yet.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:32.000000000 +0200
@@ -739,7 +739,7 @@ static ssize_t fuse_dev_writev(struct fi
struct fuse_copy_state cs;
struct fuse_conn *fc = fuse_get_conn(file);
if (!fc)
- return -ENODEV;
+ return -EPERM;

fuse_copy_init(&cs, fc, 0, NULL, iov, nr_segs);
if (nbytes < sizeof(struct fuse_out_header))
@@ -930,7 +930,7 @@ static int fuse_dev_fasync(int fd, struc
{
struct fuse_conn *fc = fuse_get_conn(file);
if (!fc)
- return -ENODEV;
+ return -EPERM;

/* No locking - fasync_helper does its own locking */
return fasync_helper(fd, file, on, &fc->fasync);

2006-03-31 17:59:55

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 8/10] fuse: clean up request accounting

FUSE allocated most requests from a fixed size pool filled at mount
time. However in some cases (release/forget) non-pool requests were
used. File locking operations aren't well served by the request pool,
since they may block indefinetly thus exhausting the pool.

This patch removes the request pool and always allocates requests on
demand.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:32.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:32.000000000 +0200
@@ -72,10 +72,8 @@ static void restore_sigs(sigset_t *oldse
*/
void fuse_reset_request(struct fuse_req *req)
{
- int preallocated = req->preallocated;
BUG_ON(atomic_read(&req->count) != 1);
fuse_request_init(req);
- req->preallocated = preallocated;
}

static void __fuse_get_request(struct fuse_req *req)
@@ -90,71 +88,28 @@ static void __fuse_put_request(struct fu
atomic_dec(&req->count);
}

-static struct fuse_req *do_get_request(struct fuse_conn *fc)
+struct fuse_req *fuse_get_req(struct fuse_conn *fc)
{
- struct fuse_req *req;
+ struct fuse_req *req = fuse_request_alloc();
+ if (!req)
+ return ERR_PTR(-ENOMEM);

- spin_lock(&fc->lock);
- BUG_ON(list_empty(&fc->unused_list));
- req = list_entry(fc->unused_list.next, struct fuse_req, list);
- list_del_init(&req->list);
- spin_unlock(&fc->lock);
+ atomic_inc(&fc->num_waiting);
fuse_request_init(req);
- req->preallocated = 1;
req->in.h.uid = current->fsuid;
req->in.h.gid = current->fsgid;
req->in.h.pid = current->pid;
return req;
}

-/* This can return NULL, but only in case it's interrupted by a SIGKILL */
-struct fuse_req *fuse_get_request(struct fuse_conn *fc)
-{
- int intr;
- sigset_t oldset;
-
- atomic_inc(&fc->num_waiting);
- block_sigs(&oldset);
- intr = down_interruptible(&fc->outstanding_sem);
- restore_sigs(&oldset);
- if (intr) {
- atomic_dec(&fc->num_waiting);
- return NULL;
- }
- return do_get_request(fc);
-}
-
-/* Must be called with fc->lock held */
-static void fuse_putback_request(struct fuse_conn *fc, struct fuse_req *req)
-{
- if (req->preallocated) {
- atomic_dec(&fc->num_waiting);
- list_add(&req->list, &fc->unused_list);
- } else
- fuse_request_free(req);
-
- /* If we are in debt decrease that first */
- if (fc->outstanding_debt)
- fc->outstanding_debt--;
- else
- up(&fc->outstanding_sem);
-}
-
void fuse_put_request(struct fuse_conn *fc, struct fuse_req *req)
{
if (atomic_dec_and_test(&req->count)) {
- spin_lock(&fc->lock);
- fuse_putback_request(fc, req);
- spin_unlock(&fc->lock);
+ atomic_dec(&fc->num_waiting);
+ fuse_request_free(req);
}
}

-static void fuse_put_request_locked(struct fuse_conn *fc, struct fuse_req *req)
-{
- if (atomic_dec_and_test(&req->count))
- fuse_putback_request(fc, req);
-}
-
void fuse_release_background(struct fuse_conn *fc, struct fuse_req *req)
{
iput(req->inode);
@@ -189,9 +144,9 @@ static void request_end(struct fuse_conn
list_del(&req->list);
req->state = FUSE_REQ_FINISHED;
if (!req->background) {
- wake_up(&req->waitq);
- fuse_put_request_locked(fc, req);
spin_unlock(&fc->lock);
+ wake_up(&req->waitq);
+ fuse_put_request(fc, req);
} else {
void (*end) (struct fuse_conn *, struct fuse_req *) = req->end;
req->end = NULL;
@@ -302,16 +257,6 @@ static void queue_request(struct fuse_co
req->in.h.unique = fc->reqctr;
req->in.h.len = sizeof(struct fuse_in_header) +
len_args(req->in.numargs, (struct fuse_arg *) req->in.args);
- if (!req->preallocated) {
- /* If request is not preallocated (either FORGET or
- RELEASE), then still decrease outstanding_sem, so
- user can't open infinite number of files while not
- processing the RELEASE requests. However for
- efficiency do it without blocking, so if down()
- would block, just increase the debt instead */
- if (down_trylock(&fc->outstanding_sem))
- fc->outstanding_debt++;
- }
list_add_tail(&req->list, &fc->pending);
req->state = FUSE_REQ_PENDING;
wake_up(&fc->waitq);
Index: linux/fs/fuse/dir.c
===================================================================
--- linux.orig/fs/fuse/dir.c 2006-03-31 18:55:10.000000000 +0200
+++ linux/fs/fuse/dir.c 2006-03-31 18:55:32.000000000 +0200
@@ -117,8 +117,8 @@ static int fuse_dentry_revalidate(struct
return 0;

fc = get_fuse_conn(inode);
- req = fuse_get_request(fc);
- if (!req)
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
return 0;

fuse_lookup_init(req, entry->d_parent->d_inode, entry, &outarg);
@@ -188,9 +188,9 @@ static struct dentry *fuse_lookup(struct
if (entry->d_name.len > FUSE_NAME_MAX)
return ERR_PTR(-ENAMETOOLONG);

- req = fuse_get_request(fc);
- if (!req)
- return ERR_PTR(-EINTR);
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return ERR_PTR(PTR_ERR(req));

fuse_lookup_init(req, dir, entry, &outarg);
request_send(fc, req);
@@ -244,15 +244,14 @@ static int fuse_create_open(struct inode
struct file *file;
int flags = nd->intent.open.flags - 1;

- err = -ENOSYS;
if (fc->no_create)
- goto out;
+ return -ENOSYS;

- err = -EINTR;
- req = fuse_get_request(fc);
- if (!req)
- goto out;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

+ err = -ENOMEM;
ff = fuse_file_alloc();
if (!ff)
goto out_put_request;
@@ -314,7 +313,6 @@ static int fuse_create_open(struct inode
fuse_file_free(ff);
out_put_request:
fuse_put_request(fc, req);
- out:
return err;
}

@@ -375,9 +373,9 @@ static int fuse_mknod(struct inode *dir,
{
struct fuse_mknod_in inarg;
struct fuse_conn *fc = get_fuse_conn(dir);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.mode = mode;
@@ -407,9 +405,9 @@ static int fuse_mkdir(struct inode *dir,
{
struct fuse_mkdir_in inarg;
struct fuse_conn *fc = get_fuse_conn(dir);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.mode = mode;
@@ -427,9 +425,9 @@ static int fuse_symlink(struct inode *di
{
struct fuse_conn *fc = get_fuse_conn(dir);
unsigned len = strlen(link) + 1;
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->in.h.opcode = FUSE_SYMLINK;
req->in.numargs = 2;
@@ -444,9 +442,9 @@ static int fuse_unlink(struct inode *dir
{
int err;
struct fuse_conn *fc = get_fuse_conn(dir);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->in.h.opcode = FUSE_UNLINK;
req->in.h.nodeid = get_node_id(dir);
@@ -476,9 +474,9 @@ static int fuse_rmdir(struct inode *dir,
{
int err;
struct fuse_conn *fc = get_fuse_conn(dir);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->in.h.opcode = FUSE_RMDIR;
req->in.h.nodeid = get_node_id(dir);
@@ -504,9 +502,9 @@ static int fuse_rename(struct inode *old
int err;
struct fuse_rename_in inarg;
struct fuse_conn *fc = get_fuse_conn(olddir);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.newdir = get_node_id(newdir);
@@ -553,9 +551,9 @@ static int fuse_link(struct dentry *entr
struct fuse_link_in inarg;
struct inode *inode = entry->d_inode;
struct fuse_conn *fc = get_fuse_conn(inode);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.oldnodeid = get_node_id(inode);
@@ -583,9 +581,9 @@ int fuse_do_getattr(struct inode *inode)
int err;
struct fuse_attr_out arg;
struct fuse_conn *fc = get_fuse_conn(inode);
- struct fuse_req *req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ struct fuse_req *req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->in.h.opcode = FUSE_GETATTR;
req->in.h.nodeid = get_node_id(inode);
@@ -673,9 +671,9 @@ static int fuse_access(struct inode *ino
if (fc->no_access)
return 0;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.mask = mask;
@@ -780,9 +778,9 @@ static int fuse_readdir(struct file *fil
if (is_bad_inode(inode))
return -EIO;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

page = alloc_page(GFP_KERNEL);
if (!page) {
@@ -809,11 +807,11 @@ static char *read_link(struct dentry *de
{
struct inode *inode = dentry->d_inode;
struct fuse_conn *fc = get_fuse_conn(inode);
- struct fuse_req *req = fuse_get_request(fc);
+ struct fuse_req *req = fuse_get_req(fc);
char *link;

- if (!req)
- return ERR_PTR(-EINTR);
+ if (IS_ERR(req))
+ return ERR_PTR(PTR_ERR(req));

link = (char *) __get_free_page(GFP_KERNEL);
if (!link) {
@@ -933,9 +931,9 @@ static int fuse_setattr(struct dentry *e
}
}

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
iattr_to_fattr(attr, &inarg);
@@ -995,9 +993,9 @@ static int fuse_setxattr(struct dentry *
if (fc->no_setxattr)
return -EOPNOTSUPP;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.size = size;
@@ -1035,9 +1033,9 @@ static ssize_t fuse_getxattr(struct dent
if (fc->no_getxattr)
return -EOPNOTSUPP;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.size = size;
@@ -1085,9 +1083,9 @@ static ssize_t fuse_listxattr(struct den
if (fc->no_listxattr)
return -EOPNOTSUPP;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.size = size;
@@ -1131,9 +1129,9 @@ static int fuse_removexattr(struct dentr
if (fc->no_removexattr)
return -EOPNOTSUPP;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->in.h.opcode = FUSE_REMOVEXATTR;
req->in.h.nodeid = get_node_id(inode);
Index: linux/fs/fuse/file.c
===================================================================
--- linux.orig/fs/fuse/file.c 2006-03-31 18:55:29.000000000 +0200
+++ linux/fs/fuse/file.c 2006-03-31 18:55:32.000000000 +0200
@@ -22,9 +22,9 @@ static int fuse_send_open(struct inode *
struct fuse_req *req;
int err;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.flags = file->f_flags & ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
@@ -184,9 +184,9 @@ static int fuse_flush(struct file *file)
if (fc->no_flush)
return 0;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.fh = ff->fh;
@@ -223,9 +223,9 @@ int fuse_fsync_common(struct file *file,
if ((!isdir && fc->no_fsync) || (isdir && fc->no_fsyncdir))
return 0;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&inarg, 0, sizeof(inarg));
inarg.fh = ff->fh;
@@ -297,9 +297,9 @@ static int fuse_readpage(struct file *fi
if (is_bad_inode(inode))
goto out;

- err = -EINTR;
- req = fuse_get_request(fc);
- if (!req)
+ req = fuse_get_req(fc);
+ err = PTR_ERR(req);
+ if (IS_ERR(req))
goto out;

req->out.page_zeroing = 1;
@@ -368,10 +368,10 @@ static int fuse_readpages_fill(void *_da
(req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_read ||
req->pages[req->num_pages - 1]->index + 1 != page->index)) {
fuse_send_readpages(req, data->file, inode);
- data->req = req = fuse_get_request(fc);
- if (!req) {
+ data->req = req = fuse_get_req(fc);
+ if (IS_ERR(req)) {
unlock_page(page);
- return -EINTR;
+ return PTR_ERR(req);
}
}
req->pages[req->num_pages] = page;
@@ -392,9 +392,9 @@ static int fuse_readpages(struct file *f

data.file = file;
data.inode = inode;
- data.req = fuse_get_request(fc);
- if (!data.req)
- return -EINTR;
+ data.req = fuse_get_req(fc);
+ if (IS_ERR(data.req))
+ return PTR_ERR(data.req);

err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
if (!err) {
@@ -455,9 +455,9 @@ static int fuse_commit_write(struct file
if (is_bad_inode(inode))
return -EIO;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

req->num_pages = 1;
req->pages[0] = page;
@@ -532,9 +532,9 @@ static ssize_t fuse_direct_io(struct fil
if (is_bad_inode(inode))
return -EIO;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

while (count) {
size_t nres;
Index: linux/fs/fuse/fuse_i.h
===================================================================
--- linux.orig/fs/fuse/fuse_i.h 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/fuse_i.h 2006-03-31 18:55:32.000000000 +0200
@@ -18,9 +18,6 @@
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32

-/** If more requests are outstanding, then the operation will block */
-#define FUSE_MAX_OUTSTANDING 10
-
/** It could be as large as PATH_MAX, but would that have any uses? */
#define FUSE_NAME_MAX 1024

@@ -131,8 +128,8 @@ struct fuse_conn;
* A request to the client
*/
struct fuse_req {
- /** This can be on either unused_list, pending processing or
- io lists in fuse_conn */
+ /** This can be on either pending processing or io lists in
+ fuse_conn */
struct list_head list;

/** Entry on the background list */
@@ -150,9 +147,6 @@ struct fuse_req {
/** True if the request has reply */
unsigned isreply:1;

- /** The request is preallocated */
- unsigned preallocated:1;
-
/** The request was interrupted */
unsigned interrupted:1;

@@ -247,19 +241,9 @@ struct fuse_conn {
interrupted request) */
struct list_head background;

- /** Controls the maximum number of outstanding requests */
- struct semaphore outstanding_sem;
-
- /** This counts the number of outstanding requests if
- outstanding_sem would go negative */
- unsigned outstanding_debt;
-
/** RW semaphore for exclusion with fuse_put_super() */
struct rw_semaphore sbput_sem;

- /** The list of unused requests */
- struct list_head unused_list;
-
/** The next unique request id */
u64 reqctr;

@@ -452,11 +436,11 @@ void fuse_reset_request(struct fuse_req
/**
* Reserve a preallocated request
*/
-struct fuse_req *fuse_get_request(struct fuse_conn *fc);
+struct fuse_req *fuse_get_req(struct fuse_conn *fc);

/**
- * Decrement reference count of a request. If count goes to zero put
- * on unused list (preallocated) or free request (not preallocated).
+ * Decrement reference count of a request. If count goes to zero free
+ * the request.
*/
void fuse_put_request(struct fuse_conn *fc, struct fuse_req *req);

Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:31.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:32.000000000 +0200
@@ -243,9 +243,9 @@ static int fuse_statfs(struct super_bloc
struct fuse_statfs_out outarg;
int err;

- req = fuse_get_request(fc);
- if (!req)
- return -EINTR;
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);

memset(&outarg, 0, sizeof(outarg));
req->in.numargs = 0;
@@ -370,15 +370,7 @@ static int fuse_show_options(struct seq_

static void fuse_conn_release(struct kobject *kobj)
{
- struct fuse_conn *fc = get_fuse_conn_kobj(kobj);
-
- while (!list_empty(&fc->unused_list)) {
- struct fuse_req *req;
- req = list_entry(fc->unused_list.next, struct fuse_req, list);
- list_del(&req->list);
- fuse_request_free(req);
- }
- kfree(fc);
+ kfree(get_fuse_conn_kobj(kobj));
}

static struct fuse_conn *new_conn(void)
@@ -387,27 +379,16 @@ static struct fuse_conn *new_conn(void)

fc = kzalloc(sizeof(*fc), GFP_KERNEL);
if (fc) {
- int i;
spin_lock_init(&fc->lock);
init_waitqueue_head(&fc->waitq);
INIT_LIST_HEAD(&fc->pending);
INIT_LIST_HEAD(&fc->processing);
INIT_LIST_HEAD(&fc->io);
- INIT_LIST_HEAD(&fc->unused_list);
INIT_LIST_HEAD(&fc->background);
- sema_init(&fc->outstanding_sem, 1); /* One for INIT */
init_rwsem(&fc->sbput_sem);
kobj_set_kset_s(fc, connections_subsys);
kobject_init(&fc->kobj);
atomic_set(&fc->num_waiting, 0);
- for (i = 0; i < FUSE_MAX_OUTSTANDING; i++) {
- struct fuse_req *req = fuse_request_alloc();
- if (!req) {
- kobject_put(&fc->kobj);
- return NULL;
- }
- list_add(&req->list, &fc->unused_list);
- }
fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
fc->bdi.unplug_io_fn = default_unplug_io_fn;
fc->reqctr = 0;
@@ -438,7 +419,6 @@ static struct super_operations fuse_supe

static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req)
{
- int i;
struct fuse_init_out *arg = &req->misc.init_out;

if (req->out.h.error || arg->major != FUSE_KERNEL_VERSION)
@@ -457,22 +437,11 @@ static void process_init_reply(struct fu
fc->minor = arg->minor;
fc->max_write = arg->minor < 5 ? 4096 : arg->max_write;
}
-
- /* After INIT reply is received other requests can go
- out. So do (FUSE_MAX_OUTSTANDING - 1) number of
- up()s on outstanding_sem. The last up() is done in
- fuse_putback_request() */
- for (i = 1; i < FUSE_MAX_OUTSTANDING; i++)
- up(&fc->outstanding_sem);
-
fuse_put_request(fc, req);
}

-static void fuse_send_init(struct fuse_conn *fc)
+static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req)
{
- /* This is called from fuse_read_super() so there's guaranteed
- to be exactly one request available */
- struct fuse_req *req = fuse_get_request(fc);
struct fuse_init_in *arg = &req->misc.init_in;

arg->major = FUSE_KERNEL_VERSION;
@@ -508,6 +477,7 @@ static int fuse_fill_super(struct super_
struct fuse_mount_data d;
struct file *file;
struct dentry *root_dentry;
+ struct fuse_req *init_req;
int err;

if (!parse_fuse_opt((char *) data, &d))
@@ -554,13 +524,17 @@ static int fuse_fill_super(struct super_
goto err;
}

+ init_req = fuse_request_alloc();
+ if (!init_req)
+ goto err_put_root;
+
err = kobject_set_name(&fc->kobj, "%llu", conn_id());
if (err)
- goto err_put_root;
+ goto err_free_req;

err = kobject_add(&fc->kobj);
if (err)
- goto err_put_root;
+ goto err_free_req;

sb->s_root = root_dentry;
fc->mounted = 1;
@@ -574,10 +548,12 @@ static int fuse_fill_super(struct super_
*/
fput(file);

- fuse_send_init(fc);
+ fuse_send_init(fc, init_req);

return 0;

+ err_free_req:
+ fuse_request_free(init_req);
err_put_root:
dput(root_dentry);
err:

2006-03-31 18:03:25

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 9/10] fuse: account background requests

The previous patch removed limiting the number of outstanding
requests. This patch adds a much simpler limiting, that is also
compatible with file locking operations.

A task may have at most one synchronous request allocated. So these
requests need not be otherwise limited.

However the number of background requests (release, forget,
asynchronous reads, interrupted requests) can grow indefinitely. This
can be used by a malicous user to cause FUSE to allocate arbitrary
amounts of unswappable kernel memory, denying service.

For this reason add a limit for the number of background requests, and
block allocations of new requests until the number goes bellow the
limit.

Also use this mechanism to block all requests until the INIT reply is
received.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/dev.c
===================================================================
--- linux.orig/fs/fuse/dev.c 2006-03-31 18:55:32.000000000 +0200
+++ linux/fs/fuse/dev.c 2006-03-31 18:55:32.000000000 +0200
@@ -90,7 +90,17 @@ static void __fuse_put_request(struct fu

struct fuse_req *fuse_get_req(struct fuse_conn *fc)
{
- struct fuse_req *req = fuse_request_alloc();
+ struct fuse_req *req;
+ sigset_t oldset;
+ int err;
+
+ block_sigs(&oldset);
+ err = wait_event_interruptible(fc->blocked_waitq, !fc->blocked);
+ restore_sigs(&oldset);
+ if (err)
+ return ERR_PTR(-EINTR);
+
+ req = fuse_request_alloc();
if (!req)
return ERR_PTR(-ENOMEM);

@@ -118,6 +128,11 @@ void fuse_release_background(struct fuse
fput(req->file);
spin_lock(&fc->lock);
list_del(&req->bg_entry);
+ if (fc->num_background == FUSE_MAX_BACKGROUND) {
+ fc->blocked = 0;
+ wake_up_all(&fc->blocked_waitq);
+ }
+ fc->num_background--;
spin_unlock(&fc->lock);
}

@@ -195,6 +210,9 @@ static void background_request(struct fu
{
req->background = 1;
list_add(&req->bg_entry, &fc->background);
+ fc->num_background++;
+ if (fc->num_background == FUSE_MAX_BACKGROUND)
+ fc->blocked = 1;
if (req->inode)
req->inode = igrab(req->inode);
if (req->inode2)
@@ -288,6 +306,7 @@ void request_send(struct fuse_conn *fc,
static void request_send_nowait(struct fuse_conn *fc, struct fuse_req *req)
{
spin_lock(&fc->lock);
+ background_request(fc, req);
if (fc->connected) {
queue_request(fc, req);
spin_unlock(&fc->lock);
@@ -306,9 +325,6 @@ void request_send_noreply(struct fuse_co
void request_send_background(struct fuse_conn *fc, struct fuse_req *req)
{
req->isreply = 1;
- spin_lock(&fc->lock);
- background_request(fc, req);
- spin_unlock(&fc->lock);
request_send_nowait(fc, req);
}

Index: linux/fs/fuse/fuse_i.h
===================================================================
--- linux.orig/fs/fuse/fuse_i.h 2006-03-31 18:55:32.000000000 +0200
+++ linux/fs/fuse/fuse_i.h 2006-03-31 18:55:32.000000000 +0200
@@ -18,6 +18,9 @@
/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32

+/** Maximum number of outstanding background requests */
+#define FUSE_MAX_BACKGROUND 10
+
/** It could be as large as PATH_MAX, but would that have any uses? */
#define FUSE_NAME_MAX 1024

@@ -241,6 +244,17 @@ struct fuse_conn {
interrupted request) */
struct list_head background;

+ /** Number of requests currently in the background */
+ unsigned num_background;
+
+ /** Flag indicating if connection is blocked. This will be
+ the case before the INIT reply is received, and if there
+ are too many outstading backgrounds requests */
+ int blocked;
+
+ /** waitq for blocked connection */
+ wait_queue_head_t blocked_waitq;
+
/** RW semaphore for exclusion with fuse_put_super() */
struct rw_semaphore sbput_sem;

Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:32.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:32.000000000 +0200
@@ -381,6 +381,7 @@ static struct fuse_conn *new_conn(void)
if (fc) {
spin_lock_init(&fc->lock);
init_waitqueue_head(&fc->waitq);
+ init_waitqueue_head(&fc->blocked_waitq);
INIT_LIST_HEAD(&fc->pending);
INIT_LIST_HEAD(&fc->processing);
INIT_LIST_HEAD(&fc->io);
@@ -392,6 +393,7 @@ static struct fuse_conn *new_conn(void)
fc->bdi.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
fc->bdi.unplug_io_fn = default_unplug_io_fn;
fc->reqctr = 0;
+ fc->blocked = 1;
}
return fc;
}
@@ -438,6 +440,8 @@ static void process_init_reply(struct fu
fc->max_write = arg->minor < 5 ? 4096 : arg->max_write;
}
fuse_put_request(fc, req);
+ fc->blocked = 0;
+ wake_up_all(&fc->blocked_waitq);
}

static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req)

2006-03-31 18:04:59

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 10/10] fuse: add POSIX file locking support

This patch adds POSIX file locking support to the FUSE interface.

This implementation doesn't keep any locking state in kernel, except a
dummy lock which is inserted whenever a locking operation is performed
on an inode and removed on inode cleanup. This is needed because
locks_remove_posix() assumes that if inode->i_flock list is empty
there are no locks applied and bypasses calling the ->lock() method.

This is quite optimal, since file locking is rarely used, hence
->lock() will only be called on process exit, if there's a chance that
there might be locks owned by that process.

Mandatory locking is not supported. The filesystem may enfoce
mandatory locking in userspace if needed.

Signed-off-by: Miklos Szeredi <[email protected]>

Index: linux/fs/fuse/file.c
===================================================================
--- linux.orig/fs/fuse/file.c 2006-03-31 18:55:43.000000000 +0200
+++ linux/fs/fuse/file.c 2006-03-31 19:03:04.000000000 +0200
@@ -1,6 +1,6 @@
/*
FUSE: Filesystem in Userspace
- Copyright (C) 2001-2005 Miklos Szeredi <[email protected]>
+ Copyright (C) 2001-2006 Miklos Szeredi <[email protected]>

This program can be distributed under the terms of the GNU GPL.
See the file COPYING.
@@ -615,6 +615,158 @@ static int fuse_set_page_dirty(struct pa
return 0;
}

+/*
+ * Need to add a dummy posix lock, so VFS doesn't optimize away the
+ * unlock from locks_remove_posix()
+ */
+static int fuse_add_dummy_lock(struct inode *inode)
+{
+ struct file_lock lock;
+ struct fuse_conn *fc = get_fuse_conn(inode);
+
+ locks_init_lock(&lock);
+ lock.fl_type = F_WRLCK;
+ lock.fl_flags = FL_POSIX;
+ lock.fl_start = 0;
+ lock.fl_end = OFFSET_MAX;
+ lock.fl_owner = (fl_owner_t) fc;
+ return __posix_lock_file(inode, &lock, NULL);
+}
+
+void fuse_remove_dummy_lock(struct inode *inode)
+{
+ struct file_lock lock;
+ struct fuse_conn *fc = get_fuse_conn(inode);
+
+ locks_init_lock(&lock);
+ lock.fl_type = F_UNLCK;
+ lock.fl_flags = FL_POSIX;
+ lock.fl_start = 0;
+ lock.fl_end = OFFSET_MAX;
+ lock.fl_owner = (fl_owner_t) fc;
+ __posix_lock_file(inode, &lock, NULL);
+}
+
+static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
+ struct file_lock *fl)
+{
+ switch (ffl->type) {
+ case F_UNLCK:
+ break;
+
+ case F_RDLCK:
+ case F_WRLCK:
+ if (ffl->start < 0 || ffl->end < 0 || ffl->end < ffl->start)
+ return -EIO;
+
+ fl->fl_start = ffl->start;
+ fl->fl_end = ffl->end;
+ fl->fl_pid = ffl->pid;
+ break;
+
+ default:
+ return -EIO;
+ }
+ fl->fl_type = ffl->type;
+ return 0;
+}
+
+static void fuse_lk_fill(struct fuse_req *req, struct file *file,
+ const struct file_lock *fl, int opcode, pid_t pid)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ struct fuse_file *ff = file->private_data;
+ struct fuse_lk_in *arg = &req->misc.lk_in;
+
+ arg->fh = ff->fh;
+ arg->owner = (unsigned long) fl->fl_owner;
+ arg->lk.start = fl->fl_start;
+ arg->lk.end = fl->fl_end;
+ arg->lk.type = fl->fl_type;
+ arg->lk.pid = pid;
+ req->in.h.opcode = opcode;
+ req->in.h.nodeid = get_node_id(inode);
+ req->inode = inode;
+ req->file = file;
+ req->in.numargs = 1;
+ req->in.args[0].size = sizeof(*arg);
+ req->in.args[0].value = arg;
+}
+
+static int fuse_getlk(struct file *file, struct file_lock *fl)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ struct fuse_conn *fc = get_fuse_conn(inode);
+ struct fuse_req *req;
+ struct fuse_lk_out outarg;
+ int err;
+
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+
+ fuse_lk_fill(req, file, fl, FUSE_GETLK, 0);
+ req->out.numargs = 1;
+ req->out.args[0].size = sizeof(outarg);
+ req->out.args[0].value = &outarg;
+ request_send(fc, req);
+ err = req->out.h.error;
+ fuse_put_request(fc, req);
+ if (!err)
+ err = convert_fuse_file_lock(&outarg.lk, fl);
+
+ return err;
+}
+
+static int fuse_setlk(struct file *file, struct file_lock *fl)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ struct fuse_conn *fc = get_fuse_conn(inode);
+ struct fuse_req *req;
+ int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK;
+ int err;
+ pid_t pid = 0;
+
+ if (fl->fl_type != F_UNLCK) {
+ pid = current->tgid;
+ err = fuse_add_dummy_lock(inode);
+ if (err)
+ return err;
+ }
+
+ req = fuse_get_req(fc);
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+
+ fuse_lk_fill(req, file,fl, opcode, pid);
+ request_send(fc, req);
+ err = req->out.h.error;
+ fuse_put_request(fc, req);
+ return err;
+}
+
+static int fuse_file_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+ struct inode *inode = file->f_dentry->d_inode;
+ struct fuse_conn *fc = get_fuse_conn(inode);
+ int err;
+
+ if (cmd == F_GETLK) {
+ if (fc->no_lk) {
+ if (!posix_test_lock(file, fl, fl))
+ fl->fl_type = F_UNLCK;
+ err = 0;
+ } else
+ err = fuse_getlk(file, fl);
+ } else {
+ if (fc->no_lk)
+ err = posix_lock_file_wait(file, fl);
+ else
+ err = fuse_setlk(file, fl);
+ }
+ return err;
+}
+
static const struct file_operations fuse_file_operations = {
.llseek = generic_file_llseek,
.read = generic_file_read,
@@ -624,6 +776,7 @@ static const struct file_operations fuse
.flush = fuse_flush,
.release = fuse_release,
.fsync = fuse_fsync,
+ .lock = fuse_file_lock,
.sendfile = generic_file_sendfile,
};

@@ -635,6 +788,7 @@ static const struct file_operations fuse
.flush = fuse_flush,
.release = fuse_release,
.fsync = fuse_fsync,
+ .lock = fuse_file_lock,
/* no mmap and sendfile */
};

Index: linux/fs/fuse/fuse_i.h
===================================================================
--- linux.orig/fs/fuse/fuse_i.h 2006-03-31 18:55:43.000000000 +0200
+++ linux/fs/fuse/fuse_i.h 2006-03-31 18:55:43.000000000 +0200
@@ -178,6 +178,7 @@ struct fuse_req {
struct fuse_init_in init_in;
struct fuse_init_out init_out;
struct fuse_read_in read_in;
+ struct fuse_lk_in lk_in;
} misc;

/** page vector */
@@ -302,6 +303,9 @@ struct fuse_conn {
/** Is removexattr not implemented by fs? */
unsigned no_removexattr : 1;

+ /** Are file locking primitives not implemented by fs? */
+ unsigned no_lk : 1;
+
/** Is access not implemented by fs? */
unsigned no_access : 1;

@@ -490,3 +494,8 @@ int fuse_do_getattr(struct inode *inode)
* Invalidate inode attributes
*/
void fuse_invalidate_attr(struct inode *inode);
+
+/**
+ * Remove dummy lock from inode
+ */
+void fuse_remove_dummy_lock(struct inode *inode);
Index: linux/include/linux/fuse.h
===================================================================
--- linux.orig/include/linux/fuse.h 2006-03-31 18:55:39.000000000 +0200
+++ linux/include/linux/fuse.h 2006-03-31 18:55:43.000000000 +0200
@@ -1,6 +1,6 @@
/*
FUSE: Filesystem in Userspace
- Copyright (C) 2001-2005 Miklos Szeredi <[email protected]>
+ Copyright (C) 2001-2006 Miklos Szeredi <[email protected]>

This program can be distributed under the terms of the GNU GPL.
See the file COPYING.
@@ -14,7 +14,7 @@
#define FUSE_KERNEL_VERSION 7

/** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 6
+#define FUSE_KERNEL_MINOR_VERSION 7

/** The node ID of the root inode */
#define FUSE_ROOT_ID 1
@@ -58,6 +58,13 @@ struct fuse_kstatfs {
__u32 spare[6];
};

+struct fuse_file_lock {
+ __u64 start;
+ __u64 end;
+ __u32 type;
+ __u32 pid; /* tgid */
+};
+
/**
* Bitmasks for fuse_setattr_in.valid
*/
@@ -82,6 +89,7 @@ struct fuse_kstatfs {
* INIT request/reply flags
*/
#define FUSE_ASYNC_READ (1 << 0)
+#define FUSE_POSIX_LOCKS (1 << 1)

enum fuse_opcode {
FUSE_LOOKUP = 1,
@@ -112,8 +120,14 @@ enum fuse_opcode {
FUSE_READDIR = 28,
FUSE_RELEASEDIR = 29,
FUSE_FSYNCDIR = 30,
+ FUSE_GETLK = 31,
+ FUSE_SETLK = 32,
+ FUSE_SETLKW = 33,
FUSE_ACCESS = 34,
- FUSE_CREATE = 35
+ FUSE_CREATE = 35,
+
+ /* Keep at the end: */
+ FUSE_MAXOP
};

/* The read buffer is required to be at least 8k, but may be much larger */
@@ -247,6 +261,16 @@ struct fuse_getxattr_out {
__u32 padding;
};

+struct fuse_lk_in {
+ __u64 fh;
+ __u64 owner;
+ struct fuse_file_lock lk;
+};
+
+struct fuse_lk_out {
+ struct fuse_file_lock lk;
+};
+
struct fuse_access_in {
__u32 mask;
__u32 padding;
Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2006-03-31 18:55:43.000000000 +0200
+++ linux/fs/fuse/inode.c 2006-03-31 18:55:43.000000000 +0200
@@ -71,6 +71,7 @@ static struct inode *fuse_alloc_inode(st
static void fuse_destroy_inode(struct inode *inode)
{
struct fuse_inode *fi = get_fuse_inode(inode);
+ BUG_ON(inode->i_flock);
if (fi->forget_req)
fuse_request_free(fi->forget_req);
kmem_cache_free(fuse_inode_cachep, inode);
@@ -96,6 +97,8 @@ void fuse_send_forget(struct fuse_conn *

static void fuse_clear_inode(struct inode *inode)
{
+ if (inode->i_flock)
+ fuse_remove_dummy_lock(inode);
if (inode->i_sb->s_flags & MS_ACTIVE) {
struct fuse_conn *fc = get_fuse_conn(inode);
struct fuse_inode *fi = get_fuse_inode(inode);
@@ -104,6 +107,14 @@ static void fuse_clear_inode(struct inod
}
}

+static int fuse_remount_fs(struct super_block *sb, int *flags, char *data)
+{
+ if (*flags & MS_MANDLOCK)
+ return -EINVAL;
+
+ return 0;
+}
+
void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr)
{
if (S_ISREG(inode->i_mode) && i_size_read(inode) != attr->size)
@@ -413,6 +424,7 @@ static struct super_operations fuse_supe
.destroy_inode = fuse_destroy_inode,
.read_inode = fuse_read_inode,
.clear_inode = fuse_clear_inode,
+ .remount_fs = fuse_remount_fs,
.put_super = fuse_put_super,
.umount_begin = fuse_umount_begin,
.statfs = fuse_statfs,
@@ -432,6 +444,8 @@ static void process_init_reply(struct fu
ra_pages = arg->max_readahead / PAGE_CACHE_SIZE;
if (arg->flags & FUSE_ASYNC_READ)
fc->async_read = 1;
+ if (!(arg->flags & FUSE_POSIX_LOCKS))
+ fc->no_lk = 1;
} else
ra_pages = fc->max_read / PAGE_CACHE_SIZE;

@@ -451,7 +465,7 @@ static void fuse_send_init(struct fuse_c
arg->major = FUSE_KERNEL_VERSION;
arg->minor = FUSE_KERNEL_MINOR_VERSION;
arg->max_readahead = fc->bdi.ra_pages * PAGE_CACHE_SIZE;
- arg->flags |= FUSE_ASYNC_READ;
+ arg->flags |= FUSE_ASYNC_READ | FUSE_POSIX_LOCKS;
req->in.h.opcode = FUSE_INIT;
req->in.numargs = 1;
req->in.args[0].size = sizeof(*arg);
@@ -484,6 +498,9 @@ static int fuse_fill_super(struct super_
struct fuse_req *init_req;
int err;

+ if (sb->s_flags & MS_MANDLOCK)
+ return -EINVAL;
+
if (!parse_fuse_opt((char *) data, &d))
return -EINVAL;