2013-04-15 15:08:54

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On Thu, Dec 20, 2012 at 1:30 PM, Maxim Patlasov <[email protected]> wrote:
> Hi,
>
> There is a long-standing demand for syncronous behaviour of fuse_release:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>
> A few months ago Avati and me explained why such a feature would be useful:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>
> In short, the problem is that fuse_release (that's called on last user
> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
> ACK from userspace. Consequently, there is a gap when user regards the
> file released while userspace fuse is still working on it. An attempt to
> access the file from another node leads to complicated synchronization
> problems because the first node still "holds" the file.
>
> The patch-set resolves the problem by making fuse_release synchronous:
> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>
> To keep single-threaded userspace implementations happy the patch-set
> ensures that by the time fuse_release_common calls fuse_file_put, no
> more in-flight I/O exists. Asynchronous fuse callbacks (like
> fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
> never block in contexts other than close().

There are a few fput() calls outside sys_close(), all of these can
trigger FUSE_RELEASE. Most of those are OK, but for some I'm
reluctant to enable synchronous release.

For example doing a readlink() on a magic symlink under /proc
shouldn't result in a synchronous call to a fuse filesystem. Making
fput() synchronous may actually end up doing that (even if it's not
very likely).

At least for the unprivileged fuse daemon case it shouldn't be done.
If the fuse daemon can be "trusted" then enabling synchronous release
should be okay, that's why it's enabled for fuseblk.

But maybe I'm just too paranoid...

Thanks,
Miklos


2013-04-15 15:30:48

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <[email protected]> wrote:
> For example doing a readlink() on a magic symlink under /proc
> shouldn't result in a synchronous call to a fuse filesystem. Making
> fput() synchronous may actually end up doing that (even if it's not
> very likely).

Thinking about this a bit more. As it is it sounds wrong to rely on a
synchronous release, when in fact release is just not synchronous, as
indicated by the above example. Maybe it's the proc code that's buggy
and shouldn't do get_file/fput because everyone is assuming release
being synchronous with close(). Don't know.

Let's approach it from the other direction: what if you give back the
write lease on the first flush? It will probably work fine for 99% of
cases, since no other writes are going to happen after the first
flush. For the remaining cases you'll just have to reacquire the
lease when a write happens after the flush. I guess performance-wise
that will not be an issue, but I may be wrong.

Thanks,
Miklos

2013-04-15 18:17:50

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On Mon, Apr 15, 2013 at 05:30:41PM +0200, Miklos Szeredi wrote:
> On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <[email protected]> wrote:
> > For example doing a readlink() on a magic symlink under /proc
> > shouldn't result in a synchronous call to a fuse filesystem. Making
> > fput() synchronous may actually end up doing that (even if it's not
> > very likely).
>
> Thinking about this a bit more. As it is it sounds wrong to rely on a
> synchronous release, when in fact release is just not synchronous, as
> indicated by the above example. Maybe it's the proc code that's buggy
> and shouldn't do get_file/fput because everyone is assuming release
> being synchronous with close(). Don't know.

What the hell? ->release() is not and has never been synchronous with close().
There is any number of places where the final fput() might be called and no,
this readlink example is irrelevant - things like munmap()/dup2()/close
of a socket discarding a datagram with the last reference to struct file in
it, et sodding cetera.

Hell, another thread might be in the middle of read(2) at the moment when you
call close(). Result: the final fput() will be done when we are about to
return from that read(2).

People, ->release() is *NOT* guaranteed to be anywhere near close(2). Never
had been.

2013-04-16 09:09:06

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On Mon, Apr 15, 2013 at 8:17 PM, Al Viro <[email protected]> wrote:
> On Mon, Apr 15, 2013 at 05:30:41PM +0200, Miklos Szeredi wrote:
>> On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <[email protected]> wrote:
>> > For example doing a readlink() on a magic symlink under /proc
>> > shouldn't result in a synchronous call to a fuse filesystem. Making
>> > fput() synchronous may actually end up doing that (even if it's not
>> > very likely).
>>
>> Thinking about this a bit more. As it is it sounds wrong to rely on a
>> synchronous release, when in fact release is just not synchronous, as
>> indicated by the above example. Maybe it's the proc code that's buggy
>> and shouldn't do get_file/fput because everyone is assuming release
>> being synchronous with close(). Don't know.
>
> What the hell? ->release() is not and has never been synchronous with close().
> There is any number of places where the final fput() might be called and no,
> this readlink example is irrelevant - things like munmap()/dup2()/close
> of a socket discarding a datagram with the last reference to struct file in
> it, et sodding cetera.
>
> Hell, another thread might be in the middle of read(2) at the moment when you
> call close(). Result: the final fput() will be done when we are about to
> return from that read(2).

Apparently we do make some pains to make ->release() return before the
syscall that triggered it returns. Why is that then?

I think the difference between proc symlink and all the rest is that
everything the app does to the file descriptor is its own business.
If it just does single threaded open, read/write, close (which is what
the vast majority of apps do) then close *is* going to be synchronous
with release. At least most of the time.

Thanks,
Miklos

2013-04-16 18:22:23

by Maxim Patlasov

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

Hi Miklos,

On 4/15/13 7:08 PM, Miklos Szeredi wrote:
> On Thu, Dec 20, 2012 at 1:30 PM, Maxim Patlasov<[email protected]> wrote:
>> Hi,
>>
>> There is a long-standing demand for syncronous behaviour of fuse_release:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
>> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>>
>> A few months ago Avati and me explained why such a feature would be useful:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
>> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>>
>> In short, the problem is that fuse_release (that's called on last user
>> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
>> ACK from userspace. Consequently, there is a gap when user regards the
>> file released while userspace fuse is still working on it. An attempt to
>> access the file from another node leads to complicated synchronization
>> problems because the first node still "holds" the file.
>>
>> The patch-set resolves the problem by making fuse_release synchronous:
>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>>
>> To keep single-threaded userspace implementations happy the patch-set
>> ensures that by the time fuse_release_common calls fuse_file_put, no
>> more in-flight I/O exists. Asynchronous fuse callbacks (like
>> fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
>> never block in contexts other than close().
> There are a few fput() calls outside sys_close(), all of these can
> trigger FUSE_RELEASE. Most of those are OK, but for some I'm
> reluctant to enable synchronous release.
>
> For example doing a readlink() on a magic symlink under /proc
> shouldn't result in a synchronous call to a fuse filesystem. Making
> fput() synchronous may actually end up doing that (even if it's not
> very likely).
>
> At least for the unprivileged fuse daemon case it shouldn't be done.
> If the fuse daemon can be "trusted" then enabling synchronous release
> should be okay, that's why it's enabled for fuseblk.
>
> But maybe I'm just too paranoid...

No, I don't think it's too paranoid. I suggest to put the feature under
fusermount control by adding "close_wait" mount option. This is very
simple and straightforward and let sysad to decide whether to allow the
feature for unprivileged users or not.

Btw, having read last messages on this thread, I realized that the name
of patchset is a bit misleading - it would be better to name it "process
last fput() synchronously". But the core idea still looks sensible to
me: userspace may hold a reference to a file in one way or another (e.g.
by mmap-ed region), but when all references are released the file should
be ready for reuse again (e.g. to be accessed from another node).

The patch-set was reviewed by Brian Foster and now you looked at it as
well. Is it time for me to rebase the patchset to be applied on top of
writeback-cache patches?

Thanks,
Maxim

2013-04-17 20:53:22

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On Mon, Apr 15, 2013 at 5:30 PM, Miklos Szeredi <[email protected]> wrote:
> Let's approach it from the other direction: what if you give back the
> write lease on the first flush? It will probably work fine for 99% of
> cases, since no other writes are going to happen after the first
> flush. For the remaining cases you'll just have to reacquire the
> lease when a write happens after the flush. I guess performance-wise
> that will not be an issue, but I may be wrong.

What about this?

Thanks,
Miklos

2013-04-18 03:25:38

by Maxim Patlasov

[permalink] [raw]
Subject: Re: [PATCH 0/5] fuse: close file synchronously

On 4/17/13 1:53 PM, Miklos Szeredi wrote:
> On Mon, Apr 15, 2013 at 5:30 PM, Miklos Szeredi<[email protected]> wrote:
>> Let's approach it from the other direction: what if you give back the
>> write lease on the first flush? It will probably work fine for 99% of
>> cases, since no other writes are going to happen after the first
>> flush. For the remaining cases you'll just have to reacquire the
>> lease when a write happens after the flush. I guess performance-wise
>> that will not be an issue, but I may be wrong.
> What about this?

We'd like to do it, but we can't. Firstly because we rely on the fact
that the file cannot be modified by someone else while we hold exclusive
write lease. By the time we decide to reacquire the lease, the file can
be re-used by someone else and become completely different comparatively
with its state at the moment of first flush. Secondly, we can't sensibly
handle a case when the lease is already acquired by someone else by the
time of attempt to reacquire it.

Thanks,
Maxim