2010-10-25 11:49:38

by Filipe David Manana

[permalink] [raw]
Subject: question about fsync

Hi,

I've recently seen code that does the following:

- opens a file (write append mode) and writes data to it
- opens a 2nd file descriptor for that same file (write append mode as
well) and does an fsync call on this 2nd file descriptor
- closes the 2nd file descriptor
- continues writing to the 1st file descriptor
- etc

Is there a risk that some of the data might not get fsynced to disk? I
mean, is it possible that the 2nd file descriptor doesn't have all the
same metadata as the first one?

Thanks.

--
Filipe David Manana,
[email protected], [email protected]

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."


2010-10-25 12:59:17

by Theodore Ts'o

[permalink] [raw]
Subject: Re: question about fsync


On Oct 25, 2010, at 7:49 AM, Filipe David Manana wrote:

> - opens a file (write append mode) and writes data to it
> - opens a 2nd file descriptor for that same file (write append mode as
> well) and does an fsync call on this 2nd file descriptor
> - closes the 2nd file descriptor
> - continues writing to the 1st file descriptor
> - etc
>
> Is there a risk that some of the data might not get fsynced to disk? I
> mean, is it possible that the 2nd file descriptor doesn't have all the
> same metadata as the first one?

What do you mean by "risk?" For most Linux file systems, including ext2, ext3, and ext4, all of the data will be sync'ed out to disk. However, this is not guaranteed by POSIX or the Single Unix Specification. So in theory, there might exist some POSIX/SUS compliant implementation that would not sync out the data. I think most Unix or Linux-compatible systems are such that you could count on this, but it's not a portable thing that you can count upon. I have no idea what the Windows POSIX layer would do in this situation, for example, or the Eunice Emulation layer for VMS. (Back in the Perl 3 days, Larry Wall's configure script would test for Eunice, and if not found, print the message, "Congratulations. You're not running Eunice." :-)

-- Ted


2010-10-25 22:10:18

by Filipe David Manana

[permalink] [raw]
Subject: Re: question about fsync

Theodore, thanks for your answer.

A similar question:

I have 2 threads. Both open the same file, but each using its own
(different) file descriptor.

If one thread writes data past the eof (appends data), will the other
one be able to read it with pread call (an offset past the eof it
found when it opened the file)?

I would just like to know if it's something safe to do on Linux and
all Posix compliant platforms (and eventually Windows). (I tested this
with non C code and it worked)

best regards,


On Mon, Oct 25, 2010 at 1:54 PM, Theodore Tso <[email protected]> wrote:
>
> On Oct 25, 2010, at 7:49 AM, Filipe David Manana wrote:
>
>> - opens a file (write append mode) and writes data to it
>> - opens a 2nd file descriptor for that same file (write append mode as
>> well) and does an fsync call on this 2nd file descriptor
>> - closes the 2nd file descriptor
>> - continues writing to the 1st file descriptor
>> - etc
>>
>> Is there a risk that some of the data might not get fsynced to disk? I
>> mean, is it possible that the 2nd file descriptor doesn't have all the
>> same metadata as the first one?
>
> What do you mean by "risk?"  For most Linux file systems, including ext2, ext3, and ext4, all of the data will be sync'ed out to disk.   However, this is not guaranteed by POSIX or the Single Unix Specification.   So in theory, there might exist some POSIX/SUS compliant implementation that would not sync out the data.  I think most Unix or Linux-compatible systems are such that you could count on this, but it's not a portable thing that you can count upon.   I have no idea what the Windows POSIX layer would do in this situation, for example, or the Eunice Emulation layer for VMS.  (Back in the Perl 3 days, Larry Wall's configure script would test for Eunice, and if not found, print the message, "Congratulations.  You're not running Eunice."  :-)
>
> -- Ted
>
>



--
Filipe David Manana,
[email protected], [email protected]

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

2010-10-26 22:26:17

by Theodore Ts'o

[permalink] [raw]
Subject: Re: question about fsync

On Mon, Oct 25, 2010 at 11:10:15PM +0100, Filipe David Manana wrote:
> Theodore, thanks for your answer.
>
> A similar question:
>
> I have 2 threads. Both open the same file, but each using its own
> (different) file descriptor.
>
> If one thread writes data past the eof (appends data), will the other
> one be able to read it with pread call (an offset past the eof it
> found when it opened the file)?
>
> I would just like to know if it's something safe to do on Linux and
> all Posix compliant platforms (and eventually Windows). (I tested this
> with non C code and it worked)

Race conditions aside, yes.

However, note that large writes are not necessarily guaranteed to be
atomic. Yes, the other thread will be able to read it with a pread
call, but if there is a race between the write() and pread(), it's
possible that pread will only read part of what has been written so
far by the write system call.

- Ted