2006-05-01 06:21:10

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

Hi Raul,

On Thu, Apr 27, 2006 at 08:32:49AM +0200, DervishD wrote:
> Hi all :)
>
> I don't know if the patch to backport O_DIRECT support for ext3
> under kernel 2.4.3x was finally accepted or not, but I'm having what
> I consider inconsistent behaviour due to O_DIRECT under ext3fs and
> kernel 2.4.32.
>
> I can understand that ext3 doesn't support O_DIRECT, and that's
> not a problem for me. In fact, if an app really needs O_DIRECT and
> the underlying filesystem doesn't support it, the app should fail, no
> more and no less.

On v2.4, nope it doesnt.

> The problem I'm having is with dvd+rw-tools. Apart from all the
> problems regarding DVD writing, I have another problem: the open64
> call with the O_DIRECT flag succeeds, but any subsequent read
> operation fails. IMHO, if the filesystem is going to return EINVAL
> for any read/write operation over an O_DIRECT'ed filehandle, it
> should return an error when opening, too.
>
> The growisofs program tries to open a file using O_DIRECT and the
> call succeeds, so it tries to read from that filehandle and the
> result is always EINVAL.
>
> I've tried a test program, just in case the problem was memory
> alignment of the buffer, but nothing is solved (I used posix_memalign
> and some recipe I found in this list, using the st_blksize and the
> st_size of the file). The problem seems to be in the O_DIRECT flag,
> because removing it from the open call makes all work.
>
> Shouldn't ext3fs return an error when the O_DIRECT flag is used
> in the open call? Is the open call userspace only and thus only libc
> can return such error? Am I misunderstanding the entire issue and
> this is a perfectly legal behaviour (allowing the open, failing in
> the read operation)?

Your interpretation is correct. It would be nicer for open() to fail on
fs'es which don't support O_DIRECT, but v2.4 makes such check later at
read/write unfortunately ;(

And its too late for changing that IMO...


2006-05-01 11:23:06

by DervishD

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

Hi Marcelo :)

* Marcelo Tosatti <[email protected]> dixit:
> > Shouldn't ext3fs return an error when the O_DIRECT flag is
> > used in the open call? Is the open call userspace only and thus
> > only libc can return such error? Am I misunderstanding the entire
> > issue and this is a perfectly legal behaviour (allowing the open,
> > failing in the read operation)?
>
> Your interpretation is correct. It would be nicer for open() to
> fail on fs'es which don't support O_DIRECT, but v2.4 makes such
> check later at read/write unfortunately ;(

Oops :(

> And its too late for changing that IMO...

Probably. Anyway, since an userspace app shouldn't bother about
which underlying filesystem a file is under, ext3 should:

- fail in the open() call: OK, it's too late for that.
- don't check while in read()/write(): I'm not sure about this.

The problem I see is that I can't tell if (given that probably
the bug cannot be fixed right now) it's better to let the userspace
app believe that O_DIRECT is honored but silently ignore it, or let
the userspace believe that O_DIRECT was honored in the open() call
and make all subsequent calls to read()/write() fail.

Myself, I would prefer to be deceived and have successful calls
even if the O_DIRECT flag was ignored instead of having successful
calls to open(O_DIRECT) but failures on subsequent read()'s, but I
must confess that I don't know what kind of scenarios need the use of
O_DIRECT and I don't know if having O_DIRECT accepted but ignored is a
good thing :(

I'm not familiar with the ext3 code, so I don't know if it's easy
to modify it so it will reject an open if O_DIRECT is specified :(((

Thanks for your answer, Marcelo :)

Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to... RAmen!

2006-05-01 21:28:24

by Nathan Scott

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

On Mon, May 01, 2006 at 01:23:03PM +0200, DervishD wrote:
> Hi Marcelo :)
>
> * Marcelo Tosatti <[email protected]> dixit:
> > > Shouldn't ext3fs return an error when the O_DIRECT flag is
> > > used in the open call? Is the open call userspace only and thus
> > > only libc can return such error? Am I misunderstanding the entire
> > > issue and this is a perfectly legal behaviour (allowing the open,
> > > failing in the read operation)?
> >
> > Your interpretation is correct. It would be nicer for open() to
> > fail on fs'es which don't support O_DIRECT, but v2.4 makes such
> > check later at read/write unfortunately ;(
>
> Oops :(

Nothing else really make sense due to fcntl...
fcntl(fd, F_SETFL, O_DIRECT);
...can happen at any time, to enable/disable direct I/O.

cheers.

--
Nathan

2006-05-01 22:23:12

by be-news06

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

Nathan Scott <[email protected]> wrote:
>> > Your interpretation is correct. It would be nicer for open() to
>> > fail on fs'es which don't support O_DIRECT, but v2.4 makes such
>> > check later at read/write unfortunately ;(
>>
>> Oops :(
>
> Nothing else really make sense due to fcntl...
> fcntl(fd, F_SETFL, O_DIRECT);
> ...can happen at any time, to enable/disable direct I/O.

Actually everytime the O_DIRECT feaure is enabled (open() or fctnl()) you
could fail, why not?

Gruss
Bernd

2006-05-02 17:24:18

by DervishD

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

Hi Nathan :)

* Nathan Scott <[email protected]> dixit:
> On Mon, May 01, 2006 at 01:23:03PM +0200, DervishD wrote:
> > * Marcelo Tosatti <[email protected]> dixit:
> > > > Shouldn't ext3fs return an error when the O_DIRECT flag is
> > > > used in the open call? Is the open call userspace only and thus
> > > > only libc can return such error? Am I misunderstanding the entire
> > > > issue and this is a perfectly legal behaviour (allowing the open,
> > > > failing in the read operation)?
> > >
> > > Your interpretation is correct. It would be nicer for open() to
> > > fail on fs'es which don't support O_DIRECT, but v2.4 makes such
> > > check later at read/write unfortunately ;(
> >
> > Oops :(
>
> Nothing else really make sense due to fcntl...
> fcntl(fd, F_SETFL, O_DIRECT);
> ...can happen at any time, to enable/disable direct I/O.

I know, but that fcntl call should fail just like the open() one.
I mean, I don't find this very different, it's just another point
where the flag can be activated and so it should fail if the
underlying filesystem doesn't support it (and doesn't ignore it in
read()/write()).

Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to... RAmen!

2006-05-02 20:03:47

by Nathan Scott

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

On Tue, May 02, 2006 at 07:24:11PM +0200, DervishD wrote:
> Hi Nathan :)

Hi there,

> * Nathan Scott <[email protected]> dixit:
> > On Mon, May 01, 2006 at 01:23:03PM +0200, DervishD wrote:
> > > * Marcelo Tosatti <[email protected]> dixit:
> > > > Your interpretation is correct. It would be nicer for open() to
> > > > fail on fs'es which don't support O_DIRECT, but v2.4 makes such
> > > > check later at read/write unfortunately ;(
> > >
> > > Oops :(
> >
> > Nothing else really make sense due to fcntl...
> > fcntl(fd, F_SETFL, O_DIRECT);
> > ...can happen at any time, to enable/disable direct I/O.
>
> I know, but that fcntl call should fail just like the open() one.
> I mean, I don't find this very different, it's just another point
> where the flag can be activated and so it should fail if the
> underlying filesystem doesn't support it (and doesn't ignore it in
> read()/write()).

Problem is there is no way to know whether the underlying fs
supports direct IO or not here (fcntl is implemented outside
the filesystem, entirely). Which is not unfixable in itself
(could use a superblock flag or something similar) but it's
way out of scope for the sort of change going into 2.4 these
days.

cheers.

--
Nathan

2006-05-03 05:27:55

by DervishD

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

Hi Nathan :)

* Nathan Scott <[email protected]> dixit:
> > > Nothing else really make sense due to fcntl...
> > > fcntl(fd, F_SETFL, O_DIRECT);
> > > ...can happen at any time, to enable/disable direct I/O.
> >
> > I know, but that fcntl call should fail just like the open() one.
> > I mean, I don't find this very different, it's just another point
> > where the flag can be activated and so it should fail if the
> > underlying filesystem doesn't support it (and doesn't ignore it
> > in read()/write()).
>
> Problem is there is no way to know whether the underlying fs
> supports direct IO or not here (fcntl is implemented outside the
> filesystem, entirely).

I thought that it was implemented per filesystem.

> Which is not unfixable in itself (could use a superblock flag or
> something similar) but it's way out of scope for the sort of change
> going into 2.4 these days.

Which approach does 2.6 kernel use? O_DIRECT is correctly handled
for ext3 there, AFAIK :? Are the differences too large?

I know that this change would be intrusive and probably large,
but IMHO is a quite important bug, because it prevents apps to
selectively disable O_DIRECT (the flag is accepted by open(), so
there's no reason the app should bother about which caused the
read()/write() failures. In fact, is very difficult to know that
those failures are caused by partial/buggy support of O_DIRECT flag).

Thanks for the information! :)

Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to... RAmen!

2006-05-03 06:35:38

by Nathan Scott

[permalink] [raw]
Subject: Re: O_DIRECT, ext3fs, kernel 2.4.32... again

On Wed, May 03, 2006 at 07:27:52AM +0200, DervishD wrote:
> ...
> Are the differences too large?

Yep.

> I know that this change would be intrusive and probably large,
> but IMHO is a quite important bug, because it prevents apps to
> selectively disable O_DIRECT (the flag is accepted by open(), so
> there's no reason the app should bother about which caused the
> read()/write() failures. In fact, is very difficult to know that
> those failures are caused by partial/buggy support of O_DIRECT flag).

You could open for direct, do a direct read, and see if it fails.
If it fails, clear O_DIRECT on the fd via fcntl(F_SETFL) then do
regular buffered IO instead... a bit hacky, but should work fine
I think.

cheers.

--
Nathan