LinuxLists.cc - Re: regressions due to 64-bit ext4 directory cookies

2013-02-14 00:05:01

Subject: Re: regressions due to 64-bit ext4 directory cookies

On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <[email protected]> wrote:
>
> I suspect this would seriously screw over Gluster, though, and this
> wouldn't be a solution for NFSv3, since NFS needs long-lived directory
> cookies, and not the short-lived cookies which is all POSIX/SuSv3
> guarantees.
>

Actually this would work just fine with Gluster. Except in the case of
gluster-NFS, the native client is only acting like a router/proxy of
syscalls to the backend system. A directory opened by an application will
have a matching directory fd opened on ext4, and readdir from an app will
be translated into readdir on the matching fd on ext4. So the
app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem".
As long as the offs^H^H^H^H cookies do not overflow in the transformation,
Gluster would not have a problem.

However Gluster-NFS (and NFS in general, too) will break, as we
opendir/closedir potentially on every request.

Avati

Attachments:

(No filename) (156.00 B)

2013-02-14 21:47:47

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

On Wed, Feb 13, 2013 at 04:05:01PM -0800, Anand Avati wrote:
> On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <[email protected]> wrote:
> >
> > I suspect this would seriously screw over Gluster, though, and this
> > wouldn't be a solution for NFSv3, since NFS needs long-lived directory
> > cookies, and not the short-lived cookies which is all POSIX/SuSv3
> > guarantees.
> >
>
> Actually this would work just fine with Gluster. Except in the case of
> gluster-NFS, the native client is only acting like a router/proxy of
> syscalls to the backend system. A directory opened by an application will
> have a matching directory fd opened on ext4, and readdir from an app will
> be translated into readdir on the matching fd on ext4. So the
> app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem".
> As long as the offs^H^H^H^H cookies do not overflow in the transformation,
> Gluster would not have a problem.
>
> However Gluster-NFS (and NFS in general, too) will break, as we
> opendir/closedir potentially on every request.

Yes. And, of course, NFS cookies live forever--we have no idea when a
client will hand one back to us and expect us to do something with it.

--b.

2013-03-28 16:26:40

by Eric Sandeen

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

On 3/28/13 9:07 AM, Theodore Ts'o wrote:
> On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote:
>>> We don't have reached a conclusion so far, do we? What about the
>>> ioctl approach, but a bit differently? Would it work to specify the
>>> allowed upper bits for ext4 (for example 16 additional bit) and the
>>> remaining part for gluster? One of the mails had the calculation
>>> formula:
>>
>> I did throw together an ioctl patch last week, but I think Anand has a new
>> approach he's trying out which won't require ext4 code changes. I'll let
>> him reply when he has a moment. :)
>
> Any update about whether Gluster can address this without needing the
> ioctl patch? Or should we push the ioctl patch into ext4 for the next
> merge window?

I went ahead & sent the ioctl patches to the ext4 list; they are lightly
tested, and not tested at all w/ gluster AFAIK. Wanted to get them
out just in case we decide we want them.

Thanks,
-Eric

> Thanks,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2013-03-26 15:48:28

by Eric Sandeen

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

On 3/26/13 10:23 AM, Bernd Schubert wrote:
> Sorry for my late reply, I had been rather busy.
>
> On 02/14/2013 01:05 AM, Anand Avati wrote:
>> On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <[email protected]> wrote:
>>>
>>> I suspect this would seriously screw over Gluster, though, and this
>>> wouldn't be a solution for NFSv3, since NFS needs long-lived directory
>>> cookies, and not the short-lived cookies which is all POSIX/SuSv3
>>> guarantees.
>>>
>>
>> Actually this would work just fine with Gluster. Except in the case of
>
> Would it really work perfectly? What about a server reboot in the middle of a readdir of a client?
>
>> gluster-NFS, the native client is only acting like a router/proxy of
>> syscalls to the backend system. A directory opened by an application will
>> have a matching directory fd opened on ext4, and readdir from an app will
>> be translated into readdir on the matching fd on ext4. So the
>> app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem".
>> As long as the offs^H^H^H^H cookies do not overflow in the transformation,
>> Gluster would not have a problem.
>>
>> However Gluster-NFS (and NFS in general, too) will break, as we
>> opendir/closedir potentially on every request.
>
> We don't have reached a conclusion so far, do we? What about the
> ioctl approach, but a bit differently? Would it work to specify the
> allowed upper bits for ext4 (for example 16 additional bit) and the
> remaining part for gluster? One of the mails had the calculation
> formula:

I did throw together an ioctl patch last week, but I think Anand has a new
approach he's trying out which won't require ext4 code changes. I'll let
him reply when he has a moment. :)

-Eric

> final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx
>
> But what is the value of MAX_SERVERS?
>
>
> Cheers,
> Bernd
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-03-26 15:23:19

by Bernd Schubert

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

Sorry for my late reply, I had been rather busy.

On 02/14/2013 01:05 AM, Anand Avati wrote:
> On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <[email protected]> wrote:
>>
>> I suspect this would seriously screw over Gluster, though, and this
>> wouldn't be a solution for NFSv3, since NFS needs long-lived directory
>> cookies, and not the short-lived cookies which is all POSIX/SuSv3
>> guarantees.
>>
>
> Actually this would work just fine with Gluster. Except in the case of

Would it really work perfectly? What about a server reboot in the middle
of a readdir of a client?

> gluster-NFS, the native client is only acting like a router/proxy of
> syscalls to the backend system. A directory opened by an application will
> have a matching directory fd opened on ext4, and readdir from an app will
> be translated into readdir on the matching fd on ext4. So the
> app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem".
> As long as the offs^H^H^H^H cookies do not overflow in the transformation,
> Gluster would not have a problem.
>
> However Gluster-NFS (and NFS in general, too) will break, as we
> opendir/closedir potentially on every request.

We don't have reached a conclusion so far, do we? What about the ioctl
approach, but a bit differently? Would it work to specify the allowed
upper bits for ext4 (for example 16 additional bit) and the remaining
part for gluster? One of the mails had the calculation formula:

final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx

But what is the value of MAX_SERVERS?

Cheers,
Bernd

2013-03-28 14:55:46

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote:
> > We don't have reached a conclusion so far, do we? What about the
> > ioctl approach, but a bit differently? Would it work to specify the
> > allowed upper bits for ext4 (for example 16 additional bit) and the
> > remaining part for gluster? One of the mails had the calculation
> > formula:
>
> I did throw together an ioctl patch last week, but I think Anand has a new
> approach he's trying out which won't require ext4 code changes. I'll let
> him reply when he has a moment. :)

Any update about whether Gluster can address this without needing the
ioctl patch? Or should we push the ioctl patch into ext4 for the next
merge window?

Thanks,

- Ted

2013-03-28 17:52:16

by Zach Brown

[permalink] [raw]

Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies

On Thu, Mar 28, 2013 at 10:07:44AM -0400, Theodore Ts'o wrote:
> On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote:
> > > We don't have reached a conclusion so far, do we? What about the
> > > ioctl approach, but a bit differently? Would it work to specify the
> > > allowed upper bits for ext4 (for example 16 additional bit) and the
> > > remaining part for gluster? One of the mails had the calculation
> > > formula:
> >
> > I did throw together an ioctl patch last week, but I think Anand has a new
> > approach he's trying out which won't require ext4 code changes. I'll let
> > him reply when he has a moment. :)
>
> Any update about whether Gluster can address this without needing the
> ioctl patch? Or should we push the ioctl patch into ext4 for the next
> merge window?

They're testing a work-around:

http://review.gluster.org/#change,4711

I'm not sure if they've decided that they're going to go with it, or
not.

- z