From: Anand Avati Subject: Re: regressions due to 64-bit ext4 directory cookies Date: Thu, 28 Mar 2013 15:20:59 -0700 Message-ID: References: <20130213224720.GE5938@thunk.org> <20130213230511.GW14195@fieldses.org> <20130213234430.GF5938@thunk.org> <5151BD5F.30607@itwm.fraunhofer.de> <5151C33E.2070008@redhat.com> <20130328140744.GA4989@thunk.org> <20130328175205.GD16651@lenny.home.zabbo.net> <20130328183153.GG7080@fieldses.org> <51549D74.1060703@redhat.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8110946275589804602==" Cc: Eric Sandeen , linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Theodore Ts'o , Zach Brown , Bernd Schubert , linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org To: Jeff Darcy Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gluster-devel-bounces+gcfgd-gluster-devel=m.gmane.org-qX2TKyscuCcdnm+yROfE0A@public.gmane.org Sender: gluster-devel-bounces+gcfgd-gluster-devel=m.gmane.org-qX2TKyscuCcdnm+yROfE0A@public.gmane.org List-Id: linux-ext4.vger.kernel.org --===============8110946275589804602== Content-Type: multipart/alternative; boundary=e89a8ff1c1fcb375de04d9039149 --e89a8ff1c1fcb375de04d9039149 Content-Type: text/plain; charset=ISO-8859-1 On Thu, Mar 28, 2013 at 3:14 PM, Anand Avati wrote: > On Thu, Mar 28, 2013 at 12:43 PM, Jeff Darcy wrote: > >> On 03/28/2013 02:49 PM, Anand Avati wrote: >> > Yes, it should, based on the theory of how ext4 was generating the >> > 63bits. But Jeff's test finds that the experiment is not matching the >> > theory. >> >> FWIW, I was able to re-run my test in between stuff related to That >> Other Problem. What seems to be happening is that we read correctly >> until just after d_off 0x4000000000000000, then we suddenly wrap around >> - not to the very first d_off we saw, but to a pretty early one (e.g. >> 0x0041b6340689a32e). This is all on a single brick, BTW, so it's pretty >> easy to line up the back-end and front-end d_off values which match >> perfectly up to this point. >> >> I haven't had a chance to ponder what this all means and debug it >> further. Hopefully I'll be able to do so soon, but I figured I'd >> mention it in case something about those numbers rang a bell. >> > > Of course, the unit tests (with artificial offsets) were done with brick > count >= 2. You have tested with DHT subvol count=1, which was not tested, > and sure enough, the code isn't handling it well. Just verified with the > unit tests that brick count = 1 condition fails to return the same d_off. > > Posting a fixed version. Thanks for the catch! > Posted an updated version http://review.gluster.org/4711. This passes unit tests for all brick counts (>= 1). Can you confirm if the "loop"ing is now gone in your test env? Thanks, Avati --e89a8ff1c1fcb375de04d9039149 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Thu, Mar 28, 2013 at 3:14 PM, Anand A= vati <anand.avati-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
On Thu, Mar 28, 2013 at 12:43 PM, Jeff Darcy <jdarcy-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
On 03/28/2013 02:49 PM, Anand Avati wrote:
> Yes, it should, based on the theory of how ext4 was generating the
> 63bits. But Jeff's test finds that the experiment is not matching = the
> theory.

FWIW, I was able to re-run my test in between stuff related to That Other Problem. =A0What seems to be happening is that we read correctly
until just after d_off 0x4000000000000000, then we suddenly wrap around
- not to the very first d_off we saw, but to a pretty early one (e.g.
0x0041b6340689a32e). =A0This is all on a single brick, BTW, so it's pre= tty
easy to line up the back-end and front-end d_off values which match
perfectly up to this point.

I haven't had a chance to ponder what this all means and debug it
further. =A0Hopefully I'll be able to do so soon, but I figured I'd=
mention it in case something about those numbers rang a bell.

Of course, the unit tests (with artifici= al offsets) were done with brick count >=3D 2. You have tested with DHT = subvol count=3D1, which was not tested, and sure enough, the code isn't= handling it well. Just verified with the unit tests that brick count =3D 1= condition fails to return the same d_off.

Posting a fixed version. Thanks for the catch!

Posted an updated version http://review.gluster.org/4711. This pass= es unit tests for all brick counts (>=3D 1). Can you confirm if the &quo= t;loop"ing is now gone in your test env?

Thanks,
Avati
=A0
--e89a8ff1c1fcb375de04d9039149-- --===============8110946275589804602== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel --===============8110946275589804602==--