Return-Path: Received: from mail-vx0-f174.google.com ([209.85.220.174]:36589 "EHLO mail-vx0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755276Ab1ILPD6 (ORCPT ); Mon, 12 Sep 2011 11:03:58 -0400 Received: by vxi9 with SMTP id 9so2812782vxi.19 for ; Mon, 12 Sep 2011 08:03:57 -0700 (PDT) In-Reply-To: <4E6B0E48.7050208@tonian.com> References: <1314512558-16912-1-git-send-email-gusev.vitaliy@nexenta.com> <1315337382.16274.7.camel@lade.trondhjem.org> <4E669B21.30006@nexenta.com> <1315348373.19556.22.camel@lade.trondhjem.org> <2E1EB2CF9ED1CB4AA966F0EB76EAB4430B0ED3DF@SACMVEXC2-PRD.hq.netapp.com> <2E1EB2CF9ED1CB4AA966F0EB76EAB4430B0ED4C8@SACMVEXC2-PRD.hq.netapp.com> <1315592430.17611.15.camel@lade.trondhjem.org> <4E6B0E48.7050208@tonian.com> From: Peng Tao Date: Mon, 12 Sep 2011 22:56:50 +0800 Message-ID: Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release To: Benny Halevy Cc: Trond Myklebust , tao.peng@emc.com, gusev.vitaliy@nexenta.com, gusev.vitaliy@gmail.com, linux-nfs@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Sat, Sep 10, 2011 at 3:14 PM, Benny Halevy wrote: > On 2011-09-09 11:20, Trond Myklebust wrote: >> On Thu, 2011-09-08 at 23:11 -0400, tao.peng@emc.com wrote: >>> HI, Trond, >>> >>>> -----Original Message----- >>>> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com] >>>> Sent: Friday, September 09, 2011 1:05 AM >>>> To: Peng Tao >>>> Cc: Peng, Tao; gusev.vitaliy@nexenta.com; gusev.vitaliy@gmail.com; >>>> linux-nfs@vger.kernel.org >>>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release >>>> >>>>> -----Original Message----- >>>>> From: Peng Tao [mailto:bergwolf@gmail.com] >>>>> Sent: Thursday, September 08, 2011 11:00 AM >>>>> To: Myklebust, Trond >>>>> Cc: tao.peng@emc.com; gusev.vitaliy@nexenta.com; >>>>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org >>>>> Subject: Re: [PATCH] nfs: fix inifinite loop at >>>>> nfs4_layoutcommit_release >>>>> >>>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond >>>>> wrote: >>>>>>> -----Original Message----- >>>>> >>>>>> >>>>>> Yes, but as far as I can see, even in the blocks case there can be >>>>> multiple extents per layout segment. What if I write to one >>>>> uninitialised extent, layoutcommit, then write to another uninitialized >>>>> extent in the same layout segment and layoutcommit? In my reading of >>>>> the code, there is a chance that the second layoutcommit will fail to >>>>> pick up the layout segment, and so will fail to notify the MDS that the >>>>> second extent now contains data. >>>>> >>>>> blocklayout does not decide what to layoutcommit according to the lseg >>>>> list. Instead, it keeps track of each extent's state at the >>>>> granularity of blocksize, and encode whatever needs layoutcommitted in >>>>> the layoutcommit call. So in your above case, as long as the second >>>>> layoutcommit is issued, blocklayout will encode the newly written >>>>> extent in the second layoutcommit call, even if the lseg is not >>>>> attached to the second layoutcommit. >>>>> >>>>> But that leads to another two question: >>>>> 1. How do we protect against layoutrecall if lseg is not linked to >>>>> layoutcommit? For this one, can we just reject layoutrecall if there >>>>> is inflight layoutcommit? It will be less parallel but can guarantee >>>>> current implementation correctness. >>>>> 2. blocklayout ONLY: bl_committing may be overloaded by several >>>>> layoutcommit calls and we don't have information in >>>>> cleanup_layoutcommit() on how many entry should be removed from >>>>> bl_committing. Maybe we can add a (void*) to struct >>>>> nfs4_layoutcommit_data, so that LD can pass some private information >>>>> between encode_layoutcommit() and cleanup_layoutcommit()? >>>> >>>> 3. What is the purpose of pinning the layout segment at all if neither blocks, nor >>>> objects nor files cares? >>> I believe it is for protecting against layoutrecall. But since we are seperating lseg and LD specific layout information management, it is actually not working as expected. >>> > > The layout segments are not really in use while in LAYOUTCOMMIT. > We only need to get the stateid right with respect to concurrent layout recalls. LAYOUTCOMMIT takes lseg reference to mark them as in use so that layoutrecall cannot free them. -- Thanks, Tao