Return-Path: Received: from mx2.netapp.com ([216.240.18.37]:6014 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750883Ab1CXQs0 convert rfc822-to-8bit (ORCPT ); Thu, 24 Mar 2011 12:48:26 -0400 Subject: Re: [PATCH 11/12] NFSv4.1: layoutcommit From: Trond Myklebust To: Benny Halevy Cc: "William A. (Andy) Adamson" , Fred Isaman , NFS list In-Reply-To: <4D8B732F.8020404@panasas.com> References: <4D8B732F.8020404@panasas.com> Content-Type: text/plain; charset="UTF-8" Date: Thu, 24 Mar 2011 12:48:25 -0400 Message-ID: <1300985305.31106.6.camel@lade.trondhjem.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Thu, 2011-03-24 at 18:37 +0200, Benny Halevy wrote: > On 2011-03-24 15:57, William A. (Andy) Adamson wrote: > >>> Only whole file layout support means that there is only one IOMODE_RW layout > >>> segment. > >>> > >>> Signed-off-by: Andy Adamson > >>> Signed-off-by: Alexandros Batsakis > >>> Signed-off-by: Boaz Harrosh > >>> Signed-off-by: Dean Hildebrand > >>> Signed-off-by: Fred Isaman > >>> Signed-off-by: Mingyang Guo > >>> Signed-off-by: Tao Guo > >>> Signed-off-by: Zhang Jingwang > >>> Tested-by: Boaz Harrosh > >>> Signed-off-by: Benny Halevy > >> > >> The code in this patch is new and different enough from the one I/we > >> signed-off originally that they don't make sense here. > > > > Hi Benny > > > > OK with me > > > >>> > >>> + /* references matched in nfs4_layoutcommit_release */ > >>> + wdata->lseg->pls_lc_cred = > >>> + get_rpccred(wdata->args.context->state->owner->so_cred); > >>> + mark_inode_dirty_sync(wdata->inode); > >>> + dprintk("%s: Set layoutcommit for inode %lu ", > >>> + __func__, wdata->inode->i_ino); > >>> + } > >>> + if (end_pos > wdata->lseg->pls_end_pos) > >>> + wdata->lseg->pls_end_pos = end_pos; > >> > >> The end_pos is essentially per inode, why maintain it per lseg? > >> How do you see this working with multiple lsegs in mind? > > > > The end-pos is per lseg, not per inode - each layoutcommit applies to > > a range of WRITES for a layoutsegment over the LAYOUTCOMMIT range. > > > > From Section 18.42.3 > > . The byte-range being committed is > > specified through the byte-range (loca_offset and loca_length). This > > byte-range MUST overlap with one or more existing layouts previously > > granted via LAYOUTGET > > > > > > Also, loca_last_write_offset MUST overlap the range > > described by loca_offset and loca_length. > > > > For the multiple lseg case: if the lsegs are merged, bookeeping > > end_pos per lseg just works. If a layoutdriver does not use merged > > lsegs, then there is a bit of work to do to walk the list of lsegs and > > determine the final end_pos for a given LAYOUTCOMMIT. If there are > > multiple non-contiguous lsegs, each used for WRITEs then multiple > > LAYOUTCOMMITs will need to be sent, otherwise the LAYOUTCOMMIT > > byte-range will not overlap as required. > > > > For the current layout types I believe that the LAYOUTCOMMIT can "merge" > multiple layout segments into a single LAYOUTCOMMIT, with a byte range > covering all segments and a last_byte_written offset which is just the maximum. > Future layout types may need this method though... Is that safe? What if I'm doing blocks and have written layout segment 1 & 3, but not layout segment 2? I don't want to have the MDS commit layout segment 2, and make the (lack of) data there visible to future readers. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com