Return-Path: Received: from mail-gw0-f52.google.com ([74.125.83.52]:38399 "EHLO mail-gw0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753154Ab1IMHvB (ORCPT ); Tue, 13 Sep 2011 03:51:01 -0400 Received: by gwj15 with SMTP id 15so316915gwj.11 for ; Tue, 13 Sep 2011 00:51:01 -0700 (PDT) Message-ID: <4E6F0B61.4050907@tonian.com> Date: Tue, 13 Sep 2011 00:50:57 -0700 From: Benny Halevy To: Trond Myklebust CC: Peng Tao , tao.peng@emc.com, gusev.vitaliy@nexenta.com, gusev.vitaliy@gmail.com, linux-nfs@vger.kernel.org Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release References: <1314512558-16912-1-git-send-email-gusev.vitaliy@nexenta.com> <1315337382.16274.7.camel@lade.trondhjem.org> <4E669B21.30006@nexenta.com> <1315348373.19556.22.camel@lade.trondhjem.org> <2E1EB2CF9ED1CB4AA966F0EB76EAB4430B0ED3DF@SACMVEXC2-PRD.hq.netapp.com> <2E1EB2CF9ED1CB4AA966F0EB76EAB4430B0ED4C8@SACMVEXC2-PRD.hq.netapp.com> <1315592430.17611.15.camel@lade.trondhjem.org> <4E6B0E48.7050208@tonian.com> <4E6E6C3B.2040605@tonian.com> <1315861851.8350.11.camel@lade.trondhjem.org> In-Reply-To: <1315861851.8350.11.camel@lade.trondhjem.org> Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On 2011-09-12 14:10, Trond Myklebust wrote: > On Mon, 2011-09-12 at 13:31 -0700, Benny Halevy wrote: >> On 2011-09-12 07:56, Peng Tao wrote: >>>> The layout segments are not really in use while in LAYOUTCOMMIT. >>>> We only need to get the stateid right with respect to concurrent layout recalls. >>> LAYOUTCOMMIT takes lseg reference to mark them as in use so that >>> layoutrecall cannot free them. >>> >> >> And if layoutrecall would have freed layout segments during layoutcommit, >> what is your specific concern? > > That layoutcommit is supposed to return NFS4ERR_BAD_LAYOUT in that case > according to section 18.42.3 of RFC5661. I can't find anything in the > errata that changes that requirement. > Right. That tells me there no need to strictly serialize LAYOUTCOMMITs with CB_LAYOUTRECALL, as long as the layout stateid sent with LAYOUTCOMMIT atomically represents the state when the operation was prepared. That said, since we do want the LAYOUTCOMMIT to succeed, it would be beneficial for the client to reply to a CB_LAYOUTRECALL received while a conflicting LAYOUTCOMMIT is in progress with NFS4ERR_DELAY. The server, on its side, should prevent a distributed deadlock by avoiding blocking of a LAYOUTCOMMIT on an outstanding CB_LAYOUTRECALL for the same client that sent the LAYOUTCOMMIT. I'm not sure what error would be best to return. Maybe NFS4ERR_RECALL_CONFLICT if it would be allowed (it isn't listed for LAYOUTCOMMIT at the moment). Just returning NFS4ER_DELAY might lead to a live lock situation where neither the LAYOUTCOMMIT not the CB_LAYOUTRECALL complete. Benny