Return-Path: Received: from mx2.netapp.com ([216.240.18.37]:55247 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752966Ab1HAXLU convert rfc822-to-8bit (ORCPT ); Mon, 1 Aug 2011 19:11:20 -0400 Subject: Re: [PATCH v4 00/27] add block layout driver to pnfs client From: Trond Myklebust To: Andy Adamson Cc: Peng Tao , Jim Rees , Christoph Hellwig , linux-nfs@vger.kernel.org, peter honeyman Date: Mon, 01 Aug 2011 19:11:18 -0400 In-Reply-To: <0A94625F-4E4F-4A88-A0DF-63D429924F66@netapp.com> References: <1311874276-1386-1-git-send-email-rees@umich.edu> <20110729155136.GB28306@infradead.org> <20110729185415.GA23061@merit.edu> <20110729190133.GA10946@infradead.org> <20110729191341.GC23061@merit.edu> <1311988172.16078.15.camel@lade.trondhjem.org> <20110730032621.GB25188@merit.edu> <1312233006.23392.17.camel@lade.trondhjem.org> <1312238117.23392.19.camel@lade.trondhjem.org> <0A94625F-4E4F-4A88-A0DF-63D429924F66@netapp.com> Content-Type: text/plain; charset="UTF-8" Message-ID: <1312240278.4616.7.camel@lade.trondhjem.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Mon, 2011-08-01 at 18:57 -0400, Andy Adamson wrote: > On Aug 1, 2011, at 6:35 PM, Trond Myklebust wrote: > > > On Mon, 2011-08-01 at 17:10 -0400, Trond Myklebust wrote: > >> Looking at the callback code, I see that if tbl->highest_used_slotid != > >> 0, then we BUG() while holding the backchannel's tbl->slot_tbl_lock > >> spinlock. That seems a likely candidate for the above hang. > >> > >> Andy, how we are guaranteed that tbl->highest_used_slotid won't take > >> values other than 0, and why do we commit suicide when it does? As far > >> as I can see, there is no guarantee that we call nfs4_cb_take_slot() in > >> nfs4_callback_compound(), however we appear to unconditionally call > >> nfs4_cb_free_slot() provided there is a session. > >> > >> The other strangeness would be the fact that there is nothing enforcing > >> the NFS4_SESSION_DRAINING flag. If the session is draining, then the > >> back-channel simply ignores that and goes ahead with processing the > >> callback. Is this to avoid deadlocks with the server returning > >> NFS4ERR_BACK_CHAN_BUSY when the client does a DESTROY_SESSION? > > > When NFS4_SESSION_DRAINING is set, the backchannel is drained first - and waits for current processing to complete signaled by the highest_used_slotid == -1. Any other backchannel requests that occur under the NFS4_SESSION_DRAINING flag are processed - but just the CB_SEQUENCE operation which returns NFS4ERR_DELAY. It does indeed prevent the BACK_CHAN_BUSY deadlock. > > > > > How about something like the following? > > Looks good. Nice catch. One change below and a comment inline > > -->Andy > > > > > 8<------------------------------------------------------------------------------- > > From c0c499b0ca9d0af8cbdc29c40effba38475461d9 Mon Sep 17 00:00:00 2001 > > From: Trond Myklebust > > Date: Mon, 1 Aug 2011 18:31:12 -0400 > > Subject: [PATCH] NFSv4.1: Fix the callback 'highest_used_slotid' behaviour > > > > Currently, there is no guarantee that we will call nfs4_cb_take_slot() even > > though nfs4_callback_compound() will consistently call > > nfs4_cb_free_slot() provided the cb_process_state has set the 'clp' field. > > The result is that we can trigger the BUG_ON() upon the next call to > > nfs4_cb_take_slot(). > > This patch fixes the above problem by using the slot id that was taken in > > the CB_SEQUENCE operation as a flag for whether or not we need to call > > nfs4_cb_free_slot(). > > It also fixes an atomicity problem: we need to set tbl->highest_used_slotid > > atomically with the check for NFS4_SESSION_DRAINING, otherwise we end up > > racing with the various tests in nfs4_begin_drain_session(). > > But the code below doesn't do this - it locks the backchannel slot table to check the flag, but does not set the highest_used_slot atomically with this check. It should. The patch ensures that we hold the backchannel slot table lock across both the NFS4_SESSION_DRAINING test and the validate_seqid(), and also ensures that the latter function sets tbl->highest_used_slotid on success. > > > > Signed-off-by: Trond Myklebust > > --- > > fs/nfs/callback.h | 2 +- > > fs/nfs/callback_proc.c | 18 +++++++++++++----- > > fs/nfs/callback_xdr.c | 24 +++++++----------------- > > 3 files changed, 21 insertions(+), 23 deletions(-) > > > > diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h > > index b257383..07df5f1 100644 > > --- a/fs/nfs/callback.h > > +++ b/fs/nfs/callback.h > > @@ -38,6 +38,7 @@ enum nfs4_callback_opnum { > > struct cb_process_state { > > __be32 drc_status; > > struct nfs_client *clp; > > + int slotid; > > }; > > > > struct cb_compound_hdr_arg { > > @@ -166,7 +167,6 @@ extern unsigned nfs4_callback_layoutrecall( > > void *dummy, struct cb_process_state *cps); > > > > extern void nfs4_check_drain_bc_complete(struct nfs4_session *ses); > > -extern void nfs4_cb_take_slot(struct nfs_client *clp); > > > > struct cb_devicenotifyitem { > > uint32_t cbd_notify_type; > > diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c > > index 74780f9..1bd2c81 100644 > > --- a/fs/nfs/callback_proc.c > > +++ b/fs/nfs/callback_proc.c > > @@ -348,7 +348,7 @@ validate_seqid(struct nfs4_slot_table *tbl, struct cb_sequenceargs * args) > > /* Normal */ > > if (likely(args->csa_sequenceid == slot->seq_nr + 1)) { > > slot->seq_nr++; > > - return htonl(NFS4_OK); > > + goto out_ok; > > } > > > > /* Replay */ > > @@ -367,11 +367,14 @@ validate_seqid(struct nfs4_slot_table *tbl, struct cb_sequenceargs * args) > > /* Wraparound */ > > if (args->csa_sequenceid == 1 && (slot->seq_nr + 1) == 0) { > > slot->seq_nr = 1; > > - return htonl(NFS4_OK); > > + goto out_ok; > > } > > > > /* Misordered request */ > > return htonl(NFS4ERR_SEQ_MISORDERED); > > +out_ok: > > + tbl->highest_used_slotid = args->csa_slotid; > > + return htonl(NFS4_OK); > > } > > > > /* > > @@ -433,26 +436,32 @@ __be32 nfs4_callback_sequence(struct cb_sequenceargs *args, > > struct cb_sequenceres *res, > > struct cb_process_state *cps) > > { > > + struct nfs4_slot_table *tbl; > > struct nfs_client *clp; > > int i; > > __be32 status = htonl(NFS4ERR_BADSESSION); > > > > - cps->clp = NULL; > > - > > clp = nfs4_find_client_sessionid(args->csa_addr, &args->csa_sessionid); > > if (clp == NULL) > > goto out; > > > > + tbl = &clp->cl_session->bc_slot_table; > > + > > + spin_lock(&tbl->slot_tbl_lock); > > /* state manager is resetting the session */ > > if (test_bit(NFS4_SESSION_DRAINING, &clp->cl_session->session_state)) { > > status = NFS4ERR_DELAY; > > status = htonl(NFS4ERR_DELAY); Yep. I'll fix that too... > > + spin_unlock(&tbl->slot_tbl_lock); > > goto out; > > } > > > > status = validate_seqid(&clp->cl_session->bc_slot_table, args); > > + spin_unlock(&tbl->slot_tbl_lock); This is what guarantees the atomicity. Even if nfs4_begin_drain_session() sets NFS4_SESSION_DRAINING, we know that it can't test the value of tbl->highest_used_slotid without first taking the tbl->slot_tbl_lock, and we're holding that... > > if (status) > > goto out; > > > > + cps->slotid = args->csa_slotid; > -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com