Return-Path: Received: from mail-ua0-f182.google.com ([209.85.217.182]:40604 "EHLO mail-ua0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933554AbeCHS63 (ORCPT ); Thu, 8 Mar 2018 13:58:29 -0500 Received: by mail-ua0-f182.google.com with SMTP id c14so538454uak.7 for ; Thu, 08 Mar 2018 10:58:29 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <87y46monel.fsf@notabene.neil.brown.name> <87bmyx3q3u.fsf@notabene.neil.brown.name> <8737k2354z.fsf@notabene.neil.brown.name> <87zijsu4ko.fsf@notabene.neil.brown.name> <87d1dbcyxo.fsf@notabene.neil.brown.name> From: Olga Kornievskaia Date: Thu, 8 Mar 2018 13:58:28 -0500 Message-ID: Subject: Re: [PATCH resend] NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID To: NeilBrown Cc: Trond Myklebust , Schumaker Anna , List Linux NFS Mailing Content-Type: text/plain; charset="UTF-8" Sender: linux-nfs-owner@vger.kernel.org List-ID: On Tue, Mar 21, 2017 at 12:43 PM, Olga Kornievskaia wrote: > On Mon, Mar 20, 2017 at 7:26 PM, NeilBrown wrote: >> On Mon, Mar 20 2017, Olga Kornievskaia wrote: >> >>> Since open_owner is no longer removed then the resources are not >>> freed. They will not be freed until unmount (or reboot recovery). >> >> I don't follow your reasoning. >> When the last uses for a state owner is dropped by >> nfs4_put_state_owner() (e.g. when the last open file using it is >> closed), nfs4_put_state_owner() will add the state to >> server->state_owners_lru. >> >> nfs4_gc_state_owners() is called periodically (every time any file is >> opened I think) and it will call nfs4_remove_state_owner_locked() >> on any state which is sufficiently old enough (hasn't been used in >> 'lease-time'), and will then call nfs4_free_state_owner(). >> These two will release all resources. >> >> Does that explanation fit with your understanding? > > Thanks for the explanation. I was thinking about the rb-tree nodes. > The new open finds the same rb-tree node. I incorrectly thought that > it'd be generating a new node each time for the new open owner. I have another question about this patch. With this patch, when the new open owner is sent, it is sent with a seqid of what the old open owner had. Is this intentional or accidental? > > Thanks. > >> >> Thanks, >> NeilBrown >> >> >>> >>> There are (buggy) servers out there that might be forcing the client >>> to keep creating new open_owners. So if they are no longer released, >>> can't the client get into trouble here? >>> >>> On Sun, Dec 18, 2016 at 7:48 PM, NeilBrown wrote: >>>> >>>> When an NFS4ERR_BAD_SEQID is received the open-owner is removed from >>>> the ->state_owners rbtree so that it will no longer be used. >>>> >>>> If any stateids attached to this open-owner are still in use, and if a >>>> request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad. >>>> >>>> The state is marked as needing recovery and the nfs4_state_manager() >>>> is scheduled to clean up. nfs4_state_manager() finds states to be >>>> recovered by walking the state_owners rbtree. As the open-owner is >>>> not in the rbtree, the bad state is not found so nfs4_state_manager() >>>> completes having done nothing. The request is then retried, with a >>>> predicatable result (indefinite retries). >>>> >>>> If the stateid is for a delegation, this open_owner will be used >>>> to open files when the delegation is returned. For that to work, >>>> a new open-owner needs to be presented to the server. >>>> >>>> This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner >>>> in the rbtree but updates the 'create_time' so it looks like a new >>>> open-owner. With this the indefinite retries no longer happen. >>>> >>>> Signed-off-by: NeilBrown >>>> --- >>>> >>>> Hi Trond, >>>> It appears this one got lost too. >>>> I've added a comment as I thought an explanation was needed, and >>>> renamed the function from "drop" to "reset". >>>> >>>> Thanks, >>>> NeilBrown >>>> >>>> fs/nfs/nfs4state.c | 29 +++++++++++++---------------- >>>> 1 file changed, 13 insertions(+), 16 deletions(-) >>>> >>>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c >>>> index cf869802ff23..1d152f4470cd 100644 >>>> --- a/fs/nfs/nfs4state.c >>>> +++ b/fs/nfs/nfs4state.c >>>> @@ -494,21 +494,18 @@ nfs4_alloc_state_owner(struct nfs_server *server, >>>> } >>>> >>>> static void >>>> -nfs4_drop_state_owner(struct nfs4_state_owner *sp) >>>> -{ >>>> - struct rb_node *rb_node = &sp->so_server_node; >>>> - >>>> - if (!RB_EMPTY_NODE(rb_node)) { >>>> - struct nfs_server *server = sp->so_server; >>>> - struct nfs_client *clp = server->nfs_client; >>>> - >>>> - spin_lock(&clp->cl_lock); >>>> - if (!RB_EMPTY_NODE(rb_node)) { >>>> - rb_erase(rb_node, &server->state_owners); >>>> - RB_CLEAR_NODE(rb_node); >>>> - } >>>> - spin_unlock(&clp->cl_lock); >>>> - } >>>> +nfs4_reset_state_owner(struct nfs4_state_owner *sp) >>>> +{ >>>> + /* This state_owner is no longer usable, but must >>>> + * remain in place so that state recovery can find it >>>> + * and the opens associated with it. >>>> + * It may also be used for new 'open' request to >>>> + * return a delegation to the server. >>>> + * So update the 'create_time' so that it looks like >>>> + * a new state_owner. This will cause the server to >>>> + * request an OPEN_CONFIRM to start a new sequence. >>>> + */ >>>> + sp->so_seqid.create_time = ktime_get(); >>>> } >>>> >>>> static void nfs4_free_state_owner(struct nfs4_state_owner *sp) >>>> @@ -1113,7 +1110,7 @@ void nfs_increment_open_seqid(int status, struct nfs_seqid *seqid) >>>> >>>> sp = container_of(seqid->sequence, struct nfs4_state_owner, so_seqid); >>>> if (status == -NFS4ERR_BAD_SEQID) >>>> - nfs4_drop_state_owner(sp); >>>> + nfs4_reset_state_owner(sp); >>>> if (!nfs4_has_session(sp->so_server->nfs_client)) >>>> nfs_increment_seqid(status, seqid); >>>> } >>>> -- >>>> 2.11.0 >>>>