Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp363933ybx; Wed, 6 Nov 2019 18:37:21 -0800 (PST) X-Google-Smtp-Source: APXvYqxQzvbg7BjZGxeJWfK1yPawo7GC9bKZwr3t+nOM6Fub4M8M36CvoGtqfSlXkMBM2GAiJdFd X-Received: by 2002:a17:906:6d84:: with SMTP id h4mr863236ejt.262.1573094241206; Wed, 06 Nov 2019 18:37:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573094241; cv=none; d=google.com; s=arc-20160816; b=qIFprZIUwkakr0+Bop6uyZsGqDUz7qu+UJdo37GbPGrE1wlZg/riHnJITMc6i8vX5N foNEtnYjNBWwRyNIIE+5KF9iwBSDoRwcn0TYd8alTsXzXqsmj3NFbDXOtYVudcILWnZM 8vj+92tkCTt4NtuY6WVxLQZu3xXcBMbbkhT5x1snhuK3DeLKxc2CNAD2/ILeTTYh1Wax mXS+mRT5wlhWw+HIe/Hw3xWo8YU4WYKiyQ7Y4hvRsBGKmZyFClnXSq7QUaOkUb1hqdCT oAqWkp2A8NPU9eK5mBUZRRky8k7CDcuAUiFyowLf0kgGHwQuKHH4dk/i+UVIOSrkZ25K gRKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=X6fVdSaRz6khjNHs7ugyMBiNbcmqhMag79zbgVmLtck=; b=wkrm5CjCHQrU/ljWPSrH8dqF+bzLeHDRc38l7LSiy0rQlI/dKKeJzsVQJ0ZMrIXimK yB2u0RXVv7Ivlf3QD1Y7TYE2VVH9N1PPhR1ozsvqwkAtOyTepChR/WvHVApFhs/a+L5I FHEGEPS9rQIlo5SvnzphXrSFFb5XkFTSZx14EWIipo4GM4bc823ahT7NUKtYQUDSUcRS 2v31a/Yjg/5jiXRG11Z0W7c8J0VMqvnN5XmxsCwtjF0PDK15jE5GMHSKaeGIL1S3qwx7 MSfIqwqA0Jg7ZC5Ypgqh/0hChPNSysCNwkaEd745ocZRCZk4MgveH24a1nbBBxH89B/4 9AYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g3si542052ejw.13.2019.11.06.18.36.44; Wed, 06 Nov 2019 18:37:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728315AbfKGCea (ORCPT + 99 others); Wed, 6 Nov 2019 21:34:30 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:43200 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727924AbfKGCea (ORCPT ); Wed, 6 Nov 2019 21:34:30 -0500 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 207EE97F2AC58024BB86; Thu, 7 Nov 2019 10:34:28 +0800 (CST) Received: from [127.0.0.1] (10.173.220.145) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.439.0; Thu, 7 Nov 2019 10:34:24 +0800 Subject: Re: [PATCH] NFS4: Fix v4.0 client state corruption when mount To: Benjamin Coddington CC: , , References: <1557115023-86769-1-git-send-email-zhangxiaoxu5@huawei.com> <21D6F3D9-C1B6-4F5C-98A0-87B067C6E198@redhat.com> From: "zhangxiaoxu (A)" Message-ID: <6348ccfd-4a61-5e82-fd2a-03b2c18fe220@huawei.com> Date: Thu, 7 Nov 2019 10:34:23 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.1.2 MIME-Version: 1.0 In-Reply-To: <21D6F3D9-C1B6-4F5C-98A0-87B067C6E198@redhat.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.220.145] X-CFilter-Loop: Reflected Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org 在 2019/11/7 0:47, Benjamin Coddington 写道: > Hi ZhangXiaoxu, > > I'm having a bit of trouble with this fix (which went upstream in > f02f3755dbd14fb935d24b14650fff9ba92243b8). > > Since this change, my client calls SETCLIENTID/SETCLIENTID_CONFIRM twice in > quick succession on mount, and the second SETCLIENTID_CONFIRM sent by the state > manager can sometimes have the same verifier sent back by the first > SETCLIENTID's response.  I think we're missing a memory barrier somewhere.. nfs40_discover_server_trunking nfs4_proc_setclientid # the first time after nfs4_schedule_state_manager, the state manager: nfs4_run_state_manager nfs4_state_manager # 'nfs4_alloc_client' init state to NFS4CLNT_LEASE_EXPIRED nfs4_reclaim_lease nfs4_establish_lease nfs4_init_clientid nfs4_proc_setclientid # the second time. > > But, I do not understand how the client was able to corrupt the state before > this patch, and I don't understand how the patch fixes state corruption. > > Can anyone enlighten me as to how we were corrupting state here? when 'nfs4_alloc_client', the client state initialized with 'NFS4CLNT_LEASE_EXPIRED', So, we should recover it when the client init. After the first setclientid, maybe we should clear the 'NFS4CLNT_LEASE_EXPIRED', then the state manager won't be called it again. > > Ben > > On 5 May 2019, at 23:57, ZhangXiaoxu wrote: > >> stat command with soft mount never return after server is stopped. >> >> When alloc a new client, the state of the client will be set to >> NFS4CLNT_LEASE_EXPIRED. >> >> When the server is stopped, the state manager will work, and accord >> the state to recover. But the state is NFS4CLNT_LEASE_EXPIRED, it >> will drain the slot table and lead other task to wait queue, until >> the client recovered. Then the stat command is hung. >> >> When discover server trunking, the client will renew the lease, >> but check the client state, it lead the client state corruption. >> >> So, we need to call state manager to recover it when detect server >> ip trunking. >> >> Signed-off-by: ZhangXiaoxu >> --- >>  fs/nfs/nfs4state.c | 4 ++++ >>  1 file changed, 4 insertions(+) >> >> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c >> index 3de3647..f502f1c 100644 >> --- a/fs/nfs/nfs4state.c >> +++ b/fs/nfs/nfs4state.c >> @@ -159,6 +159,10 @@ int nfs40_discover_server_trunking(struct nfs_client *clp, >>          /* Sustain the lease, even if it's empty.  If the clientid4 >>           * goes stale it's of no use for trunking discovery. */ >>          nfs4_schedule_state_renewal(*result); >> + >> +        /* If the client state need to recover, do it. */ >> +        if (clp->cl_state) >> +            nfs4_schedule_state_manager(clp); >>      } >>  out: >>      return status; >> -- >> 2.7.4 > > > .