Return-Path: linux-nfs-owner@vger.kernel.org Received: from mx1.netapp.com ([216.240.18.38]:11048 "EHLO mx1.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757066Ab3KYSdX convert rfc822-to-8bit (ORCPT ); Mon, 25 Nov 2013 13:33:23 -0500 From: "Myklebust, Trond" To: "Adamson, Andy" CC: Linux NFS Mailing List Subject: Re: [PATCH 1/1] NFSv4.1 fix a kswap nfs4_state_manger race Date: Mon, 25 Nov 2013 18:33:11 +0000 Message-ID: <496E7DBC-183B-43A2-91D4-837FC092E88A@netapp.com> References: <1385402270-14284-1-git-send-email-andros@netapp.com> <1385402270-14284-2-git-send-email-andros@netapp.com> <5B8C7A9D-CD9E-487A-AC62-B1292649835D@netapp.com> In-Reply-To: <5B8C7A9D-CD9E-487A-AC62-B1292649835D@netapp.com> Content-Type: text/plain; charset="Windows-1252" MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Nov 25, 2013, at 13:13, Myklebust, Trond wrote: > > On Nov 25, 2013, at 12:57, wrote: > >> From: Andy Adamson >> >> The state manager is recovering expired state and recovery OPENs are being >> processed. If kswapd is pruning inodes at the same time, a deadlock can occur >> when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the >> resultant layoutreturn gets an error that the state mangager is to handle, >> causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. >> >> At the same time an open is waiting for the inode deletion to complete in >> __wait_on_freeing_inode. >> >> If the open is either the open called by the state manager, or an open from >> the same open owner that is holding the NFSv4.0 sequence id which causes the >> OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, >> then the state is deadlocked with kswapd. >> >> Do not handle LAYOUTRETURN errors when called from nfs4_evict_inode. > > Why are we waiting for recovery in LAYOUTRETURN at all? Layouts are automatically lost when the server reboots or when the lease is otherwise lost. > > IOW: Is there any reason why we need to special-case nfs4_evict_inode? Shouldn?t we just bail out on error in _all_ cases? BTW: Is it possible that we might have a similar problem with delegreturn? That too can be called from nfs4_evict_inode... -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com