Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp4122277ybg; Fri, 25 Oct 2019 13:40:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqxzQXcryG3OlClWAIgmff/bB1olT0/Gt+gtWnD1T0IbfNdGzxxq/vick+ZH31L8OOnftsoL X-Received: by 2002:a17:906:494e:: with SMTP id f14mr3641480ejt.42.1572036058912; Fri, 25 Oct 2019 13:40:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572036058; cv=none; d=google.com; s=arc-20160816; b=AKkOOXR2Waaql2qHOBjQXZVAFOldHHy5qR8yitxp0YRax3fLcakzeEMuJq22L/mPLs P5aHZY7xCYT9/z2NXt1JN7WJiz/SawvWH4i1Y5/8hkCUpb4JsBaEqPEc1d4Uz7Qb9hO9 6pWKeWaZaoV34fPDuxBcmquTNuhjEKj2/dB8ysmCmQ925wPaQFH0IIMTdfLoRscrv1Ba wgjwHfjX2E33D6JkfC9T0IIAtB98j/Tq6j/2429nQfzUk1VQ/iIyBFD6w0wE5n5ItrJu GFHYoLzuu6pNulPjXt2LhAP05qZ3uRQ55H8qGWToq9TdYRENRedZiEDas8Ak7UgJIz7e UINw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:from:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:date; bh=IXE446oCyq03IvOSI0SEdt2tyKO8psaNWmvSr5ejxvk=; b=g96kat+R1e868BAgEmgBtUiPLrbNP8B1chtOwMq3zmCkwNb72O0jYZG4h47D4eFFjP CNtrsH3KPmB1jcVYEs08o7kx6vgmz6Th/C5iXVSk8GYXR0cSxxSpXOoWmXVN/av99Nt8 RYDGl00N9ro0rKr3CShKIMfudxz95yqB97LkO45UedtTInfsY4DEYo63C/qQfJ2E0ZKs MHMaKLTZlRUR6EtFdCJGfrkBDjtrhx3B44xMYtN6Y198gXzmhP9yDmllVh+uqpIrELIk BjGq7mCjQVxkuVm0u21U56TxeqbWe/vmKa22XpiKcKeP77GBEdXQqmFTnS0gleSTuv/v dk3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v13si2672622eda.131.2019.10.25.13.40.34; Fri, 25 Oct 2019 13:40:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2395429AbfJYPdh (ORCPT + 99 others); Fri, 25 Oct 2019 11:33:37 -0400 Received: from fieldses.org ([173.255.197.46]:40532 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730267AbfJYPdh (ORCPT ); Fri, 25 Oct 2019 11:33:37 -0400 Received: by fieldses.org (Postfix, from userid 2815) id B6E1D1C21; Fri, 25 Oct 2019 11:33:36 -0400 (EDT) Date: Fri, 25 Oct 2019 11:33:36 -0400 To: "J. Bruce Fields" Cc: Trond Myklebust , "linux-nfs@vger.kernel.org" Subject: Re: [PATCH v2] nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() Message-ID: <20191025153336.GA20283@fieldses.org> References: <20191023214318.9350-1-trond.myklebust@hammerspace.com> <20191025145147.GA16053@pick.fieldses.org> <97f56de86f0aeafb56998023d0561bb4a6233eb8.camel@hammerspace.com> <20191025152119.GC16053@pick.fieldses.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191025152119.GC16053@pick.fieldses.org> User-Agent: Mutt/1.5.21 (2010-09-15) From: bfields@fieldses.org (J. Bruce Fields) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Fri, Oct 25, 2019 at 11:21:19AM -0400, J. Bruce Fields wrote: > On Fri, Oct 25, 2019 at 02:55:45PM +0000, Trond Myklebust wrote: > > On Fri, 2019-10-25 at 10:51 -0400, J. Bruce Fields wrote: > > > On Wed, Oct 23, 2019 at 05:43:18PM -0400, Trond Myklebust wrote: > > > > When we're destroying the client lease, and we call > > > > nfsd4_shutdown_callback(), we must ensure that we do not return > > > > before all outstanding callbacks have terminated and have > > > > released their payloads. > > > > > > This is great, thanks! We've seen what I'm fairly sure is the same > > > bug > > > from Red Hat users. I think my blind spot was an assumption that > > > rpc tasks wouldn't outlive rpc_shutdown_client(). > > > > > > However, it's causing xfstests runs to hang, and I haven't worked out > > > why yet. > > > > > > I'll spend some time on it this afternoon and let you know what I > > > figure > > > out. > > > > > > > Is that happening with v2 or with v1? With v1 there is definitely a > > hang in __destroy_client() due to the refcount leak that I believe I > > fixed in v2. > > I thought I was running v2, let me double-check.... Yes, with v2 I'm getting a hang on generic/013. I checked quickly and didn't see anything interesting in the logs, otherwise I haven't done any digging. --b.