Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp1452642pxb; Sat, 15 Jan 2022 12:39:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJzWDJdZ+9Jtx3tWr+TXyYahEgDDYxkLyzax/4IxWHBlIecpHed+I1yRvo7aBAC3/TXqQM0R X-Received: by 2002:a17:906:3b96:: with SMTP id u22mr6033177ejf.661.1642279171977; Sat, 15 Jan 2022 12:39:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642279171; cv=none; d=google.com; s=arc-20160816; b=Mgq/1W7Y0HzZBWvY2oqnh7oOYGNGYz6kKEUpoYj+jKGuZt+BfZcNgku8w5gX8/cv7t KutaLnO4Mml3pM/fI5BocxpYdneYSbZjFz41dI04JjQyNn8ogqfq7ZqCPEI8DO+ztC4n 5cpqlahtgpWUhH95GOGfVINpb7CVEoBa80PGJIQ1GnKDNo/iQlWSXv2TJmAbe5MDqOUU jLpKc06ARws5wTAFTNUO8PKWWqiaPr7ag6wYdTVLWzzl8Nae8odAxaYJCZfd4MCECm+X n1lWJYRATEIB08CoEuJ0U2N4ukea5B6GAALSfyiWZvbt3g37MJtQAwDTxJusCCAzc0oz lyjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=EzT90moJPhuFjc9juOrfcN8OnM1+xR4+XSJb0KXRQus=; b=KHzeOLYKhnihB9csJxzUvp5PAfWcUjXpJDgndWaodYNeBHgyV8DjGSUcxjZx0QZz5O yJvLGlsExlFpHPr5fCDxzwVYCwQWYR1MUeJJKJDil+XQIt1C08iegrOQM8V86cWhn2BV bqrEbM7bGJWm42Ns6RTDJyyesqZNZPySyH7OOb2Nw/XVPc9ljbKNouUiMjnMdN/VGO35 taMBhGmwUPhb+M+N0+hjYDHk1vJU61ozMG47tOYtKN6ZwZwfECvutvZn8GnRnFa8duFi dkefo/ivsLJaVibxFsd5eDH5KP4F3WixlX5kkzTo0oO6PtoXBVtsBg4Fgk2gvs9q+64/ DQ0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s11si2934965edd.368.2022.01.15.12.38.51; Sat, 15 Jan 2022 12:39:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232659AbiAOIOr (ORCPT + 99 others); Sat, 15 Jan 2022 03:14:47 -0500 Received: from server.atrad.com.au ([150.101.241.2]:43476 "EHLO server.atrad.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231446AbiAOIOr (ORCPT ); Sat, 15 Jan 2022 03:14:47 -0500 Received: from marvin.atrad.com.au (IDENT:1008@marvin.atrad.com.au [192.168.0.2]) by server.atrad.com.au (8.17.1/8.17.1) with ESMTPS id 20F8EKPv029479 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Sat, 15 Jan 2022 18:44:22 +1030 Date: Sat, 15 Jan 2022 18:44:20 +1030 From: Jonathan Woithe To: Chuck Lever III Cc: Bruce Fields , Linux NFS Mailing List Subject: Re: [Bug report] Recurring oops, 5.15.x, possibly during or soon after client mount Message-ID: <20220115081420.GB8808@marvin.atrad.com.au> References: <20220114103901.GA22009@marvin.atrad.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-MIMEDefang-action: accept X-Scanned-By: MIMEDefang 2.86 on 192.168.0.1 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Hi Chuck Thanks for your response. On Fri, Jan 14, 2022 at 03:18:01PM +0000, Chuck Lever III wrote: > > Recently we migrated an NFS server from a 32-bit environment running > > kernel 4.14.128 to a 64-bit 5.15.x kernel. The NFS configuration remained > > unchanged between the two systems. > > > > On two separate occasions since the upgrade (5 Jan under 5.15.10, 14 Jan > > under 5.15.12) the kernel has oopsed at around the time that an NFS client > > machine is turned on for the day. On both occasions the call trace was > > essentially identical. The full oops sequence is at the end of this email. > > The oops was not observed when running the 4.14.128 kernel. > > > > Is there anything more I can provide to help track down the cause of the > > oops? > > A possible culprit is 7f024fcd5c97 ("Keep read and write fds with each > nlm_file"), which was introduced in or around v5.15. You could try a > simple test and back the server down to v5.14.y to see if the problem > persists. I could do this, but only perhaps on Monday when I'm next on site. It may take a while to get an answer though, since it seems we hit the fault only around once every 2 weeks. Since it's a production server we are of course limited in the things I can do. I *may* be able to set up another system as an NFS server and hit that with repeated mount requests. That could help reduce the time we have to wait for an answer. Is it worth considering a revert of 7f024fcd5c97? I guess it depends on how many later patches depended on it. > Otherwise, Bruce, can you have a look at this? Regards jonathan