Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp2372824ybb; Sat, 21 Mar 2020 20:45:02 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtThN+zxX5XhQVUFypfdUrjdVSqMpmjTR3y0FwFA6ImgNCUHeWeKHSTgiO8zd7d4CPx/dyk X-Received: by 2002:aca:5345:: with SMTP id h66mr12919105oib.110.1584848701956; Sat, 21 Mar 2020 20:45:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584848701; cv=none; d=google.com; s=arc-20160816; b=dxty31udNDGjmgBJPk2JmO7YrYoQxg9JixTtDe5B1tLuIwewijvq54K5ffAraIGG14 9BQ42uhXRNb80v5avlECblU53uIT4TdcHGDrsgkgiHj3sUStIarBVlGsqflQJDAuWwjY NAY/HI5Rw2ZwZ/CV8XfdbaW4319MhnVt0UUZPdenHZQckEf0ytS3/m6Yttb9AYylo+tc p8uGVoYkyNfhdgNjkjrNN9QRc6/PxJuyG5qEet7ljgqZyNaJVENAIYAJzNEEuKePZXs1 z09fdhQJRJcg3r4Tm9dVLq0XbhXsLvTlrMXqKgmLX7YzXmZrMJp5sqCyjnVAriIl5Qpz ClVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=5RqKex8KbYpbGsT6RWzmMGGHCMZsA9dtLWNVMB80qA0=; b=o+qJAphQTIZwpAqARa8wRaDTWVYg4irGyeOpZ8jmrw828fbxaKYmn9THhkvXRysp/E maYeUGvXNup3oUpuNPk3H8r2ztCSzMXczv/DdghaUebc6Lir7TOfivZzsnqlOF2tKQOq I17m3qST/DTGiwDM2SxQ5J2nIUuwdDL8CUDd7yd1CoYptT3HZZw/x64FRXFtdiRUEFzM Xzimknt78HQeZtvhd/9DiLh5h9rdcimLEeevmYxiWhP5+r2K7flI8unbq/R20GCeD0ir /CXoK3anj/fI2R5WDBLTO2ZifVsapRTiqJscdNd+F4F02jGTfIgtu9O6zNhivsgvIu4y GDNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@belsznica.pl header.s=x header.b=cQtHnIpE; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x64si5452107oig.34.2020.03.21.20.44.32; Sat, 21 Mar 2020 20:45:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@belsznica.pl header.s=x header.b=cQtHnIpE; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727851AbgCVDn5 (ORCPT + 99 others); Sat, 21 Mar 2020 23:43:57 -0400 Received: from s58.linuxpl.com ([5.9.16.239]:53266 "EHLO s58.linuxpl.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727700AbgCVDn5 (ORCPT ); Sat, 21 Mar 2020 23:43:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=belsznica.pl; s=x; h=Content-Transfer-Encoding:Content-Type:MIME-Version: References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=5RqKex8KbYpbGsT6RWzmMGGHCMZsA9dtLWNVMB80qA0=; b=cQtHnIpEtNZ6je+2B6gcw1YOfE Xe8+atZplu2RFWPzaLXLuMTkXcwLoh1dxClqfre7RuAiIe+BMENJG4vEVtyYOCIg6d3S76l6Cqo7s 857zykRv8mR8KvhqYeiIKyboekUfFpFdKhLuyieW03fa5Z6+rSCQ8QH3SjCfx1c4omGXCkhCVejZU Rg113Os2xT+2dcBECK8mkNDGQ54sgSfnAuPNqAtpWg9lASk8MftvEdrN6Unu7Xa2ynrOtFUbE4XHu D30kr6O/HVfYk+cX7cBeI2/5CnjPRDudkkTAykLkZKue6eUR18fa86B8DH4kkoRAjA+qlFYccwwWz 7hPUz35A==; Received: from user-5-173-97-3.play-internet.pl ([5.173.97.3] helo=mordimer) by s58.linuxpl.com with esmtpa (Exim 4.92.3) (envelope-from ) id 1jFrWj-000827-Ax; Sun, 22 Mar 2020 04:43:53 +0100 Received: from mordimer (localhost [127.0.0.1]) by mordimer (Postfix) with ESMTP id 454BF60192; Sun, 22 Mar 2020 04:43:53 +0100 (CET) Date: Sun, 22 Mar 2020 04:43:52 +0100 From: Jan Psota To: Chuck Lever Cc: Linux NFS Mailing List , "Jason A. Donenfeld" Subject: Re: refcount underflow in nfsd41_destroy_cb Message-ID: <20200322044352.2ff1fbd8.jasiu@belsznica.pl> In-Reply-To: <44C9D860-4F51-46B1-88A3-D10DDEF4BD8E@oracle.com> References: <44C9D860-4F51-46B1-88A3-D10DDEF4BD8E@oracle.com> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-pc-linux-gnu) X-Operating-System: Linux; Gentoo MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Chuck Lever napisa=B3(a): > Jan, how are you reproducing this? It looks like it's taking place on server on high NFS load and about a day after boot! (as I noticed looking into last -x results, below) Then system runs all right for a month (to be rebooted on new kernel [not always] or something like this). We have some NFS-rooted machines: /systemd on / type nfs4 (rw,relatime,vers=3D4.2,rsize=3D4096,wsize=3D4096,n= amlen=3D255,hard,proto=3Dtcp, timeo=3D10,retrans=3D2,sec=3Dsys,clientaddr=3D192.168.1.18,local_lock=3Dno= ne,addr=3D192.168.1.1) Server has 10Gb Aquantia AQC107 card connected to Mikrotik CSS326 switch. Clients running distcc (aside from acting as workstations) are connected on 1Gb ethernet. Server runs Gentoo Linux on OpenRC (stations have Systemd) with recent gcc-9.3, binutils-2.34 and glibc-2.30, has 32 GB RAM and AMD Phenom II X6 1090T CPU. /var/tmp/portage, where compilation takes place, normally is on client tmpfs, but when there is not enough space to compile huge program, I switch it to server exported NFS (/etc/exports opts: -rw,async,no_root_squash,no_subtree_check) # "grep nfs.*destroy /var/log/messages" mixed with "last -x" reboot system boot 5.5.1-gentoo Mon Feb 3 00:20 - 15:22 (25+15:01) Feb 4 17:44:39 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] rust compilation, kernel 5.5.1-gentoo reboot system boot 5.5.6-gentoo Fri Feb 28 15:23 - 16:25 (14+01:02) Feb 29 13:51:49 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] rust compilation, kernel 5.5.6-gentoo reboot system boot 5.5.9-gentoo Fri Mar 13 16:27 - 00:04 (4+07:36) Mar 14 18:03:49 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] libpciaccess compilation, kernel=20 reboot system boot 5.6.0-rc6 Wed Mar 18 00:06 - 20:39 (2+20:32) Mar 19 11:08:07 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] linux-firmware merge * reboot system boot 5.6.0-rc6 Fri Mar 20 20:40 - 02:40 (05:59) Mar 20 21:43:34 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] zstd compilation * reboot system boot 5.6.0-rc6 Sat Mar 21 02:42 still running Mar 21 17:34:43 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] nodejs compilation * - I noticed kernel fault looking for a reason, why WireGuard refused to connect with _some_ remote peers so I rebooted the server and it helped.