Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp2499357ybn; Thu, 26 Sep 2019 12:51:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqwG0DcMH8E9CmkM7DpgHUaCG+DW1z2vF+pR6aK5Rpqz3OqDxnVcRcKySVv52WD1haYHMGm7 X-Received: by 2002:a17:906:80c3:: with SMTP id a3mr4626455ejx.271.1569527519798; Thu, 26 Sep 2019 12:51:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569527519; cv=none; d=google.com; s=arc-20160816; b=gCrtYpfXr4o/76EeKANJl+sHq05VMcdwLLNUZLjarZXpV2QgmI9/FqF1fDphZqWolB fpPcPa+RMnVdUflrb0q7hfN5IG4Cj+RdbKmdlDIzqdD6dkRCzk35aRkKv9lLH4Fr1ngQ dTbS5PxifhJ/nmTuj+Wr98nSv5P260Q+CO3fZXJaPqeiMNL+X59+7jNgBCmQb6JtHTZI ut/l4+gFRAHVRzmtg2vjx+btIhkWCXVPEgoyOvki+uHXOujYPyBCbAd3VSuNQVsyqU0z 5yWQweA7LuZUI//68qXEKp9Zhl9Ey3rMlJssQ2fkEF7fa2AxiFqEeTEKxEXbbRt0gGT2 Z9VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:from:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:date; bh=GSgzXzlb1nwoltuS8aTPCKCqxtqD1j9oHwG0K5fW7F0=; b=AK3nuJZ+pgGU02kxHoqFrORV/ET69EjzBQJsrnMVqmaV5lhjzrTLDD8XKqFf7iXA+X 1PwFOItBsYkQeugtjtqzY+AGKdBNbm+0YorbcT+rMcqFrT/MorgJQEGNzk8ZKL7Do4Og Clv7QxgarIR30qeWJzor4GQCkvlMBk8dCDmunc3p4h1wYq/lW3/LalVB1qrkl9K2Vm9g FqVaCzATC4mOZpiFKV0G2vnm6kv/qePFjv2r9/l2x4WvgoGvTvGQJtNHPVnFenrjFTH9 2fhz2+0jj5tjZbEvNw11NwkFAE4fhPZDg+KhbAeIgTT4h525h7Ng+/ulACka6yxuGsh4 xRkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j6si1826638ejj.37.2019.09.26.12.51.24; Thu, 26 Sep 2019 12:51:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728747AbfIZTvA (ORCPT + 99 others); Thu, 26 Sep 2019 15:51:00 -0400 Received: from fieldses.org ([173.255.197.46]:60848 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728654AbfIZTvA (ORCPT ); Thu, 26 Sep 2019 15:51:00 -0400 Received: by fieldses.org (Postfix, from userid 2815) id 2BDCD150B; Thu, 26 Sep 2019 15:51:00 -0400 (EDT) Date: Thu, 26 Sep 2019 15:51:00 -0400 To: James Harvey Cc: Linux NFS Mailing List , bcodding@redhat.com Subject: Re: 5.3.0 Regression: rpc.nfsd v4 uninterruptible sleep for 5+ minutes w/o rpc-statd/etc Message-ID: <20190926195100.GB2849@fieldses.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) From: bfields@fieldses.org (J. Bruce Fields) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Thu, Sep 19, 2019 at 09:17:03AM -0400, James Harvey wrote: > On Thu, Sep 19, 2019 at 9:00 AM James Harvey wrote: > > > > For a really long time (years?) if you forced NFS v4 only, you could > > mask a lot of unnecessary services. > > > > In /etc/nfs.conf, in "[nfsd] I've been able to set "vers3=n", and then > > mask the following services: > > * gssproxy > > * nfs-blkmap > > * rpc-statd > > * rpcbind (service & socket) > > > > Upgrading from 5.2.14 to 5.3.0, nfs-server.service (rpc.nfsd) has > > exactly a 5 minute delay, and sometimes longer. > > > > ... > > P.S. During a few of the earlier boots where I was seeing strace show > the delay was reading "/proc/fs/nfsd/versions", I do have this in > journalctl, when I tried in another term cat'ing it during the > rpc.nfsd delay: (it repeats a few times during the 5 minute delay) It's waiting on the nfsd_mutex. So somebody else is sitting around blocking for five minutes while holding nfsd_mutex. That lock covers a lot of stuff so that may not narrow down much who's blocking. An alt-sysrq-t dump might help. (So, hit alt-sysrq-t, or echo t to /proc/sysrq-trigger, then look in dmesg for tasks that are waiting.) There's not a whole ton of changes between 5.2 and 5.3, but I'm not seeing an obvious culprit. --b. > > INFO: task cat:3400 blocked for more than 122 seconds. > Not tainted 5.3.0-arch1-1-ARCH #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > cat D 0 3400 3029 0x00000080 > Call Trace: > ? __schedule+0x27f/0x6d0 > schedule+0x43/0xd0 > schedule_preempt_disabled+0x14/0x20 > __mutex_lock.isra.0+0x27d/0x530 > write_versions+0x38/0x3c0 [nfsd] > ? _copy_from_user+0x37/0x60 > ? write_ports+0x320/0x320 [nfsd] > nfsctl_transaction_write+0x45/0x70 [nfsd] > nfsctl_transaction_read+0x3b/0x60 [nfsd] > vfs_read+0x9d/0x150 > ksys_read+0x67/0xe0 > do_syscall_64+0x5f/0x1c0 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x7f47286fa155 > Code: Bad RIP value. > RSP: 002b:00007ffcd2a74438 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 > RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f47286fa155 > RDX: 0000000000020000 RSI: 00007f47282ba000 RDI: 0000000000000003 > RBP: 00007f47282ba000 R08: 00007f47282b9010 R09: 0000000000000000 > R10: 0000000000000022 R11: 0000000000000246 R12: 00007f47282ba000 > R13: 0000000000000003 R14: 0000000000000fff R15: 0000000000020000 > svc: failed to register lockdv1 RPC service (errno 110). > lockd_up: makesock failed, error=-110