Received: by 2002:ac8:156:0:b0:3e0:cd10:60c8 with SMTP id f22csp11172qtg; Thu, 6 Apr 2023 10:35:02 -0700 (PDT) X-Google-Smtp-Source: AKy350YP5h4NdcQRSIXndm+14gZWuib7lSGjG/n5JOOoCH7S8f/9n0Jrh1Ha5g20hPdB/Pb5weXG X-Received: by 2002:a17:90b:350f:b0:23f:7c8c:4f2d with SMTP id ls15-20020a17090b350f00b0023f7c8c4f2dmr11883803pjb.16.1680802502307; Thu, 06 Apr 2023 10:35:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680802502; cv=none; d=google.com; s=arc-20160816; b=kq/K8miQF/3c33/M00B9TCNWxFRY9sGr0hk8Gu4n6UdkYRKEPyWk9wnQMW0dOQc5Jh giiLigjS1cP/Hr/ItFQseMexxdJVDnc2lw+GDdCc1C1VjCLCZZiBhoxZLBgKf5birG4n PL62J1lzHjSiq4v+pv/QbG2ADoIVJeDsXNuoorZt85XVKNUyPd9moX5WM3teixQ8+rpT W2TlPXrOUpgvhK6XjTpQ/kUQh2C4vx0t+ncsuhq6loily9Urb9kJQTT5Zi8YFJ6zy5iZ rxjl+FA7MjSbfPIkDlRo30uGQ/4kIVeRwO2l44dn+lWlfhYMXcT+i76AkYf3Bf43FyV5 kWyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=jxnYfwN8gmYu+P3g5db8LSZaRaw4ZDW9LOP3ZO3NCd8=; b=WY8JC0fPoAPHzv4Jk/j/pv+jKlnwHQpA5/Y5eS1XiZuq7eUQeJAhtydp7Z+OHtwref vaV1D6CPOmq78fyoYZxVcbpTDQP3/ejB2GbM+NuTYTqBFUNsSGn1vq86nPBQ39xD8eYz gqVtYIbGJ3gQahNVsy6vKWelNV5OKBESlpKQ1pjwwfVlEjZ7Qp450c4SFp2+Y7O615xi lsidlkJxKmicr60KPyhfXciG9uwGQF1BgBgVtGRd6pCP8eoX4UaAi8IUidiRVg0AmjIT KsObgyUSBrHAv+hnNpT7M9h8h5opeiXe/ugNjW1T1bvJ0+KzQ7ujOOAij7mDH6oJEBC4 4AGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@phys.ethz.ch header.s=2023 header.b=or94ud+6; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=phys.ethz.ch Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rm10-20020a17090b3eca00b0023312fad927si4567441pjb.100.2023.04.06.10.34.29; Thu, 06 Apr 2023 10:35:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@phys.ethz.ch header.s=2023 header.b=or94ud+6; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=phys.ethz.ch Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240036AbjDFR1L (ORCPT + 99 others); Thu, 6 Apr 2023 13:27:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240064AbjDFR1F (ORCPT ); Thu, 6 Apr 2023 13:27:05 -0400 Received: from phd-imap.ethz.ch (phd-imap.ethz.ch [129.132.80.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81C7DA5E1 for ; Thu, 6 Apr 2023 10:26:44 -0700 (PDT) Received: from localhost (192-168-127-49.net4.ethz.ch [192.168.127.49]) by phd-imap.ethz.ch (Postfix) with ESMTP id 4PspLw343Fz31; Thu, 6 Apr 2023 19:26:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=phys.ethz.ch; s=2023; t=1680801976; bh=jxnYfwN8gmYu+P3g5db8LSZaRaw4ZDW9LOP3ZO3NCd8=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=or94ud+6dGn7cZ2JjrL2kIodVvVZv1o0cK/mQnXOD/IXiW37j7/7ZbWqi9Ts3BTTj NOLDqMxgsQjm5RvHNHEwity0xcfU4CaZxGOBC0G1GAf0gECDYmo2qK4o+QvHhJJt1L 82M0fvlNS4wui6B/C5epW30Jv8Aq6+rqGF0ogSMZHxedbM0j4TF41vEmuWkEZoG59N tSKPcqhu7yDU8LoJZeq4II229t0xzkQFG9UVzJzmNpxIghNvUuFyLmXASKOCLGX9si EInKOpl+Fk7ezpbym9FP0FdVdrUNsf7aMj6B/Nx8mm02DafhH+8Q7p+v1oh5VR1d7n DT0bSzhylg+IQ== X-Virus-Scanned: Debian amavisd-new at phys.ethz.ch Received: from phd-mxin.ethz.ch ([192.168.127.53]) by localhost (phd-mailscan.ethz.ch [192.168.127.49]) (amavisd-new, port 10024) with LMTP id 9GTg0saUeQMs; Thu, 6 Apr 2023 19:26:16 +0200 (CEST) Received: from phys.ethz.ch (mothership.ethz.ch [192.33.96.20]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: daduke@phd-mxin.ethz.ch) by phd-mxin.ethz.ch (Postfix) with ESMTPSA id 4PspLw2Dl2z9r; Thu, 6 Apr 2023 19:26:16 +0200 (CEST) Date: Thu, 6 Apr 2023 19:26:15 +0200 From: Christian Herzog To: Bob Ciotti Cc: Chuck Lever III , Linux NFS Mailing List , Bob Ciotti Subject: Re: file server freezes with all nfsds stuck in D state after upgrade to Debian bookworm Message-ID: Reply-To: Christian Herzog References: <4F41FC87-908F-451F-8D2C-089CB7AB5919@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4F41FC87-908F-451F-8D2C-089CB7AB5919@gmail.com> X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Dear Bob, thanks a lot for your input. > >>>> That was our first idea too, but we haven't found any indication that this is the case. The xfs file systems seem perfectly fine when all nfsds are in D state, and we can > >>>> read from them and write to them. If xfs were to block nfs IO, this should > >>>> affect other processes too, right? > >>> It's possible that the NFSD threads are waiting on I/O to a particular filesystem block. XFS is not likely to block other activity in this case. > >> ok good to know. So far we were under the impression that a file system would > >> block as a whole. > > > > XFS tries to operate in parallel as much as it can. Maybe other filesystems aren't as capable. > > > > If the unresponsive block is part of a superblock or the journal (ie, shared metadata) I would expect XFS to become unresponsive. For I/O on blocks containing file data, it is likely to have more robust behavior. > > > > Pretty sure we have seen a similar issue - never fully explained. From what I recall, the server gets to a low memory state. At that point, efforts to coalesce writes are abandoned, and each write request is processed in line - vs scheduled - all nfsd's then pile up in D. writes continue to arrive at a rate higher than can keep up. But, the back end store (a high end netapp raid 6 w/240 drives also with xfs) had very little load - not too busy. Never fully explained it - but Chucks point on shared metadata block may be good place to look - and whether in-line write at low memory could have synergy. IIRC, worked around with releases and tunables like minfree kmem et.al. , that came into play to reduce - but not eliminate. I'm away from reference material for a while but I'll review and update if I find anything. we'll certainly investigate this topic, but right now it's kinda hard to imagine since I've never seen the file server above ~10G of its 64G of RAM (excluding page cache of course). We're not even sure heavy writes trigger the problem, in one case our monitoring hinted at a lot of reads leading up to the freeze. OTOH if our issue could be resolved by throwing a bunch of RAM bars into the server, all the better. thanks, -Christian