Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp893110pxt; Fri, 6 Aug 2021 17:12:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvu/TLk1nrWBgF3K5OX1xLIY05r5euI0XM5ndIi7kmC+N2nP38qt6OweOrTkznx/f5Ht/D X-Received: by 2002:a05:6638:304a:: with SMTP id u10mr12455856jak.62.1628295141185; Fri, 06 Aug 2021 17:12:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628295141; cv=none; d=google.com; s=arc-20160816; b=qxkROQR5I0vViLWGMk/FSznd85zaUDLRZ2PW+kHgbhoj39VYXDdm9F9s2OBoomSFo1 3PshFh+ao0T3IX3TNEYOG5CHfoTMCfIfZjnBnH+sLd36JKlxJKvUrQM94rmWxu/QA3l1 3I4r6sLorhrwfCo8Votppde7RVMOtxl4vykbHXW6/S1ueE0VT5IVsw+JSvNKl1RYUwZi ElY9cXA7+sPGJF64cHZMoOHt15Hz0998hG2rasGBY8jCdv0nwlQs5JEIiwedmK2Fm5Ha Na4sS6ETREkvxBHPcEOM+tdDnpJNqjaBJDpn9MtLI6MyPh+LYKlIGWjhTJfwaG/voqj+ IiEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter; bh=pQFxXslybWGkCTzAO35JaXt4z4WeC12z4LlGwRRjVds=; b=T3g3Q6zXgs6WIqwzxFfoZpChaDdiW8NKcGulCaBhy2aPCjKWp/ngLUS2R2iC/S56L+ maAcQh8n18vW3DlgS/stOqbKYdnujdp3GLE7gF8wBxCrNwi/BNRM5ohZW8SN0hrhIKE9 Wtmz9H0vbg2YWjvNpnKF9O/jC3Wpjb+X1H+4yiNSQX+0TeNbSL0C/9KwMGGV8DHF/ATI 0RB6+4W3u6DwYqGvnAXZeFMrDHCd6z5fHpjapWUbaPPBOo3PuyAtZLAxeHHmLXLpjCGz A/75mJP5vsJNXQ/HqEPKHfCKmPpj7boM94WbokRqU4/qDt9PJKo/GwHCMVT5TbPhCTXi B4hA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@raptorengineering.com header.s=B8E824E6-0BE2-11E6-931D-288C65937AAD header.b=XusPhugk; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=raptorengineering.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n12si10308802ilm.119.2021.08.06.17.12.09; Fri, 06 Aug 2021 17:12:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@raptorengineering.com header.s=B8E824E6-0BE2-11E6-931D-288C65937AAD header.b=XusPhugk; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=raptorengineering.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231728AbhHFUhl (ORCPT + 99 others); Fri, 6 Aug 2021 16:37:41 -0400 Received: from mail.rptsys.com ([23.155.224.45]:54296 "EHLO mail.rptsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231132AbhHFUhl (ORCPT ); Fri, 6 Aug 2021 16:37:41 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id C1C7037B2C9861; Fri, 6 Aug 2021 15:37:24 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id SXYxxgknChUk; Fri, 6 Aug 2021 15:37:24 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id DEF7737B2C985E; Fri, 6 Aug 2021 15:37:23 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com DEF7737B2C985E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1628282243; bh=pQFxXslybWGkCTzAO35JaXt4z4WeC12z4LlGwRRjVds=; h=Date:From:To:Message-ID:MIME-Version; b=XusPhugk7Yy6KbP6LcJ9yvFeweBX2cEJdofNeur6Z3+GeitaE6+SZo5WRXtH+kXg0 2XcKJm6eTP46XBG5OLOS5K2HQIGJFvRrfP09juVwdJtA00iyrE0u+sqNCkPWLS82X0 1s7I9o7yyLSyXRNS2ol5gDs0HsluzDDnUjEgvOlc= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 4AIci4eC-mfG; Fri, 6 Aug 2021 15:37:23 -0500 (CDT) Received: from vali.starlink.edu (unknown [192.168.3.2]) by mail.rptsys.com (Postfix) with ESMTP id B89CD37B2C985B; Fri, 6 Aug 2021 15:37:23 -0500 (CDT) Date: Fri, 6 Aug 2021 15:37:22 -0500 (CDT) From: Timothy Pearson To: Olga Kornievskaia Cc: linux-nfs Message-ID: <620055521.439700.1628282242689.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: References: <985631970.48634.1628121620017.JavaMail.zimbra@raptorengineeringinc.com> <1851673341.49012.1628121856011.JavaMail.zimbra@raptorengineeringinc.com> <361337129.54635.1628123839436.JavaMail.zimbra@raptorengineeringinc.com> Subject: Re: Callback slot table overflowed MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC83 (Linux)/8.5.0_GA_3042) Thread-Topic: Callback slot table overflowed Thread-Index: diZ7YAPSPH6ZJArLYa5dBpRSeeCQSw== Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org ----- Original Message ----- > From: "Olga Kornievskaia" > To: "Timothy Pearson" > Cc: "linux-nfs" > Sent: Friday, August 6, 2021 2:53:19 PM > Subject: Re: Callback slot table overflowed > On Thu, Aug 5, 2021 at 12:15 AM Timothy Pearson > wrote: >> >> On further investigation, the working server had already been rolled back to >> 4.19.0. Apparently the issue was insurmountable in 5.x. >> >> It should be simple enough to set up a test environment out of production for >> 5.x, if you have any debug tips / would like to see any debug options compiled >> in. >> >> Thanks! >> >> ----- Original Message ----- >> > From: "Timothy Pearson" >> > To: "linux-nfs" >> > Sent: Wednesday, August 4, 2021 7:04:16 PM >> > Subject: Re: Callback slot table overflowed >> >> > Other information that may be helpful: >> > >> > All clients are using TCP >> > arm64 clients are unaffected by the bug >> > The armel clients use very small (4k) rsize/wsize buffers >> > Prior to the upgrade from Debian Stretch, everything was working perfectly >> > >> > ----- Original Message ----- >> >> From: "Timothy Pearson" >> >> To: "linux-nfs" >> >> Sent: Wednesday, August 4, 2021 7:00:20 PM >> >> Subject: Callback slot table overflowed >> > >> >> All, >> >> >> >> We've hit an odd issue after upgrading a main NFS server from Debian Stretch to >> >> Debian Buster. In both cases the 5.13.4 kernel was used, however after the >> >> upgrade none of our ARM thin clients can mount their root filesystems -- early >> >> in the boot process I/O errors are returned immediately following "Callback >> >> slot table overflowed" in the client dmesg. >> >> >> >> I am unable to find any useful information on this "Callback slot table >> >> overflowed" message, and have no idea why it is only impacting our ARM (armel) >> >> clients. Both 4.14 and 5.3 on the client side show the issue, other client >> >> kernel versions were not tested. >> >> >> >> Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but >> >> does not eliminate) the number of callback overflow messages. >> >> >> >> The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin >> >> clients are experiencing any problems. Nothing of interest appears in the >> >> server message log. >> >> >> >> Any troubleshooting hints would be most welcome. > > A network trace would be useful. > > 5.3 should have this patch "SUNRPC: Fix up backchannel slot table > accounting". I believe "callback slot table overflowed" is hit when > the server sent more reqs than client can handle (ie doesn't have a > free slot to handle the request). A network trace would show that. > However you said this happens when the client is trying to mount and > besides cb_null requests I'm not sure what could be happening. I'll work to get a network trace out of the test environment once it's set up. I should however clarify that this is immediately *after* mount, when the diskless ARM device is attempting to run early startup (i.e. reading /etc/init.d and such). >> >> > > > > Thank you!