Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp2162706ybe; Tue, 3 Sep 2019 08:50:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqxf9c/lfsicJtyxq7K453TThQ19hJTN/ZQn770rOLctEgaTjh5h+uUN38/5sqOB43hZQeyl X-Received: by 2002:a17:90a:ac14:: with SMTP id o20mr705431pjq.143.1567525836422; Tue, 03 Sep 2019 08:50:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567525836; cv=none; d=google.com; s=arc-20160816; b=A/jSquzk4ldaox9poU5QpFFrhgqn3sszyJf6ulMxCJOvg1F2SU8D/TGPUqla8kBLfo SFXqRcWF/RZ8du2o0xjiTwuyLGAf0v3f3yqYa7G2FUIRHcZlBNTjhpY4jUkr+826OHo8 qKxOAI3MnABz4EyUP7nrtueQjrLjabb/+DhtNWw5Yy+RIMlHuWdXzxa/gjyqKkC/7+RD FROlboi3J94EeQAiX0QGDcYLb5wdlgTw/k9SV8duEN77UgZl6SKTSzvgQ3/l8WJDqVXF 7QxYg7Hx45qg9fanQ0swnlxTJQxtXOFyu+qM1WrgolAZ/e5Bry7i7FsO+H+l29+d+LM6 j+LA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from; bh=BmfGVdzJcvnaQ6XhyKLB2dgCah5lp+LKWNXLYGsdEB0=; b=hByObKruys8PK1C/B/vXpC4Gyuku1wmd/1AhrQg5I2xRNrw46zovz7amarpREG9zNj mBVhY2M4QJsWOprX3T/KGKqNNWs6G8o8QeutFE6pP7mat5fkfMBZh7b6FB7n6jnOLZGA LYly6NKqo0lkpFxGB7bHt6k2++24jlZ+Yk5ZwF8053/0IffeALk2E6g4qAnqPbsBeLMM rcDzfUVO8D0eyhRUMVOIeVCL03BB9tWtno83eH0pMNeGyKgYdyBYVKnfXs0D02ExsVqB CNP69IgCaVBdNQYGGw8vlcscDxeB1w0bmZNXN0KEAWITr49ypqALD/wHMZqeeOI1kRkw EW5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z11si14946427pju.108.2019.09.03.08.50.07; Tue, 03 Sep 2019 08:50:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729499AbfICPuF (ORCPT + 99 others); Tue, 3 Sep 2019 11:50:05 -0400 Received: from mx1.math.uh.edu ([129.7.128.32]:56392 "EHLO mx1.math.uh.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729393AbfICPuF (ORCPT ); Tue, 3 Sep 2019 11:50:05 -0400 Received: from epithumia.math.uh.edu ([129.7.128.2]) by mx1.math.uh.edu with esmtp (Exim 4.92) (envelope-from ) id 1i5B40-00036N-4s; Tue, 03 Sep 2019 10:50:03 -0500 Received: by epithumia.math.uh.edu (Postfix, from userid 7225) id 159ED801554; Tue, 3 Sep 2019 10:49:48 -0500 (CDT) From: Jason L Tibbitts III To: bfields@fieldses.org (J. Bruce Fields) Cc: linux-nfs@vger.kernel.org, km@cm4all.com, linux-kernel@vger.kernel.org Subject: Re: Regression in 5.1.20: Reading long directory fails References: <20190828174609.GB29148@fieldses.org> Date: Tue, 03 Sep 2019 10:49:48 -0500 In-Reply-To: (Jason L. Tibbitts, III's message of "Wed, 28 Aug 2019 13:29:00 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -2.9 (--) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org >>>>> "JLT" == Jason L Tibbitts writes: JLT> Certainly a server reboot, or maybe even just JLT> unmounting and remounting the filesystem or copying the data to JLT> another filesystem would tell me that. In any case, as soon as I JLT> am able to mess with that server, I'll know more. Rebooting the server did not make any difference, and now more users are seeing the problem. At this point I'm in a state where NFS simply isn't reliable at all, and I'm not sure what to do. If Centos 8 were out, I'd work on moving to that just so that the server was a little more modern. (Currently the server is Centos 7.) I guess I could try using Fedora, or installing one of the upstream kernels, just in case this has to do with some interaction between the client and the old RHEL7 kernel. I do have a packet capture of a directory listing that fails with EIO, but I'm not sure if it's safe to simply post it, and I'm not sure what tshark options would be useful in decoding it. I do know that I can rsync one of the problematic directories to a different server (running the same kernel) and it doesn't have the same problem. What I'll try next is rsyncing to a different filesystem on the same server, but again I'll have to wait until people log off to do proper testing. - J<