Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2691727imu; Mon, 17 Dec 2018 06:21:31 -0800 (PST) X-Google-Smtp-Source: AFSGD/XiLyoPwmGqWFgkKJKeKXhw1flbQKu1Erz/1ioKGZIx9g9lxBkehTEHktzdY0f/JVFVhlj3 X-Received: by 2002:a17:902:a98c:: with SMTP id bh12mr12975553plb.31.1545056490992; Mon, 17 Dec 2018 06:21:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545056490; cv=none; d=google.com; s=arc-20160816; b=CNrpsokv8yWJ1fxSHJUP0efL3lSoPloOOmlyP9QHWbJ3NwXJ6v15ttgphhJCe/KfhX KwI2ZGMnEVQGm3b/PZJd1xHdouuzP5mwu3Dk3gHWin1eeFmTzfmJ4WKvd/DMDnpCl7Zn zOJE5s7Wsn5yON0KN16hDBfg7TSCjxMsfSUI5p/FUfM0DYfNcF89zUjR2DsYR/OvOD8Z fVbVsR+hXBkwJXE+PJFnhEk1PFIawW1x7KDsneghOZeDayU2NDHOopH1tD7GZ0OAr3Qp PvtWj/LVU5NPAln2OdqRXOmQzEmfkB1/nIl132Vz66QCEUiUQqpmE3XOttf7f8+yQ0ZD pF0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=IyySPFQR501jnuyrqlTBkAENA+JvMUdw4OsdFoys3rU=; b=opZid/txJY1iqXlCNAGU4L7xbPyYSJS540UvYb+4RIWvzq6QM7jwdBMvJEFBerGg5m b9BRx9KHMV6bRNnUV3Q7AI+ueyBh2A/zYg53/qkoJcDIpXWZJfusGTDN+nzQ/Te19d3h 2acEJt4OJYOVRsDszrr9zqvoRp0z5qF+SSjtS4OO/BDKwuVwRN7pX9F298FTtdZaf0U9 Meq49RC0nf3ryd9ixY13gwQPRSDL6/m3Fzyppc/y5BC2qdc6ZNukHp72EjMz+TElNWQw gSjGxOJiF5T/P5OmfXNYcJpSmzriNHKRpv3f4uv2tiXtnw/7+6I3Y5ST1OM+pDtj43V4 ImnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 27si10899012pgu.421.2018.12.17.06.21.14; Mon, 17 Dec 2018 06:21:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733120AbeLQOD4 (ORCPT + 99 others); Mon, 17 Dec 2018 09:03:56 -0500 Received: from mail-ua1-f66.google.com ([209.85.222.66]:42561 "EHLO mail-ua1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727384AbeLQODz (ORCPT ); Mon, 17 Dec 2018 09:03:55 -0500 Received: by mail-ua1-f66.google.com with SMTP id d21so4443964uap.9; Mon, 17 Dec 2018 06:03:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IyySPFQR501jnuyrqlTBkAENA+JvMUdw4OsdFoys3rU=; b=ZLx2lP6ZNkZHfLcxs0BEYB4a1NX8wUvppf1brWcbDWHPe7VfGonHfEpNbv835nDvO9 JdFnX+tyiTQEDorFjm9UM6iAVn/h7bAotMCDNMWHsQLm9Z/IFZSDmvm3NmJrJt6xWa00 eOmANSrFMU/xKhnwdsJz2mDhrDBeSp0pTWxyGz7DAa5OVbrEzn5E8G74FgSyfMfUe+JP HEthYp0JwGXQKnRjNcyuv3Iwq18+s9E9RTlJMPeoS1Y5VpvRWHIHal8oEA0eK0M0K6AM f8nF+VgtPI8cAh4HHCYfziC1Qec7XMeCajTcJE+uNFLNIP7NLb61/1qbvQn5cVMU4tUc JA+Q== X-Gm-Message-State: AA+aEWb9dh4LUhAlewiW9DE/cVdoZPYDTFs4RjPTslbJ7Ej8VIrXtS7G dM/d7Gf+nSMhNutXvsBs9YkGoU5HraKHi45ECUA= X-Received: by 2002:ab0:210e:: with SMTP id d14mr6327048ual.20.1545055428920; Mon, 17 Dec 2018 06:03:48 -0800 (PST) MIME-Version: 1.0 References: <20181205.221146.969453990167463340.anemo@mba.ocn.ne.jp> <92ce4b8c2b2d53e27ed5bc0e5af3fee4bc17b4dc.camel@hammerspace.com> In-Reply-To: From: Geert Uytterhoeven Date: Mon, 17 Dec 2018 15:03:35 +0100 Message-ID: Subject: Re: NFS/TCP crashes on MIPS/RBTX4927 in v4.20-rcX (bisected) To: trondmy@hammerspace.com Cc: Atsushi Nemoto , Linux Kernel Mailing List , "open list:NFS, SUNRPC, AND..." , linux-mips@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Trond, On Wed, Dec 5, 2018 at 3:47 PM Geert Uytterhoeven wrote: > On Wed, Dec 5, 2018 at 2:45 PM Trond Myklebust wrote: > > On Wed, 2018-12-05 at 14:41 +0100, Geert Uytterhoeven wrote: > > > On Wed, Dec 5, 2018 at 2:11 PM Atsushi Nemoto > > > wrote: > > > > On Tue, 4 Dec 2018 14:53:07 +0100, Geert Uytterhoeven < > > > > geert@linux-m68k.org> wrote: > > > > > I found similar crashes in a report from 2006, but of course the > > > > > code > > > > > has changed too much to apply the solution proposed there > > > > > ( > > > > > https://www.linux-mips.org/archives/linux-mips/2006-09/msg00169.html > > > > > ). > > > > > > > > > > Userland is Debian 8 (the last release supporting "old" MIPS). > > > > > My kernel is based on v4.20.0-rc5, but the issue happens with > > > > > v4.20-rc1, > > > > > too. > > > > > > > > > > However, I noticed it works in v4.19! Hence I've bisected this, > > > > > to commit > > > > > 277e4ab7d530bf28 ("SUNRPC: Simplify TCP receive code by switching > > > > > to using > > > > > iterators"). > > > > > > > > > > Dropping the ",tcp" part from the nfsroot parameter also fixes > > > > > the issue. > > > > > > > > > > Given RBTX4927 is little endian, just like my arm/arm64 boards, > > > > > it's probably > > > > > not an endianness issue. Sparse didn't show anything suspicious > > > > > before/after > > > > > the guilty commit. > > > > > > > > > > Do you have a clue? > > > > > > > > If it was a cache issue, disabling i-cache or d-cache completely > > > > might > > > > help understanding the problem. I added TXx9 specific "icdisable" > > > > and > > > > "dcdisable" kernel options for debugging long ago. > > > > > > > > I hope these options still works correctly with recent kernel but > > > > not > > > > sure. > > > > > > > > Also, disabling i-cache makes your board VERY slow, of course. > > > > > > Thanks! > > > > > > When using these options, I do see a slowdown in early boot, but the > > > issue > > > is still there. > > > > > > My next guess is an unaligned access not using {get,put}_unaligned(), > > > which > > > doesn't seem to work on tx4927, but doesn't cause an exception > > > neither. > > > > Can you try my linux-next branch on git.linux-nfs.org? It contains a > > fixes for a hang that results from the above commit. > > > > git pull git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next > > Thanks for the suggestion, but unfortunately it doesn't help. In the mean time, I tried your newer linux-next, no change. I tried several other things: - remove the packed attribute (why did you add that?), - verify (at runtime) that all accesses to fraghdr, xid, and calldir are aligned, - enable RPC_DEBUG_DATA, nothing fishy seen at first sight. Is anyone else seeing this on MIPS, or any other platform? Does mounting NFS with -o nfsvers=3,tcp work on other MIPS platforms? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds