Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753847AbdCBQra (ORCPT ); Thu, 2 Mar 2017 11:47:30 -0500 Received: from sym2.noone.org ([178.63.92.236]:60285 "EHLO sym2.noone.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752938AbdCBQrW (ORCPT ); Thu, 2 Mar 2017 11:47:22 -0500 Date: Thu, 2 Mar 2017 17:38:13 +0100 From: Tobias Klauser To: Guenter Roeck Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>, Sandra Loosemore , Arnd Bergmann , Andrew Morton , linux-kernel@vger.kernel.org, Ley Foon Tan , nios2-dev@lists.rocketboards.org Subject: Re: nios2 crash/hang in mainline due to 'lib: update LZ4 compressor module' Message-ID: <20170302163813.GE27998@distanz.ch> References: <20170226210338.GA19476@roeck-us.net> <20170228155331.GC27998@distanz.ch> <58B5B934.5040807@codesourcery.com> <20170228181413.GC13455@roeck-us.net> <20170301185817.GA13543@bierbaron.springfield.local> <20170301194520.GA20160@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170301194520.GA20160@roeck-us.net> X-Editor: Vi IMproved 7.3 User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3667 Lines: 72 On 2017-03-01 at 20:45:21 +0100, Guenter Roeck wrote: > On Wed, Mar 01, 2017 at 07:58:17PM +0100, Sven Schmidt wrote: > > Hi Guenter, Tobias and Sandra, > > > > thanks for your effort here. > > > > On Tue, Feb 28, 2017 at 10:14:13AM -0800, Guenter Roeck wrote: > > > On Tue, Feb 28, 2017 at 10:53:56AM -0700, Sandra Loosemore wrote: > > > > On 02/28/2017 08:53 AM, Tobias Klauser wrote: > > > > >(adding Sandra Loosemore to Cc due to possible relation to gcc/binutils > > > > >for nios2) > > > > > > > > > >On 2017-02-26 at 22:03:38 +0100, Guenter Roeck wrote: > > > > >>Hi Sven, > > > > >> > > > > >>my qemu test for nios2 started failing with commit 4e1a33b105dd ("lib: > > > > >>update LZ4 compressor module"). The test hangs early during boot before > > > > >>any console output is seen. Reverting the offending patch as well as the > > > > >>subsequent lz4 related patches fixes the problem. Disabling CONFIG_RD_LZ4 > > > > >>and with it other LZ4 options also fixes it (as does adding "return -EINVAL;" > > > > >>at the top of the LZ4 decompression code). For reference, bisect log > > > > >>is attached. > > > > >> > > > > >>I tried with buildroot toolchains using gcc 6.1.0 as well as 6.3.0 > > > > >>and binutils 2.26.1. Scripts used to run the tests are available at > > > > >>https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2. > > > > >>Qemu is from qemu mainline or qemu v2.8 with nios2 patches applied. > > > > > > > > > >Looks like this is somehow related to gcc/binutils. Using GCC 4.8.3 and > > > > >binutils 2.24.51 (both from from Sourcery CodeBench Lite 2014.05) I can > > > > >get a kernel booting on latest master branch. AFAICT, none of the > > > > >LZ4_decompress_* functions are called during boot. > > > > > > > > > It seems a bit strange that code which is not actually called causes problems like that. > > > Yes, it is, though it is always possible. The code isn't exactly easy to > understand; there may be some hidden caveats such as global variables. It may > also be that some jump target exceeds its range (though why that would only > be seen with the LZ4 code is another question), or that the compiler gets > confused by the forced inlines (disabling that didn't make a difference, > though, nor did disabling -O3). > > > Please let me know if and how I may help you figure out what's happening, especially > > regarding the differences between the previous LZ4 and the current implementation. > > > > For my part I am all but clueless. Unless someone has an idea, we may to > disable LZ4 support for nios2 for the time being. Does anyone have thoughts > on that ? Of course, that would not help if the problem also affects > recent gcc/binutil versions on other architectures. After some further investigations, I'd say this isn't "caused" by LZ4 specifically but by a more general problem with one of the nios2 arch specific tools involved. I manually enabled random additional CONFIG_* options and in some cases I got the kernel to boot (with CONFIG_RD_LZ4 enabled and no return -EINVAL in place) while in others I didn't. So I'd rather suspect this problem to be connected to the size or structure of the generated vmlinux image. Or could this even be a problem with qemu? Did anyone already verify this on the 10m50 devboard? (Unfortunately I don't have any nios2 devboard available right now, otherwise I would have done this...) Other than that I'm also becoming all but clueless... One option I thought of was using the QEMU monitor to dump the CPU state after the hang but so far I didn't manage to get it to work (hints appreciated ;) Thanks Tobias