Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932281AbcDTAqX (ORCPT ); Tue, 19 Apr 2016 20:46:23 -0400 Received: from eddie.linux-mips.org ([148.251.95.138]:50042 "EHLO cvs.linux-mips.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753578AbcDTAqU (ORCPT ); Tue, 19 Apr 2016 20:46:20 -0400 Date: Wed, 20 Apr 2016 01:46:13 +0100 (BST) From: "Maciej W. Rozycki" To: Bob Tracy cc: linux-kernel@vger.kernel.org, debian-alpha@lists.debian.org, mcree@orcon.net.nz, jay.estabrook@gmail.com, mattst88@gmail.com Subject: Re: [BUG] machine check Oops on Alpha In-Reply-To: <20160419235657.GA4404@gherkin.frus.com> Message-ID: References: <20160417210532.GA27208@gherkin.frus.com> <20160418035848.GA28637@gherkin.frus.com> <20160418123136.GA30382@gherkin.frus.com> <20160419025243.GA32734@gherkin.frus.com> <20160419235657.GA4404@gherkin.frus.com> User-Agent: Alpine 2.20 (LFD 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1486 Lines: 36 On Tue, 19 Apr 2016, Bob Tracy wrote: > > 4.6.0-rc4 build complete, including suggested (by Alan Young) "Verbose > > Machine Checks" option set to level 2 by default. System rebooted, and > > now we wait... Thanks for everyone's continued patience. > > Within three minutes of rebooting, I got a machine check, but perhaps > significantly, no "Oops". I'm guessing the only reason I'm seeing the > ECC errors now (haven't seen them before) is because of the stepped-up > debug output. Syslog output attached... If this is a code generation bug, which I now suspect even more highly than before, then the debug verbosity configuration change may well have made the compiler behave indeed. As you can see from the log the logout area pointer is not null: machine check: LA: fffffc0000006000 (of course the lone insertion of this `printk' call may have covered the bug, regardless of the debug verbosity change). Consequently further information is printed -- the: CIA machine check: vector=0x630 pc=0xfffffc00005b66ac code=0x86 line would have been printed anyway -- in fact the Oops previously happened in an attempt to retrieve `code' to print with this line. I can see if I can find anything suspicious there if you send me original copies (i.e. those that oopsed) of arch/alpha/kernel/irq_alpha.o and arch/alpha/kernel/core_cia.o. > Machine has been stable since the machine check. Kernel is 4.6.0-rc4. Yeah, it was a correctable error after all. Maciej