Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751938AbbKPIyP (ORCPT ); Mon, 16 Nov 2015 03:54:15 -0500 Received: from www.linutronix.de ([62.245.132.108]:37547 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750911AbbKPIyK (ORCPT ); Mon, 16 Nov 2015 03:54:10 -0500 Date: Mon, 16 Nov 2015 09:53:27 +0100 (CET) From: Thomas Gleixner To: Linus Torvalds cc: Kyle Sanderson , Linux-Kernal Subject: Re: BUG: unable to handle kernel paging request at ffffe8ff7fc00001 In-Reply-To: Message-ID: References: User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1704 Lines: 52 On Sun, 15 Nov 2015, Linus Torvalds wrote: > On Sun, Nov 15, 2015 at 2:28 PM, Kyle Sanderson wrote: > > [] BUG: unable to handle kernel paging request at ffffe8ff7fc00001 > > [] IP: [] kstat_irqs+0x4f/0x90 > > [] CPU: 2 PID: 1078 Comm: usage.pl Not tainted 4.1.7-hardened-r1 #1 > > [] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 1.0b 04/21/2015 > RSI: 000060f700000001 > > [] Call Trace: > > [] [<>] kstat_irqs_usr+0x1e/0x40 > The code ends up being > > mov 0x48(%r13),%rsi > mov __per_cpu_offset(,%rcx,8),%rcx > add (%rsi,%rcx,1),%ebx <-- trapping instruction > > which is just the > > sum += *per_cpu_ptr(desc->kstat_irqs, cpu); > > part of kstat_irqs(). > > Your registers being > > RSI: 000060f700000001 > RCX: ffff88087fc00000 > > and it's RSI that makes no sense - RCX looks like a real kernel > pointer. So it looks like it's the "desc->kstat_irqs" thing that is > for some reason garbage. > > I don't see any sane possible reason this would happen, though. > Thomas, does this look like anything you've seen before? No. What's strange is that this does explode while reading /proc/interrupts and it did not happen when interrupt accounting took place. Though this looks like memory corruption and it might be an interrupt which fired only at boot time, i.e. before the corruption happened. No idea how to decode that. Kyle, is that reproducible? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/