Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp536126imm; Mon, 21 May 2018 09:59:43 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoMz6Sad3ocGTxez4l17PhpGIRf2MdZuRupf24pSVbBn1fQn3BuAxC5fvpZEPi5dX1I0wWs X-Received: by 2002:a62:981d:: with SMTP id q29-v6mr20791443pfd.65.1526921983531; Mon, 21 May 2018 09:59:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526921983; cv=none; d=google.com; s=arc-20160816; b=yPpGODmTiKKx1Af0J9Su9LJ/cAAGS8eQ/krMLCO4bXuYuuYj+TUN49Wm9xlgdgoX0r Yw6oRviIHgzOwXHmF3qe3Qhyak3XcZGrUK9yYMCo+0DucHg88kpI+654W/wTlnjlYHFd Od6zI1obJSI8KK6u7t42sKs6LI6FyaH9OYNDtSK71/nnnunxgHj4/6BoHFvWZOZQF136 tEH+zCYoVNFCU00D0ratE+Tkcep39x3rT5vnI9ZIYsAulIQqqz90VQCdkKxKehKPxn1t cwfu8UaXgnJF5SQOH9teGdPFhMCxIEAu0tUE9TkCBhqOMfPAv+ctYhkkJZw5SmslDljk 8/wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=UgD3CYXbtWJQVy0nLYTDZBcC6mjelKMOzcebMTZCOYg=; b=xESSPXW4JZKJ0xDEjvzu6+XpXW2nC3lu67ITiot2uU+eB4e7O5Ak9DU8Fd/dlVuHCC 6tdnXq/ZD5ZLVn5u4bTlFy3p05oh6/8ps5RFEJWUioGuv4lb7tt+6RacLxCbrn4bzuRs Wl7G3ye1fg5IENcLduwiB0r8L3+r1fpjI5/IukFY1OlCKClrXLFCOpL/eRWlmDTCydKL ezXBmX56IyvXZAOHuXUf5LUEoY5l7sFnL5WjU/OSVek4XgyHWQtTEqX4/BXanXk2Lg73 8nebOvi18Xmb4967z4WX3L6EGkHSKgBIbGMZLwZdtXpdKTAXfh5w1FV6S5K3ku1cTs9O 1YFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q6-v6si11642514pgs.451.2018.05.21.09.59.28; Mon, 21 May 2018 09:59:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753279AbeEUQ6H (ORCPT + 99 others); Mon, 21 May 2018 12:58:07 -0400 Received: from mga11.intel.com ([192.55.52.93]:2467 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752693AbeEUQ6E (ORCPT ); Mon, 21 May 2018 12:58:04 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 May 2018 09:58:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,426,1520924400"; d="scan'208";a="56238336" Received: from agluck-desk.sc.intel.com (HELO agluck-desk) ([10.3.52.160]) by fmsmga004.fm.intel.com with ESMTP; 21 May 2018 09:58:03 -0700 Date: Mon, 21 May 2018 09:58:03 -0700 From: "Luck, Tony" To: Jeffrin Thalakkottoor Cc: Borislav Petkov , Thomas Gleixner , mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-edac@vger.kernel.org, lkml Subject: Re: PROBLEM: mce: [Hardware Error] from dmesg -l emerg Message-ID: <20180521165803.GA15717@agluck-desk> References: <20180514162752.GG23049@pd.tnic> <20180520204032.GA19845@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 21, 2018 at 05:31:52PM +0530, Jeffrin Thalakkottoor wrote: > > Ok, but please do not top-post. > > Ok > > > Looks like mcelog has trouble decoding this. Have you updated mcelog to > > the latest version in your distro? > . > mcelog 153+dfsg-1 So this is looking like another case where an error is logged during BIOS bringup, and Linux finds the error when it scans all machine check banks during boot. The earlier logs you sent showed a value of ee0000000040110b in the machine check bank status register. Not sure why mcelog had trouble with this(*). Upper bits say: VALID OVER UC MISCV ADDRV Low 16 (MCACOD) bits say: FILTER CACHE ERR GENERIC LEVEL=3 So BIOS did something to trigger some issues in the L3 cache (more than once since the overflow and filter bits are both set). I think (but am not 100% sure because I don't have an internal decoder that knows about this specific CPU model) that the error was a write-back to MMIO (this matches other cases where we've seen BIOS trigger some error and left the logs for Linux to find at boot). It's not quite the same because the address logged for you is 160000080, where the previous cases has addresses below 4GB. But some platforms include MMIO above 4GB, so this is still plausible. Advice we have given before is to attempt to log a bug against the BIOS with the vendor of your system. But the last person to try this reported no success. Or, you could ignore it. It appears to not have any side effects. -Tony (*) Can you send a snip from the raw dmesg output that starts a couple of lines before: ... [Hardware Error]: CPU 0: Machine Check: 0 Bank: 5 ... and continues a couple of lines past ... [Hardware Error]: PROCESSOR 0:306d4 ... and I'll take a look at why mcelog choked.