Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752063AbbFNHyQ (ORCPT ); Sun, 14 Jun 2015 03:54:16 -0400 Received: from mail-ob0-f181.google.com ([209.85.214.181]:36647 "EHLO mail-ob0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751494AbbFNHyH (ORCPT ); Sun, 14 Jun 2015 03:54:07 -0400 MIME-Version: 1.0 X-Originating-IP: [180.255.242.78] In-Reply-To: <1434256796.1699.5.camel@googlemail.com> References: <1434256796.1699.5.camel@googlemail.com> Date: Sun, 14 Jun 2015 15:54:06 +0800 Message-ID: Subject: Re: lockup when C1E and high-resolution timers enabled From: Daniel J Blueman To: Christoph Fritz Cc: Linux Kernel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2295 Lines: 60 On 14 June 2015 at 12:39, Christoph Fritz wrote: > On Sun, 2015-06-14 at 11:13 +0800, Daniel J Blueman wrote: >> On Sunday, June 14, 2015 at 4:00:06 AM UTC+8, Christoph Fritz wrote: >> > Hi, >> > >> > on following computer configuration, I do get hard lockup under heavy >> > IO-Load (using rsync): >> > >> > - CONFIG_HIGH_RES_TIMERS=y >> > - CPU: AMD FX(tm)-8350 Eight-Core Processor (family 0x15 model 0x2) >> > - Motherboard: 'GA-970A-UD3P (rev. 1.0)' AMD 970/SB950 >> > - BIOS: C1E enabled (on 'GA-970A-UD3P' there is no disable option) >> > - Kernels: 4.1.0-rc6, 4.0.x, 3.16.x >> > >> > Tests: >> > - add kernel parameter "idle=halt" -> system runs fine >> > - disable CONFIG_HIGH_RES_TIMERS -> system runs fine >> > - change motherboard and disable C1E -> system runs fine >> > - change CPU to AMD Phenom II X6 Processor -> system runs fine >> [..] >> >> C1E disconnects HyperTransport links when all cores enter C1 (halt) >> for a period of time; this is all at the platform level, so isn't due >> to the kernel. The AMD AGESA code which controls the setup of this >> mechanism is updated in the F2g BIOS: >> http://www.gigabyte.com/products/product-page.aspx?pid=4717#bios >> >> Did you try both BIOS releases with defaults? > > Yes, rechecked both versions: Same bad behaviour. > >> If still issues, also try with the current family 10h microcode from >> http://www.amd64.org/microcode/amd-ucode-latest.tar.bz2 > > Don't you mean family 15h for 'AMD FX(tm)-8350' ? > > already using latest microcode: As a workaround, you can probably just disable message triggered C1E (see the BKDG p399 [1]): val=0x$(setpci -s 00:18.4 0xd4.l) # read D18F3xD4 val=$((val &~(1 << 13))) # clear bit13 (MTC1eEn) setpci -d 1022:1604 0xd4.l=$(printf %x $val) # write back The chipset setup and behaviour is quite complex, so it's likely Gigabyte haven't done their homework. The alternative is coreboot of course. Thanks, Daniel [1] http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/