Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752621AbdFMPyc (ORCPT ); Tue, 13 Jun 2017 11:54:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:55426 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750899AbdFMPyb (ORCPT ); Tue, 13 Jun 2017 11:54:31 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83BA523A00 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org MIME-Version: 1.0 In-Reply-To: References: <3d69fb9d-651a-8266-8e00-789fedd74659@gmx.de> From: Andy Lutomirski Date: Tue, 13 Jun 2017 08:54:08 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it To: Mikulas Patocka Cc: Andy Lutomirski , Bernhard Held , Toshi Kani , Borislav Petkov , Andrew Morton , Brian Gerst , Linus Torvalds , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , "Luis R. Rodriguez" , Denys Vlasenko , Josh Poimboeuf , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3663 Lines: 94 On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka wrote: > > > On Tue, 6 Jun 2017, Andy Lutomirski wrote: > >> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka wrote: >> > >> > >> > On Sun, 28 May 2017, Andy Lutomirski wrote: >> > >> >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held wrote: >> >> > Hi, >> >> > >> >> > this patch breaks the boot of my kernel. The last message is "Booting >> >> > the kernel.". >> >> > >> >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a >> >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the >> >> > microcode of the E5450 and recognizes the CPU. >> >> > >> >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch. >> >> > I'm happy to provide more information or to test patches. >> >> >> >> I think this patch is bogus. pat_enabled() sure looks like it's >> >> supposed to return true if PAT is *enabled*, and these days PAT is >> >> "enabled" even if there's no HW PAT support. Even if the patch were >> >> somehow correct, it should have been split up into two patches, one to >> >> change pat_enabled() and one to use this_cpu_has(). >> >> >> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so >> >> -stable knows not to backport it, and starting over with the fix. >> >> >From very brief inspection, the right fix is to make sure that >> >> pat_init(), or at least init_cache_modes(), gets called on the >> >> affected CPUs. >> >> >> >> --Andy >> > >> > Hi >> > >> > Here I send the second version of the patch. It drops the change from >> > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that >> > caused kernel to be unbootable for some people). >> > >> > Another change is that setup_arch() calls init_cache_modes() if PAT is >> > disabled, so that init_cache_modes() is always called. >> > >> > Mikulas >> > >> > >> > >> > From: Mikulas Patocka >> > >> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The >> > variable is set to 1 by default and the function pat_init() sets >> > __pat_enabled to 0 if the CPU doesn't support PAT. >> > >> > However, on AMD K6-3 CPU, the processor initialization code never calls >> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() >> > returns true, even though the K6-3 CPU doesn't support PAT. >> > >> > The result of this bug is that this warning is produced when attemting to >> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). >> > Another symptom of this bug is that the framebuffer driver doesn't set the >> > K6-3 MTRR registers. >> > >> > This patch changes pat_enabled() so that it returns true only if pat >> > initialization was actually done. >> >> Why? Shouldn't calling init_cache_modes() be sufficient? >> >> --Andy > > See the function arch_phys_wc_add(): > > if (pat_enabled() || !mtrr_enabled()) > return 0; /* Success! (We don't need to do anything.) */ > ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true); > > - if pat_enabled() returns true, that function doesn't set MTRRs. > pat_enabled() must return false on systems without PAT, so that MTRRs are > set. It still sounds to me like there are two bugs here that should be treated separately. Bug 1: A warning fires. Have you figured out why the warning fires? Bug 2: arch_phys_wc_add() appears to be checking the wrong condition. How about checking the right condition? It doesn't actually want to know if PAT is enabled -- it wants to know if the PAT contains a usable WC entry. Something like pat_has_wc() would be better, I think. --Andy