Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4792608ybf; Wed, 4 Mar 2020 10:45:46 -0800 (PST) X-Google-Smtp-Source: ADFU+vs8nNm31Po+KH6/K2ErDKVuGuOuPEykSk8j4WQkBRUdJFxQmzpBo/jU92gPUmkaYiwzRGu7 X-Received: by 2002:a05:6808:8f1:: with SMTP id d17mr2698653oic.105.1583347546644; Wed, 04 Mar 2020 10:45:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583347546; cv=none; d=google.com; s=arc-20160816; b=vTxc53OhC/+yFlPQudTvAcF1MdUTmEy2ouKPErWiXivMh4YeTzC6KpAfV1gkRe78/j Em3ZJdsdhW/G5Di6jFMSK5UiPddwS4j6z4HHQtluPhc2GAO03RPWd6K2wwCD1kB2369t FG8M36faX7jlcxaWYpm/mrB6zlxd1EBn6A1Z6pW2u6uqMdb6H/+BUksajFB+aBgmMWNQ 9vOQ7RhlCBMsBbrTOzWxVoxOH1Q+vxN08uZJ/II9YlrVB7QpRWiC/nceRiPjrb7DUXMR DVB+FIdL14SIH0onDfk1rQrTeAUB6+N4JGN27pjoyg/+HP8+b9tZcJm45Su+4cP/egDt XgOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=anA/2eYKhVa2xKIeZ2AUgPUCtD2qrV+ElGZi84rkKmY=; b=coEl7aEWVjrlO6we7I/Tjb0VJ6Ui45RBtmxRnpQhaUEu1BE0dU/gBDfcd1RnOFjHIT QFrjSOfB7WeArB5UE3CUzHKb7LOXmP1ZR2A2M46pfnc94RkpoftNeVjwXvRHDmvgPA3M JIyvvAZh3I5fz95fQM25e1mWpoHOB/SnBUayBpis60yZQ5O3BxROLyClGCaN3cVCHk20 K5Ojka6sPDfY//eLhUvJyw8awOkNHUDcxVq5JZ1ma5VMvXGvEdGBvnhHcsrK6d2AMCHs rc4iWyErsjmWlkg1J7M+aXXy4NxInzcIC+gpth6edBW5FKCjJSelby9Sq7RIXEm3m9Fb gZCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=evFc4Gy2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c7si1751838oto.105.2020.03.04.10.45.34; Wed, 04 Mar 2020 10:45:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=evFc4Gy2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388117AbgCDSpE (ORCPT + 99 others); Wed, 4 Mar 2020 13:45:04 -0500 Received: from mail.kernel.org ([198.145.29.99]:34620 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730021AbgCDSpE (ORCPT ); Wed, 4 Mar 2020 13:45:04 -0500 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 422F624679 for ; Wed, 4 Mar 2020 18:45:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1583347503; bh=Xrw/0ZXS8o5llP7/hWRgBB1gg32aOCivmOH4N2PPq28=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=evFc4Gy2bN812x4sk2r8Pn9zAtdOD5nizsdGRFGnPPQJjIT3Giw+WApSBY2sg4RQj iTqf7tJ5u5IGF0EXEAMZNb1VAIz5xMM6HzAydQcZPNoCREgPFvBNUAcWYJTqD+wqzD sux5tvloTufAOp3GDV77PXiGSd4cbGPBEygtj4qA= Received: by mail-wr1-f41.google.com with SMTP id t11so3753782wrw.5 for ; Wed, 04 Mar 2020 10:45:03 -0800 (PST) X-Gm-Message-State: ANhLgQ2GOQ+urtkrV42gSkYaPD2ro2mNlG4c7R62xufKuOns6vNshCc8 7dxriseACfJUVFhChUomCsmkxJ89E7c5OuduJc2/EA== X-Received: by 2002:a05:6000:110b:: with SMTP id z11mr5480355wrw.252.1583347501553; Wed, 04 Mar 2020 10:45:01 -0800 (PST) MIME-Version: 1.0 References: <20200303205445.3965393-1-nivedita@alum.mit.edu> <20200303205445.3965393-2-nivedita@alum.mit.edu> <20200304154908.GB998825@rani.riverdale.lan> In-Reply-To: <20200304154908.GB998825@rani.riverdale.lan> From: Ard Biesheuvel Date: Wed, 4 Mar 2020 19:44:50 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 1/4] x86/mm/pat: Handle no-GBPAGES case correctly in populate_pud To: Arvind Sankar Cc: Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , "the arch/x86 maintainers" , linux-efi , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 4 Mar 2020 at 16:49, Arvind Sankar wrote: > > On Wed, Mar 04, 2020 at 09:17:44AM +0100, Ard Biesheuvel wrote: > > On Tue, 3 Mar 2020 at 21:54, Arvind Sankar wrote: > > > > > > Commit d367cef0a7f0 ("x86/mm/pat: Fix boot crash when 1GB pages are not > > > supported by the CPU") added checking for CPU support for 1G pages > > > before using them. > > > > > > However, when support is not present, nothing is done to map the > > > intermediate 1G regions and we go directly to the code that normally > > > maps the remainder after 1G mappings have been done. This code can only > > > handle mappings that fit inside a single PUD entry, but there is no > > > check, and it instead silently produces a corrupted mapping to the end > > > of the PUD entry, and no mapping beyond it, but still returns success. > > > > > > This bug is encountered on EFI machines in mixed mode (32-bit firmware > > > with 64-bit kernel), with RAM beyond 2G. The EFI support code > > > direct-maps all the RAM, so a memory range from below 1G to above 2G > > > triggers the bug and results in no mapping above 2G, and an incorrect > > > mapping in the 1G-2G range. If the kernel resides in the 1G-2G range, a > > > firmware call does not return correctly, and if it resides above 2G, we > > > end up passing addresses that are not mapped in the EFI pagetable. > > > > > > Fix this by mapping the 1G regions using 2M pages when 1G page support > > > is not available. > > > > > > Signed-off-by: Arvind Sankar > > > > I was trying to test these patches, and while they seem fine from a > > regression point of view, I can't seem to reproduce this issue and > > make it go away again by applying this patch. > > > > Do you have any detailed instructions how to reproduce this? > > > > The steps I'm following are > - build x86_64 defconfig + enable EFI_PGT_DUMP (to show the incorrect > pagetable) > - run (QEMU is 4.2.0) > $ qemu-system-x86_64 -cpu Haswell -pflash qemu/OVMF_32.fd -m 3072 -nographic \ > -kernel kernel64/arch/x86/boot/bzImage -append "earlyprintk=ttyS0,keep efi=debug nokaslr" > > The EFI memory map I get is (abbreviated to regions of interest): > ... > [ 0.253991] efi: mem10: [Conventional Memory| | | | | | | | | |WB|WT|WC|UC] range=[0x00000000053e7000-0x000000003fffbfff] (940MB) > [ 0.254424] efi: mem11: [Loader Data | | | | | | | | | |WB|WT|WC|UC] range=[0x000000003fffc000-0x000000003fffffff] (0MB) > [ 0.254991] efi: mem12: [Conventional Memory| | | | | | | | | |WB|WT|WC|UC] range=[0x0000000040000000-0x00000000bbf77fff] (1983MB) > ... > > The pagetable this produces is (abbreviated again): > ... > [ 0.272980] 0x0000000003400000-0x0000000004800000 20M ro PSE x pmd > [ 0.273327] 0x0000000004800000-0x0000000005200000 10M RW PSE NX pmd > [ 0.273987] 0x0000000005200000-0x0000000005400000 2M RW NX pte > [ 0.274343] 0x0000000005400000-0x000000003fe00000 938M RW PSE NX pmd > [ 0.274725] 0x000000003fe00000-0x0000000040000000 2M RW NX pte > [ 0.275066] 0x0000000040000000-0x0000000080000000 1G RW PSE NX pmd > [ 0.275437] 0x0000000080000000-0x00000000bbe00000 958M pmd > ... > > Note how 0x80000000-0xbbe00000 range is unmapped in the resulting > pagetable. The dump doesn't show physical addresses, but the > 0x40000000-0x80000000 range is incorrectly mapped as well, as the loop > in populate_pmd would just go over that virtual address range twice. > > while (end - start >= PMD_SIZE) { > ... > pmd = pmd_offset(pud, start); > > set_pmd(pmd, pmd_mkhuge(pfn_pmd(cpa->pfn, > canon_pgprot(pmd_pgprot)))); > > start += PMD_SIZE; > cpa->pfn += PMD_SIZE >> PAGE_SHIFT; > cur_pages += PMD_SIZE >> PAGE_SHIFT; > } I've tried a couple of different ways, but I can't seem to get my memory map organized in the way that will trigger the error.