Received: by 2002:a4a:311b:0:0:0:0:0 with SMTP id k27-v6csp4806414ooa; Tue, 14 Aug 2018 10:48:46 -0700 (PDT) X-Google-Smtp-Source: AA+uWPz9nDUVCvpzn/AS9JjI1LiQcEasqBDznV2zU+1dhvARA0XeAPbeJJsuOu5rjxiMuGGC4LKv X-Received: by 2002:a17:902:740b:: with SMTP id g11-v6mr21437882pll.85.1534268925940; Tue, 14 Aug 2018 10:48:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534268925; cv=none; d=google.com; s=arc-20160816; b=mYYGrbb2PWghul8gteZf7PUvxnpxB+ucqc/cCJW1XEztpat4jQz3fWOGd21qQuXQxy EmswTC9IWB9DBqqlJN89Rvyva7bW5yo/nwzPVUQ3QEMhIam2R6UJxXOxA8IaplUcEgYG D9D5afP0hUR0mSu2I2ZajZdeTIFmJRP4zgZZOYAoTI/hhK46E13LADLySyjHPi9lHqDE V3sMGV1mEu+/DEyOZeMVXq8U3CmV5M1kXZjGOp+G2qemV6ghAvpoLdBJmw1KGr5LeHix t8qoI5NZebPeDRTj2uttWQIlre0dvSG1P3NJFh9wLbVIj5jzClMcltNuoPqGJCR3C2vp ab7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=8QP00bNk+5Kduos72oVK0cEnaKYMnIqmYXpob0wS63s=; b=OlfaA6Le6bae0FGz0CRPA5JYnXKRASJkwjXZ0vZY0PmBZ8c7lzl3Kb3Pcm/CgZrGfr B4VsotP84XvWSortU7edD0aSo9jc8GOwme8FUtZ+PYrH9i58eGbYb9Xbp/YLSRTnqYEc qGnZU2wr1bjiuBWFFbVpot4B4WIipxYwQPXLFPnAY9/4CUeD9IlebeI9jcOVmQ0GC7x9 rn5VfQwVkvWrb4riaNIKHx+dHu+p20cS+5B4mQvf4vZAkPF9o7Bb58tlZQ4x6zBQ5asd OOs0NfzbfGjYrnTA4NwJiuMjau6soJZOL34aKqeZaoL+VTCT7MMzx48vOLGs25rMC1ZA pKhQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r21-v6si21188609pgi.690.2018.08.14.10.48.31; Tue, 14 Aug 2018 10:48:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391155AbeHNUfG (ORCPT + 99 others); Tue, 14 Aug 2018 16:35:06 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:60630 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390379AbeHNUfF (ORCPT ); Tue, 14 Aug 2018 16:35:05 -0400 Received: from localhost (unknown [194.244.16.108]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 91B57D1F; Tue, 14 Aug 2018 17:46:52 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Linus Torvalds , Andi Kleen , Thomas Gleixner , Josh Poimboeuf , Michal Hocko , Vlastimil Babka , Dave Hansen , David Woodhouse , Guenter Roeck Subject: [PATCH 4.4 25/43] x86/speculation/l1tf: Change order of offset/type in swap entry Date: Tue, 14 Aug 2018 19:18:01 +0200 Message-Id: <20180814171518.782256517@linuxfoundation.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180814171517.014285600@linuxfoundation.org> References: <20180814171517.014285600@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Linus Torvalds commit bcd11afa7adad8d720e7ba5ef58bdcd9775cf45f upstream If pages are swapped out, the swap entry is stored in the corresponding PTE, which has the Present bit cleared. CPUs vulnerable to L1TF speculate on PTE entries which have the present bit set and would treat the swap entry as phsyical address (PFN). To mitigate that the upper bits of the PTE must be set so the PTE points to non existent memory. The swap entry stores the type and the offset of a swapped out page in the PTE. type is stored in bit 9-13 and offset in bit 14-63. The hardware ignores the bits beyond the phsyical address space limit, so to make the mitigation effective its required to start 'offset' at the lowest possible bit so that even large swap offsets do not reach into the physical address space limit bits. Move offset to bit 9-58 and type to bit 59-63 which are the bits that hardware generally doesn't care about. That, in turn, means that if you on desktop chip with only 40 bits of physical addressing, now that the offset starts at bit 9, there needs to be 30 bits of offset actually *in use* until bit 39 ends up being set, which means when inverted it will again point into existing memory. So that's 4 terabyte of swap space (because the offset is counted in pages, so 30 bits of offset is 42 bits of actual coverage). With bigger physical addressing, that obviously grows further, until the limit of the offset is hit (at 50 bits of offset - 62 bits of actual swap file coverage). This is a preparatory change for the actual swap entry inversion to protect against L1TF. [ AK: Updated description and minor tweaks. Split into two parts ] [ tglx: Massaged changelog ] Signed-off-by: Linus Torvalds Signed-off-by: Andi Kleen Signed-off-by: Thomas Gleixner Tested-by: Andi Kleen Reviewed-by: Josh Poimboeuf Acked-by: Michal Hocko Acked-by: Vlastimil Babka Acked-by: Dave Hansen Signed-off-by: David Woodhouse Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/pgtable_64.h | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-) --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -168,7 +168,7 @@ static inline int pgd_large(pgd_t pgd) { * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | OFFSET (14->63) | TYPE (9-13) |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | OFFSET (9-58) |0|0|X|X| X| X|X|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -182,19 +182,28 @@ static inline int pgd_large(pgd_t pgd) { * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. */ -#define SWP_TYPE_FIRST_BIT (_PAGE_BIT_PROTNONE + 1) -#define SWP_TYPE_BITS 5 -/* Place the offset above the type: */ -#define SWP_OFFSET_FIRST_BIT (SWP_TYPE_FIRST_BIT + SWP_TYPE_BITS) +#define SWP_TYPE_BITS 5 + +#define SWP_OFFSET_FIRST_BIT (_PAGE_BIT_PROTNONE + 1) + +/* We always extract/encode the offset by shifting it all the way up, and then down again */ +#define SWP_OFFSET_SHIFT (SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS) #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) -#define __swp_type(x) (((x).val >> (SWP_TYPE_FIRST_BIT)) \ - & ((1U << SWP_TYPE_BITS) - 1)) -#define __swp_offset(x) ((x).val >> SWP_OFFSET_FIRST_BIT) -#define __swp_entry(type, offset) ((swp_entry_t) { \ - ((type) << (SWP_TYPE_FIRST_BIT)) \ - | ((offset) << SWP_OFFSET_FIRST_BIT) }) +/* Extract the high bits for type */ +#define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS)) + +/* Shift up (to get rid of type), then down to get value */ +#define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT) + +/* + * Shift the offset up "too far" by TYPE bits, then down again + */ +#define __swp_entry(type, offset) ((swp_entry_t) { \ + ((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \ + | ((unsigned long)(type) << (64-SWP_TYPE_BITS)) }) + #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) }) #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val })