Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp668445imj; Thu, 7 Feb 2019 09:58:45 -0800 (PST) X-Google-Smtp-Source: AHgI3IZyGkl58/XiBcobRFXt9pz57ZF2KQ7+gBSpoTrVV7IfuDTEs/CpJPhJSe2etnt4jU75/Q1u X-Received: by 2002:a17:902:2887:: with SMTP id f7mr17302172plb.176.1549562325095; Thu, 07 Feb 2019 09:58:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549562325; cv=none; d=google.com; s=arc-20160816; b=eFH2GxabKzO4wX4R55j1Qm+nztJKS8QwILdnY2/JNd/qx1ujmytARGT9u+yr89etmv 9NdxXL9TH25CaKkZhx8aei2nvXtKxLNiwQnSIYFz0urWbamQ+WYrtFXt+n8ouNkOiUNm 24/xdde0Y1v7HtEmkpdJywByFkWCx5CWb84zV0KIcGnjhVOiP4kZhZWocqU9vvpqRqel Min3kxdR6L/17Qyf8+t+cpXmGwxRev9l/yWEv7KuFrTMFYXQqmUjZpzhAgKwzPumIXbx ft8Sb437U7wp7hzhbbsHqO2/HcAsHds4KH396k3FVbSLHeRY6Sl7rIbBgfPICqebtCdC mhqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ebJSCtBJYxqTB25lfOZd5rVt3X49y6Cba8ZLNIDHrHw=; b=LutVru6yUcYEWOOY3aXpS949pAWwouIJHHuP3LGs5/2bh41v8qy1i4KeBfWME0RZe6 IsGlLgIxcJXlA/eanA1ie/ppE/5WLY3iiqOfgYIanTMRVuhCt8qqb4KcpZ4oboWaPIYk eX/GyvhF8cnG2SKn5ZLDIuhidwnIJhT+1kEDXcONvXmX+QDgPrp+JAe5J9VdVDwIa6XB c4XcTVrbOJ8IIt7ypRpg0hP/fCrIcMxmlEOk+mg87g3V8TbiBTrchDo91aY2oegkUyKs lSRTkStaLL5oGk8+4S36vUUzKi4P8fvM8z1+4FaCbkGEAv+U4eCq2RxDMP/BVZks4oUU sBnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=wTXwOLox; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c11si9695188pgh.18.2019.02.07.09.58.28; Thu, 07 Feb 2019 09:58:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=wTXwOLox; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726883AbfBGR5k (ORCPT + 99 others); Thu, 7 Feb 2019 12:57:40 -0500 Received: from merlin.infradead.org ([205.233.59.134]:44254 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726171AbfBGR5k (ORCPT ); Thu, 7 Feb 2019 12:57:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ebJSCtBJYxqTB25lfOZd5rVt3X49y6Cba8ZLNIDHrHw=; b=wTXwOLoxAVwFmYqTvV8GkAQZL QOvl1ge+6uGkiO4BURgAkDyCUZX+yHN34VpuEOnIfxoFWQu9QA670eakRiy+QLwnODCla4YlXLTnJ NL+Tnwnn1JgZRpc8nLzu/c1YZBuPHy17Ke7KkxfP9npIMQL6i1FmzrrYNA6rDWbj/UUWl5dGj+ThZ QVV0yLa10G/Y/6joP7KNHFVtsGgb7C92UrhvkDMj1LgFSwkSK8RRKNqSO1Nt6+1Q7Amr08kAyCdZI b9pcM+f10m8ac+FtoAKGyF8YvPOM9XzC7xK+xmBMdOM8QSS7jY1plAYFa5kAOclaL53qn3xNUJKdb t+/FBObRA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1grnvP-0006Lz-TH; Thu, 07 Feb 2019 17:57:24 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id E7DDE21436C82; Thu, 7 Feb 2019 18:57:20 +0100 (CET) Date: Thu, 7 Feb 2019 18:57:20 +0100 From: Peter Zijlstra To: "Luck, Tony" Cc: Linus Torvalds , Dan Williams , Ingo Molnar , Linux List Kernel Mailing , Dave Hansen , Andy Lutomirski , Borislav Petkov , Thomas Gleixner , Rik van Riel Subject: Re: [GIT PULL] x86/mm changes for v4.21 Message-ID: <20190207175720.GE32511@hirez.programming.kicks-ass.net> References: <20181224231106.GA27438@gmail.com> <20190207001737.GA32096@agluck-desk> <20190207101846.GB32511@hirez.programming.kicks-ass.net> <20190207140131.GB32477@hirez.programming.kicks-ass.net> <20190207173600.GA15682@agluck-desk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190207173600.GA15682@agluck-desk> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 07, 2019 at 09:36:00AM -0800, Luck, Tony wrote: > On Thu, Feb 07, 2019 at 03:01:31PM +0100, Peter Zijlstra wrote: > > On Thu, Feb 07, 2019 at 11:50:52AM +0000, Linus Torvalds wrote: > > > If you re-generate the canonical address in __cpa_addr(), now we'll > > > actually have the real virtual address around for a lot of code-paths > > > (pte lookup etc), which was what people wanted to avoid in the first > > > place. > > > > Note that it's an 'unsigned long' address, not an actual pointer, and > > (afaict) non of the code paths use it as a pointer. This _should_ avoid > > the CPU from following said pointer and doing a deref on it. > > The type doesn't matter. You want to avoid having the > true value in the register as long as possible. Ideal > spot would be the instruction before the TLB is flushed. > > The speculative issue is that any branch you encounter > while you have the address in a register may be mispredicted. > You might also get a bogus hit in the branch target cache > and speculatively jump into the weeds. While there you > could find an instruction that loads using that register, and > even though it is speculative and the instruction won't > retire, a machine check log will be created in a bank (no > machine check is signalled). > > Once the TLB is updated, you are safe. A speculative > access to an uncached address will not load or log anything. Something like so then? AFAICT CLFLUSH will also #GP if feed it crap. diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 4f8972311a77..d3ae92ad72a6 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -230,6 +230,28 @@ static bool __cpa_pfn_in_highmap(unsigned long pfn) #endif +/* + * Machine check recovery code needs to change cache mode of poisoned + * pages to UC to avoid speculative access logging another error. But + * passing the address of the 1:1 mapping to set_memory_uc() is a fine + * way to encourage a speculative access. So we cheat and flip the top + * bit of the address. This works fine for the code that updates the + * page tables. But at the end of the process we need to flush the cache + * and the non-canonical address causes a #GP fault when used by the + * CLFLUSH instruction. + * + * But in the common case we already have a canonical address. This code + * will fix the top bit if needed and is a no-op otherwise. + */ +static inline unsigned long fix_addr(unsigned long addr) +{ +#ifdef CONFIG_X86_64 + return (long)(addr << 1) >> 1; +#else + return addr; +#endif +} + static unsigned long __cpa_addr(struct cpa_data *cpa, unsigned long idx) { if (cpa->flags & CPA_PAGES_ARRAY) { @@ -313,7 +335,7 @@ void __cpa_flush_tlb(void *data) unsigned int i; for (i = 0; i < cpa->numpages; i++) - __flush_tlb_one_kernel(__cpa_addr(cpa, i)); + __flush_tlb_one_kernel(fix_addr(__cpa_addr(cpa, i))); } static void cpa_flush(struct cpa_data *data, int cache) @@ -347,7 +369,7 @@ static void cpa_flush(struct cpa_data *data, int cache) * Only flush present addresses: */ if (pte && (pte_val(*pte) & _PAGE_PRESENT)) - clflush_cache_range_opt((void *)addr, PAGE_SIZE); + clflush_cache_range_opt((void *)fix_addr(addr), PAGE_SIZE); } mb(); } @@ -1627,29 +1649,6 @@ static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias) return ret; } -/* - * Machine check recovery code needs to change cache mode of poisoned - * pages to UC to avoid speculative access logging another error. But - * passing the address of the 1:1 mapping to set_memory_uc() is a fine - * way to encourage a speculative access. So we cheat and flip the top - * bit of the address. This works fine for the code that updates the - * page tables. But at the end of the process we need to flush the cache - * and the non-canonical address causes a #GP fault when used by the - * CLFLUSH instruction. - * - * But in the common case we already have a canonical address. This code - * will fix the top bit if needed and is a no-op otherwise. - */ -static inline unsigned long make_addr_canonical_again(unsigned long addr) -{ -#ifdef CONFIG_X86_64 - return (long)(addr << 1) >> 1; -#else - return addr; -#endif -} - - static int change_page_attr_set_clr(unsigned long *addr, int numpages, pgprot_t mask_set, pgprot_t mask_clr, int force_split, int in_flag,