Received: by 10.223.176.46 with SMTP id f43csp2548530wra; Thu, 25 Jan 2018 11:34:01 -0800 (PST) X-Google-Smtp-Source: AH8x227Tn7zXFJfe8gdlV3PZExn5U5kppWPQKMRFsqJ52Sa1a107RTSCCjW25pDZhCU1OtdK7ZQ3 X-Received: by 2002:a17:902:3181:: with SMTP id x1-v6mr12136029plb.361.1516908841036; Thu, 25 Jan 2018 11:34:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516908840; cv=none; d=google.com; s=arc-20160816; b=ziQz0Q0tfE06nOiOmi74ro7ye1/TihgYByNn5eyx6rTAL3L1z9tCU9wvDzX3NmTd4v KNqmhe2cFu2IG2Vpq1Omk3dLchAgxoR/2JztHI/bZJaTzfdQS9gxo93OTk0+Vpz3IGZQ 9WdsuijPMakDcov3dnPk/ukA2z/lmQBHVBBy00R1eScjXaDJXLtnGxatJ2vuwXEay/IK 1wXdLaTI9nxgrW/Fwk6sAS755MtsC2ARknsdJCbPunUba/ELO5+CEntY9b1N0oz53/Re FmFWxthTNuIJ8vvOQj57UWIKHxEXVzncMbOl5brRp7yhxdEq8yX3Fg1wwrMkjiT+7UlD 5QNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=Em7wVtAqoeUHvrZO00tFR0z9QjpyRrOLZ4HtJRYspDg=; b=eJARMfvmGzfQ1X7CML1TkqnoHBeci6VCLbsDfvBMgqHNQDahN/Zm5o2tCOe2fUlGnH gidzFLa3FKbsMRwTNuOgDubEFk5USoGVNt1iqZAY0EO7VyzyazfwlE1gpn4kcptoHGe7 KroCxY1dTCPpAeIOrbbDFSsJ3wpNDiPv/BCOeyCsRWtg1o1goBt1E6vhP6ugXMtQF2Ea mWIsTgQzgUnqUboALSdvnBCjxOcuQHS/62LU59zTUZCm509PsHqTVjtuiOuU2sefV3in 8Teamb+74CmIVBBXpa+t4ttIdD4Jmof7gSvLFMd0sSJdP8b4KfdahVdZBoqEGj+E1C+e PpOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o68si5049584pfk.327.2018.01.25.11.33.46; Thu, 25 Jan 2018 11:34:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751318AbeAYTct (ORCPT + 99 others); Thu, 25 Jan 2018 14:32:49 -0500 Received: from mga04.intel.com ([192.55.52.120]:14554 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751173AbeAYTcs (ORCPT ); Thu, 25 Jan 2018 14:32:48 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jan 2018 11:32:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,413,1511856000"; d="scan'208";a="13492916" Received: from schen9-desk3.jf.intel.com (HELO [10.54.74.42]) ([10.54.74.42]) by orsmga006.jf.intel.com with ESMTP; 25 Jan 2018 11:32:46 -0800 Subject: Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process To: Peter Zijlstra , Andy Lutomirski Cc: Arjan van de Ven , LKML , KarimAllah Ahmed , Andi Kleen , Andrea Arcangeli , Ashok Raj , Asit Mallick , Borislav Petkov , Dan Williams , Dave Hansen , David Woodhouse , Greg Kroah-Hartman , "H . Peter Anvin" , Ingo Molnar , Janakarajan Natarajan , Joerg Roedel , Jun Nakajima , Laura Abbott , Linus Torvalds , Masami Hiramatsu , Paolo Bonzini , Radim Krcmar , Thomas Gleixner , Tom Lendacky , X86 ML References: <20180125085820.GV2228@hirez.programming.kicks-ass.net> <20180125092233.GE2295@hirez.programming.kicks-ass.net> <86541aca-8de7-163d-b620-083dddf29184@linux.intel.com> <20180125135055.GK2249@hirez.programming.kicks-ass.net> <20180125164139.GM2269@hirez.programming.kicks-ass.net> <20180125181852.GL2249@hirez.programming.kicks-ass.net> From: Tim Chen Message-ID: Date: Thu, 25 Jan 2018 11:32:46 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: <20180125181852.GL2249@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/25/2018 10:18 AM, Peter Zijlstra wrote: > On Thu, Jan 25, 2018 at 09:04:21AM -0800, Andy Lutomirski wrote: >> I haven't tried to fully decipher the patch, but I think the idea is >> wrong. (I think it's the same wrong idea that Rik and I both had and >> that I got into Linus' tree for a while...) The problem is that it's >> not actually correct to run indefinitely in kernel mode using stale >> cached page table data. The stale PTEs themselves are fine, but the >> stale intermediate translations can cause the CPU to speculatively >> load complete garbage into the TLB, and that's bad (and causes MCEs on >> AMD CPUs). > > Urggh.. indeed :/ > >> I think we only really have two choices: tlb_defer_switch_to_init_mm() >> == true and tlb_defer_switch_to_init_mm() == false. The current >> heuristic is to not defer if we have PCID, because loading CR3 is >> reasonably fast. > > I just _really_ _really_ hate idle drivers doing leave_mm(). I don't > suppose limiting the !IPI case to just the idle case would be correct > either, because between waking from idle and testing our 'should I have > invalidated' bit it can (however unlikely) speculate into stale TLB > entries too.. > > Peter, This patch is not ideal as it comes with the caveats that patch 2 tries to close. I put it out here to see if it can prompt people to come up with a better solution. Keeping active_mm around would have been cleaner but it looks like there are issues that Andy mentioned. The "A -> idle -> A" case would not trigger IBPB if tlb_defer_switch_to_init_mm() is true (non pcid) as we does not change the mm. This patch tries to address the case when we do switch to init_mm and back. Do you still have objections to the approach in this patch to save the last active mm before switching to init_mm? Tim