Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp126701ybl; Tue, 7 Jan 2020 15:31:58 -0800 (PST) X-Google-Smtp-Source: APXvYqyRGAggb7ITH1va6qYCkNQGnVWngtHRXAU5DWLP31iDVGFE0zY+/M1Z3xT7UFMRpLZIs7r0 X-Received: by 2002:a05:6830:1402:: with SMTP id v2mr2098476otp.12.1578439918372; Tue, 07 Jan 2020 15:31:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578439918; cv=none; d=google.com; s=arc-20160816; b=EWBdPMnMx8Odfx5C2EgSlJ7vvC6XlKUoTmClno/3FQ7shGgDz3DnSYD2luo4SMNhvP eTWHFH1N90YC1ouwXcGLvdz3w2aQqXDNUWmbEB+NQSu8iqoy6SQOwWzDxtHxMkhWjRC/ Fm5To9zDYesyDyXPRZTzIrE4YKuec17ns8N8QPJn+wpjf8UjVan5v/PbM07ww5EP3Are 76Jh67TUVTt7NrteA9LiRam1oFO6JImsThP+N6od/Qw+PLF/ISFRV49ZqvHrWpFkt1WN jFmcvB+8DrA1HoBoJMRGUM9nKUfg4VU3nRd6wJRy5k+miz6uebFLm00M0G5rBJBYmtJ5 S8HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=W/lAVz+0yBdHJJzdq12IkJFkBb5PpawLItF9gP3C5R8=; b=hQU08/8thzJw0mUlF8w873YtDGFsuAur/sPCMUfWbqT0oy3d3aUMIBqKKjGRHO1CX+ gnsz0m4l3ZuiJlRvEQ8EvgaEkLZWeaMtiidxUGioj7aCKM4SMsrKnh1+eJK95b3V7/UQ Ds8xd666dlRoplLe9sJxC6AzoUoaVPvapbO3N8rDF4uT6zjCL8FNc+8ITvN25FbGxqNv inWIPNo6jirv4XczXPyzjTaJMhOh2V0RkS4odtKIiQqH9N0axS19lnOZFZyMVo7jhRYo ysNkU0lzYPCllcqPn+9FV0kPpl0Fuylg18s2A1HtApzO3IsRUw1wBJymfrfLr8ioEqJe LzjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y22si726141oti.269.2020.01.07.15.31.46; Tue, 07 Jan 2020 15:31:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727253AbgAGXbE (ORCPT + 99 others); Tue, 7 Jan 2020 18:31:04 -0500 Received: from mga12.intel.com ([192.55.52.136]:33852 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726530AbgAGXbD (ORCPT ); Tue, 7 Jan 2020 18:31:03 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jan 2020 15:31:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,407,1571727600"; d="scan'208";a="370772634" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.202]) by orsmga004.jf.intel.com with ESMTP; 07 Jan 2020 15:31:02 -0800 Date: Tue, 7 Jan 2020 15:31:02 -0800 From: Sean Christopherson To: Tom Lendacky Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Brijesh Singh Subject: Re: [PATCH v2] KVM: SVM: Override default MMIO mask if memory encryption is enabled Message-ID: <20200107233102.GC16987@linux.intel.com> References: <20200106224931.GB12879@linux.intel.com> <20200106233846.GC12879@linux.intel.com> <20200107222813.GB16987@linux.intel.com> <298352c6-7670-2929-9621-1124775bfaed@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <298352c6-7670-2929-9621-1124775bfaed@amd.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 07, 2020 at 04:54:34PM -0600, Tom Lendacky wrote: > On 1/7/20 4:28 PM, Sean Christopherson wrote: > > On Tue, Jan 07, 2020 at 02:16:37PM -0600, Tom Lendacky wrote: > >> On 1/6/20 5:38 PM, Sean Christopherson wrote: > >>> On Mon, Jan 06, 2020 at 05:14:04PM -0600, Tom Lendacky wrote: > >>>> On 1/6/20 4:49 PM, Sean Christopherson wrote: > >>>>> This doesn't handle the case where x86_phys_bits _isn't_ reduced by SME/SEV > >>>>> on a future processor, i.e. x86_phys_bits==52. > >>>> > >>>> Not sure I follow. If MSR_K8_SYSCFG_MEM_ENCRYPT is set then there will > >>>> always be a reduction in physical addressing (so I'm told). > >>> > >>> Hmm, I'm going off APM Vol 2, which states, or at least strongly implies, > >>> that reducing the PA space is optional. Section 7.10.2 is especially > >>> clear on this: > >>> > >>> In implementations where the physical address size of the processor is > >>> reduced when memory encryption features are enabled, software must > >>> ensure it is executing from addresses where these upper physical address > >>> bits are 0 prior to setting SYSCFG[MemEncryptionModEn]. > >> > >> It's probably not likely, but given what is stated, I can modify my patch > >> to check for a x86_phys_bits == 52 and skip the call to set the mask, eg: > >> > >> if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT && > >> boot_cpu_data.x86_phys_bits < 52) { > >> > >>> > >>> But, hopefully the other approach I have in mind actually works, as it's > >>> significantly less special-case code and would naturally handle either > >>> case, i.e. make this a moot point. > >> > >> I'll hold off on the above and wait for your patch. > > > > Sorry for the delay, this is a bigger mess than originally thought. Or > > I'm completely misunderstanding the issue, which is also a distinct > > possibility :-) > > > > Due to KVM activating its L1TF mitigation irrespective of whether the CPU > > is whitelisted as not being vulnerable to L1TF, simply using 86_phys_bits > > to avoid colliding with the C-bit isn't sufficient as the L1TF mitigation > > uses those first five reserved PA bits to store the MMIO GFN. Setting > > BIT(x86_phys_bits) for all MMIO sptes would cause it to be interpreted as > > a GFN bit when the L1TF mitigation is active and lead to bogus MMIO. > > The L1TF mitigation only gets applied when: > boot_cpu_data.x86_cache_bits < 52 - shadow_nonpresent_or_rsvd_mask_len > > and with shadow_nonpresent_or_rsvd_mask_len = 5, that means that means > boot_cpu_data.x86_cache_bits < 47. > > On AMD processors that support memory encryption, the x86_cache_bits value > is not adjusted, just the x86_phys_bits. So for AMD processors that have > memory encryption support, this value will be at least 48 and therefore > not activate the L1TF mitigation. Ah. Hrm. I'd prefer to clean that code up to make the interactions more explicit, but may be we can separate that out. > > The only sane approach I can think of is to activate the L1TF mitigation > > based on whether the CPU is vulnerable to L1TF, as opposed to activating> the mitigation purely based on the max PA of the CPU. Since all CPUs that > > support SME/SEV are whitelisted as NO_L1TF, the L1TF mitigation and C-bit > > should never be active at the same time. > > There is still the issue of setting a single bit that can conflict with > the C-bit. As it is today, if the C-bit were to be defined as bit 51, then > KVM would not take a nested page fault and MMIO would be broken. Wouldn't Paolo's patch to use the raw "cpuid_eax(0x80000008) & 0xff" for shadow_phys_bits fix that particular collision by causing kvm_set_mmio_spte_mask() to clear the present bit? Or am I misundertanding how the PA reduction interacts with the C-Bit? AIUI, using phys_bits=48, then the standard scenario is Cbit=47 and some additional bits 46:M are reserved. Applying that logic to phys_bits=52, then Cbit=51 and bits 50:M are reserved, so there's a collision but it's mostly benign because shadow_phys_bits==52, which triggers this: if (IS_ENABLED(CONFIG_X86_64) && shadow_phys_bits == 52) mask &= ~1ull; In other words, Paolo's patch fixes the fatal bug, but unnecessarily disables optimized MMIO page faults. To remedy that, your idea is to rely on the (undocumented?) behavior that there are always additional reserved bits between Cbit and the reduced x86_phys_bits.