Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1295375ybg; Fri, 18 Oct 2019 15:23:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqxosY55wxWv3FYL4AdnNX7DHL/CDYrsFnh1SRDq03aDbZJQRPgdnDNG7RA/Ph4aTHOvys8e X-Received: by 2002:a17:906:4948:: with SMTP id f8mr10788494ejt.318.1571437386444; Fri, 18 Oct 2019 15:23:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571437386; cv=none; d=google.com; s=arc-20160816; b=CbZxe7ZndhBfxjJNAw8IRJ5TDSuJC1HTubO/ZVxMJopNQ16H4jlFoXlIjvW3/M/CjG OFk9WUsRd5QC9mbPGB5xtvSj8GyJnXuTvtJvhk2PeIkCLamXBVN4kJ8T5oo8NzuYBH9b XJtgzD4G9GBmCfo4MsuSNevgPWi40yuGm1LGql+DqFUH6swbaO0LFouWmVQ2MliBbC3U TYLNSvYdWIRJUCT4yi0nEkySuBXZkQk84LdPrsGCX4DL98RZ4NkKpx46EguJp3qPWtnE htIb/GVM3Ul9cZbvsT26TsHhgZUZ0wqzeCxpLh4+WtCRXoRXH2zMe83e3OPlHGlChEw7 8yhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=vdfaS4x+15geFXFAAjmKgq9HzbAipInSo3Bf9aQPBrw=; b=KcFTmncdB8F/1fkSv17AyItVc+/gZpsj0N0OxLRmlfqhben0AQ6FUk9jBJJljmhP/j Bfex80NaSBVOvmiiIHmj4ppB90McQfrJiPEB/0+xxAbDeBYnMNwP4u6ZHO9sgKJBkHYD kE1yLz8Mj78nBtjM/Zs5IxtovDl8aSJnw4ExyXO6RcZraLHcJ4PADOxu15iKMkpLwNj2 BB8USxdnxDu/wMKygNmSRKODvNSHZlAwjBsZiHLOn296+HgnKM/TfzvKLDJP4zRwDt6C d1lohrkxajbemeX0FqtAQ+E72NYUYF0RxSVHoW9/lBVji3Cs/mLaCbaORDvZotGDf/to 2ycA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z35si4867324edb.146.2019.10.18.15.22.43; Fri, 18 Oct 2019 15:23:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441768AbfJQXiZ (ORCPT + 99 others); Thu, 17 Oct 2019 19:38:25 -0400 Received: from mga14.intel.com ([192.55.52.115]:32452 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2438560AbfJQXiZ (ORCPT ); Thu, 17 Oct 2019 19:38:25 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Oct 2019 16:38:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,309,1566889200"; d="scan'208";a="221564530" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.41]) by fmsmga004.fm.intel.com with ESMTP; 17 Oct 2019 16:38:24 -0700 Date: Thu, 17 Oct 2019 16:38:24 -0700 From: Sean Christopherson To: Thomas Gleixner Cc: Paolo Bonzini , Xiaoyao Li , Fenghua Yu , Ingo Molnar , Borislav Petkov , H Peter Anvin , Peter Zijlstra , Andrew Morton , Dave Hansen , Radim Krcmar , Ashok Raj , Tony Luck , Dan Williams , Sai Praneeth Prakhya , Ravi V Shankar , linux-kernel , x86 , kvm@vger.kernel.org Subject: Re: [RFD] x86/split_lock: Request to Intel Message-ID: <20191017233824.GA23654@linux.intel.com> References: <57f40083-9063-5d41-f06d-fa1ae4c78ec6@redhat.com> <8808c9ac-0906-5eec-a31f-27cbec778f9c@intel.com> <20191017172312.GC20903@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 17, 2019 at 11:31:15PM +0200, Thomas Gleixner wrote: > On Thu, 17 Oct 2019, Sean Christopherson wrote: > > On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote: > > > The more I look at this trainwreck, the less interested I am in merging any > > > of this at all. > > > > > > The fact that it took Intel more than a year to figure out that the MSR is > > > per core and not per thread is yet another proof that this industry just > > > works by pure chance. > > > > > > There is a simple way out of this misery: > > > > > > Intel issues a microcode update which does: > > > > > > 1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to > > > AND logic, i.e. when one thread disables AC it's automatically > > > disabled on the core. > > > > > > Alternatively it supresses the #AC when the current thread has it > > > disabled. > > > > > > 2) Provide a separate bit which indicates that the AC enable logic is > > > actually AND based or that #AC is supressed when the current thread > > > has it disabled. > > > > > > Which way I don't really care as long as it makes sense. > > > > The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on > > one CPU are immediately visible on its sibling CPU. > > That's less horrible than I read out of your initial explanation. > > Thankfully all of this is meticulously documented in the SDM ... Preaching to the choir on this one... > Though it changes the picture radically. The truly shared MSR allows > regular software synchronization without IPIs and without an insane amount > of corner case handling. > > So as you pointed out we need a per core state, which is influenced by: > > 1) The global enablement switch > > 2) Host induced #AC > > 3) Guest induced #AC > > A) Guest has #AC handling > > B) Guest has no #AC handling > > #1: > > - OFF: #AC is globally disabled > > - ON: #AC is globally enabled > > - FORCE: same as ON but #AC is enforced on guests > > #2: > > If the host triggers an #AC then the #AC has to be force disabled on the > affected core independent of the state of #1. Nothing we can do about > that and once the initial wave of #AC issues is fixed this should not > happen on production systems. That disables #3 even for the #3.A case > for simplicity sake. > > #3: > > A) Guest has #AC handling > > #AC is forwarded to the guest. No further action required aside of > accounting > > B) Guest has no #AC handling > > If #AC triggers the resulting action depends on the state of #1: > > - FORCE: Guest is killed with SIGBUS or whatever the virt crowd > thinks is the appropriate solution > - ON: #AC triggered state is recorded per vCPU and the MSR is > toggled on VMENTER/VMEXIT in software from that point on. > > So the only interesting case is #3.B and #1.state == ON. There you need > serialization of the state and the MSR write between the cores, but only > when the vCPU triggered an #AC. Until then, nothing to do. And "vCPU triggered an #AC" should include an explicit check in KVM's emulator. > vmenter() > { > if (vcpu->ac_disable) > this_core_disable_ac(); > } > > vmexit() > { > if (vcpu->ac_disable) { > this_core_enable_ac(); > } > > this_core_dis/enable_ac() takes the global state into account and has the > necessary serialization in place. Overall, looks good to me. Although Tony's mail makes it obvious we need to sync internally...