Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp335731ybl; Thu, 30 Jan 2020 23:24:03 -0800 (PST) X-Google-Smtp-Source: APXvYqwY6U9RKNcxC8fYMY4pXF+0TBXwgmTi4tPs/JH9K0ehEiw56wD5UJhhLborI8JHpo4li6Ws X-Received: by 2002:a05:6808:8e1:: with SMTP id d1mr5394015oic.68.1580455443437; Thu, 30 Jan 2020 23:24:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580455443; cv=none; d=google.com; s=arc-20160816; b=PU4WWrvneKgM0A9Lr96alWvMSApif7dQK8rbmt53aXbLjSImv3Q7mczRbMGQ16ng6Y C7XvDR6vOnX9n+ZRavFFXEWfXEfRIX33dfZONj8zbENMnJDdDYd7B0XRZZkuhyzDPRCu rOC3IMa+ENADkPBhW8vN17n6mHeotsrvWEczqjSB9mPeJHUu93jDejKoEl+zqZ1LnFRM DBQFvnGFg6TuV9ghgSblO5FRywIQ6EXQFtltABlZ7dNjLtGTMgzUmQp4ZTtMPrQiZC0M O6zRycJNZbRmUR90g9FQ1oJ03c0GTeYBTU/gEAXqgv7TNoQsp9kEKVe5Q7qwNF4YqWRI vITw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=tclUj8Vwoikc+r6smkKpz6wKF3O7qI+1HVc6z7CqRlE=; b=Om7nwquEdZMF/HeV3FeJOMr9JU0Kh7kW7pV1MPe3e2QqMO9XKcZpRCXkiV8yiy9aIc ipygR037EQcTquxU1gAOaVC9ZpZiLpRwdjrzUgwDRNOP4I5FdC4ym489WVdGXf3sAgb9 P1sSi3d7t22I6FCNVycyhUovN0IecYHXGI8rLGZFuxHvrmLRAiKxbuaVImq5jQy0xG+l qA9zgwZuzgzD1cCHyB/fJDGTYC4zYp+J0ItPTNdbeoIzewCiZpuRq0W3uGoJONREGZXD rrTOZfyACuhE0Cd804QztdJcnxG5nFZKrtepgXNULt2QeDv74uYEDVzDVqZT0USn2MNO QmdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d17si3652241oij.136.2020.01.30.23.23.50; Thu, 30 Jan 2020 23:24:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728102AbgAaHW5 (ORCPT + 99 others); Fri, 31 Jan 2020 02:22:57 -0500 Received: from mga11.intel.com ([192.55.52.93]:41834 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728027AbgAaHW5 (ORCPT ); Fri, 31 Jan 2020 02:22:57 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jan 2020 23:22:56 -0800 X-IronPort-AV: E=Sophos;i="5.70,385,1574150400"; d="scan'208";a="223055979" Received: from xiaoyaol-mobl.ccr.corp.intel.com (HELO [10.249.168.169]) ([10.249.168.169]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 30 Jan 2020 23:22:54 -0800 Subject: Re: [PATCH 2/2] KVM: VMX: Extend VMX's #AC handding To: Andy Lutomirski Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Paolo Bonzini , Sean Christopherson , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org References: From: Xiaoyao Li Message-ID: <3499ee3f-e734-50fd-1b50-f6923d1f4f76@intel.com> Date: Fri, 31 Jan 2020 15:22:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/31/2020 1:16 AM, Andy Lutomirski wrote: > > >> On Jan 30, 2020, at 8:30 AM, Xiaoyao Li wrote: >> >> On 1/30/2020 11:18 PM, Andy Lutomirski wrote: >>>>> On Jan 30, 2020, at 4:24 AM, Xiaoyao Li wrote: >>>> >>>> There are two types of #AC can be generated in Intel CPUs: >>>> 1. legacy alignment check #AC; >>>> 2. split lock #AC; >>>> >>>> Legacy alignment check #AC can be injected to guest if guest has enabled >>>> alignemnet check. >>>> >>>> When host enables split lock detection, i.e., split_lock_detect!=off, >>>> guest will receive an unexpected #AC when there is a split_lock happens in >>>> guest since KVM doesn't virtualize this feature to guest. >>>> >>>> Since the old guests lack split_lock #AC handler and may have split lock >>>> buges. To make guest survive from split lock, applying the similar policy >>>> as host's split lock detect configuration: >>>> - host split lock detect is sld_warn: >>>> warning the split lock happened in guest, and disabling split lock >>>> detect around VM-enter; >>>> - host split lock detect is sld_fatal: >>>> forwarding #AC to userspace. (Usually userspace dump the #AC >>>> exception and kill the guest). >>> A correct userspace implementation should, with a modern guest kernel, forward the exception. Otherwise you’re introducing a DoS into the guest if the guest kernel is fine but guest userspace is buggy. >> >> To prevent DoS in guest, the better solution is virtualizing and advertising this feature to guest, so guest can explicitly enable it by setting split_lock_detect=fatal, if it's a latest linux guest. >> >> However, it's another topic, I'll send out the patches later. >> > > Can we get a credible description of how this would work? I suggest: > > Intel adds and documents a new CPUID bit or core capability bit that means “split lock detection is forced on”. If this bit is set, the MSR bit controlling split lock detection is still writable, but split lock detection is on regardless of the value. Operating systems are expected to set the bit to 1 to indicate to a hypervisor, if present, that they understand that split lock detection is on. > > This would be an SDM-only change, but it would also be a commitment to certain behavior for future CPUs that don’t implement split locks. It sounds a PV solution for virtualization that it doesn't need to be defined in Intel-SDM but in KVM document. As you suggested, we can define new bit in KVM_CPUID_FEATURES (0x40000001) as KVM_FEATURE_SLD_FORCED and reuse MSR_TEST_CTL or use a new virtualized MSR for guest to tell hypervisor it understand split lock detection is forced on. > Can one of you Intel folks ask the architecture team about this? >