Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1932731ybl; Thu, 30 Jan 2020 08:31:10 -0800 (PST) X-Google-Smtp-Source: APXvYqw/x9Rh25wHbB4LRtL7Fw7IfBiCdsIx5+C+CSDgPBnasii7w3/MlxV2nQ9arX8z55JMhKf7 X-Received: by 2002:aca:4ece:: with SMTP id c197mr3452058oib.53.1580401870575; Thu, 30 Jan 2020 08:31:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580401870; cv=none; d=google.com; s=arc-20160816; b=ffyn52J8HDvXnsTnziKLzS8suNp/jkrM+NQPsESG9Tge+RbCP9X639sGsfoK0VvA/u IeG3JE/FJbD/GtiNA/e1ZYtAywCv3tY0/4hH6Gvg3Us9iQxB2yHsGOTyOMKgHhXT431O AkmNUBa2AM5R+b2PuQhJWKZWMR4K2ge8ow/cd1wwfXav+7siFsyWyXwF7N7ybVf/wzoJ fo4y26jpaL98fH8LG9/H4D1FV8C1obb0Hx/GWq8IIcMYwawh9cXObZhPtSdu2Vg05LPj M40HnzgJhrCN6rbXGs2YrSXnmgscwt9YPn3rCPrfESKwHY27yk8z1IC5Q8f9rKlb3q3i uVxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Twd0rn1D47SXjA43P6xkT4PROSd6YxLxXeW38giS+F8=; b=o0yw8WKXlRdsjpfuQ8TNFCf9Yprqq8cCNkRQXxBcj4vLeVoC7LMHfVsXH28x9Y0QoA 5NCadrqDjefaEJDxo6hLoABlp1PWEBlv5BYlNenO2a43LeMECs5VS21+OwEH/20IDdQq 4ac2Cq3jhydQfHwqDUmmIvriSZrowe+HB5ZPhxUfsImMPr8vH95XrIESBwtF2+gissw7 ijNArLctTCJ1Ij/Wvsqr3WqGmYbWj4LN1UzZjUlUFO/flRAHg+9yayUfxzI9cYHQqthQ XG/DaYPJTIYaguX6895nLoz7zQAuyUuHR3Nh/Xumjw9p30Tc4WBPONkF4MiTyS9C0JVj mdoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p22si3544252ota.43.2020.01.30.08.30.56; Thu, 30 Jan 2020 08:31:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727308AbgA3QaB (ORCPT + 99 others); Thu, 30 Jan 2020 11:30:01 -0500 Received: from mga12.intel.com ([192.55.52.136]:18032 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727191AbgA3QaB (ORCPT ); Thu, 30 Jan 2020 11:30:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jan 2020 08:30:01 -0800 X-IronPort-AV: E=Sophos;i="5.70,382,1574150400"; d="scan'208";a="222842003" Received: from xiaoyaol-mobl.ccr.corp.intel.com (HELO [10.249.168.169]) ([10.249.168.169]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 30 Jan 2020 08:29:58 -0800 Subject: Re: [PATCH 2/2] KVM: VMX: Extend VMX's #AC handding To: Andy Lutomirski Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Paolo Bonzini , Sean Christopherson , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org References: <20200130121939.22383-3-xiaoyao.li@intel.com> <4A8E14B3-1914-4D0C-A43A-234717179408@amacapital.net> From: Xiaoyao Li Message-ID: Date: Fri, 31 Jan 2020 00:29:56 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <4A8E14B3-1914-4D0C-A43A-234717179408@amacapital.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/30/2020 11:18 PM, Andy Lutomirski wrote: > > >> On Jan 30, 2020, at 4:24 AM, Xiaoyao Li wrote: >> >> There are two types of #AC can be generated in Intel CPUs: >> 1. legacy alignment check #AC; >> 2. split lock #AC; >> >> Legacy alignment check #AC can be injected to guest if guest has enabled >> alignemnet check. >> >> When host enables split lock detection, i.e., split_lock_detect!=off, >> guest will receive an unexpected #AC when there is a split_lock happens in >> guest since KVM doesn't virtualize this feature to guest. >> >> Since the old guests lack split_lock #AC handler and may have split lock >> buges. To make guest survive from split lock, applying the similar policy >> as host's split lock detect configuration: >> - host split lock detect is sld_warn: >> warning the split lock happened in guest, and disabling split lock >> detect around VM-enter; >> - host split lock detect is sld_fatal: >> forwarding #AC to userspace. (Usually userspace dump the #AC >> exception and kill the guest). > > A correct userspace implementation should, with a modern guest kernel, forward the exception. Otherwise you’re introducing a DoS into the guest if the guest kernel is fine but guest userspace is buggy. To prevent DoS in guest, the better solution is virtualizing and advertising this feature to guest, so guest can explicitly enable it by setting split_lock_detect=fatal, if it's a latest linux guest. However, it's another topic, I'll send out the patches later. > What’s the intended behavior here? > It's for old guests. Below I quote what Paolo said in https://lore.kernel.org/kvm/57f40083-9063-5d41-f06d-fa1ae4c78ec6@redhat.com/ "So for an old guest, as soon as the guest kernel happens to do a split lock, it gets an unexpected #AC and crashes and burns. And then, after much googling and gnashing of teeth, people proceed to disable split lock detection. (Old guests are the common case: you're a cloud provider and your customers run old stuff; it's a workstation and you want to play that game that requires an old version of Windows; etc.). To save them the googling and gnashing of teeth, I guess we can do a pr_warn_ratelimited on the first split lock encountered by a guest. (It has to be ratelimited because userspace could create an arbitrary amount of guests to spam the kernel logs). But the end result is the same, split lock detection is disabled by the user."