Received: by 2002:a89:288:0:b0:1f7:eeee:6653 with SMTP id j8csp18447lqh; Mon, 6 May 2024 09:55:02 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCU8DzfH6xWHZPW/ex7DEXhznuejlKltDrjfTyKnAIuqlFNCw1+5W9ut/U327UtFwZOPvwDXkCYl6Ysg8yED1FuUoo1+OiaXB/j8yVfXTQ== X-Google-Smtp-Source: AGHT+IHNtQADCEZmaZC5CDBM/+XRXqkCAcp/7wG9rlnKGM5kVrC5l2fsZCybpgihKQvoFLKS7MHw X-Received: by 2002:a50:d694:0:b0:56c:5990:813e with SMTP id r20-20020a50d694000000b0056c5990813emr6408727edi.13.1715014502699; Mon, 06 May 2024 09:55:02 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715014502; cv=pass; d=google.com; s=arc-20160816; b=JdwaMd8g54gBp/2tndPOWdG3LvaFxzI9HMSSHL1AHcIBE0j21xgDeXMADbZm/Dm+X3 mmW3EmmwORADgv8HdEz9BWIzxGL9mwPNbU2x7biJk/A9GiyknHfgS1pvQCdErSGawCZg /8rJpoVXX8G1D/7lJ1AanbNHQzk/bu+jH5hP+MLv6Qjjn7o0t22K4W9qSYGZ7SRU2hn9 tu6BxbWKmzcr5iq8QRjtHIIJq1T81drlOxUPAldSzHpy2LPcA00MlFihnjVesFNy9taq ya3omvzjH36Fug3mPLXlc1Zr+KOBY6cIM86dpWRDqx/WgbGiBaic5Z9TotUaM+YsVxxI 6YoA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=1q7kv4eiqfBXu87xL8Ycp/RG/MD0+b+6v2pcsNs/bQo=; fh=W0/hQry1q6jEhYezuMLJTuytN4DvB77wtXjlnfKg5tM=; b=BsW2ZJXKAnm+/UNi4PH8u3mf2kkmfgNQOL99jVptqzROpnNjCLNcHvmhAlpsUBAYW+ 4lY8GXE/GiMNm0d1qLzUPEu840ZyWIYPMjucgS0x50J7swVe5Iuh+oZyhp3AnZWvj69f mQIlbTgxsd6nYIR4G/tfdyRqeg2ZrrU45jglnZaA2q7CHu7rZwIEYCNRd6MC6Y21aubc GSPWfD5dxlZVafpaPGOnfUKbYqANA1YQJVH2UmV8xAz77HyJ8GM1dWaBjJC4T/Wb7RXY 3ZFJ/X1jU+GIyf/yqsiNguGzya3tpL/tOnLdSHIJ/F3yri6vGmbM9Sny/8yln/HQXAoG Lqeg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=v1tjfmHv; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-170196-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-170196-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id z1-20020a05640240c100b00572a7ca1ff9si5405956edb.326.2024.05.06.09.55.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 May 2024 09:55:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-170196-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=v1tjfmHv; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-170196-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-170196-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 492B21F265A6 for ; Mon, 6 May 2024 16:55:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A4BF6156C69; Mon, 6 May 2024 16:54:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="v1tjfmHv" Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 973C3156C62 for ; Mon, 6 May 2024 16:54:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715014490; cv=none; b=Zw9o5bQi5vJBaMI20XMxw2BF1CtQUdcs8uetLnVtvtjjqAoJ7DEvhkxAJWfOCKJwtm4M/zdLbHCUokn9lsE/Vaki9LVMmVBl+VXL5eb3O11/3dqxYq54RNbsH4c6oZb44bqvGk9OZDf7cbJysCm4HSEJ9/ik2m6YdtWd+a6pNVc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715014490; c=relaxed/simple; bh=YV70YhhV1rRenJW+J47TAVixYhKt6tnlxLJU4RuhfdI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=UQ8LJvonee8KpOdEMjY23DpImEj9BlOOr8PGc8kmRr9931guFytFyAi6v4ZTMdejpKljxRz3dD2xYKoNZ8d82JVm2Au8ML5qPkvRaEoYUtB7Pqck3iaEDbk31AXX9CCRZPvJtELRPc8XWS9+7NjfDLiz8YXvyfLNxnPei3fdyCc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=v1tjfmHv; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-61bea0c36bbso47103957b3.2 for ; Mon, 06 May 2024 09:54:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715014487; x=1715619287; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1q7kv4eiqfBXu87xL8Ycp/RG/MD0+b+6v2pcsNs/bQo=; b=v1tjfmHvIsUQXfNyYqvvFilz6D2hnpzzefTNm2j3ev3EM5Mow4Z3s7TqXLhgZX9cS9 WIBA4W3kwSq3Zu52eQL9Y8Hm3Jjmyd1WvV3qBzxF8kndNC7/csD5lj4I8spfuBtajQEy 2HFkN6CNagVRihIWhZKWAFL3kEFAm49Om4/wwV1PP/oI6+pN31x1GU5XVk+FpCopHiRL lTbTsca9VNnax0xVJz9lG4YhA/nvaFZ1FUgBbb58u0x7FIUoaNk3j37nIINri4C9e6e6 fo8LfA/2hORCFbkAACK7LunRNds2evk4vwVQ5qGN4tjBgpb+ndQ91BEiNVd+kW+4DTFN aVfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715014487; x=1715619287; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1q7kv4eiqfBXu87xL8Ycp/RG/MD0+b+6v2pcsNs/bQo=; b=peuOfNLH+Acz5Ge+03PZV0j2eVYAnixAotLbNqS2eUU7KMF6LCF1czg6vkif7YitPa nnAtXWmRfN1Za1J/8TdDb4rqx5YHdzKBuncZ6MPK7jvzWwnBwWqbgPxMQLcgXehAV0u1 xFBKxlO1kRzUmEyazGZXtY4DagK/QeRFh6fNo2lOORYQ8FTXUo2T3eos9Xtbb8gsxCJi TgcVSNDWcoyL5aVXq6hxiemjBtw693sxsGtkkAAj38gxf6JUp8G6s8r7pncJ5Qj/K1Ce rurnmOhIH1EUo4QKzzxz4FRhSDq7mFwax8fWTybJy65P2hjIKlKfWjfOPIGy82KKtda/ ENAQ== X-Forwarded-Encrypted: i=1; AJvYcCU7yDMUh1cB9Z+tQy8rkLrY+dO/ueS9MdzjWoCH4omOy+UoKrNCQpyTxRuTrzsDCesQx+6Cc8W3ThLO6gCITvXN+WBF+jAIZA4SEuKA X-Gm-Message-State: AOJu0YzeOHHQk20S7dVf5x6EEfZDay8TtXdwIGzc7N/HrbCLen2gDAm9 +Lm7Z5EJh4cefLk5PD6tst54cjfOBrXBtdy/cVYw2XW7iQsBtxXUIIT/nBChf9pc99+oSqUwhoY PcA== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:52ca:0:b0:61b:e8a2:6f4b with SMTP id g193-20020a8152ca000000b0061be8a26f4bmr2879705ywb.1.1715014487705; Mon, 06 May 2024 09:54:47 -0700 (PDT) Date: Mon, 6 May 2024 09:54:46 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240219074733.122080-1-weijiang.yang@intel.com> <20240219074733.122080-25-weijiang.yang@intel.com> Message-ID: Subject: Re: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace From: Sean Christopherson To: Weijiang Yang Cc: pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com Content-Type: text/plain; charset="us-ascii" On Mon, May 06, 2024, Weijiang Yang wrote: > On 5/2/2024 7:15 AM, Sean Christopherson wrote: > > On Sun, Feb 18, 2024, Yang Weijiang wrote: > > > @@ -696,6 +697,20 @@ void kvm_set_cpu_caps(void) > > > kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP); > > > if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) > > > kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD); > > > + /* > > > + * Don't use boot_cpu_has() to check availability of IBT because the > > > + * feature bit is cleared in boot_cpu_data when ibt=off is applied > > > + * in host cmdline. > > I'm not convinced this is a good reason to diverge from the host kernel. E.g. > > PCID and many other features honor the host setup, I don't see what makes IBT > > special. > > This is mostly based on our user experience and the hypothesis for cloud > computing: When we evolve host kernels, we constantly encounter issues when > kernel IBT is on, so we have to disable kernel IBT by adding ibt=off. But we > need to test the CET features in VM, if we just simply refer to host boot > cpuid data, then IBT cannot be enabled in VM which makes CET features > incomplete in guest. > > I guess in cloud computing, it could run into similar dilemma. In this case, > the tenant cannot benefit the feature just because of host SW problem. Hmm, but such issues should be found before deploying a kernel to production. The one scenario that comes to mind where I can see someone wanting to disable IBT would be running a out-of-tree and/or third party module. > I know currently KVM except LA57 always honors host feature configurations, > but in CET case, there could be divergence wrt honoring host configuration as > long as there's no quirk for the feature. > > But I think the issue is still open for discussion... I'm not totally opposed to the idea. Somewhat off-topic, the existing LA57 code upon which the IBT check is based is flawed, as it doesn't account for the max supported CPUID leaf. On Intel CPUs, that could result in a false positive due CPUID (stupidly) returning the value of the last implemented CPUID leaf, no zeros. In practice, it doesn't cause problems because CPUID.0x7 has been supported since forever, but it's still a bug. Hmm, actually, __kvm_cpu_cap_mask() has the exact same bug. And that's much less theoretical, e.g. kvm_cpu_cap_init_kvm_defined() in particular is likely to cause problems at some point. And I really don't like that KVM open codes calls to cpuid_() for these "raw" features. One option would be to and helpers to change this: if (cpuid_edx(7) & F(IBT)) kvm_cpu_cap_set(X86_FEATURE_IBT); to this: if (raw_cpuid_has(X86_FEATURE_IBT)) kvm_cpu_cap_set(X86_FEATURE_IBT); but I think we can do better, and harden the CPUID code in the process. If we do kvm_cpu_cap_set() _before_ kvm_cpu_cap_mask(), then incorporating the raw host CPUID will happen automagically, as __kvm_cpu_cap_mask() will clear bits that aren't in host CPUID. The most obvious approach would be to simply call kvm_cpu_cap_set() before kvm_cpu_cap_mask(), but that's more than a bit confusing, and would open the door for potential bugs due to calling kvm_cpu_cap_set() after kvm_cpu_cap_mask(). And detecting such bugs would be difficult, because there are features that KVM fully emulates, i.e. _must_ be stuffed after kvm_cpu_cap_mask(). Instead of calling kvm_cpu_cap_set() directly, we can take advantage of the fact that the F() maskes are fed into kvm_cpu_cap_mask(), i.e. are naturally processed before the corresponding kvm_cpu_cap_mask(). If we add an array to track which capabilities have been initialized, then F() can WARN on improper usage. That would allow detecting bad "raw" usage, *and* would detect (some) scenarios where a F() is fed into the wrong leaf, e.g. if we added F(LA57) to CPUID_7_EDX instead of CPUID_7_ECX. #define F(name) \ ({ \ u32 __leaf = __feature_leaf(X86_FEATURE_##name); \ \ BUILD_BUG_ON(__leaf >= ARRAY_SIZE(kvm_cpu_cap_initialized)); \ WARN_ON_ONCE(kvm_cpu_cap_initialized[__leaf]); \ \ feature_bit(name); \ }) /* * Raw Feature - For features that KVM supports based purely on raw host CPUID, * i.e. that KVM virtualizes even if the host kernel doesn't use the feature. * Simply force set the feature in KVM's capabilities, raw CPUID support will * be factored in by kvm_cpu_cap_mask(). */ #define RAW_F(name) \ ({ \ kvm_cpu_cap_set(X86_FEATURE_##name); \ F(name); \ }) Assuming testing doesn't poke a hole in my idea, I'll post a small series.