Received: by 2002:ab2:69cc:0:b0:1f4:be93:e15a with SMTP id n12csp1695985lqp; Mon, 15 Apr 2024 14:19:59 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVKlZ5/a+jEV3rY/6iU/dxQiNC+sTnfzB4AMJsJD8/8GFfGvQsfrBR91fpXHWYreJ8DsPjFxiaaRYYfFdTOegJ+p/26M5YP8vw8y3ZJPA== X-Google-Smtp-Source: AGHT+IHLV88ccRm3aiCPwRLNm+Av7cm2a5FAdap4QQS/yMkJIki+lJ+F3jr+Ounl9g257u5TNoA+ X-Received: by 2002:a17:902:d2c2:b0:1e2:45f3:2d57 with SMTP id n2-20020a170902d2c200b001e245f32d57mr240516plc.6.1713215999547; Mon, 15 Apr 2024 14:19:59 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713215999; cv=pass; d=google.com; s=arc-20160816; b=nNL44Vm6HnGEbHL6rfBvxOpUb+hKoKBgPLUHlxj9I9isD/7bqL/e9rx9+IN1g98Qw8 7lSbUpzaH9OajHf7GvdYWil9oyr4bef/bWN5rwSZfDZjIZ6VYkJ+J6wiemm7yOsY3nfR q331pT27aE62Oc51tGW/8gOWQluqR2Fo5oKoZiv+PcVS0V4XyumqjHOQjmuiRyvzpDXI Hs/TZfhGQ17W64Fy87uNKNYrr4IfJsoQ3C+t9LGWT7jQrm+o7VzQ08lfYqCEMj/s4bSd sv79wjYEFBCoDvoeH3p+upBD8fxq4J/SYVwa+EJZ1xgQpEgsdr89+UlfdX1jv9SlYRgp 7Tcg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :in-reply-to:date:dkim-signature; bh=aBGOoxkS/mJcI9EGCPz3dzyAa3bwIeBtfweaCEU7c+g=; fh=Mhr8T/reCS4rODbzkNsjME4z83cZgFOryQDoRWLOLgE=; b=QW3ehtv/vfqcR84bgWjCtWT5TzHY/JOkSLP+Ze1HKVmdJPqetbEeK7d/XzaHFj5NP9 1ltNMSzVf+Yh5iXSaINlHAsCZ1gd0CoaNJUup7AFVdpV8bzcrV6gPzj/qK3d3YnKebq1 yS4ITDYzngbM3ymTEdwR4rkGcdb5A+ds5xHRN/IuZ7QaMXfilhCqINdLMc7kisWqffhG 9SDiKTPMma5fXZ4LS1JBfrbgsj1Z9WH8JE2AKamEeMCHIsYrtZWHCKRU0ALWjfXUs5b/ YVl8GydVPNIMmM7bkY466rWxb0zi6QS84wz/XQsxSn7zhgOaKrxVByxZFLXP27zwrO5T ueoA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=YvYZyhF2; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-145921-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-145921-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id b16-20020a170903229000b001e4b20b9f97si5674560plh.646.2024.04.15.14.19.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Apr 2024 14:19:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-145921-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=YvYZyhF2; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-145921-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-145921-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 27018B23504 for ; Mon, 15 Apr 2024 21:17:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A471D156F32; Mon, 15 Apr 2024 21:17:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YvYZyhF2" Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A60415698A for ; Mon, 15 Apr 2024 21:17:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713215825; cv=none; b=lZGcm8kb+a5m0JWKFQrdO1TGFf6cHaPFMfjizjcXvRyZFBrN1kEhRD11XMg6/a4Da3Hxvvn6ds2NbdgYKuB/ttl8x010gkWwd/VTYfSFz9Cko0/V6Q7Y7ZH2emY2zBZR0C4Cpa9tZCvB15BNqpFK8JCGpDVmKtf3wtoO0LsIQIY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713215825; c=relaxed/simple; bh=yfn6aJnruIJIdeEpg1IYTe1hjYARVnJM9I6RBU6GCHQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=V5B7WO+4UxkrQyrMwjkg9/NjgHHlPNGWeaHbPdT/0WXbk8nEJYxgUZL2In/L0zo24Ra20iqaViMRR1XpMCamOFKg5Tn112rNH6tEu6mPKMhGM4dDs5Qc4gf6yNAXlXgTVcaMcyYLJsf07+SrGDebIdKuj/fAUD0h13dJIJy1kck= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YvYZyhF2; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-1e5e5fa31dbso23679095ad.0 for ; Mon, 15 Apr 2024 14:17:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713215824; x=1713820624; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=aBGOoxkS/mJcI9EGCPz3dzyAa3bwIeBtfweaCEU7c+g=; b=YvYZyhF2zuhdKd8PW1Q6UCpbcmp0xTKmng364IJUAnLbcj7ei2AaO3QrQIkPimA+X3 16RchrXgeOzxlSZMwKUPL0VkMEuqkZbpgLDfjW4XmaR+FMpCXsDWAdG9crdZzlIBzB8S MS8DlT6Xjv3QqED8Kld6QgNIf68hke9MMzDfg8ankKGvyMqSo0pRMkSREmXvXqvN/c/s Wmb7iYIuPq7kpN6yfusdObx1d2dbMBiu3fdCv+BCODiP43PAPe8cVu4XzsWS+TSXCiRE 3lpnVCkSjD3VIIECnnspwQKHc0qrUnslRpnqCOUzxqanjkp6IA/DqyVdM2fgQkWN9oHx cJog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713215824; x=1713820624; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=aBGOoxkS/mJcI9EGCPz3dzyAa3bwIeBtfweaCEU7c+g=; b=ATK1L5Q3boJsxH5ZFXrOUh8aNQN84NdRB3hyZ22u3nrEX1ufu6vZ0mQh5pRCH6ALCN pzFWAhA9lV6cEqjHvJuTIFlO67g8/LtGYjuFgGlzvC+NssO+M0kOCP5y2Y5gCfwZYuBs WNdtbTbm7Nm+EIwqIbmL5L0pJKhCRes0UPNVakDeysJP5YvD/aXXFPgzM3Q6B/h0vmdX Fry6J1oLZSKQcThZoGtbwWMu42EZsSUHIwm0V8wuV/rrPhErzRVxQnqwy4tVB4wO1HkU DetLQoYHyLiKXJ26LvB4NXAtgHZkGN6XOpqrjnEBCeEp4jj8Dk45ONqJCUG5wv2NcCiY 0mRQ== X-Forwarded-Encrypted: i=1; AJvYcCXTTjPHKCzi/eAbEantLR9G+t5UXKpq0GEJjRv2n8M8Fwj32xfh96yN6FrDlAWXym1KcxBgWFLiwZCEO+CVE0OaT1rAcvGKN4iPWwlY X-Gm-Message-State: AOJu0Yz9Bi0a2lVQriIGHsojjd5ykXbb+Q9fojJwkf9TdFjqGIdOA895 2UQgONfaRbhEwZppNIQsEx2eG+hsWoVvloKpYuJT9Qiyqac5OTER+Is52OkgGIV1aEqHdabpB+H ssg== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:22ca:b0:1e7:b7ca:2d96 with SMTP id y10-20020a17090322ca00b001e7b7ca2d96mr40604plg.1.1713215823680; Mon, 15 Apr 2024 14:17:03 -0700 (PDT) Date: Mon, 15 Apr 2024 14:17:02 -0700 In-Reply-To: <116179545fafbf39ed01e1f0f5ac76e0467fc09a.camel@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <2f1de1b7b6512280fae4ac05e77ced80a585971b.1712785629.git.isaku.yamahata@intel.com> <116179545fafbf39ed01e1f0f5ac76e0467fc09a.camel@intel.com> Message-ID: Subject: Re: [PATCH v2 07/10] KVM: x86: Always populate L1 GPA for KVM_MAP_MEMORY From: Sean Christopherson To: Rick P Edgecombe Cc: "kvm@vger.kernel.org" , Isaku Yamahata , Kai Huang , "federico.parola@polito.it" , "linux-kernel@vger.kernel.org" , "isaku.yamahata@gmail.com" , "dmatlack@google.com" , "michael.roth@amd.com" , "pbonzini@redhat.com" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Mon, Apr 15, 2024, Rick P Edgecombe wrote: > I wouldn't call myself much of an expert on nested, but... >=20 > On Wed, 2024-04-10 at 15:07 -0700, isaku.yamahata@intel.com wrote: > > There are several options to populate L1 GPA irrelevant to vCPU mode. > > - Switch vCPU MMU only: This patch. > > =C2=A0 Pros: Concise implementation. > > =C2=A0 Cons: Heavily dependent on the KVM MMU implementation. Con: Makes it impossible to support other MMUs/modes without extending the = uAPI. > Is switching just the MMU enough here? Won't the MTRRs and other vcpu bit= s be > wrong? >=20 > > - Use kvm_x86_nested_ops.get/set_state() to switch to/from guest mode. > > =C2=A0 Use __get/set_sregs2() to switch to/from SMM mode. > > =C2=A0 Pros: straightforward. > > =C2=A0 Cons: This may cause unintended side effects. >=20 > Cons make sense. >=20 > > - Refactor KVM page fault handler not to pass vCPU. Pass around necessa= ry > > =C2=A0 parameters and struct kvm. > > =C2=A0 Pros: The end result will have clearly no side effects. > > =C2=A0 Cons: This will require big refactoring. >=20 > But doesn't the fault handler need the vCPU state? Ignoring guest MTRRs, which will hopefully soon be a non-issue, no. There = are only six possible roots if TDP is enabled: 1. 4-level !SMM !guest_mode 2. 4-level SMM !guest_mode 3. 5-level !SMM !guest_mode 4. 5-level SMM !guest_mode 5. 4-level !SMM guest_mode 6. 5-level !SMM guest_mode 4-level vs. 5-level is a guest MAXPHYADDR thing, and swapping the MMU elimi= nates the SMM and guest_mode issues. If there is per-vCPU state that makes its w= ay into the TDP page tables, then we have problems, because it means that there is = per-vCPU state in per-VM structures that isn't accounted for. There are a few edge cases where KVM treads carefully, e.g. if the fault is= to the vCPU's APIC-access page, but KVM manually handles those to avoid consum= ing per-vCPU state. That said, I think this option is effectively 1b, because dropping the SMM = vs. guest_mode state has the same uAPI problems as forcibly swapping the MMU, i= t's just a different way of doing so. The first question to answer is, do we want to return an error or "silently= " install mappings for !SMM, !guest_mode. And so this option becomes relevan= t only _if_ we want to unconditionally install mappings for the 'base" mode. > > - Return error on guest mode or SMM mode:=C2=A0 Without this patch. > > =C2=A0 Pros: No additional patch. > > =C2=A0 Cons: Difficult to use. >=20 > Hmm... For the non-TDX use cases this is just an optimization, right? For= TDX > there shouldn't be an issue. If so, maybe this last one is not so horribl= e. And the fact there are so variables to control (MAXPHADDR, SMM, and guest_m= ode) basically invalidates the argument that returning an error makes the ioctl(= ) hard to use. I can imagine it might be hard to squeeze this ioctl() into QEMU's existing code, but I don't buy that the ioctl() itself is hard to use. Literally the only thing userspace needs to do is set CPUID to implicitly s= elect between 4-level and 5-level paging. If userspace wants to pre-map memory d= uring live migration, or when jump-starting the guest with pre-defined state, sim= ply pre-map memory before stuffing guest state. In and of itself, that doesn't= seem difficult, e.g. at a quick glance, QEMU could add a hook somewhere in kvm_vcpu_thread_fn() without too much trouble (though that comes with a hug= e disclaimer that I only know enough about how QEMU manages vCPUs to be dange= rous). I would describe the overall cons for this patch versus returning an error differently. Switching MMU state puts the complexity in the kernel. Retur= ning an error punts any complexity to userspace. Specifically, anything that KV= M can do regarding vCPU state to get the right MMU, userspace can do too. =20 Add on that silently doing things that effectively ignore guest state usual= ly ends badly, and I don't see a good argument for this patch (or any variant thereof).