Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp5635756rdb; Wed, 13 Dec 2023 14:59:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IEVRFwEMJP4vLoHubCM8GygTBOJp/5iEzPaRhhSHR6INMli35MH0oEw2QniXoxt9LnMZwd+ X-Received: by 2002:a05:6a20:4e2f:b0:17b:426f:829 with SMTP id gk47-20020a056a204e2f00b0017b426f0829mr8530267pzb.37.1702508372576; Wed, 13 Dec 2023 14:59:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702508372; cv=none; d=google.com; s=arc-20160816; b=BOupfT31R9LeabgaEyl8eZ2zHF0dL99hlEdVTFI2woEusSAuHWlky8Xz2WE56MOwId c7L5kK5myJgYlCCBAPUbnEzXfhpTynF82jKk6mqdqATvFHeQhbeQre1oRm5gbh4ZNvD+ lQagdW5IyLl8FqoVj+T06jnIXGa2Pz76lt8n0wjBSfP0FZBHjaLOJFEF68mTe+ZVS3ZB rIKNUCXEa0cZUp9QI+E0HIxOBDg1qv2Ppwjt3gKN1wHHpoRujFchYvc4A0BFNfuubcQS s+/ejmiXXIYTZGfLdWIifOkAQIFlab+HznqMv0qqh3ROevBhTKrQSBLCDQ75/wyhUOtS Hm6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=P+J5uhAeT04KXTELzQDOfFmLaVLnZ2vRpEsBsoOgLL4=; fh=pVZmooW2D9dG+7zgPZ8qL30foJYjz9QBUaDOhIS2Qf8=; b=f9XtujqTcqQ9paj7KZzqPmuCi/DcsjYEHT/fv0psSLyLUYfMk+yHl/NV2XkVaACv6P H3/OxHMuo0l+Ez9RxcFIM78rChvoxYDK7L08u+R2KIw5OjARUG3T9sHx/dITpHIHVhVd IS6NCbxdbJIpRTS33EBp1cHcce3i69qTsc/OjJe4Jn+APpF7Uc5Ka37jlf8seJQbJdO2 PYyxzdhaRyvzBlyFJAy0D5kHRU4Gu0MIbbraAqAYN9Qh2wM9c7Ti3bze2G3XqsUaHxfI nOVirkUkvrZpREz10acrV0TtwazHj0cJf8WsyVrsVEsAmqU5nvv5FVXc2J2/uc8QkCEc Xx+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=KfqFARil; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id f11-20020a056a00228b00b006ce9acdf9efsi10040561pfe.79.2023.12.13.14.59.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 14:59:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=KfqFARil; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 15E9C802F6A2; Wed, 13 Dec 2023 14:59:30 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233945AbjLMW7N (ORCPT + 99 others); Wed, 13 Dec 2023 17:59:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbjLMW7N (ORCPT ); Wed, 13 Dec 2023 17:59:13 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50039CF for ; Wed, 13 Dec 2023 14:59:19 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-dbc68661060so5191428276.2 for ; Wed, 13 Dec 2023 14:59:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702508358; x=1703113158; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=P+J5uhAeT04KXTELzQDOfFmLaVLnZ2vRpEsBsoOgLL4=; b=KfqFARilZk67aIMqxNzCzRfq/DqZxGxtsMpp6prFL06MxhzS4rmbbTrr45NGtdGuXx KGDWdDc5pUuYn4KLtlljQWjoyefiJ+ddfqQuBYsPI8iJsD84nclDLmF2M7jNwCDXWZx9 oBgnaPFFWpVJ5b6NyJHDx6StNjYYfzDlP/WjURuo/sjTiO5kc68j/f+3k/UBpiDMhxAi r2yE5mkELOpmdYkZGb0OZ0RL7MbMuN0alb7bGZs52REVX6RUQZKuJ+yePas9LILeCKQM qV48pvLgllVHnX7o8YniHt4JydRf0XU7yJq97zg22VyJ/0tYr5ajGy0mRBUcvtURKr/B h7nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702508358; x=1703113158; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=P+J5uhAeT04KXTELzQDOfFmLaVLnZ2vRpEsBsoOgLL4=; b=JCxVZeAwdJEP9tVOJ5YJA9Ed9sEylhfE0EWY9vi6Aw1MbqT3xAZAvHJixZknpLjJmX RPA+7d+oLPgoG9IcDlKzoP7xK60yUolJ2iLgUQML2SCj3bnuOil1/HIKPLX4ht0OHYE3 a4VZ54s2N9D8wJbJtEf287nbwhmn6natQN2mvVxfngv2DCZluyjS3att5F5uUMO68D2A ygIrn49zIzbKvANlMdCFtecGEX0L6ncm3kV5zzJRv7+jli7T+FfL0VGmzsEOVP0U5ygW xcDhVRoul5UtxThxbIukOaT3kkRje4rhO7BXwpOXHQXD5xN7ZOks6v+OJrQMmUyKBpZt HQFA== X-Gm-Message-State: AOJu0Yyzk3hurjmRbGmzYTAr8jdTvsN73e0tOPR2BZNRvDVPRgI0fmnq vygZglYzBelJioeB/2oHcxCpVz+8u8Y= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:a28c:0:b0:dbc:d4c4:15d3 with SMTP id c12-20020a25a28c000000b00dbcd4c415d3mr27833ybi.5.1702508358541; Wed, 13 Dec 2023 14:59:18 -0800 (PST) Date: Wed, 13 Dec 2023 14:59:16 -0800 In-Reply-To: <5ca5592b21131f515e296afae006e5bb28b1fb87.camel@redhat.com> Mime-Version: 1.0 References: <20220921003201.1441511-11-seanjc@google.com> <20231207010302.2240506-1-jmattson@google.com> <5ca5592b21131f515e296afae006e5bb28b1fb87.camel@redhat.com> Message-ID: Subject: Re: [PATCH v4 10/12] KVM: x86: never write to memory from kvm_vcpu_check_block() From: Sean Christopherson To: Maxim Levitsky Cc: Jim Mattson , alexandru.elisei@arm.com, anup@brainfault.org, aou@eecs.berkeley.edu, atishp@atishpatra.org, borntraeger@linux.ibm.com, chenhuacai@kernel.org, david@redhat.com, frankja@linux.ibm.com, imbrenda@linux.ibm.com, james.morse@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, maz@kernel.org, oliver.upton@linux.dev, palmer@dabbelt.com, paul.walmsley@sifive.com, pbonzini@redhat.com, suzuki.poulose@arm.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 13 Dec 2023 14:59:30 -0800 (PST) On Thu, Dec 14, 2023, Maxim Levitsky wrote: > On Tue, 2023-12-12 at 07:28 -0800, Sean Christopherson wrote: > > On Sun, Dec 10, 2023, Jim Mattson wrote: > > > On Thu, Dec 7, 2023 at 8:21=E2=80=AFAM Sean Christopherson wrote: > > > > Doh. We got the less obvious cases and missed the obvious one. > > > >=20 > > > > Ugh, and we also missed a related mess in kvm_guest_apic_has_interr= upt(). That > > > > thing should really be folded into vmx_has_nested_events(). > > > >=20 > > > > Good gravy. And vmx_interrupt_blocked() does the wrong thing becau= se that > > > > specifically checks if L1 interrupts are blocked. > > > >=20 > > > > Compile tested only, and definitely needs to be chunked into multip= le patches, > > > > but I think something like this mess? > > >=20 > > > The proposed patch does not fix the problem. In fact, it messes thing= s > > > up so much that I don't get any test results back. > >=20 > > Drat. > >=20 > > > Google has an internal K-U-T test that demonstrates the problem. I > > > will post it soon. > >=20 > > Received, I'll dig in soonish, though "soonish" might unfortunately mig= ht mean > > 2024. > >=20 >=20 > Hi, >=20 > So this is what I think: >=20 > KVM does have kvm_guest_apic_has_interrupt() for this exact purpose, > to check if nested APICv has a pending interrupt before halting. For all intents and purposes, so was nested_ops->has_events(). I don't see any reason to have two APIs that do the same thing, and the call to kvm_guest_apic_has_interrupt() is wrong in that it doesn't verify that IRQs= are enabled for _L2_. That's why my preference is to fold the two together. > However the problem is bigger - with APICv we have in essence 2 pending > interrupt bitmaps - the PIR and the IRR, and to know if the guest has a > pending interrupt one has in theory to copy PIR to IRR, then see if the m= ax > is larger then the current PPR. Yeah, this is what my untested hack-a-patch tried to do. > Since we don't want to write to guest memory, The changelog is misleading/wrong. Writing guest memory is ok, what isn't = safe is blocking or sleeping, i.e. KVM must not trigger a host page fault due to accessing a page that's been swapped out. Read vs. write doesn't matter. So KVM can safely read and write guest memory so long as it already mapped = by=20 kvm_vcpu_map() (or I suppose if we wrapped an access with pagefault_disable= (), but I can't think of a sane reason to do that). E.g. nVMX can access a vCP= U's PID mapping, but synthesizing a nested VM-Exit will cause explosions on nSV= M. > and the IRR here resides in the guest memory, I guess we have to do a > 'dry-run' version of 'vmx_complete_nested_posted_interrupt' and call it f= rom > kvm_guest_apic_has_interrupt(). nested_ops->has_events() is the much better fit, e.g. the naming won't get = weird and we can gate the whole thing on is_guest_mode(). Though we probably nee= d a wrapper to handle any commonalities between nVMX and nSVM. > What do you think? I can prepare a patch for this. As above, this is what I tried to do, sort of. Though it's obviously broke= n. We don't need a full dry-run because KVM only needs to detect events that are = unique to L2, e.g. nVMX's preemption timer, MTF, and pending virtual interrupts (h= mm, I suspect nSVM's vNMI is broken too). Things like INIT and SMI don't requi= re nested virtualization awareness because the event itself is tracked for the= vCPU as a whole.