Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp435900imn; Wed, 3 Aug 2022 10:05:20 -0700 (PDT) X-Google-Smtp-Source: AA6agR7SGk1V/Nz/fAcBUC/MlTTrwCZTTD6Cc/X0xzrtLe1KVf0ouRyfRuUpr/9s2Oi1bhAu8P3F X-Received: by 2002:a17:90b:4aca:b0:1f4:ea26:f589 with SMTP id mh10-20020a17090b4aca00b001f4ea26f589mr5702745pjb.142.1659546320187; Wed, 03 Aug 2022 10:05:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659546320; cv=none; d=google.com; s=arc-20160816; b=V2cOZTeCQtkuR/5VOi78YQPYA2dscbMH/8rwfhAtnIWROJ191oHFa9KHmLLbpc1Spv ZopzzOnrbMvQfkKhSf7sVzjtU9yNyZY8BpJ4G8/b0FESmk9DjZdNPS7rVVPCbeyIYTmY v/sisrJ4jvQKItYl/aYdanZfCDppH0M9gBt4bfZC3HupMm3cEasU4nGaYp/GishkGRBA 91a7VEArIom5zR781XjBCryJQNeyzU+Y/m4o9vhDSLW4R/aB+Tc6T4KR6OTN7SeaV5/G AzPRsD/OpP2NQ4r8b22dxcnP9N9Xruy/xq5zCyfMcMPGJwdTBJUAUYe3JXXc+YDZRIEw 2dBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=fhbd9DQJ5D6KXAxH6rPNWphudRMxo7FuUfGNTHo4u9E=; b=iIVGsT/v4wwlpte9I7YCHYcl02GGAcZx3SOnUKlp+W5OTSMOoJoPTe/R7cTt0aGjk0 YBeS8NkbU8TP1jIyAFdsNk5ddR8wrmrT8WfeDxnS7B+kO7O8dsTSfNDeZHOsdphAbFgJ VnV7oMuj3tF1o0LUA3Z++tBYeqWkB861kBnvU94+CboqA/LeZtEzLN6G7Ng7Ry6AyPrh NRoKy2axGl1UtMEU0YYCNoMuW45QqcpBYA8N9LEedlDdhKBzPPR5qnJTkHJWFseIzhQL iEuJ8tCZlpCLFIt+hyNCqzwXVFmcaU8b8MrUztZg4kCRwFBO7A/twtEJC14Tfjbh/QQ1 gJHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=rPbZqOsA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h21-20020a635315000000b0041b2b6c8eb6si17970054pgb.287.2022.08.03.10.05.05; Wed, 03 Aug 2022 10:05:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=rPbZqOsA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237313AbiHCRAe (ORCPT + 99 others); Wed, 3 Aug 2022 13:00:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232098AbiHCRAd (ORCPT ); Wed, 3 Aug 2022 13:00:33 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7B5F2AE1 for ; Wed, 3 Aug 2022 10:00:31 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id f11so15637638pgj.7 for ; Wed, 03 Aug 2022 10:00:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=fhbd9DQJ5D6KXAxH6rPNWphudRMxo7FuUfGNTHo4u9E=; b=rPbZqOsAXcAEpup8KYY+VxUphb2xjMzZFqapiuN5pUFkOvsoVd/Igmcm7tOt63U5JB shrZP2YQLPYg08a4TP4F+X9e3iI7BRvnxouDtn1zFpx6QXFtyGrUNnBdPakYRtbIIxpo SnLjCWBja2N3vnnvbdE75nsb+7N9l+0b310thhZS6aCKYq1hPB3sXHEUEEi2snNIWHO7 FgQm+Tk6eCgPnmlEncrBrk0fhUuApKKWWpDFxxexQmhqZSqIHsBIEM9M91kxqNiIbTEP tq+AgqlJjA9291+qEe+RDzdhOYtcJGGhShKmCxLkdgfU6KL+9MwsI8dpCdFdbIBolW+k ctkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=fhbd9DQJ5D6KXAxH6rPNWphudRMxo7FuUfGNTHo4u9E=; b=1PPTFrcB79QhykEmRS9LZoeib/ehDjoquvDPpKjKw8DccPq+/baLsa2t6p6PhN7FFD /vaiSGPKVlq58bkVOpcy1ZERq9PyuXJJRcphDXMMtpyRImE4Xy2Oh9py+n8Kh11mLxxN sJ0T0iWKBRXdkruwsDz5Dr/0FfCRnvwKJzii3BBGoPzwEXU9uORhrWM8BugJwLRjCb7y eO7NL/lwU07fL0B7bNltDcJuLIWeyAyo1uY9JVEsVNHmXHTe0Iksj2/j2Gf3OPaQqkdI JCFB1e/1xkqyX9SqpE0iKynbY/d4Gz4deB0RI6WGxBsbz0in0UY7kSds0mAKsmW6MJqY OGBQ== X-Gm-Message-State: AJIora9ZciE68FI1HSFT3+3U9Wy6FE+fOWefNLOrHZnmQg4UtSzTHnEJ uGgVo884WOabmgZmfoHi8lKfJg== X-Received: by 2002:a65:42c8:0:b0:41a:8138:f47f with SMTP id l8-20020a6542c8000000b0041a8138f47fmr21839456pgp.476.1659546030504; Wed, 03 Aug 2022 10:00:30 -0700 (PDT) Received: from google.com (33.5.83.34.bc.googleusercontent.com. [34.83.5.33]) by smtp.gmail.com with ESMTPSA id q25-20020aa79839000000b0052deda6e3d2sm3901576pfl.98.2022.08.03.10.00.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Aug 2022 10:00:30 -0700 (PDT) Date: Wed, 3 Aug 2022 17:00:26 +0000 From: Mingwei Zhang To: Maxim Levitsky Cc: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Oliver Upton Subject: Re: [PATCH 1/5] KVM: x86: Get vmcs12 pages before checking pending interrupts Message-ID: References: <20220802230718.1891356-1-mizhang@google.com> <20220802230718.1891356-2-mizhang@google.com> <060419e118445978549f0c7d800f96a9728c157c.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 03, 2022, Mingwei Zhang wrote: > On Wed, Aug 03, 2022, Maxim Levitsky wrote: > > On Tue, 2022-08-02 at 23:07 +0000, Mingwei Zhang wrote: > > > From: Oliver Upton > > > > > > vmx_guest_apic_has_interrupts implicitly depends on the virtual APIC > > > page being present + mapped into the kernel address space. However, with > > > demand paging we break this dependency, as the KVM_REQ_GET_VMCS12_PAGES > > > event isn't assessed before entering vcpu_block. > > > > > > Fix this by getting vmcs12 pages before inspecting the guest's APIC > > > page. Note that upstream does not have this issue, as they will directly > > > get the vmcs12 pages on vmlaunch/vmresume instead of relying on the > > > event request mechanism. However, the upstream approach is problematic, > > > as the vmcs12 pages will not be present if a live migration occurred > > > before checking the virtual APIC page. > > > > Since this patch is intended for upstream, I don't fully understand > > the meaning of the above paragraph. > > My apology. Some of the statement needs to be updated, which I should do > before sending. But I think the point here is that there is a missing > get_nested_state_pages() call here within vcpu_block() when there is the > request of KVM_REQ_GET_NESTED_STATE_PAGES. > > > > > > > > > > > Signed-off-by: Oliver Upton > > > Signed-off-by: Mingwei Zhang > > > --- > > > arch/x86/kvm/x86.c | 17 +++++++++++++++++ > > > 1 file changed, 17 insertions(+) > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 5366f884e9a7..1d3d8127aaea 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -10599,6 +10599,23 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu) > > > { > > > bool hv_timer; > > > > > > + /* > > > + * We must first get the vmcs12 pages before checking for interrupts > > > + * that might unblock the guest if L1 is using virtual-interrupt > > > + * delivery. > > > + */ > > > + if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { > > > + /* > > > + * If we have to ask user-space to post-copy a page, > > > + * then we have to keep trying to get all of the > > > + * VMCS12 pages until we succeed. > > > + */ > > > + if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { > > > + kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); > > > + return 0; > > > + } > > > + } > > > + > > > if (!kvm_arch_vcpu_runnable(vcpu)) { > > > /* > > > * Switch to the software timer before halt-polling/blocking as > > > > > > If I understand correctly, you are saying that if apic backing page is migrated in post copy > > then 'get_nested_state_pages' will return false and thus fail? > > What I mean is that when the vCPU was halted and then migrated in this > case, KVM did not call get_nested_state_pages() before getting into > kvm_arch_vcpu_runnable(). This function checks the apic backing page and > fails on that check and triggered the warning. > > > > AFAIK both SVM and VMX versions of 'get_nested_state_pages' assume that this is not the case > > for many things like MSR bitmaps and such - they always uses non atomic versions > > of guest memory access like 'kvm_vcpu_read_guest' and 'kvm_vcpu_map' which > > supposed to block if they attempt to access HVA which is not present, and then > > userfaultd should take over and wake them up. > > You are right here. > > > > If that still fails, nested VM entry is usually failed, and/or the whole VM > > is crashed with 'KVM_EXIT_INTERNAL_ERROR'. > > Ah, I think I understand what you are saying. hmm, so basically the patch here is to continuously request vmcs12 pages if failed. But what you are saying is that we just need to call 'get_nested_state_pages' once. If it fails, then the VM fails to work. Let me double check and get back. > > Anything I missed? > > > > Best regards, > > Maxim Levitsky > >