Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp330313iob; Wed, 18 May 2022 03:12:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+DycqnP8pCjucTgHKtUdifZfxIb6AU0MzkX3+b+aP9uXFRm0aod1Z5cnEvqVXVCtR0AY4 X-Received: by 2002:a17:903:18c:b0:15e:be98:90ef with SMTP id z12-20020a170903018c00b0015ebe9890efmr26128449plg.129.1652868736953; Wed, 18 May 2022 03:12:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652868736; cv=none; d=google.com; s=arc-20160816; b=cld1nBTGggGgRCWs2D83OyEgEq1veHP9I52dGEEeuKCc3JiKVDBCKBUBxhf9S/2fk/ JzcjeFiUlKs6UtnmkmTXQ6kXjq33jfpkuUEpiaiwRHJ7iMhgXIViQt32DuuB79safhbh Uk6CZhmXXzHdffQWYJ+UUu58v3+oEpt+hkzMl6kxQ6ZNfJe0T0pg51jMxLZb7Ft5w8FB +WReK7Mvr2cQ6X/3kS6hXHMj0WuAhhzVUsgaHm0pyJc+49dRM6G6qnEShdjsIdvtnr+P XQyejXCtjAfhyS+UG7ZYjd3NfXQZl9MC+akeWRsBmJXiPAkPzF0hCBRVjd1/jct4X5f3 zn+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=MFPqNngbhurlXb9z2VG6oI79AAlWQ0hNU+ijxe8hQD8=; b=YR+ep6btSzUxD7E3tYq8YOLEqBc/fFF9jaN0/g2j9eHzP4swt4Gyb8P7M0H/IqO+Cj rAke6Iaf4hB1v1iPjOYjwDZ/Zb40oiIljYOIT4NxJbaoC/hvx+mIEplEHIMn2CzzzY0x JM/+yIafZMgOnuB75D8czifNgPRhnk/NRhbQbMA5xFhG3FGotSYWx5n4LwpfeWhLUjq/ dKyn3kQXAQWPKKo9dIUPgAHGScGzraZ2SrOfno5AJ5GY2sTQvo06GLvHESbrqfspQ58M gQxuhE8QFzetnNr9o+709HfJLPbfLHWzJYZjeTNUdTMAnXsPmzC3hHQLBwEFeOIdOi9e /UlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="DBMTr/WE"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id q8-20020a170902f78800b00156eeb5cd6bsi2112087pln.562.2022.05.18.03.12.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 03:12:16 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="DBMTr/WE"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CEAAC26AE8; Wed, 18 May 2022 02:50:55 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234511AbiERJun (ORCPT + 99 others); Wed, 18 May 2022 05:50:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234612AbiERJum (ORCPT ); Wed, 18 May 2022 05:50:42 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B7FA3165AF for ; Wed, 18 May 2022 02:50:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1652867439; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MFPqNngbhurlXb9z2VG6oI79AAlWQ0hNU+ijxe8hQD8=; b=DBMTr/WENYgB8FXfPvyARQwDzWZauUw+7iSVsav4B/3qQWyXfiEfKYi1MK1x5CCJ+vMI/d uw6F7Loj9t3NdpgljjnlXX7gD3wxgHZx1TjlHBnTJS3xx1NhFGBR/ZqafOfAFKn/pDmO46 4tmEOcVTQe0aEONvgh2DRqNy29qHDu4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-320-iUSXRWdkNa2yhFYazfRDPw-1; Wed, 18 May 2022 05:50:35 -0400 X-MC-Unique: iUSXRWdkNa2yhFYazfRDPw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CAD65811E76; Wed, 18 May 2022 09:50:34 +0000 (UTC) Received: from starship (unknown [10.40.192.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 23D031410F36; Wed, 18 May 2022 09:50:27 +0000 (UTC) Message-ID: <8c78939bf01a98554696add10e17b07631d97a28.camel@redhat.com> Subject: Re: [RFC PATCH v3 02/19] KVM: x86: inhibit APICv/AVIC when the guest and/or host changes apic id/base from the defaults. From: Maxim Levitsky To: Chao Gao Cc: kvm@vger.kernel.org, Wanpeng Li , Vitaly Kuznetsov , Jani Nikula , Paolo Bonzini , Tvrtko Ursulin , Rodrigo Vivi , Zhenyu Wang , Joonas Lahtinen , Tom Lendacky , Ingo Molnar , David Airlie , Thomas Gleixner , Dave Hansen , x86@kernel.org, intel-gfx@lists.freedesktop.org, Sean Christopherson , Daniel Vetter , Borislav Petkov , Joerg Roedel , linux-kernel@vger.kernel.org, Jim Mattson , Zhi Wang , Brijesh Singh , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 18 May 2022 12:50:27 +0300 In-Reply-To: <20220518082811.GA8765@gao-cwp> References: <20220427200314.276673-1-mlevitsk@redhat.com> <20220427200314.276673-3-mlevitsk@redhat.com> <20220518082811.GA8765@gao-cwp> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.85 on 10.11.54.7 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-05-18 at 16:28 +0800, Chao Gao wrote: > On Wed, Apr 27, 2022 at 11:02:57PM +0300, Maxim Levitsky wrote: > > Neither of these settings should be changed by the guest and it is > > a burden to support it in the acceleration code, so just inhibit > > it instead. > > > > Also add a boolean 'apic_id_changed' to indicate if apic id ever changed. > > > > Signed-off-by: Maxim Levitsky > > --- > > arch/x86/include/asm/kvm_host.h | 3 +++ > > arch/x86/kvm/lapic.c | 25 ++++++++++++++++++++++--- > > arch/x86/kvm/lapic.h | 8 ++++++++ > > 3 files changed, 33 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 63eae00625bda..636df87542555 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1070,6 +1070,8 @@ enum kvm_apicv_inhibit { > > APICV_INHIBIT_REASON_ABSENT, > > /* AVIC is disabled because SEV doesn't support it */ > > APICV_INHIBIT_REASON_SEV, > > + /* APIC ID and/or APIC base was changed by the guest */ > > + APICV_INHIBIT_REASON_RO_SETTINGS, > > You need to add it to check_apicv_inhibit_reasons as well. True, forgot about it. > > > }; > > > > struct kvm_arch { > > @@ -1258,6 +1260,7 @@ struct kvm_arch { > > hpa_t hv_root_tdp; > > spinlock_t hv_root_tdp_lock; > > #endif > > + bool apic_id_changed; > > What's the value of this boolean? No one reads it. I use it in later patches to kill the guest during nested VM entry if it attempts to use nested AVIC after any vCPU changed APIC ID. I mentioned this boolean in the commit description. This boolean avoids the need to go over all vCPUs and checking if they still have the initial apic id. In the future maybe we can introduce a more generic 'taint' bitmap with various flags like that, indicating that the guest did something unexpected. BTW, the other option in regard to the nested AVIC is just to ignore this issue completely. The code itself always uses vcpu_id's, thus regardless of when/how often the guest changes its apic ids, my code would just use the initial APIC ID values consistently. In this case I won't need this boolean. > > > }; > > > > struct kvm_vm_stat { > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > > index 66b0eb0bda94e..8996675b3ef4c 100644 > > --- a/arch/x86/kvm/lapic.c > > +++ b/arch/x86/kvm/lapic.c > > @@ -2038,6 +2038,19 @@ static void apic_manage_nmi_watchdog(struct kvm_lapic *apic, u32 lvt0_val) > > } > > } > > > > +static void kvm_lapic_check_initial_apic_id(struct kvm_lapic *apic) > > +{ > > + if (kvm_apic_has_initial_apic_id(apic)) > > + return; > > + > > + pr_warn_once("APIC ID change is unsupported by KVM"); > > It is misleading because changing xAPIC ID is supported by KVM; it just > isn't compatible with APICv. Probably this pr_warn_once() should be > removed. Honestly since nobody uses this feature, I am not sure if to call this supported, I am sure that KVM has more bugs in regard of using non standard APIC ID. This warning might hopefuly make someone complain about it if this feature is actually used somewhere. > > > + > > + kvm_set_apicv_inhibit(apic->vcpu->kvm, > > + APICV_INHIBIT_REASON_RO_SETTINGS); > > The indentation here looks incorrect to me. > kvm_set_apicv_inhibit(apic->vcpu->kvm, > APICV_INHIBIT_REASON_RO_SETTINGS); True, will fix. > > > + > > + apic->vcpu->kvm->arch.apic_id_changed = true; > > +} > > + > > static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > > { > > int ret = 0; > > @@ -2046,9 +2059,11 @@ static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > > > > switch (reg) { > > case APIC_ID: /* Local APIC ID */ > > - if (!apic_x2apic_mode(apic)) > > + if (!apic_x2apic_mode(apic)) { > > + > > kvm_apic_set_xapic_id(apic, val >> 24); > > - else > > + kvm_lapic_check_initial_apic_id(apic); > > + } else > > ret = 1; > > break; > > > > @@ -2335,8 +2350,11 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) > > MSR_IA32_APICBASE_BASE; > > > > if ((value & MSR_IA32_APICBASE_ENABLE) && > > - apic->base_address != APIC_DEFAULT_PHYS_BASE) > > + apic->base_address != APIC_DEFAULT_PHYS_BASE) { > > + kvm_set_apicv_inhibit(apic->vcpu->kvm, > > + APICV_INHIBIT_REASON_RO_SETTINGS); > > pr_warn_once("APIC base relocation is unsupported by KVM"); > > + } > > } > > > > void kvm_apic_update_apicv(struct kvm_vcpu *vcpu) > > @@ -2649,6 +2667,7 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, > > } > > } > > > > + kvm_lapic_check_initial_apic_id(vcpu->arch.apic); > > return 0; > > } > > > > diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h > > index 4e4f8a22754f9..b9c406d383080 100644 > > --- a/arch/x86/kvm/lapic.h > > +++ b/arch/x86/kvm/lapic.h > > @@ -252,4 +252,12 @@ static inline u8 kvm_xapic_id(struct kvm_lapic *apic) > > return kvm_lapic_get_reg(apic, APIC_ID) >> 24; > > } > > > > +static inline bool kvm_apic_has_initial_apic_id(struct kvm_lapic *apic) > > +{ > > + if (apic_x2apic_mode(apic)) > > + return true; > > I suggest warning of x2apic mode: > if (WARN_ON_ONCE(apic_x2apic_mode(apic))) > > Because it is weird that callers care about initial apic id when apic is > in x2apic mode. Yes but due to something I don't agree with, but also something that I gave up on arguing upon, KVM userspace API kind of supports setting APIC ID != initial apic id, even in x2apic mode, and disallowing it, is considered API breakage, therefore this case is possible. This case should still trigger a warning in kvm_lapic_check_initial_apic_id. Best regards, Maxim Levitsky >