Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3515091pxv; Mon, 19 Jul 2021 02:01:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz1s4NhHBuS33LbobWpibbfJMb5mHCdnsaGmnG8wbQrhbUqck4pv6yYC3X5HxCdA/UHhHEW X-Received: by 2002:a5d:8d16:: with SMTP id p22mr17989966ioj.90.1626685288131; Mon, 19 Jul 2021 02:01:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626685288; cv=none; d=google.com; s=arc-20160816; b=mkWQbonysN4Gl5O/wxSDukrKmd/WjK+dlKV+7DDJdZU18SCWm4pAkQ6kKKf61m8Pp6 bfHgHaJTQadB6FO/H2fzOsW+pNJkvozZ0LH8IK/DxvWn/s1tBc6Shz9JwCnRhr+tgwar IkbfvAHVOmt+xsUviTKt6qQn8+38Stmt+ytRhUWS7WJjfhb7wzIYpHRXYALS3bNv8UTT qNRx6uXtfDLecgh3mTCIohPTjHBjGl4ouAJXp7YGd7VUMWClr7hOi6vOqtRrs+eT6UJQ prIyS6L79toROzHXgLCj4e5YP/n9oigeM18+kvWQYQRapo2raBWDqz0SM52vUki6ElSw fmIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=H8BwXLt2GYeRoZIkBOMLPkcEXCy+3UKQmnbXfj9ZVxo=; b=0YKu/df6BnILW0giCGu8H6KJJNzR727uDd1ad/uLZP2/aYumBXfCLCfVrey4rh6DSo j2B+DF9WHpjNIbZexxaoshrU4U5nLHG+oR4wAU8eqRkHCASaDbBbawWP0I3B+I7IdfUx borL77aVeIZMytE1tkMRQXjH91/HCr86XFv98ueagjEapyDk7EVDRcSME+w8GtY8Fbz/ x7oorKGH7+fuEpspzKPbDc1N5p11i4Z8v0QNutTMg/crslS4fFxvk+LHJ3iELpEo+BhN qDItqfrWArUIpv0d5KpNQC9FyyFTTVurc3GPf1aXxN0f8JDtpCLJ+7Vaq6G5MpXGD0bH ScSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AUhLiyuf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y9si10851224ilv.11.2021.07.19.02.01.16; Mon, 19 Jul 2021 02:01:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AUhLiyuf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235914AbhGSITp (ORCPT + 99 others); Mon, 19 Jul 2021 04:19:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59839 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235795AbhGSITh (ORCPT ); Mon, 19 Jul 2021 04:19:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626685216; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H8BwXLt2GYeRoZIkBOMLPkcEXCy+3UKQmnbXfj9ZVxo=; b=AUhLiyuf+lw+nZ6uX1rK9217UT1NA7BvUMfpZYqZh6zoSLtD+Dxu76q4hyuycFOSgM9Nu5 AvxE8s0gj+hRg+fG7GL3LxO41sJn0T95G4wO40r0h5cNJZ81CoXeQnQYILqRsQjBUW4AD+ pauvQCIvoumfvfa9f+KrbE0zt6lH6qc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-37-Mhsvi0D9NRCXmM-jhmEcjA-1; Mon, 19 Jul 2021 05:00:13 -0400 X-MC-Unique: Mhsvi0D9NRCXmM-jhmEcjA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2EEFF8030B5; Mon, 19 Jul 2021 09:00:11 +0000 (UTC) Received: from starship (unknown [10.40.192.10]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4899060C9D; Mon, 19 Jul 2021 09:00:03 +0000 (UTC) Message-ID: Subject: Re: [PATCH v2 8/8] KVM: x86: hyper-v: Deactivate APICv only when AutoEOI feature is in use From: Maxim Levitsky To: Vitaly Kuznetsov , kvm@vger.kernel.org Cc: "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Jim Mattson , Joerg Roedel , Borislav Petkov , Wanpeng Li , Paolo Bonzini , Thomas Gleixner , "H. Peter Anvin" , Ingo Molnar , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Sean Christopherson Date: Mon, 19 Jul 2021 12:00:02 +0300 In-Reply-To: <87wnpmzqw3.fsf@vitty.brq.redhat.com> References: <20210713142023.106183-1-mlevitsk@redhat.com> <20210713142023.106183-9-mlevitsk@redhat.com> <87wnpmzqw3.fsf@vitty.brq.redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2021-07-19 at 09:47 +0200, Vitaly Kuznetsov wrote: > Maxim Levitsky writes: > > > On Tue, 2021-07-13 at 17:20 +0300, Maxim Levitsky wrote: > > > From: Vitaly Kuznetsov > > > > > > APICV_INHIBIT_REASON_HYPERV is currently unconditionally forced upon > > > SynIC activation as SynIC's AutoEOI is incompatible with APICv/AVIC. It is, > > > however, possible to track whether the feature was actually used by the > > > guest and only inhibit APICv/AVIC when needed. > > > > > > TLFS suggests a dedicated 'HV_DEPRECATING_AEOI_RECOMMENDED' flag to let > > > Windows know that AutoEOI feature should be avoided. While it's up to > > > KVM userspace to set the flag, KVM can help a bit by exposing global > > > APICv/AVIC enablement: in case APICv/AVIC usage is impossible, AutoEOI > > > is still preferred. > > > Maxim: > > > - added SRCU lock drop around call to kvm_request_apicv_update > > > - always set HV_DEPRECATING_AEOI_RECOMMENDED in kvm_get_hv_cpuid, > > > since this feature can be used regardless of AVIC > > > > > > Signed-off-by: Vitaly Kuznetsov > > > Signed-off-by: Maxim Levitsky > > > --- > > > arch/x86/include/asm/kvm_host.h | 3 +++ > > > arch/x86/kvm/hyperv.c | 34 +++++++++++++++++++++++++++------ > > > 2 files changed, 31 insertions(+), 6 deletions(-) > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > index e11d64aa0bcd..f900dca58af8 100644 > > > --- a/arch/x86/include/asm/kvm_host.h > > > +++ b/arch/x86/include/asm/kvm_host.h > > > @@ -956,6 +956,9 @@ struct kvm_hv { > > > /* How many vCPUs have VP index != vCPU index */ > > > atomic_t num_mismatched_vp_indexes; > > > > > > + /* How many SynICs use 'AutoEOI' feature */ > > > + atomic_t synic_auto_eoi_used; > > > + > > > struct hv_partition_assist_pg *hv_pa_pg; > > > struct kvm_hv_syndbg hv_syndbg; > > > }; > > > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c > > > index b07592ca92f0..6bf47a583d0e 100644 > > > --- a/arch/x86/kvm/hyperv.c > > > +++ b/arch/x86/kvm/hyperv.c > > > @@ -85,9 +85,22 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic, > > > return false; > > > } > > > > > > + > > > +static void synic_toggle_avic(struct kvm_vcpu *vcpu, bool activate) > > > +{ > > > + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); > > > + kvm_request_apicv_update(vcpu->kvm, activate, > > > + APICV_INHIBIT_REASON_HYPERV); > > > + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); > > > +} > > > > Well turns out that this patch still doesn't work (on this > > weekend I found out that all my AVIC enabled VMs hang on reboot). > > > > I finally found out what prompted me back then to make srcu lock drop > > in synic_update_vector conditional on whether the write was done > > by the host. > > > > Turns out that while KVM_SET_MSRS does take the kvm->srcu lock, > > it stores the returned srcu index in a local variable and not > > in vcpu->srcu_idx, thus the lock drop in synic_toggle_avic > > doesn't work. > > > > So it is likely that I have seen it not work, and blamed > > KVM_SET_MSRS for not taking the srcu lock which was a wrong assumption. > > > > I am more inclined to fix this by just tracking if we hold the srcu > > lock on each VCPU manually, just as we track the srcu index anyway, > > and then kvm_request_apicv_update can use this to drop the srcu > > lock when needed. > > > > Would it be possible to use some magic value in 'vcpu->srcu_idx' and not > introduce a new 'srcu_ls_locked' flag? Well, currently the returned index value from srcu_read_lock is opaque (and we have two SRCU implementations and both I think return small positive numbers, but I haven't studied them in depth). We can ask the people that maintain SRCU to reserve a number (like -1) or so. I probably first add the 'srcu_is_locked' thought and then as a follow up patch remove it if they agree. > > > > + > > > static void synic_update_vector(struct kvm_vcpu_hv_synic *synic, > > > int vector) > > > { > > > + struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic); > > > + struct kvm_hv *hv = to_kvm_hv(vcpu->kvm); > > > + int auto_eoi_old, auto_eoi_new; > > > + > > > if (vector < HV_SYNIC_FIRST_VALID_VECTOR) > > > return; > > > > > > @@ -96,10 +109,23 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic, > > > else > > > __clear_bit(vector, synic->vec_bitmap); > > > > > > + auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256); > > > + > > > if (synic_has_vector_auto_eoi(synic, vector)) > > > __set_bit(vector, synic->auto_eoi_bitmap); > > > else > > > __clear_bit(vector, synic->auto_eoi_bitmap); > > > + > > > + auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256); > > > + > > > + /* Hyper-V SynIC auto EOI SINTs are not compatible with APICV */ > > > + if (!auto_eoi_old && auto_eoi_new) { > > > + if (atomic_inc_return(&hv->synic_auto_eoi_used) == 1) > > > + synic_toggle_avic(vcpu, false); > > > + } else if (!auto_eoi_new && auto_eoi_old) { > > > + if (atomic_dec_return(&hv->synic_auto_eoi_used) == 0) > > > + synic_toggle_avic(vcpu, true); > > > + } > > > } > > > > > > static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint, > > > @@ -933,12 +959,6 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages) > > > > > > synic = to_hv_synic(vcpu); > > > > > > - /* > > > - * Hyper-V SynIC auto EOI SINT's are > > > - * not compatible with APICV, so request > > > - * to deactivate APICV permanently. > > > - */ > > > - kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_HYPERV); > > > synic->active = true; > > > synic->dont_zero_synic_pages = dont_zero_synic_pages; > > > synic->control = HV_SYNIC_CONTROL_ENABLE; > > > @@ -2466,6 +2486,8 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid, > > > ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED; > > > if (!cpu_smt_possible()) > > > ent->eax |= HV_X64_NO_NONARCH_CORESHARING; > > > + > > > + ent->eax |= HV_DEPRECATING_AEOI_RECOMMENDED; > > > > Vitally, I would like to hear your opinion on this change I also made to your code. > > I think that we should always expose HV_DEPRECATING_AEOI_RECOMMENDED as a supported > > HV cpuid bit regardless of AVIC, so that qemu could set it regardless of AVIC > > in the kernel, even if this is not optimal. > > Generally, I'm OK with the change. The meaning of the bit changes > slightly depending on whether we expose it conditionally or not. > > If HV_DEPRECATING_AEOI_RECOMMENDED is always exposed it just means that > the patch in question is in, nothing else. VMM has to figure out if > APICv/AVIC is supported by some other means. > > If HV_DEPRECATING_AEOI_RECOMMENDED is exposed conditionally, then VMM > can be stupid and solely rely on this information by just copying the > bit to user visible CPUIDs (e.g. QEMU's 'hv-passthrough' mode). I understand what you mean. In regard to avic, though, the 'hv-passthrough' doesn't work at all anyway since it also makes the qemu enable hv-vapic bit which makes the guest not use APIC.... So it seems that not even qemu but a management layer above it should figure out that it wants AVIC, and then enable all the bits that 'should' make it work, but AVIC might still not work for some reason. Currently those are the flags that are needed to make AVIC work on windows: -cpu -svm,-x2apic,-hv-vapic,hv-aeoi-deprecation (I added this locally) -global kvm-pit.lost_tick_policy=discard Thanks, Best regards, Maxim Levitsky >