Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2181442iof; Tue, 7 Jun 2022 22:24:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw3GS/ZOw4Vy6fya1rLO0AZ5QpLAMJJwRINOr+SamSBeFMuBm/ZQld0w/1X/s1YKdD33brE X-Received: by 2002:a05:6a00:1941:b0:50d:807d:530b with SMTP id s1-20020a056a00194100b0050d807d530bmr32983037pfk.17.1654665858648; Tue, 07 Jun 2022 22:24:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654665858; cv=none; d=google.com; s=arc-20160816; b=gcvio/qSocbjbpcNVMuPux2v9JzNizSExQmsekQZaZlj0wZBuTPiwAXTG2HouZzNLs /k65/U/Q7QYq+Y9g3SFUGVAjJmv5tNsSr6yEEmUwzhQVS7GtfX2msfjum+GyYWpPT0Sb FgqevMVxJ6i1Hq54412K9auDOu6epfz/v1laWY7oSlax99bffUhADig/ADkTqEzbHciQ 4Tfunkf9+7sgiUaW6y9VqP46+9FA2KLiRkXJv7xqeZQMsg9VoB878lukQrZ2xkF6by5s bWxTGp0UPRaITb8rJ8xJwOoVZqWb1cWOuPX/8fJi62yaC11oM5LiPqTDAu4V5clEQy7I Zh8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=Im8ECHcYmpO3jURF5NFQ6YSKCiAdvp768cL63zS5OUM=; b=SI6FAu5ASLB/4RV/2b9vobfDE82cOII1mwPeXrTOJ/xtS+wg+mLSr6GHOAkQTvWuAC yf9AZrQjCdCTnPZiCF5RFSlPHgcFK1vhZsuh3sfcn/Nflf5MvlJBEujCSFViTVFdB47T CC6MQyhpz+4X6eGwjp6+LAuCp3uQlBc9zk+EEpeZAO3DsaAaWXmpJij9RVrYodyYO+vX aXmmbhuZiaTdQpf/INVqvR/bulEbPIvntBzQfnOLG3lIqcLNKD2ryvkPUQ31eSXepR3z KLyy0x8Dq3/MtA6peAQU1V0YCdhQDXWmpFaE9EL/IoNUYdv7mE4UIw2Es1t/Ln1qCps8 o6LA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U1NHtWIi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id ij22-20020a170902ab5600b00153b2d16660si18045109plb.616.2022.06.07.22.24.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 22:24:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=U1NHtWIi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D658F470C55; Tue, 7 Jun 2022 21:51:06 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238939AbiFGI6I (ORCPT + 99 others); Tue, 7 Jun 2022 04:58:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238831AbiFGI5u (ORCPT ); Tue, 7 Jun 2022 04:57:50 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C444D11166 for ; Tue, 7 Jun 2022 01:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1654592266; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Im8ECHcYmpO3jURF5NFQ6YSKCiAdvp768cL63zS5OUM=; b=U1NHtWIiGI6oc6LhzJ9OfPM+z2lXA6hoilMAM4DWGRL9xtMsZGHgV9MPKKV4CsMpeMn32J 0wPrRiPSzRsm558oe91DaaBEYPi8ovZIo5n4n2QZG/kmWUnpuFKihf+PfgTSPpm7k6qGNR 9DzKydqwRbOg31jIKKE1Rqd0IaWTzT8= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-599-_2xxL-CMNO-BrkTzbmSlvw-1; Tue, 07 Jun 2022 04:57:45 -0400 X-MC-Unique: _2xxL-CMNO-BrkTzbmSlvw-1 Received: by mail-qk1-f200.google.com with SMTP id bp18-20020a05620a459200b006a6a7b1f0acso7703532qkb.13 for ; Tue, 07 Jun 2022 01:57:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=Im8ECHcYmpO3jURF5NFQ6YSKCiAdvp768cL63zS5OUM=; b=vVptD5cD9odMO/yA77FRYmBcEf0eISPwfqmnUrbvS/joT5KUENYS8Xo3D/aCayFziK UH1FagEVNH3tr7QIGyVJf+c3r5RJ9RBgNC9/Vd+//MsQsdJW6iOaAdcALvdTaxXgj8vS 1fcAILvIGS4Hfj7qrePKQ5Y8/oBlxBanZ2eeym8y0z9TjUPvn7A0FNHqNfiGI7I7WDFj 7bxqJw2jmtgo5ckWZXl2tnR+WCRx8OirrT231DHZBz9coAWNmPYO0E4P1E8KcLIiDOEj yRja3toilr2hYJVYI2cNI5ckG0FtgtvqfusnUuWK7iPPRpbnyKh/vlBpo2Hh6tsqALTQ sdSw== X-Gm-Message-State: AOAM5320EadvfURahey0BqmVTXjGu/HlvB9MlB2u2PuJnmHdI5/YxC6W FwKuKszgSRn146OFZ9fG+VI8SF+OV8JU4leTyG+bU3d/a6QSfwXUhJEvYHjxm9WVXg3LKvt0bkj ltJPjZ6DCSp0ApbU/nWZuWQsm X-Received: by 2002:a05:620a:2681:b0:67e:909f:d5ac with SMTP id c1-20020a05620a268100b0067e909fd5acmr18674848qkp.125.1654592264953; Tue, 07 Jun 2022 01:57:44 -0700 (PDT) X-Received: by 2002:a05:620a:2681:b0:67e:909f:d5ac with SMTP id c1-20020a05620a268100b0067e909fd5acmr18674844qkp.125.1654592264700; Tue, 07 Jun 2022 01:57:44 -0700 (PDT) Received: from [10.35.4.238] (bzq-82-81-161-50.red.bezeqint.net. [82.81.161.50]) by smtp.gmail.com with ESMTPSA id h12-20020a05620a13ec00b0069fc13ce20asm5775934qkl.59.2022.06.07.01.57.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 01:57:43 -0700 (PDT) Message-ID: <4be614689a902303cef1e5e1889564f965e63baa.camel@redhat.com> Subject: Re: [PATCH v6 03/38] KVM: x86: hyper-v: Introduce TLB flush fifo From: Maxim Levitsky To: Vitaly Kuznetsov , kvm@vger.kernel.org, Paolo Bonzini Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , Michael Kelley , Siddharth Chandrasekaran , Yuan Yao , linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org Date: Tue, 07 Jun 2022 11:57:40 +0300 In-Reply-To: <20220606083655.2014609-4-vkuznets@redhat.com> References: <20220606083655.2014609-1-vkuznets@redhat.com> <20220606083655.2014609-4-vkuznets@redhat.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.40.4 (3.40.4-2.fc34) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2022-06-06 at 10:36 +0200, Vitaly Kuznetsov wrote: > To allow flushing individual GVAs instead of always flushing the > whole > VPID a per-vCPU structure to pass the requests is needed. Use > standard > 'kfifo' to queue two types of entries: individual GVA (GFN + up to > 4095 > following GFNs in the lower 12 bits) and 'flush all'. Honestly I still don't think I understand why we can't just raise KVM_REQ_TLB_FLUSH_GUEST when the guest uses this interface to flush everthing, and then we won't need to touch the ring at all. But I am just curious - if there is a reason for that, then no objections from my side. > > The size of the fifo is arbitrary set to '16'. > > Note, kvm_hv_flush_tlb() only queues 'flush all' entries for now and > kvm_hv_vcpu_flush_tlb() doesn't actually read the fifo just resets > the > queue before doing full TLB flush so the functional change is very > small but the infrastructure is prepared to handle individual GVA > flush requests. > > Signed-off-by: Vitaly Kuznetsov > --- >  arch/x86/include/asm/kvm_host.h | 20 +++++++++++++++ >  arch/x86/kvm/hyperv.c           | 45 > +++++++++++++++++++++++++++++++++ >  arch/x86/kvm/hyperv.h           | 16 ++++++++++++ >  arch/x86/kvm/x86.c              |  8 +++--- >  arch/x86/kvm/x86.h              |  1 + >  5 files changed, 86 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h > b/arch/x86/include/asm/kvm_host.h > index 7a6a6f47b603..cf3748be236d 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -25,6 +25,7 @@ >  #include >  #include >  #include > +#include >   >  #include >  #include > @@ -600,6 +601,23 @@ struct kvm_vcpu_hv_synic { >         bool dont_zero_synic_pages; >  }; >   > +/* The maximum number of entries on the TLB flush fifo. */ > +#define KVM_HV_TLB_FLUSH_FIFO_SIZE (16) > +/* > + * Note: the following 'magic' entry is made up by KVM to avoid > putting > + * anything besides GVA on the TLB flush fifo. It is theoretically > possible > + * to observe a request to flush 4095 PFNs starting from > 0xfffffffffffff000 > + * which will look identical. KVM's action to 'flush everything' > instead of > + * flushing these particular addresses is, however, fully legitimate > as > + * flushing more than requested is always OK. > + */ > +#define KVM_HV_TLB_FLUSHALL_ENTRY  ((u64)-1) > + > +struct kvm_vcpu_hv_tlb_flush_fifo { > +       spinlock_t write_lock; > +       DECLARE_KFIFO(entries, u64, KVM_HV_TLB_FLUSH_FIFO_SIZE); > +}; > + >  /* Hyper-V per vcpu emulation context */ >  struct kvm_vcpu_hv { >         struct kvm_vcpu *vcpu; > @@ -619,6 +637,8 @@ struct kvm_vcpu_hv { >                 u32 enlightenments_ebx; /* > HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */ >                 u32 syndbg_cap_eax; /* > HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */ >         } cpuid_cache; > + > +       struct kvm_vcpu_hv_tlb_flush_fifo tlb_flush_fifo; >  }; >   >  /* Xen HVM per vcpu emulation context */ > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c > index b402ad059eb9..c8b22bf67577 100644 > --- a/arch/x86/kvm/hyperv.c > +++ b/arch/x86/kvm/hyperv.c > @@ -29,6 +29,7 @@ >  #include >  #include >  #include > +#include >  #include >   >  #include > @@ -954,6 +955,9 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu > *vcpu) >   >         hv_vcpu->vp_index = vcpu->vcpu_idx; >   > +       INIT_KFIFO(hv_vcpu->tlb_flush_fifo.entries); > +       spin_lock_init(&hv_vcpu->tlb_flush_fifo.write_lock); > + >         return 0; >  } >   > @@ -1789,6 +1793,35 @@ static u64 kvm_get_sparse_vp_set(struct kvm > *kvm, struct kvm_hv_hcall *hc, >                               var_cnt * sizeof(*sparse_banks)); >  } >   > +static void hv_tlb_flush_enqueue(struct kvm_vcpu *vcpu) > +{ > +       struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo; > +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu); > +       u64 entry = KVM_HV_TLB_FLUSHALL_ENTRY; > + > +       if (!hv_vcpu) > +               return; > + > +       tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo; > + > +       kfifo_in_spinlocked(&tlb_flush_fifo->entries, &entry, 1, > &tlb_flush_fifo->write_lock); Tiny nitpick: the 'tlb_flush_fifo' isn't really neede here I think, but probably will be needed in later patches, so feel free to ignore. > +} > + > +void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu) > +{ > +       struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo; > +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu); > + > +       kvm_vcpu_flush_tlb_guest(vcpu); > + > +       if (!hv_vcpu) > +               return; > + > +       tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo; > + > +       kfifo_reset_out(&tlb_flush_fifo->entries); Same here. > +} > + >  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct > kvm_hv_hcall *hc) >  { >         struct kvm *kvm = vcpu->kvm; > @@ -1797,6 +1830,8 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu > *vcpu, struct kvm_hv_hcall *hc) >         DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS); >         u64 valid_bank_mask; >         u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS]; > +       struct kvm_vcpu *v; > +       unsigned long i; >         bool all_cpus; >   >         /* > @@ -1876,10 +1911,20 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu > *vcpu, struct kvm_hv_hcall *hc) >          * analyze it here, flush TLB regardless of the specified > address space. >          */ >         if (all_cpus) { > +               kvm_for_each_vcpu(i, v, kvm) > +                       hv_tlb_flush_enqueue(v); > + >                 kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH); >         } else { >                 sparse_set_to_vcpu_mask(kvm, sparse_banks, > valid_bank_mask, vcpu_mask); >   > +               for_each_set_bit(i, vcpu_mask, KVM_MAX_VCPUS) { > +                       v = kvm_get_vcpu(kvm, i); > +                       if (!v) > +                               continue; > +                       hv_tlb_flush_enqueue(v); > +               } > + >                 kvm_make_vcpus_request_mask(kvm, > KVM_REQ_HV_TLB_FLUSH, vcpu_mask); >         } >   > diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h > index da2737f2a956..e5b32266ff7d 100644 > --- a/arch/x86/kvm/hyperv.h > +++ b/arch/x86/kvm/hyperv.h > @@ -147,4 +147,20 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, > struct kvm_hyperv_eventfd *args); >  int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 > *cpuid, >                      struct kvm_cpuid_entry2 __user *entries); >   > + > +static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu > *vcpu) > +{ > +       struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo; > +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu); > + > +       if (!hv_vcpu || !kvm_check_request(KVM_REQ_HV_TLB_FLUSH, > vcpu)) > +               return; > + > +       tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo; > + > +       kfifo_reset_out(&tlb_flush_fifo->entries); > +} > +void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu); > + > + >  #endif > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 80cd3eb5e7de..805db43c2829 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3320,7 +3320,7 @@ static void kvm_vcpu_flush_tlb_all(struct > kvm_vcpu *vcpu) >         static_call(kvm_x86_flush_tlb_all)(vcpu); >  } >   > -static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu) > +void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu) >  { >         ++vcpu->stat.tlb_flush; >   > @@ -3355,14 +3355,14 @@ void > kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu) >  { >         if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) { >                 kvm_vcpu_flush_tlb_current(vcpu); > -               kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu); > +               kvm_hv_vcpu_empty_flush_tlb(vcpu); >         } >   >         if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) { >                 kvm_vcpu_flush_tlb_guest(vcpu); > -               kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu); > +               kvm_hv_vcpu_empty_flush_tlb(vcpu); >         } else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) { > -               kvm_vcpu_flush_tlb_guest(vcpu); > +               kvm_hv_vcpu_flush_tlb(vcpu); >         } >  } >  EXPORT_SYMBOL_GPL(kvm_service_local_tlb_flush_requests); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 501b884b8cc4..9f7989f2c6d4 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -79,6 +79,7 @@ static inline unsigned int > __shrink_ple_window(unsigned int val, >   >  #define MSR_IA32_CR_PAT_DEFAULT  0x0007040600070406ULL >   > +void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu); >  void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu); >  int kvm_check_nested_events(struct kvm_vcpu *vcpu); >   Best regards, Maxim Levitsky