Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2434707ybf; Mon, 2 Mar 2020 08:33:27 -0800 (PST) X-Google-Smtp-Source: ADFU+vtqJ7GvYv9UbDEZyAexr8X1KbDUCwD9Xzy+5mNNxftMErLcG/LD4VXOdmOCKUplwTKfBesL X-Received: by 2002:aca:db43:: with SMTP id s64mr223617oig.144.1583166806965; Mon, 02 Mar 2020 08:33:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583166806; cv=none; d=google.com; s=arc-20160816; b=DP/RI06SuDKjeDo0FMLSV40M+qWhUoFjfS6/bxHi1UFMNWMlpRWftm/VAjbAxaJ9Hx eLgLz4ZsDEr/xmS17iLpPUQ7yNSQwjDZThLV6ZlsBK3raf2Pfx5mrGVAhR3U+wDTbm3b 5TlmaEHD0w18EfwiotVApjFuLtgBYmzt/vNsVEoXc0ODOIoM5dXYM+wMZApxRQR3TfjJ aY0iLRumQsvm6VaeAMt9OycTmt9eZ7gIg+M349cjpi4i/afwYHxpsFecW3qSpSWoonDc FwWmPID2fMxgLvev9ln4VPqrJxpRm6IwIhJ/Nspk/hqYz2FDobLf20CZ36PkodyWWo87 NTVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=X9rcUbisKxxmFlt7qW9aNoQqeqD9jDa6T4zDyj5oRUY=; b=WpK3iVjjo1YVRzNMf9ehth9fgMDpL/2fle7X5286TrV+dtE8+YzAIvCT27ygWswcAm Gk4HiV24+OX5oGvDG4CmdRR/zZ/TP4R2e/yE7JZSIDSf1Ul2VY1rQvMuUB56FVpVPROt eNEPkdHZdv0X74SvoIrqwveErb5ltQ7MO7u86HvF28+4trBuTQyVTMI/FfdHKXklFrjG objNd7q7M8ZJU1Wwmkt/YjbDfaPxgdQa54z8R8E3+2n4maqmuUQB0HuX3DQAG7QOZAo+ Zgy5XAOzzQHrJxJfUzeUBCQT6zQfOQ68B9S6FYvKD/v0HqcSZeSo3BteYEfuZhY0KwcO KRRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="Vf/1v0TF"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 71si7363768otm.111.2020.03.02.08.33.14; Mon, 02 Mar 2020 08:33:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="Vf/1v0TF"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727101AbgCBQc7 (ORCPT + 99 others); Mon, 2 Mar 2020 11:32:59 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:47558 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727072AbgCBQc5 (ORCPT ); Mon, 2 Mar 2020 11:32:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1583166775; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X9rcUbisKxxmFlt7qW9aNoQqeqD9jDa6T4zDyj5oRUY=; b=Vf/1v0TFP383X4KR7W/gVj0w8uIQO1c+No/hYLlVSfB18Q70rVuI/asAIAQDdU86JFXUck YNrKu5HriHdzw67GFUV8B2L4r56D5TB7Bs0JI1si8XuhYt12UZAig8qoB30twevok6YgFw wLEgxbcWAApy77mWCDFGidf9amuxq/M= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-426-GBPppWQfNWCZUPKurUpT8g-1; Mon, 02 Mar 2020 11:32:53 -0500 X-MC-Unique: GBPppWQfNWCZUPKurUpT8g-1 Received: by mail-wr1-f70.google.com with SMTP id f10so6026894wrv.1 for ; Mon, 02 Mar 2020 08:32:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=X9rcUbisKxxmFlt7qW9aNoQqeqD9jDa6T4zDyj5oRUY=; b=AfeVKu2Vz6BywDo7doGhMsJVJta8wQc+hPzdHhLe7MNxakwKz9YImajcErW+GEeLvU qbYDJW2T6ZNg5kC1WShyk6jhuhC50KX7m9kued/aFOtS5STl8wZYJOEsOpPiDbcNeZJW I5eOeFynhKnZmrlMTenA2pIPUAD6vCc9r4HlxP78QzTIZrHpenNepDDUUyq+wyhGWr2P qUYiVb7/8qYXMBzckkCmqlWxR/1I3ERaXbww0ilegOJtMepWaG6QCRW5IiQpkxSRZLcK 3CteIKaEbI3gwe+vCZP7iCBQyCCvS4M162+W8FRVyMtoRs8D0ToBB4jqf2Fowip+e/uL XJRQ== X-Gm-Message-State: ANhLgQ21o5Eb4bAtZn3eRzVEVYp4FvbpZTER0e+eH5kN0VrNca3HYFLc aA+h5l42sdxxNrHU1mLqE4jB4+kfXPgaOUU+QX4ev7ekq6vRURLlrCu+KjPhVPzJL8afWCYszXd ocoIYyIFX0dfdPg9uWFaSaNY8 X-Received: by 2002:a5d:4443:: with SMTP id x3mr360686wrr.379.1583166771753; Mon, 02 Mar 2020 08:32:51 -0800 (PST) X-Received: by 2002:a5d:4443:: with SMTP id x3mr360656wrr.379.1583166771449; Mon, 02 Mar 2020 08:32:51 -0800 (PST) Received: from [192.168.178.40] ([151.30.85.6]) by smtp.gmail.com with ESMTPSA id s15sm973084wrr.45.2020.03.02.08.32.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 02 Mar 2020 08:32:50 -0800 (PST) Subject: Re: [PATCH v3] KVM: LAPIC: Recalculate apic map in batch To: Wanpeng Li , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel References: <1582684862-10880-1-git-send-email-wanpengli@tencent.com> From: Paolo Bonzini Message-ID: <441ae2d3-cece-2c74-900c-22c7fa7ff373@redhat.com> Date: Mon, 2 Mar 2020 17:32:50 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <1582684862-10880-1-git-send-email-wanpengli@tencent.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/02/20 03:41, Wanpeng Li wrote: > From: Wanpeng Li > > In the vCPU reset and set APIC_BASE MSR path, the apic map will be recalculated > several times, each time it will consume 10+ us observed by ftrace in my > non-overcommit environment since the expensive memory allocate/mutex/rcu etc > operations. This patch optimizes it by recaluating apic map in batch, I hope > this can benefit the serverless scenario which can frequently create/destroy > VMs. > > Before patch: > > kvm_lapic_reset ~27us > > After patch: > > kvm_lapic_reset ~14us > > Observed by ftrace, improve ~48%. > > Signed-off-by: Wanpeng Li > --- > v2 -> v3: > * move apic_map_dirty to kvm_arch > * add the suggestions from Paolo > > v1 -> v2: > * add apic_map_dirty to kvm_lapic > * error condition in kvm_apic_set_state, do recalcuate unconditionally > > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/lapic.c | 46 ++++++++++++++++++++++++++++++++--------- > arch/x86/kvm/lapic.h | 1 + > arch/x86/kvm/x86.c | 1 + > 4 files changed, 39 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 40a0c0f..4380ed1 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -920,6 +920,7 @@ struct kvm_arch { > atomic_t vapics_in_nmi_mode; > struct mutex apic_map_lock; > struct kvm_apic_map *apic_map; > + bool apic_map_dirty; > > bool apic_access_page_done; > unsigned long apicv_inhibit_reasons; > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index afcd30d..de832aa 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -164,14 +164,28 @@ static void kvm_apic_map_free(struct rcu_head *rcu) > kvfree(map); > } > > -static void recalculate_apic_map(struct kvm *kvm) > +void kvm_recalculate_apic_map(struct kvm *kvm) > { > struct kvm_apic_map *new, *old = NULL; > struct kvm_vcpu *vcpu; > int i; > u32 max_id = 255; /* enough space for any xAPIC ID */ > > + if (!kvm->arch.apic_map_dirty) { > + /* > + * Read kvm->arch.apic_map_dirty before > + * kvm->arch.apic_map > + */ > + smp_rmb(); > + return; > + } > + > mutex_lock(&kvm->arch.apic_map_lock); > + if (!kvm->arch.apic_map_dirty) { > + /* Someone else has updated the map. */ > + mutex_unlock(&kvm->arch.apic_map_lock); > + return; > + } > > kvm_for_each_vcpu(i, vcpu, kvm) > if (kvm_apic_present(vcpu)) > @@ -236,6 +250,12 @@ static void recalculate_apic_map(struct kvm *kvm) > old = rcu_dereference_protected(kvm->arch.apic_map, > lockdep_is_held(&kvm->arch.apic_map_lock)); > rcu_assign_pointer(kvm->arch.apic_map, new); > + /* > + * Write kvm->arch.apic_map before > + * clearing apic->apic_map_dirty > + */ > + smp_wmb(); > + kvm->arch.apic_map_dirty = false; > mutex_unlock(&kvm->arch.apic_map_lock); > > if (old) > @@ -257,20 +277,20 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val) > else > static_key_slow_inc(&apic_sw_disabled.key); > > - recalculate_apic_map(apic->vcpu->kvm); > + apic->vcpu->kvm->arch.apic_map_dirty = true; > } > } > > static inline void kvm_apic_set_xapic_id(struct kvm_lapic *apic, u8 id) > { > kvm_lapic_set_reg(apic, APIC_ID, id << 24); > - recalculate_apic_map(apic->vcpu->kvm); > + apic->vcpu->kvm->arch.apic_map_dirty = true; > } > > static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id) > { > kvm_lapic_set_reg(apic, APIC_LDR, id); > - recalculate_apic_map(apic->vcpu->kvm); > + apic->vcpu->kvm->arch.apic_map_dirty = true; > } > > static inline u32 kvm_apic_calc_x2apic_ldr(u32 id) > @@ -286,7 +306,7 @@ static inline void kvm_apic_set_x2apic_id(struct kvm_lapic *apic, u32 id) > > kvm_lapic_set_reg(apic, APIC_ID, id); > kvm_lapic_set_reg(apic, APIC_LDR, ldr); > - recalculate_apic_map(apic->vcpu->kvm); > + apic->vcpu->kvm->arch.apic_map_dirty = true; > } > > static inline int apic_lvt_enabled(struct kvm_lapic *apic, int lvt_type) > @@ -1912,7 +1932,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > case APIC_DFR: > if (!apic_x2apic_mode(apic)) { > kvm_lapic_set_reg(apic, APIC_DFR, val | 0x0FFFFFFF); > - recalculate_apic_map(apic->vcpu->kvm); > + apic->vcpu->kvm->arch.apic_map_dirty = true; > } else > ret = 1; > break; > @@ -2018,6 +2038,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > break; > } > > + kvm_recalculate_apic_map(apic->vcpu->kvm); > + > return ret; > } > EXPORT_SYMBOL_GPL(kvm_lapic_reg_write); > @@ -2166,7 +2188,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) > static_key_slow_dec_deferred(&apic_hw_disabled); > } else { > static_key_slow_inc(&apic_hw_disabled.key); > - recalculate_apic_map(vcpu->kvm); > + vcpu->kvm->arch.apic_map_dirty = true; > } > } > > @@ -2207,6 +2229,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) > if (!apic) > return; > > + vcpu->kvm->arch.apic_map_dirty = false; > /* Stop the timer in case it's a reset to an active apic */ > hrtimer_cancel(&apic->lapic_timer.timer); > > @@ -2258,6 +2281,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) > > vcpu->arch.apic_arb_prio = 0; > vcpu->arch.apic_attention = 0; > + > + kvm_recalculate_apic_map(vcpu->kvm); > } > > /* > @@ -2479,17 +2504,18 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s) > struct kvm_lapic *apic = vcpu->arch.apic; > int r; > > - > kvm_lapic_set_base(vcpu, vcpu->arch.apic_base); > /* set SPIV separately to get count of SW disabled APICs right */ > apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV))); > > r = kvm_apic_state_fixup(vcpu, s, true); > - if (r) > + if (r) { > + kvm_recalculate_apic_map(vcpu->kvm); > return r; > + } > memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s)); > > - recalculate_apic_map(vcpu->kvm); > + kvm_recalculate_apic_map(vcpu->kvm); > kvm_apic_set_version(vcpu); > > apic_update_ppr(apic); > diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h > index ec6fbfe..7581bc2 100644 > --- a/arch/x86/kvm/lapic.h > +++ b/arch/x86/kvm/lapic.h > @@ -78,6 +78,7 @@ void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8); > void kvm_lapic_set_eoi(struct kvm_vcpu *vcpu); > void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value); > u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu); > +void kvm_recalculate_apic_map(struct kvm *kvm); > void kvm_apic_set_version(struct kvm_vcpu *vcpu); > int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val); > int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len, > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 79bc995..d3802a2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -350,6 +350,7 @@ int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > } > > kvm_lapic_set_base(vcpu, msr_info->data); > + kvm_recalculate_apic_map(vcpu->kvm); > return 0; > } > EXPORT_SYMBOL_GPL(kvm_set_apic_base); > Queued, thanks. Paolo