Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3513502pxb; Mon, 4 Apr 2022 19:26:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyrH+7hy/76wvi+gkKf5N4XqlDLsMT+MV/ULbLAXqXhl/R+lkvIijqO9wJckG2vQJu6aMrn X-Received: by 2002:a05:6a02:28e:b0:380:3aee:e863 with SMTP id bk14-20020a056a02028e00b003803aeee863mr970583pgb.556.1649125595967; Mon, 04 Apr 2022 19:26:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649125595; cv=none; d=google.com; s=arc-20160816; b=RTbM2uiDChA/dldqEAuSbUBuQiAX+KP+yr6WTwev11xNbg3tJm0T96UBqAk+Tqu9gh LH15c48QvbYU3iMT1ZRaQ4kR5RgX1l0uvQi8ueWLqzwvP9cAPq9AbGN7uKnenBPWrb2i /YWxWbPdQ2Ob+9rDxiN/d6K15y0UtMhHfwY86KWmA/O8fLdBLawTRyn2iumyJr5za7Z0 BkPNqZYab7W6a5oio1U3qaVZWMjuteHB59KD9HwuRyGaBfvGn8FGiZohP44J6QKwC0VS Z+k1l2jawxeyXd5Cl7eh+BWe6X1JV2KCcYrHOWYcjXruSMk0q5bjwOOhNss3QvNQAhBr AIFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=o6hLZdUb/ReT0m7dHtzUEMCrOQA61jDt0sJYjNLeFrM=; b=g+uRl9VcTRhVgfE9E6ZyRdUvFjlg1TbsILy72rTkbGlWI+H+IH5dZIcLq9BEfrQ97l NYBrXEpMajrVsCCOYb1I0cWnemKdJnlDVIcQd7O9YWRuziLRRVCnIeSRvScJS1EgilFF vv5lNQ1tq+MSAwyudADW6cpSfDQrSHBYEOxmq8tzwsbNO7A/6mLIzkTyi32ZGojwqBFg Ub0A9gqkQKkEZOqGWRIwHNmqZcs7e0LfvaUtgZ1DrluVTlYSR8oQ4hnCZRXi4VWPMb4z MhCZs4QXL9OsJIrDQjXdvhIg6C9I1mGW2t+YPATnFZ6jZA9y0ZV1/Q1L9qqsBnDEIF/A 6Gow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bjGFzmKz; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id y29-20020a634b1d000000b003821d32f04csi11484010pga.116.2022.04.04.19.26.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 19:26:35 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bjGFzmKz; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2C2A242E90F; Mon, 4 Apr 2022 17:49:28 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379012AbiDDVUZ (ORCPT + 99 others); Mon, 4 Apr 2022 17:20:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379725AbiDDR7a (ORCPT ); Mon, 4 Apr 2022 13:59:30 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C5A234BB9 for ; Mon, 4 Apr 2022 10:57:33 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id m18so8786031plx.3 for ; Mon, 04 Apr 2022 10:57:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=o6hLZdUb/ReT0m7dHtzUEMCrOQA61jDt0sJYjNLeFrM=; b=bjGFzmKzBzOkEsGB0baOcb7bfray3F9t5FC2XDPDR25MOdKcPAtOKZplJ3FsM9Myko pvDpp5tbpgIb2RTE5S5CLdctjR75ngAbg2862ZMtImR4p5Dm5g/dHuDVoE+W5X4ThfUr x+bSrUBWZtWfGJeSUyWeVJwmucmfsgVn1QCkhnuD19SifeUUzCHtBWa8M7oGW9wyBnSL X85uNUvUkXpoMyy1BhPAdP5QABgg0bWooDgCUrZPW1OKXzuVYi5SSLBx3QRJZgk522Vo Y0bXK4820hvyI+0ks5eoz1oa8mYPmQd8uc96lHBRf86w48NziVyW2Z9bdXfHNCeUW2UQ hwTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=o6hLZdUb/ReT0m7dHtzUEMCrOQA61jDt0sJYjNLeFrM=; b=qqsErYonQRQfIJZbZL94+pWr00P0Rvd62h+B0DsU4aCOMdln5YFCSkWcRV3Y+udH/W iNjT9mxtc3g57XU8QWO+KgdlqfqZCbCdvPJN98svsEl0kWSyxqj8URSqYFOhDN2IjFdN K2NV+WuhtW1+dD6Y5K+9L2N2xjsJyKVtSiOEkshw9fqDFZ1kec4v7rR7pbDnMhgHFR6u SL8jXPmWE2cmarqXWeFAljen0bOAFmENHBpWMsg4/30SyaWt6jJizyo104MrYpR+cX73 a0qHuYLZX2t6R1y/6SonqCg8YClAlVMxUQ5dFXZHxBWG6N9+wfJwZTIyK0A3VeyMqAEY g1wQ== X-Gm-Message-State: AOAM531UH+DhrI8xrkekOY0+6bKmuFXbSscw2pUFqlEaa3h6EQjkh4Bu NQqJSWNrnr4cb9G0Su++WjQahA== X-Received: by 2002:a17:90a:4604:b0:1bc:8bdd:4a63 with SMTP id w4-20020a17090a460400b001bc8bdd4a63mr361154pjg.147.1649095052573; Mon, 04 Apr 2022 10:57:32 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id 21-20020a630115000000b00382a0895661sm11019145pgb.11.2022.04.04.10.57.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 10:57:31 -0700 (PDT) Date: Mon, 4 Apr 2022 17:57:28 +0000 From: Sean Christopherson To: Zeng Guang Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , "kvm@vger.kernel.org" , Dave Hansen , "Luck, Tony" , Kan Liang , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Kim Phillips , Jarkko Sakkinen , Jethro Beekman , "Huang, Kai" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "Hu, Robert" , "Gao, Chao" Subject: Re: [PATCH v7 8/8] KVM: VMX: enable IPI virtualization Message-ID: References: <20220304080725.18135-1-guang.zeng@intel.com> <20220304080725.18135-9-guang.zeng@intel.com> <54df6da8-ad68-cc75-48db-d18fc87430e9@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54df6da8-ad68-cc75-48db-d18fc87430e9@intel.com> X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Apr 03, 2022, Zeng Guang wrote: > > On 4/1/2022 10:37 AM, Sean Christopherson wrote: > > > @@ -4219,14 +4226,21 @@ static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) > > > pin_controls_set(vmx, vmx_pin_based_exec_ctrl(vmx)); > > > if (cpu_has_secondary_exec_ctrls()) { > > > - if (kvm_vcpu_apicv_active(vcpu)) > > > + if (kvm_vcpu_apicv_active(vcpu)) { > > > secondary_exec_controls_setbit(vmx, > > > SECONDARY_EXEC_APIC_REGISTER_VIRT | > > > SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); > > > - else > > > + if (enable_ipiv) > > > + tertiary_exec_controls_setbit(vmx, > > > + TERTIARY_EXEC_IPI_VIRT); > > > + } else { > > > secondary_exec_controls_clearbit(vmx, > > > SECONDARY_EXEC_APIC_REGISTER_VIRT | > > > SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); > > > + if (enable_ipiv) > > > + tertiary_exec_controls_clearbit(vmx, > > > + TERTIARY_EXEC_IPI_VIRT); > > Oof. The existing code is kludgy. We should never reach this point without > > enable_apicv=true, and enable_apicv should be forced off if APICv isn't supported, > > let alone seconary exec being support. > > > > Unless I'm missing something, throw a prep patch earlier in the series to drop > > the cpu_has_secondary_exec_ctrls() check, that will clean this code up a smidge. > > cpu_has_secondary_exec_ctrls() check can avoid wrong vmcs write in case mistaken > invocation. KVM has far bigger problems on buggy invocation, and in that case the resulting printk + WARN from the failed VMWRITE is a good thing. > > > + > > > + if (!pages) > > > + return -ENOMEM; > > > + > > > + kvm_vmx->pid_table = (void *)page_address(pages); > > > + kvm_vmx->pid_last_index = kvm_vmx->kvm.arch.max_vcpu_id - 1; > > No need to cache pid_last_index, it's only used in one place (initializing the > > VMCS field). The allocation/free paths can use max_vcpu_id directly. Actually, > > In previous design, we don't forbid to change max_vcpu_id after vCPU creation > or for other purpose in future. Thus it's safe to decouple them and make ipiv > usage independent. If it can be sure that max_vcpu_id won't be modified , we > can totally remove pid_last_index and use max_vcpu_id directly even for > initializing the VMCD field. max_vcpu_id asolutely needs to be constant after the first vCPU is created. > > > @@ -7123,6 +7176,22 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu) > > > goto free_vmcs; > > > } > > > + /* > > > + * Allocate PID-table and program this vCPU's PID-table > > > + * entry if IPI virtualization can be enabled. > > Please wrap comments at 80 chars. But I'd just drop this one entirely, the code > > is self-explanatory once the allocation and setting of the vCPU's entry are split. > > > > > + */ > > > + if (vmx_can_use_ipiv(vcpu->kvm)) { > > > + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); > > > + > > > + mutex_lock(&vcpu->kvm->lock); > > > + err = vmx_alloc_pid_table(kvm_vmx); > > > + mutex_unlock(&vcpu->kvm->lock); > > This belongs in vmx_vm_init(), doing it in vCPU creation is a remnant of the > > dynamic resize approach that's no longer needed. > > We cannot allocate pid table in vmx_vm_init() as userspace has no chance to > set max_vcpu_ids at this stage. That's the reason we do it in vCPU creation > instead. Ah, right. Hrm. And that's going to be a recurring problem if we try to use the dynamic kvm->max_vcpu_ids to reduce other kernel allocations. Argh, and even kvm_arch_vcpu_precreate() isn't protected by kvm->lock. Taking kvm->lock isn't problematic per se, I just hate doing it so deep in a per-vCPU flow like this. A really gross hack/idea would be to make this 64-bit only and steal the upper 32 bits of @type in kvm_create_vm() for the max ID. I think my first choice would be to move kvm_arch_vcpu_precreate() under kvm->lock. None of the architectures that have a non-nop implemenation (s390, arm64 and x86) do significant work, so holding kvm->lock shouldn't harm performance. s390 has to acquire kvm->lock in its implementation, so we could drop that. And looking at arm64, I believe its logic should also be done under kvm->lock. It'll mean adding yet another kvm_x86_ops, but I like that more than burying the code deep in vCPU creation. Paolo, any thoughts on this?