Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3430471pxb; Mon, 4 Apr 2022 16:45:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwG5RxM1AmOnlQoTRd2dOXab1BMz9Lwj8pijwUdpMXqcmvQs3fBlw1SWQeZUxpENNk6omOn X-Received: by 2002:a62:4e4e:0:b0:4fa:b1d4:3405 with SMTP id c75-20020a624e4e000000b004fab1d43405mr660955pfb.71.1649115953234; Mon, 04 Apr 2022 16:45:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649115953; cv=none; d=google.com; s=arc-20160816; b=Bcs+FxzVi3nFYKp8n+s0yw2cu4lUjBAAblxrXmJq/Fh/RmIeZwK8UevrEk9sfRzRvs C5O/MjmmH35JqZ0b0B9zzGGvrNgY8XSoqhJ4S4aa5S/lG35ru+OcITS57fNoalrCb1OE zig9iKdxZAu4mX39zq/vCP6e35+G20EAd3HNkGCUm0zOj9lMO2B/S11Ed+E0KPjufdd8 TFL67uQzqIqgvrDo76KWL2+xEFgQ/Wzhof1dtnBlfMNQU7ebW3VYYvJCulde1ZsGBKsu fglgTuUppvpeKyiVQSjMBP7Xco4spAbTwFKyC391qu/7LLgac4nu5Kdfwm4qwtj+Ez1q 2HAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=+NwbwbLKmCRt9h5M6EbjMyAFHgQhA04J28apSba1GFg=; b=Trt/+STU/IOc037z45QFdWvaz8B+kvzY8TZ9vHgzDYJj6LlBrs7e7cn/unJdagLIZS k9ubCHQpfadZDo8Hto5PqLMKmQPBCIQdD2p1GHtwlxsNGtX9Ks5ZegTnonQgHrqmT9Os J7OtaZHPKBQRGKnPzZGc9UeAPW9IajrDChruqN+4SmwVv2K/wjAYc3DwvlgKONmnm+Uq MpQRMWzyQHhmsv0b52k0/ujH/t6MsoawfjMUV951By2bwBYC1tmuTj2QwMMsyG+zDeq4 c2Bqs9/Vt8cSKD8jzDgsJf1hEGhrON6LGrZEv1yzPVDQCwj6EwR8F4X6UWegiCDFwW/Y rDYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KhNTL6gW; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id c19-20020a637253000000b0038251571eacsi11147368pgn.180.2022.04.04.16.45.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 16:45:53 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KhNTL6gW; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E1E3E5714F; Mon, 4 Apr 2022 16:33:42 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353906AbiDCKTs (ORCPT + 99 others); Sun, 3 Apr 2022 06:19:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231140AbiDCKTr (ORCPT ); Sun, 3 Apr 2022 06:19:47 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F39DD36B52; Sun, 3 Apr 2022 03:17:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648981073; x=1680517073; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=0DWWGhFxuAkvxZdt59vprTZ/AbZtR80Zrc1KiH+AU7k=; b=KhNTL6gW4hDbJ/ztEv9LfItmhCOAzDBsKGD/Ww5eHNN7wgKEWHIBLxp4 xsbQLfLZcFs2r6prmBCOhjPmHDhH8/yMOCCHjhor5BQhF4bWbkC5s4Kj9 PgivHB+0givTA9o2YucykyN/uu9o3cpH0lWhx03VeTGzt6A8o3N4qXFh4 /2b0dDRZu6C2HCFHirgJYdicsMq5h0QjA5BEJr7ai0SGsEz16GG9LMqbF mH9hJDUcb7aSs9fliDlf4VlBL78tFechlCyaDzAfj2GGopHwuCrsGrog/ zqU1H+E3Lyl9Qm6b6y/L8550E9N/CzNBruf2QXVuYiPm5Y4hRD06fHZ3L w==; X-IronPort-AV: E=McAfee;i="6200,9189,10305"; a="323549398" X-IronPort-AV: E=Sophos;i="5.90,231,1643702400"; d="scan'208";a="323549398" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2022 03:17:52 -0700 X-IronPort-AV: E=Sophos;i="5.90,231,1643702400"; d="scan'208";a="568878443" Received: from zengguan-mobl1.ccr.corp.intel.com (HELO [10.254.215.101]) ([10.254.215.101]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2022 03:17:47 -0700 Message-ID: <60879468-c54f-e7f1-2123-ba4cf4128ac3@intel.com> Date: Sun, 3 Apr 2022 18:17:37 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v7 7/8] KVM: x86: Allow userspace set maximum VCPU id for VM Content-Language: en-US To: Sean Christopherson Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , "kvm@vger.kernel.org" , Dave Hansen , "Luck, Tony" , Kan Liang , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Kim Phillips , Jarkko Sakkinen , Jethro Beekman , "Huang, Kai" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "Hu, Robert" , "Gao, Chao" References: <20220304080725.18135-1-guang.zeng@intel.com> <20220304080725.18135-8-guang.zeng@intel.com> From: Zeng Guang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/1/2022 10:01 AM, Sean Christopherson wrote: > On Fri, Mar 04, 2022, Zeng Guang wrote: >> Introduce new max_vcpu_id in KVM for x86 architecture. Userspace >> can assign maximum possible vcpu id for current VM session using >> KVM_CAP_MAX_VCPU_ID of KVM_ENABLE_CAP ioctl(). >> >> This is done for x86 only because the sole use case is to guide >> memory allocation for PID-pointer table, a structure needed to >> enable VMX IPI. >> >> By default, max_vcpu_id set as KVM_MAX_VCPU_IDS. >> >> Suggested-by: Sean Christopherson >> Reviewed-by: Maxim Levitsky >> Signed-off-by: Zeng Guang >> --- >> arch/x86/include/asm/kvm_host.h | 6 ++++++ >> arch/x86/kvm/x86.c | 11 +++++++++++ > The new behavior needs to be documented in api.rst. OK. I will prepare document for it. >> 2 files changed, 17 insertions(+) >> >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >> index 6dcccb304775..db16aebd946c 100644 >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -1233,6 +1233,12 @@ struct kvm_arch { >> hpa_t hv_root_tdp; >> spinlock_t hv_root_tdp_lock; >> #endif >> + /* >> + * VM-scope maximum vCPU ID. Used to determine the size of structures >> + * that increase along with the maximum vCPU ID, in which case, using >> + * the global KVM_MAX_VCPU_IDS may lead to significant memory waste. >> + */ >> + u32 max_vcpu_id; > This should be max_vcpu_ids. I agree the it _should_ be max_vcpu_id, but KVM's API > for this is awful and we're stuck with the plural name. > >> }; >> >> struct kvm_vm_stat { >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 4f6fe9974cb5..ca17cc452bd3 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5994,6 +5994,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, >> kvm->arch.exit_on_emulation_error = cap->args[0]; >> r = 0; >> break; >> + case KVM_CAP_MAX_VCPU_ID: > I think it makes sense to change kvm_vm_ioctl_check_extension() to return the > current max, it is a VM-scoped ioctl after all. kvm_vm_ioctl_check_extension() can return kvm->arch.max_vcpu_ids as it reflects runtime capability supported on current vm. I will change it. > Amusingly, I think we also need a capability to enumerate that KVM_CAP_MAX_VCPU_ID > is writable. IIUC, KVM_CAP_*  has intrinsic writable attribute. KVM will return invalid If not implemented. >> + if (cap->args[0] <= KVM_MAX_VCPU_IDS) { >> + kvm->arch.max_vcpu_id = cap->args[0]; > This needs to be rejected if kvm->created_vcpus > 0, and that check needs to be > done under kvm_lock, otherwise userspace can bump the max ID after KVM allocates > per-VM structures and trigger buffer overflow. Is it necessary to use kvm_lock ? Seems no use case to call it from multi-threads. >> + r = 0; >> + } else > If-elif-else statements need curly braces for all paths if any path needs braces. > Probably a moot point for this patch due to the above changes. > >> + r = -E2BIG; > This should be -EINVAL, not -E2BIG. > > E.g. > > case KVM_CAP_MAX_VCPU_ID: > r = -EINVAL; > if (cap->args[0] > KVM_MAX_VCPU_IDS) > break; > > mutex_lock(&kvm->lock); > if (!kvm->created_vcpus) { > kvm->arch.max_vcpu_id = cap->args[0]; > r = 0; > } > mutex_unlock(&kvm->lock); > break; > > >> + break; >> default: >> r = -EINVAL; >> break; >> @@ -11067,6 +11074,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) >> struct page *page; >> int r; >> >> + if (vcpu->vcpu_id >= vcpu->kvm->arch.max_vcpu_id) >> + return -E2BIG; > Same here, it should be -EINVAL. > >> + >> vcpu->arch.last_vmentry_cpu = -1; >> vcpu->arch.regs_avail = ~0; >> vcpu->arch.regs_dirty = ~0; >> @@ -11589,6 +11599,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) >> spin_lock_init(&kvm->arch.hv_root_tdp_lock); >> kvm->arch.hv_root_tdp = INVALID_PAGE; >> #endif >> + kvm->arch.max_vcpu_id = KVM_MAX_VCPU_IDS; >> >> INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn); >> INIT_DELAYED_WORK(&kvm->arch.kvmclock_sync_work, kvmclock_sync_fn); >> -- >> 2.27.0 >>