Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1151215pxb; Thu, 24 Mar 2022 13:51:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxlbtaKbl1BJy0gRqiHiOCb3LF1SylSbYidJBS4wdkwyBsLbOEYxvFcl2S1XYRNNSk9hrVu X-Received: by 2002:a17:90b:33cd:b0:1c6:5db5:5ff9 with SMTP id lk13-20020a17090b33cd00b001c65db55ff9mr8561685pjb.197.1648155073529; Thu, 24 Mar 2022 13:51:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648155073; cv=none; d=google.com; s=arc-20160816; b=R5Ttdl42JgHhPKcIak1SUhL4930G8ale45mqPVeZmbNHpv9oP4wUUcIhCPwNPNpHz/ DJqcSed6kPC5ei4Q7HI0Tp8F8kxCqh9e8YhZC2al3qpxTN8zjkJgXsX0QcTwO9CeHpXJ 4u48AbhzTWwFLRgu99/ro1hA06Xs7FbjyLhMNKaN8UghtS5+/Vh6Z0mov2z+waaT6fIm J8ENgu7a+Qnv5063ZF3k17rymp4y716ZVUge36aWg0QmSKpGW8zeCya/7QDO9DK+rxFs G0rYdUlCf7NZugWZGDul3pZcqOB8IklwdLhOMZMLIvfEyDf7UlqYFnWwcf3RMABE3+Tk p88w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=+f+hkXEMqCDbBc5k9itvwvl/EyJykaOvhcwPJN5igM8=; b=I0bYwaVsdJeLmqC+o0w9inkAMZhq3c87WYLZk+YkbSTvuoajPqSNJeoDLQTVgbTQbp au7COHD6hw8mPbzFCW9Sw42z7yTzmLj4FdJyqm/5TYMbL/AT1vusbfp/uidf1OMtten+ GZo26g0LVyOwqyuRdz6bmkaBiukVhuJ9mvq3C++cdGfAQf5vCkvcIX53akzRu4iw/xEF oKWR/hQ3gn6iU0Kh9ZzIp2r7/uKEbyDAJIyq66PmkPWn0Wl2WCTbuqARgaN9bvRY9pjH H9eJIetcNZk5vGh81KD0FDczFrkuUu7nQ/ecLwx6h2TXlhQE5oGR+xx5069mv70GHCN1 lR+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=cgJ5PdNP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n27-20020a63a51b000000b003816ff876bfsi458069pgf.5.2022.03.24.13.50.58; Thu, 24 Mar 2022 13:51:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=cgJ5PdNP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344543AbiCWUT2 (ORCPT + 99 others); Wed, 23 Mar 2022 16:19:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240576AbiCWUT1 (ORCPT ); Wed, 23 Mar 2022 16:19:27 -0400 Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C8178BF28 for ; Wed, 23 Mar 2022 13:17:57 -0700 (PDT) Received: by mail-yb1-xb2c.google.com with SMTP id j2so4865552ybu.0 for ; Wed, 23 Mar 2022 13:17:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+f+hkXEMqCDbBc5k9itvwvl/EyJykaOvhcwPJN5igM8=; b=cgJ5PdNPSnzZO53gJodR4J6UqkOLzbinHvabBtMW9HEiEwzobQs/HEA0DX5wJQhqJ8 wcq0ZpWhgVTaXL6ym68hyA+Iw7b6slyEn2XEMmkMraqOwCDf5nCwDX4rWzVWIhdh/DAW zzylVqv0dJHs0tgjPuBwWspI9L49c+m8Gmwy8buYp3OxUhHyka764i5BLu5jxjzLD13v eu1pq1U5K8Kyhwo68WR+/EekLa99GqF/CBJJGUIRm/p7IQRuK6pvtc0EOXaEP516MBi4 i4+IdwDeVF4DB7ZYAn3uQoRAhSm5m2fs9BX3P1XRKkFUp+ghy4Yq7uahkcl/RkU4AM1w IX6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+f+hkXEMqCDbBc5k9itvwvl/EyJykaOvhcwPJN5igM8=; b=nVAIOPCX08YAn1NVJMwBoqYYjAl6GsAfLaz3QDWwmg0OAk8uOeU4nCeQC8DxdYk2rE rWtJJzaHb9YWVqbkh1kINJ8K018LapJ554ltXYJge4Yx4l5STn4KWhGSGRmec6h0S9x8 j7JR5WCXpAGDIb9mbNGNswjs9ehsZ3SiZ4wPbG38fJFpDJ4fFzK05sl2XYZQ+jhH12L+ /myjB18ZJQUMkAsblMsW07lIcca+rB4tR6vs65TwV5zgdOmh9rARXtVlCEUqXjmiVQVG wA8bV7wkELBDVE4x+pfpNiJTd9dQP6CjUnVpo9q1AwvdvOjIz4mtDOAO3uMHzRG42NtQ /P7g== X-Gm-Message-State: AOAM531hxtdBFWl/dgKuYQE/e1faFpi+ORHbsyjYUAWOYeN5LLuFJJQb p/yVvGDT16HCTDZuWo44QMF/jSLlsGn1o/NBP1ho+Q== X-Received: by 2002:a25:7387:0:b0:633:8a4d:ae8 with SMTP id o129-20020a257387000000b006338a4d0ae8mr1790817ybc.286.1648066676402; Wed, 23 Mar 2022 13:17:56 -0700 (PDT) MIME-Version: 1.0 References: <6e096d8509ef40ce3e25c1e132643e772641241b.1646422845.git.isaku.yamahata@intel.com> <20220323190812.GH1964605@ls.amr.corp.intel.com> In-Reply-To: <20220323190812.GH1964605@ls.amr.corp.intel.com> From: Erdem Aktas Date: Wed, 23 Mar 2022 13:17:45 -0700 Message-ID: Subject: Re: [RFC PATCH v5 073/104] KVM: TDX: track LP tdx vcpu run and teardown vcpus on descroing the guest TD To: Isaku Yamahata Cc: "Yamahata, Isaku" , "open list:KERNEL VIRTUAL MACHINE (KVM)" , open list , Paolo Bonzini , Jim Mattson , Connor Kuehl , Sean Christopherson Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org So the tdh_vp_flush should always succeed while vm is being torn down. Thanks Isaku for the explanation, and I think it would be great to add the error message. -Erdem On Wed, Mar 23, 2022 at 12:08 PM Isaku Yamahata wrote: > > On Tue, Mar 22, 2022 at 05:54:45PM -0700, > Erdem Aktas wrote: > > > On Fri, Mar 4, 2022 at 11:50 AM wrote: > > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > > index a6b1a8ce888d..690298fb99c7 100644 > > > --- a/arch/x86/kvm/vmx/tdx.c > > > +++ b/arch/x86/kvm/vmx/tdx.c > > > @@ -48,6 +48,14 @@ struct tdx_capabilities tdx_caps; > > > static DEFINE_MUTEX(tdx_lock); > > > static struct mutex *tdx_mng_key_config_lock; > > > > > > +/* > > > + * A per-CPU list of TD vCPUs associated with a given CPU. Used when a CPU > > > + * is brought down to invoke TDH_VP_FLUSH on the approapriate TD vCPUS. > > > + * Protected by interrupt mask. This list is manipulated in process context > > > + * of vcpu and IPI callback. See tdx_flush_vp_on_cpu(). > > > + */ > > > +static DEFINE_PER_CPU(struct list_head, associated_tdvcpus); > > > + > > > static u64 hkid_mask __ro_after_init; > > > static u8 hkid_start_pos __ro_after_init; > > > > > > @@ -87,6 +95,8 @@ static inline bool is_td_finalized(struct kvm_tdx *kvm_tdx) > > > > > > static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu) > > > { > > > + list_del(&to_tdx(vcpu)->cpu_list); > > > + > > > /* > > > * Ensure tdx->cpu_list is updated is before setting vcpu->cpu to -1, > > > * otherwise, a different CPU can see vcpu->cpu = -1 and add the vCPU > > > @@ -97,6 +107,22 @@ static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu) > > > vcpu->cpu = -1; > > > } > > > > > > +void tdx_hardware_enable(void) > > > +{ > > > + INIT_LIST_HEAD(&per_cpu(associated_tdvcpus, raw_smp_processor_id())); > > > +} > > > + > > > +void tdx_hardware_disable(void) > > > +{ > > > + int cpu = raw_smp_processor_id(); > > > + struct list_head *tdvcpus = &per_cpu(associated_tdvcpus, cpu); > > > + struct vcpu_tdx *tdx, *tmp; > > > + > > > + /* Safe variant needed as tdx_disassociate_vp() deletes the entry. */ > > > + list_for_each_entry_safe(tdx, tmp, tdvcpus, cpu_list) > > > + tdx_disassociate_vp(&tdx->vcpu); > > > +} > > > + > > > static void tdx_clear_page(unsigned long page) > > > { > > > const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); > > > @@ -230,9 +256,11 @@ void tdx_mmu_prezap(struct kvm *kvm) > > > struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > > > cpumask_var_t packages; > > > bool cpumask_allocated; > > > + struct kvm_vcpu *vcpu; > > > u64 err; > > > int ret; > > > int i; > > > + unsigned long j; > > > > > > if (!is_hkid_assigned(kvm_tdx)) > > > return; > > > @@ -248,6 +276,17 @@ void tdx_mmu_prezap(struct kvm *kvm) > > > return; > > > } > > > > > > + kvm_for_each_vcpu(j, vcpu, kvm) > > > + tdx_flush_vp_on_cpu(vcpu); > > > + > > > + mutex_lock(&tdx_lock); > > > + err = tdh_mng_vpflushdone(kvm_tdx->tdr.pa); > > > > Hi Isaku, > > Hi. > > > > I am wondering about the impact of the failures on these functions. Is > > there any other function which recovers any failures here? > > When I look at the tdx_flush_vp function, it seems like it can fail > > due to task migration so tdx_flush_vp_on_cpu might also fail and if it > > fails, tdh_mng_vpflushdone returns err. Since tdx_vm_teardown does not > > return any error , how the VMM can free the keyid used in this TD. > > Will they be forever in "used state"? > > Also if tdx_vm_teardown fails, the kvm_tdx->hkid is never set to -1 > > which will prevent tdx_vcpu_free to free and reclaim the resources > > allocated for the vcpu. > > mmu_prezap() is called via release callback of mmu notifier when the last mmu > reference of this process is dropped. It is after all kvm vcpu fd and kvm vm > fd were closed. vcpu will never run. But we still hold kvm_vcpu structures. > There is no race between tdh_vp_flush()/tdh_mng_vpflushdone() here and process > migration. tdh_vp_flush()/tdh_mng_vp_flushdone() should success. > > The cpuid check in tdx_flush_vp() is for vcpu_load() which may race with process > migration. > > Anyway what if one of those TDX seamcalls fails? HKID is leaked and will be > never used because there is no good way to free and use HKID safely. Such > failure is due to unknown issue and probably a bug. > > One mitigation is to add pr_err() when HKID leak happens. I'll add such message > on next respin. > > thanks, > -- > Isaku Yamahata