Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp3086575rdb; Wed, 15 Nov 2023 22:36:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IH2PJyJEuNcZZcco5XER0alD68ERSNbzy2NOg9ajgqF+2Xf2jpTphT+i0MNanpZ5NNBFBcn X-Received: by 2002:a05:6830:5:b0:6c0:ae30:671e with SMTP id c5-20020a056830000500b006c0ae30671emr8843372otp.20.1700116572621; Wed, 15 Nov 2023 22:36:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700116572; cv=none; d=google.com; s=arc-20160816; b=mco7IOKoAZdm4vOwuJW0i/q+ag6GEkdWYK8eJvpztaPvoCtQNizuFnpKa0WK2ZEPM9 sTZAaexmjzuo/9qSkGGx4J+PTMTL6lPzCSnY1j1q0OOf4F2gDAL4q0Em4DthGAa8vt+v 3V7iJiW1bkOFVsI6nxXUsUuH5F4ro0m+8nrnc+L43vQoCq7eK4yuOtiMi1BNz+SG2kVO GLgQ8GdvNn/egQPpNCB2b5xMK0BMcCfVHg/Z6tE/1CuhvzJfPFWH5K3eRdZQ5u65Dvk2 w9fCnm8FIcwWEzLC7XQb2Vn650VaJpoc1qWxN6zgFF7JcNX8jXnVSRC9VXiFf2RVNgEr W6hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=yjlpuTmR3vkTuz9QtUYJKgZO3seTxG2ynbX/zxwQKuE=; fh=iGzJqDbKQuRHzgbZIw+Jrud5U1mtNTSqu139WvJPj+8=; b=XukKqdhQh1S5/e+ywACei8iDqlWxLwFZdsNbaHrlGv7a7C/gKZyNz3yRiQtMcbdhCc pFmG38dZMyEnxz5IRkQMbqktV8+hG702Nppl5Re9CO1rj+s3qZEonPnbGHMSNrHD1tv6 uTg8nBYF133ySkj08cwx92upWSyitST0INq9Us3QUzmvRTvnCGLx68V8BjrYnbqbNFeM hhdKkH+hGvWOLWTARdUjV3HhAY7Ta9qgYL04e/GLqXeKxRKOTLlVvyrnZikbtuaCTM15 3en8xMpJ5mxAbVtZSHgY3ZAVjkan5bWoIQM+FWWtfiFzVDK0E4GoR+jLNu/g57XNboSU kY9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BpkK8WF9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id bs3-20020a632803000000b005c1b2f29f24si5479749pgb.512.2023.11.15.22.36.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 22:36:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BpkK8WF9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 30E5F806A12C; Wed, 15 Nov 2023 22:35:46 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229786AbjKPGfp (ORCPT + 99 others); Thu, 16 Nov 2023 01:35:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229484AbjKPGfn (ORCPT ); Thu, 16 Nov 2023 01:35:43 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1259D130; Wed, 15 Nov 2023 22:35:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700116540; x=1731652540; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=M6Kj7ZVfefbTWmwT4t5wkfnorCdmmojdk0ojb43vGZE=; b=BpkK8WF97dknALHszo2xIRyZ+isWfrxXC721NKQknBG5YLxO1bTh/AqA hfwuoWRj0v4ecWuEF6rYIdQ+MW/19y7o7awFsbdBQ8hwsux415tNJJktz 4pDdllAAun3bM4qcUuFJywMMbRnsflzUYED5L13PIu+zVcQdTaLH2ljxH 8YI1wCKToH2rY/xpdSKu1l2xMeDoS2zrX8/AGPi/I+TnQOWsaCqjZXAn0 HnvNRNhdAbEbKNuyHIjdS4Y7uid1FbLXqxtxGRpSfGmHppAUSFLQDHzIh x52R9m20nkPRKEyNrqOcN/Gd9wjhfMdqT9agxOuCS9oQVDQUZA30cE5kM Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="457519052" X-IronPort-AV: E=Sophos;i="6.03,307,1694761200"; d="scan'208";a="457519052" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 22:35:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,307,1694761200"; d="scan'208";a="6427183" Received: from binbinwu-mobl.ccr.corp.intel.com (HELO [10.238.10.126]) ([10.238.10.126]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 22:35:36 -0800 Message-ID: Date: Thu, 16 Nov 2023 14:35:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v17 059/116] KVM: TDX: Create initial guest memory To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, gkirkpatrick@google.com References: From: Binbin Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 15 Nov 2023 22:35:46 -0800 (PST) On 11/7/2023 10:56 PM, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata > > Because the guest memory is protected in TDX, the creation of the initial > guest memory requires a dedicated TDX module API, tdh_mem_page_add, instead > of directly copying the memory contents into the guest memory in the case > of the default VM type. KVM MMU page fault handler callback, > private_page_add, handles it. > > Define new subcommand, KVM_TDX_INIT_MEM_REGION, of VM-scoped > KVM_MEMORY_ENCRYPT_OP. It assigns the guest page, copies the initial > memory contents into the guest memory, encrypts the guest memory. At the > same time, optionally it extends memory measurement of the TDX guest. It > calls the KVM MMU page fault(EPT-violation) handler to trigger the > callbacks for it. > > Reported-by: gkirkpatrick@google.com > Signed-off-by: Isaku Yamahata > > --- > v15 -> v16: > - add check if nr_pages isn't large with > (nr_page << PAGE_SHIFT) >> PAGE_SHIFT > > v14 -> v15: > - add a check if TD is finalized or not to tdx_init_mem_region() > - return -EAGAIN when partial population > --- > arch/x86/include/uapi/asm/kvm.h | 9 ++ > arch/x86/kvm/mmu/mmu.c | 1 + > arch/x86/kvm/vmx/tdx.c | 167 +++++++++++++++++++++++++- > arch/x86/kvm/vmx/tdx.h | 2 + > tools/arch/x86/include/uapi/asm/kvm.h | 9 ++ > 5 files changed, 185 insertions(+), 3 deletions(-) > [...] > > +static int tdx_sept_page_add(struct kvm *kvm, gfn_t gfn, > + enum pg_level level, kvm_pfn_t pfn) For me, the function name is a bit confusing. I would relate it to a SEPT table page instead of a normal private page if only by the function name. Similar to tdx_sept_page_aug(), though it's less confusing due to there is no seam call to aug a sept table page. > +{ > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); > + hpa_t hpa = pfn_to_hpa(pfn); > + gpa_t gpa = gfn_to_gpa(gfn); > + struct tdx_module_args out; > + hpa_t source_pa; > + bool measure; > + u64 err; > + > + /* > + * KVM_INIT_MEM_REGION, tdx_init_mem_region(), supports only 4K page > + * because tdh_mem_page_add() supports only 4K page. > + */ > + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) > + return -EINVAL; > + > + /* > + * In case of TDP MMU, fault handler can run concurrently. Note > + * 'source_pa' is a TD scope variable, meaning if there are multiple > + * threads reaching here with all needing to access 'source_pa', it > + * will break. However fortunately this won't happen, because below > + * TDH_MEM_PAGE_ADD code path is only used when VM is being created > + * before it is running, using KVM_TDX_INIT_MEM_REGION ioctl (which > + * always uses vcpu 0's page table and protected by vcpu->mutex). > + */ > + if (KVM_BUG_ON(kvm_tdx->source_pa == INVALID_PAGE, kvm)) { > + tdx_unpin(kvm, pfn); > + return -EINVAL; > + } > + > + source_pa = kvm_tdx->source_pa & ~KVM_TDX_MEASURE_MEMORY_REGION; > + measure = kvm_tdx->source_pa & KVM_TDX_MEASURE_MEMORY_REGION; > + kvm_tdx->source_pa = INVALID_PAGE; > + > + do { > + err = tdh_mem_page_add(kvm_tdx->tdr_pa, gpa, hpa, source_pa, > + &out); > + /* > + * This path is executed during populating initial guest memory > + * image. i.e. before running any vcpu. Race is rare. > + */ > + } while (unlikely(err == TDX_ERROR_SEPT_BUSY)); > + if (KVM_BUG_ON(err, kvm)) { > + pr_tdx_error(TDH_MEM_PAGE_ADD, err, &out); > + tdx_unpin(kvm, pfn); > + return -EIO; > + } else if (measure) > + tdx_measure_page(kvm_tdx, gpa); > + > + return 0; > + > +} > + [...]