Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp2661824rwr; Fri, 21 Apr 2023 12:01:44 -0700 (PDT) X-Google-Smtp-Source: AKy350aXAiAopTo+5wpbrcPFiclSVxsDed4mtup5nkz3NTjOkZurOSlYXWujWVt7WcGivBKuX5+e X-Received: by 2002:a17:90a:198f:b0:236:1ec1:6d30 with SMTP id 15-20020a17090a198f00b002361ec16d30mr5002847pji.3.1682103703967; Fri, 21 Apr 2023 12:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682103703; cv=none; d=google.com; s=arc-20160816; b=vK2LHXmgRMJ482+MtKfB/XTlZCBf2l1NUtNkykna7QPmKwULPIk6Xkb9eER9069AQZ r8Qz6sS60x7SixKhWqKyOuRaynNiZYznLxtmQNlIDu0baKtrjwcrx2wLbvuRlDSubYEK AF+qAT4GX3UwJrxwO+P9bKObKQvD0jzy65OctayRmiVqBN29IlspdANSPfTpKtkGuSsa HzlbdyoBs7PlIwrcP/r/s3RkIuHdqSl9Kd1AD2/ph1EnVBbJHsUk0LJeTjiEaeNgA0zw TK5c28BabamkCTcXoawTB1Jw8XbKXZA2An293GiiD0vMQ1L8J7mL6ERK4VxjDxajf8RO nkhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=/IIUhUlmRecN66Z6gik+9JylRPrSHv9qYt9esJpbqSI=; b=hEDDfo9OW1eX41I0LH6QJz27A7X9rkllZOe3/ZAnSyeb/pel/1P/IOUXa4GAaRTaJl 4TrsS2ABMhfLurNZhBE26pzOM6c0YJ80hcIKZcMta3RSFp/i6uQRZkGE1N58K6IN7S6m WgvfdsVptMStIWziCEiNIkQoydtHbhxVgGq7VmgvicFX03yDfVRuCt49LSm44LPiVLpo cK+5eHJmqzkm4ZXYEIkYi1gEx6n2EATVdC/Oekxzyu/Q27lcmwqFT2kvvX4+bmczDTPK Goj4EWEeDpBIjAv8HoHn1IM6L0UuTGOO5sFUMNN/HKFsi0ev3aJBs74dvXrhya9Zd2Pz sS/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b="THDL4+u/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b12-20020a17090ae38c00b002496b783923si5022620pjz.181.2023.04.21.12.01.31; Fri, 21 Apr 2023 12:01:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b="THDL4+u/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232646AbjDUSuX (ORCPT + 99 others); Fri, 21 Apr 2023 14:50:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229582AbjDUSuV (ORCPT ); Fri, 21 Apr 2023 14:50:21 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1A6B188 for ; Fri, 21 Apr 2023 11:50:19 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1a526aa3dd5so27770785ad.3 for ; Fri, 21 Apr 2023 11:50:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1682103019; x=1684695019; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/IIUhUlmRecN66Z6gik+9JylRPrSHv9qYt9esJpbqSI=; b=THDL4+u/vOfwLaOnCW6eh/CjgCzPbUBKAHWf1LgoVb5urObrbbDHtE6LnFnJmkcGbJ X8sScn53VOwmiufg6b6vh6lFIqrfOMJmRlqDY7muic2fAlWCCsOBcmJbCjOoIvNUxnTw CCi7AskMw6mhK+++me+sRjwikLpeFgIwFV0nF0mmjg428f8Rammes3zg4WwGMq8g786W u8Qe//FfgFBTmm/8czZ4ckylnzNPDOezoP2aS60VK1m3/oMe/h5ZP8lwmqiqiNZdCveU hRoUCo88MTFudnwN/fMDvgNm+b5CVsDTOcg14bYbIUZFekjOGitMEoCJ5+oTpJJu4z/U HanA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682103019; x=1684695019; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/IIUhUlmRecN66Z6gik+9JylRPrSHv9qYt9esJpbqSI=; b=PJMXBty2c47tB+H7ef/QqHbevprOuV9dqAhQttidcQFPAVG9YhUSfJkD/sRTdYGfaZ M1v2Xh6eUnkL0o6aLxwL0Fb4QvpNLDHpi3v99p+DUO8gWh8HijevyzE101zlpnEimla4 QGvZUtBoKtqc6EF0qmTIeIMjBW5ZnhzMq2i2fc4pQQn3nDm0XMIp8Hz0SlJ7J8mFOsT6 tsRL4Ad9O+2wKyeT7Ru0Ei7qaT9+0EykeinN9Whcv7e9k2oE8y23kDJfY6YkMYhvjdcy 0dFfcVJ566uT+xx3gGGWWcl9CiM+WruUNE6gz0Dga/rrhzAJt+nD2KEhDXpRvgj+l7vU Fl3w== X-Gm-Message-State: AAQBX9d8RSNnwm1pF/ne0MbS/uvRP5abHD88W0nFkjKpQ9TJtY9g8Z0T bzZ7T+L1YFQjhNJb9cUw9ecCn/V2IAHbbqSLkeF4dQ== X-Received: by 2002:a17:902:8501:b0:1a1:f95a:24f2 with SMTP id bj1-20020a170902850100b001a1f95a24f2mr5252822plb.19.1682103019136; Fri, 21 Apr 2023 11:50:19 -0700 (PDT) MIME-Version: 1.0 References: <20230419221716.3603068-1-atishp@rivosinc.com> <20230419221716.3603068-11-atishp@rivosinc.com> In-Reply-To: From: Atish Kumar Patra Date: Sat, 22 Apr 2023 00:20:08 +0530 Message-ID: Subject: Re: [RFC 10/48] RISC-V: KVM: Implement static memory region measurement To: Sean Christopherson Cc: linux-kernel@vger.kernel.org, Alexandre Ghiti , Andrew Jones , Andrew Morton , Anup Patel , Atish Patra , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Suzuki K Poulose , Will Deacon , Marc Zyngier , linux-coco@lists.linux.dev, Dylan Reid , abrestic@rivosinc.com, Samuel Ortiz , Christoph Hellwig , Conor Dooley , Greg Kroah-Hartman , Guo Ren , Heiko Stuebner , Jiri Slaby , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Mayuresh Chitale , Palmer Dabbelt , Paolo Bonzini , Paul Walmsley , Rajnesh Kanwal , Uladzislau Rezki Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 20, 2023 at 8:47=E2=80=AFPM Sean Christopherson wrote: > > On Wed, Apr 19, 2023, Atish Patra wrote: > > +int kvm_riscv_cove_vm_measure_pages(struct kvm *kvm, struct kvm_riscv_= cove_measure_region *mr) > > +{ > > + struct kvm_cove_tvm_context *tvmc =3D kvm->arch.tvmc; > > + int rc =3D 0, idx, num_pages; > > + struct kvm_riscv_cove_mem_region *conf; > > + struct page *pinned_page, *conf_page; > > + struct kvm_riscv_cove_page *cpage; > > + > > + if (!tvmc) > > + return -EFAULT; > > + > > + if (tvmc->finalized_done) { > > + kvm_err("measured_mr pages can not be added after finaliz= e\n"); > > + return -EINVAL; > > + } > > + > > + num_pages =3D bytes_to_pages(mr->size); > > + conf =3D &tvmc->confidential_region; > > + > > + if (!IS_ALIGNED(mr->userspace_addr, PAGE_SIZE) || > > + !IS_ALIGNED(mr->gpa, PAGE_SIZE) || !mr->size || > > + !cove_is_within_region(conf->gpa, conf->npages << PAGE_SHIFT,= mr->gpa, mr->size)) > > + return -EINVAL; > > + > > + idx =3D srcu_read_lock(&kvm->srcu); > > + > > + /*TODO: Iterate one page at a time as pinning multiple pages fail= with unmapped panic > > + * with a virtual address range belonging to vmalloc region for s= ome reason. > > I've no idea what code you had, but I suspect the fact that vmalloc'd mem= ory isn't > guaranteed to be physically contiguous is relevant to the panic. > Ahh. That makes sense. Thanks. > > + */ > > + while (num_pages) { > > + if (signal_pending(current)) { > > + rc =3D -ERESTARTSYS; > > + break; > > + } > > + > > + if (need_resched()) > > + cond_resched(); > > + > > + rc =3D get_user_pages_fast(mr->userspace_addr, 1, 0, &pin= ned_page); > > + if (rc < 0) { > > + kvm_err("Pinning the userpsace addr %lx failed\n"= , mr->userspace_addr); > > + break; > > + } > > + > > + /* Enough pages are not available to be pinned */ > > + if (rc !=3D 1) { > > + rc =3D -ENOMEM; > > + break; > > + } > > + conf_page =3D alloc_page(GFP_KERNEL | __GFP_ZERO); > > + if (!conf_page) { > > + rc =3D -ENOMEM; > > + break; > > + } > > + > > + rc =3D cove_convert_pages(page_to_phys(conf_page), 1, tru= e); > > + if (rc) > > + break; > > + > > + /*TODO: Support other pages sizes */ > > + rc =3D sbi_covh_add_measured_pages(tvmc->tvm_guest_id, pa= ge_to_phys(pinned_page), > > + page_to_phys(conf_page),= SBI_COVE_PAGE_4K, > > + 1, mr->gpa); > > + if (rc) > > + break; > > + > > + /* Unpin the page now */ > > + put_page(pinned_page); > > + > > + cpage =3D kmalloc(sizeof(*cpage), GFP_KERNEL_ACCOUNT); > > + if (!cpage) { > > + rc =3D -ENOMEM; > > + break; > > + } > > + > > + cpage->page =3D conf_page; > > + cpage->npages =3D 1; > > + cpage->gpa =3D mr->gpa; > > + cpage->hva =3D mr->userspace_addr; > > Snapshotting the userspace address for the _source_ page can't possibly b= e useful. > Yeah. Currently, the hva in the kvm_riscv_cove_page is not used anywhere in the code. We can remove it. > > + cpage->is_mapped =3D true; > > + INIT_LIST_HEAD(&cpage->link); > > + list_add(&cpage->link, &tvmc->measured_pages); > > + > > + mr->userspace_addr +=3D PAGE_SIZE; > > + mr->gpa +=3D PAGE_SIZE; > > + num_pages--; > > + conf_page =3D NULL; > > + > > + continue; > > + } > > + srcu_read_unlock(&kvm->srcu, idx); > > + > > + if (rc < 0) { > > + /* We don't to need unpin pages as it is allocated by the= hypervisor itself */ > > This comment makes no sense. The above code is doing all of the allocati= on and > pinning, which strongly suggests that KVM is the hypervisor. But this co= mment > implies that KVM is not the hypervisor. > I mean to say here the conf_page is allocated in the kernel using alloc_page. So no pinning/unpinning is required. It seems the comment is probably misleading & confusing at best. I will remove it. > And "pinned_page" is cleared unpinned in the loop after the page is added= +measured, > which looks to be the same model as TDX where "pinned_page" is the source= and > "conf_page" is gifted to the hypervisor. But on failure, e.g. when alloc= ating > "conf_page", that reference is not put. > Thanks. Will fix it. > > + cove_delete_page_list(kvm, &tvmc->measured_pages, false); > > + /* Free the last allocated page for which conversion/meas= urement failed */ > > + kfree(conf_page); > > Assuming my guesses about how the architecture works are correct, this is= broken > if sbi_covh_add_measured_pages() fails. The page has already been gifted = to the Yeah. The last conf_page should be reclaimed as well if measured_pages fail at any point in the loop. All other allocated ones would be reclaimed as a part of cove_delete_page_l= ist. > TSM by cove_convert_pages(), but there is no call to sbi_covh_tsm_reclaim= _pages(), > which I'm guessing is necesary to transition the page back to a state whe= re it can > be safely used by the host.