Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp534511lqt; Fri, 19 Apr 2024 03:20:41 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUZeWJkQEZg7aAqOSQKehpLrhQwb6t/JQmIXx5mEPjOaDmpcl22DS1in+F1L1gHJpmNQry7X8E5HeJN+SqS5W1NXxVx8hac7PEdZ3omdg== X-Google-Smtp-Source: AGHT+IFKjKjrHrGfN7lgQWJ3kdwTkDYsifGA1Hyf3+UA20y9XAYV/Z623dkOS78iWkgUjVQtzGox X-Received: by 2002:ac8:7d89:0:b0:437:5fd3:ed9e with SMTP id c9-20020ac87d89000000b004375fd3ed9emr2186042qtd.9.1713522041327; Fri, 19 Apr 2024 03:20:41 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713522041; cv=pass; d=google.com; s=arc-20160816; b=V6hGPt5xQebEiT5dhf0YhYtBfUENZzDMvHE3rylGChElfrgCqoBS9ZWiTi7/dsWTCC QALmcyOWLUzX0zM0uUYzKEwFzq7rYGfLv3LTVMYvoo2ywnj16bZW1pONttXRTZOnxsMs 8ZazfyIfuxGPzBKlHAJrvDdkyI7BUcrt+uwqi39OqYGJTrJmOTlPvyQT3qJqYccjLg2R JQy73rtgcpcSqnQLjMFAwp/TWSpLUlFwjbAM2kVVvBhT+/z8+7+LXnAMB2Gu9p4XRlJL Wx24TngvCwfppnDnLwH00quE8PSyVeb11zX8+7OW7WiY4p2MQRWklwlQHkkwvy7khdi8 +JMA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=2B4wMuSACdow1Xu7l7/sONkXWLNxxiYvky6Eyk/eCcY=; fh=0oaqyrehKWuX1zrO9O/X/bfiNWa7uV4CK2uOUG2gOes=; b=T7kkyncipIkRt8WMwiNzuxc9tAiZm6MG3PUP+nZpadCWKtVvCoMq8VsYAuDVb5FQ3p du3aiW4U16puJ2/1Z7Hk+ktym9JZ8ozHkQOE13qjQAvP52VVViMy2WAaiHjMD5AJcDyc 0SysK6XgVjH9ZbHh2T05pz6FyflwR/L0Ygige220sbiOAImjjgE7EjwhxoXTwXR3edKJ waPs4TXZbbkOf4IMXOsCUKc2vEC4pt8GpOaBkoFSYWl43YwJCpcPsn0x7d2SVd9a3bah yjr1Jt0egt7X/nP6cKvJF8OVMfDwSEcyITyprAQNqWe+oiFPpCuatEc41LXZ/emMJCIy 67Jw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-151310-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151310-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id m10-20020a05622a054a00b00434c471d3ffsi3640092qtx.106.2024.04.19.03.20.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 03:20:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-151310-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-151310-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151310-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 0491F1C20C09 for ; Fri, 19 Apr 2024 10:20:41 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 260FC7E572; Fri, 19 Apr 2024 10:20:30 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 004BD7C0A6; Fri, 19 Apr 2024 10:20:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713522028; cv=none; b=dqAxi69isVi51oaLlaImCsr00y0HJ4T3T4zz/m0N3DoLr72iH685LdbCoHqYPCFmOgf6Amk4Ks0a7BCC9yGGV0nTVT6CGXtE0uoihFDDncmeixde3Rnby2wojiJXGYHdHJEV2D1Jh94TJ7fq1C+lwUhN4vppcYOsHZFNFZITqLk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713522028; c=relaxed/simple; bh=oM1QiUHybzFuup65dviqkpz2nKFhLTwV93/okveJIck=; h=Message-ID:Date:MIME-Version:Subject:From:To:Cc:References: In-Reply-To:Content-Type; b=cxJA1yWSxB3mGBC8gHBFTn8EYSv/dbipsIhsn+rFGcnzGE8Ls357uVV/P+wvl3YThLuqmbCjd2TouKgwKD2Y/qySQOFHmUP4sBjQtFzK2TlomhZSsUrEgeizGJ2alCuWGIa2T4n25CCYg+Cry9Drgope/BkUjX6ezvvtGPHbgBg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9AB8E2F; Fri, 19 Apr 2024 03:20:53 -0700 (PDT) Received: from [10.1.197.1] (ewhatever.cambridge.arm.com [10.1.197.1]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CA49E3F792; Fri, 19 Apr 2024 03:20:22 -0700 (PDT) Message-ID: Date: Fri, 19 Apr 2024 11:20:21 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 17/43] arm64: RME: Allow VMM to set RIPAS From: Suzuki K Poulose To: Steven Price , kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: Catalin Marinas , Marc Zyngier , Will Deacon , James Morse , Oliver Upton , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly , Alexandru Elisei , Christoffer Dall , Fuad Tabba , linux-coco@lists.linux.dev, Ganapatrao Kulkarni References: <20240412084056.1733704-1-steven.price@arm.com> <20240412084309.1733783-1-steven.price@arm.com> <20240412084309.1733783-18-steven.price@arm.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 19/04/2024 10:34, Suzuki K Poulose wrote: > On 12/04/2024 09:42, Steven Price wrote: >> Each page within the protected region of the realm guest can be marked >> as either RAM or EMPTY. Allow the VMM to control this before the guest >> has started and provide the equivalent functions to change this (with >> the guest's approval) at runtime. >> >> When transitioning from RIPAS RAM (1) to RIPAS EMPTY (0) the memory is >> unmapped from the guest and undelegated allowing the memory to be reused >> by the host. When transitioning to RIPAS RAM the actual population of >> the leaf RTTs is done later on stage 2 fault, however it may be >> necessary to allocate additional RTTs to represent the range requested. > > minor nit: To give a bit more context: > > "however it may be necessary to allocate additional RTTs in order for > the RMM to track the RIPAS for the requested range". > >> >> When freeing a block mapping it is necessary to temporarily unfold the >> RTT which requires delegating an extra page to the RMM, this page can >> then be recovered once the contents of the block mapping have been >> freed. A spare, delegated page (spare_page) is used for this purpose. >> >> Signed-off-by: Steven Price >> --- >>   arch/arm64/include/asm/kvm_rme.h |  16 ++ >>   arch/arm64/kvm/mmu.c             |   8 +- >>   arch/arm64/kvm/rme.c             | 390 +++++++++++++++++++++++++++++++ >>   3 files changed, 411 insertions(+), 3 deletions(-) >> >> diff --git a/arch/arm64/include/asm/kvm_rme.h >> b/arch/arm64/include/asm/kvm_rme.h >> index 915e76068b00..cc8f81cfc3c0 100644 >> --- a/arch/arm64/include/asm/kvm_rme.h >> +++ b/arch/arm64/include/asm/kvm_rme.h >> @@ -96,6 +96,14 @@ void kvm_realm_destroy_rtts(struct kvm *kvm, u32 >> ia_bits); >>   int kvm_create_rec(struct kvm_vcpu *vcpu); >>   void kvm_destroy_rec(struct kvm_vcpu *vcpu); >> +void kvm_realm_unmap_range(struct kvm *kvm, >> +               unsigned long ipa, >> +               u64 size, >> +               bool unmap_private); >> +int realm_set_ipa_state(struct kvm_vcpu *vcpu, >> +            unsigned long addr, unsigned long end, >> +            unsigned long ripas); >> + >>   #define RME_RTT_BLOCK_LEVEL    2 >>   #define RME_RTT_MAX_LEVEL    3 >> @@ -114,4 +122,12 @@ static inline unsigned long >> rme_rtt_level_mapsize(int level) >>       return (1UL << RME_RTT_LEVEL_SHIFT(level)); >>   } >> +static inline bool realm_is_addr_protected(struct realm *realm, >> +                       unsigned long addr) >> +{ >> +    unsigned int ia_bits = realm->ia_bits; >> + >> +    return !(addr & ~(BIT(ia_bits - 1) - 1)); >> +} >> + >>   #endif >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >> index 46f0c4e80ace..8a7b5449697f 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -310,6 +310,7 @@ static void invalidate_icache_guest_page(void *va, >> size_t size) >>    * @start: The intermediate physical base address of the range to unmap >>    * @size:  The size of the area to unmap >>    * @may_block: Whether or not we are permitted to block >> + * @only_shared: If true then protected mappings should not be unmapped >>    * >>    * Clear a range of stage-2 mappings, lowering the various >> ref-counts.  Must >>    * be called while holding mmu_lock (unless for freeing the stage2 >> pgd before >> @@ -317,7 +318,7 @@ static void invalidate_icache_guest_page(void *va, >> size_t size) >>    * with things behind our backs. >>    */ >>   static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t >> start, u64 size, >> -                 bool may_block) >> +                 bool may_block, bool only_shared) >>   { >>       struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu); >>       phys_addr_t end = start + size; >> @@ -330,7 +331,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu >> *mmu, phys_addr_t start, u64 >>   static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t >> start, u64 size) >>   { >> -    __unmap_stage2_range(mmu, start, size, true); >> +    __unmap_stage2_range(mmu, start, size, true, false); >>   } >>   static void stage2_flush_memslot(struct kvm *kvm, >> @@ -1771,7 +1772,8 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct >> kvm_gfn_range *range) >>       __unmap_stage2_range(&kvm->arch.mmu, range->start << PAGE_SHIFT, >>                    (range->end - range->start) << PAGE_SHIFT, >> -                 range->may_block); >> +                 range->may_block, >> +                 range->only_shared); >>       return false; >>   } >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c >> index 629a095bea61..9e5983c51393 100644 >> --- a/arch/arm64/kvm/rme.c >> +++ b/arch/arm64/kvm/rme.c >> @@ -79,6 +79,12 @@ static phys_addr_t __alloc_delegated_page(struct >> realm *realm, >>       return phys; >>   } >> +static phys_addr_t alloc_delegated_page(struct realm *realm, >> +                    struct kvm_mmu_memory_cache *mc) >> +{ >> +    return __alloc_delegated_page(realm, mc, GFP_KERNEL); >> +} >> + >>   static void free_delegated_page(struct realm *realm, phys_addr_t phys) >>   { >>       if (realm->spare_page == PHYS_ADDR_MAX) { >> @@ -94,6 +100,151 @@ static void free_delegated_page(struct realm >> *realm, phys_addr_t phys) >>       free_page((unsigned long)phys_to_virt(phys)); >>   } >> +static int realm_rtt_create(struct realm *realm, >> +                unsigned long addr, >> +                int level, >> +                phys_addr_t phys) >> +{ >> +    addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1)); >> +    return rmi_rtt_create(virt_to_phys(realm->rd), phys, addr, level); >> +} >> + >> +static int realm_rtt_fold(struct realm *realm, >> +              unsigned long addr, >> +              int level, >> +              phys_addr_t *rtt_granule) >> +{ >> +    unsigned long out_rtt; >> +    int ret; >> + >> +    ret = rmi_rtt_fold(virt_to_phys(realm->rd), addr, level, &out_rtt); >> + >> +    if (RMI_RETURN_STATUS(ret) == RMI_SUCCESS && rtt_granule) >> +        *rtt_granule = out_rtt; >> + >> +    return ret; >> +} >> + >> +static int realm_destroy_protected(struct realm *realm, >> +                   unsigned long ipa, >> +                   unsigned long *next_addr) >> +{ >> +    unsigned long rd = virt_to_phys(realm->rd); >> +    unsigned long addr; >> +    phys_addr_t rtt; >> +    int ret; >> + >> +loop: >> +    ret = rmi_data_destroy(rd, ipa, &addr, next_addr); >> +    if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { >> +        if (*next_addr > ipa) >> +            return 0; /* UNASSIGNED */ >> +        rtt = alloc_delegated_page(realm, NULL); >> +        if (WARN_ON(rtt == PHYS_ADDR_MAX)) >> +            return -1; >> +        /* ASSIGNED - ipa is mapped as a block, so split */ >> +        ret = realm_rtt_create(realm, ipa, >> +                       RMI_RETURN_INDEX(ret) + 1, rtt); > > Could we not go all the way to L3 (rather than 1 level deeper) and try > again ? That way, we are covered for block mappings at L1 (1G). > >> +        if (WARN_ON(ret)) { >> +            free_delegated_page(realm, rtt); >> +            return -1; >> +        } >> +        /* retry */ >> +        goto loop; >> +    } else if (WARN_ON(ret)) { >> +        return -1; >> +    } >> +    ret = rmi_granule_undelegate(addr); >> + >> +    /* >> +     * If the undelegate fails then something has gone seriously >> +     * wrong: take an extra reference to just leak the page >> +     */ >> +    if (WARN_ON(ret)) >> +        get_page(phys_to_page(addr)); >> + >> +    return 0; >> +} >> + >> +static void realm_unmap_range_shared(struct kvm *kvm, >> +                     int level, >> +                     unsigned long start, >> +                     unsigned long end) >> +{ >> +    struct realm *realm = &kvm->arch.realm; >> +    unsigned long rd = virt_to_phys(realm->rd); >> +    ssize_t map_size = rme_rtt_level_mapsize(level); >> +    unsigned long next_addr, addr; >> +    unsigned long shared_bit = BIT(realm->ia_bits - 1); >> + >> +    if (WARN_ON(level > RME_RTT_MAX_LEVEL)) >> +        return; >> + >> +    start |= shared_bit; >> +    end |= shared_bit; >> + >> +    for (addr = start; addr < end; addr = next_addr) { >> +        unsigned long align_addr = ALIGN(addr, map_size); >> +        int ret; >> + >> +        next_addr = ALIGN(addr + 1, map_size); >> + >> +        if (align_addr != addr || next_addr > end) { >> +            /* Need to recurse deeper */ >> +            if (addr < align_addr) >> +                next_addr = align_addr; >> +            realm_unmap_range_shared(kvm, level + 1, addr, >> +                         min(next_addr, end)); >> +            continue; >> +        } >> + >> +        ret = rmi_rtt_unmap_unprotected(rd, addr, level, &next_addr); > > minor nit: We could potentially use rmi_rtt_destroy() to tear down > shared mappings without unmapping them individually, if the range > is big enough. All such optimisations could come later though. > >> +        switch (RMI_RETURN_STATUS(ret)) { >> +        case RMI_SUCCESS: >> +            break; >> +        case RMI_ERROR_RTT: >> +            if (next_addr == addr) { > > At this point we have a block aligned address, but the mapping is > further deep. Given, start from top to down, we implicitly handle > the case of block mappings. Not sure if that needs to be in a comment > here. > >> +                next_addr = ALIGN(addr + 1, map_size); > > Reset to the "actual next" as it was overwritten by the RMI call. > >> +                realm_unmap_range_shared(kvm, level + 1, addr, >> +                             next_addr); >> +            } >> +            break; >> +        default: >> +            WARN_ON(1); >> +        } >> +    } >> +} >> + >> +static void realm_unmap_range_private(struct kvm *kvm, >> +                      unsigned long start, >> +                      unsigned long end) >> +{ >> +    struct realm *realm = &kvm->arch.realm; >> +    ssize_t map_size = RME_PAGE_SIZE; >> +    unsigned long next_addr, addr; >> + >> +    for (addr = start; addr < end; addr = next_addr) { >> +        int ret; >> + >> +        next_addr = ALIGN(addr + 1, map_size); >> + >> +        ret = realm_destroy_protected(realm, addr, &next_addr); >> + >> +        if (WARN_ON(ret)) >> +            break; >> +    } >> +} >> + >> +static void realm_unmap_range(struct kvm *kvm, >> +                  unsigned long start, >> +                  unsigned long end, >> +                  bool unmap_private) >> +{ >> +    realm_unmap_range_shared(kvm, RME_RTT_MAX_LEVEL - 1, start, end); > > minor nit: We already have a helper to find a suitable start level > (defined below), may be we could use that ? And even do the rtt_destroy > optimisation for unprotected range. > >> +    if (unmap_private) >> +        realm_unmap_range_private(kvm, start, end); >> +} >> + >>   u32 kvm_realm_ipa_limit(void) >>   { >>       return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ); >> @@ -190,6 +341,30 @@ static int realm_rtt_destroy(struct realm *realm, >> unsigned long addr, >>       return ret; >>   } >> +static int realm_create_rtt_levels(struct realm *realm, >> +                   unsigned long ipa, >> +                   int level, >> +                   int max_level, >> +                   struct kvm_mmu_memory_cache *mc) >> +{ >> +    if (WARN_ON(level == max_level)) >> +        return 0; >> + >> +    while (level++ < max_level) { >> +        phys_addr_t rtt = alloc_delegated_page(realm, mc); >> + >> +        if (rtt == PHYS_ADDR_MAX) >> +            return -ENOMEM; >> + >> +        if (realm_rtt_create(realm, ipa, level, rtt)) { >> +            free_delegated_page(realm, rtt); >> +            return -ENXIO; >> +        } >> +    } >> + >> +    return 0; >> +} >> + >>   static int realm_tear_down_rtt_level(struct realm *realm, int level, >>                        unsigned long start, unsigned long end) >>   { >> @@ -265,6 +440,68 @@ static int realm_tear_down_rtt_range(struct realm >> *realm, >>                        start, end); >>   } >> +/* >> + * Returns 0 on successful fold, a negative value on error, a >> positive value if >> + * we were not able to fold all tables at this level. >> + */ >> +static int realm_fold_rtt_level(struct realm *realm, int level, >> +                unsigned long start, unsigned long end) >> +{ >> +    int not_folded = 0; >> +    ssize_t map_size; >> +    unsigned long addr, next_addr; >> + >> +    if (WARN_ON(level > RME_RTT_MAX_LEVEL)) >> +        return -EINVAL; >> + >> +    map_size = rme_rtt_level_mapsize(level - 1); >> + >> +    for (addr = start; addr < end; addr = next_addr) { >> +        phys_addr_t rtt_granule; >> +        int ret; >> +        unsigned long align_addr = ALIGN(addr, map_size); >> + >> +        next_addr = ALIGN(addr + 1, map_size); >> + >> +        ret = realm_rtt_fold(realm, align_addr, level, &rtt_granule); >> + >> +        switch (RMI_RETURN_STATUS(ret)) { >> +        case RMI_SUCCESS: >> +            if (!WARN_ON(rmi_granule_undelegate(rtt_granule))) >> +                free_page((unsigned long)phys_to_virt(rtt_granule)); > > minor nit: Do we need a wrapper function for things like this, abd > leaking the page if undelegate fails, something like > rme_reclaim_delegated_page()  ? > > >> +            break; >> +        case RMI_ERROR_RTT: >> +            if (level == RME_RTT_MAX_LEVEL || >> +                RMI_RETURN_INDEX(ret) < level) { >> +                not_folded++; >> +                break; >> +            } >> +            /* Recurse a level deeper */ >> +            ret = realm_fold_rtt_level(realm, >> +                           level + 1, >> +                           addr, >> +                           next_addr); >> +            if (ret < 0) >> +                return ret; >> +            else if (ret == 0) >> +                /* Try again at this level */ >> +                next_addr = addr; >> +            break; >> +        default: >> +            return -ENXIO; >> +        } >> +    } >> + >> +    return not_folded; >> +} >> + >> +static int realm_fold_rtt_range(struct realm *realm, >> +                unsigned long start, unsigned long end) >> +{ >> +    return realm_fold_rtt_level(realm, get_start_level(realm) + 1, >> +                    start, end); >> +} >> + >>   static void ensure_spare_page(struct realm *realm) >>   { >>       phys_addr_t tmp_rtt; >> @@ -295,6 +532,147 @@ void kvm_realm_destroy_rtts(struct kvm *kvm, u32 >> ia_bits) >>       WARN_ON(realm_tear_down_rtt_range(realm, 0, (1UL << ia_bits))); >>   } >> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size, >> +               bool unmap_private) >> +{ >> +    unsigned long end = ipa + size; >> +    struct realm *realm = &kvm->arch.realm; >> + >> +    end = min(BIT(realm->ia_bits - 1), end); >> + >> +    ensure_spare_page(realm); >> + >> +    realm_unmap_range(kvm, ipa, end, unmap_private); >> + >> +    realm_fold_rtt_range(realm, ipa, end); > > Shouldn't this be : > >     if (unmap_private) >         realm_fold_rtt_range(realm, ipa, end); > > Also it is fine to reclaim RTTs from the protected space, not the > unprotected half, as long as we use RTT_DESTROY in unmap_shared routine. Thinking about this a bit more, we could : 1. Rename this to realm_reclaim_rtts_range() 2. Use "FOLD" vs "DESTROY" depending on the state of the Realm. If the realm is DYING (or add a state in the kvm_pgtable_stage2_destroy() to indicate that stage2 can now be "destroyed") and use DESTROY wherever it is safe to do so. Suzuki