2022-02-25 04:10:03

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v4 2/8] KVM: arm64: Introduce pkvm_alloc_private_va_range()

pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges
in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping()
to allow specifying an alignment for the private VA mapping.

These will be used to implement stack guard pages for pKVM nVHE hypervisor
(in a subsequent patch in the series).

Credits to Quentin Perret <[email protected]> for the idea of moving
private VA allocation out of __pkvm_create_private_mapping()

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v4:
- Handle null ptr in pkvm_alloc_private_va_range() and replace
IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad
- Fix kernel-doc comments format, per Fuad
- Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad

Changes in v3:
- Handle null ptr in IS_ERR_OR_NULL checks, per Mark

Changes in v2:
- Allow specifying an alignment for the private VA allocations, per Marc

arch/arm64/kvm/hyp/include/nvhe/mm.h | 3 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 ++-
arch/arm64/kvm/hyp/nvhe/mm.c | 60 +++++++++++++++++-----------
arch/arm64/kvm/mmu.c | 2 +-
4 files changed, 43 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index 2d08510c6cc1..76d6fcf1d4f4 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -20,7 +20,8 @@ int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot);
int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot);
int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot);
unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
- enum kvm_pgtable_prot prot);
+ size_t align, enum kvm_pgtable_prot prot);
+unsigned long pkvm_alloc_private_va_range(size_t size, size_t align);

static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size,
unsigned long *start, unsigned long *end)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 5e2197db0d32..96b2312a0f1d 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -158,9 +158,10 @@ static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ct
{
DECLARE_REG(phys_addr_t, phys, host_ctxt, 1);
DECLARE_REG(size_t, size, host_ctxt, 2);
- DECLARE_REG(enum kvm_pgtable_prot, prot, host_ctxt, 3);
+ DECLARE_REG(size_t, align, host_ctxt, 3);
+ DECLARE_REG(enum kvm_pgtable_prot, prot, host_ctxt, 4);

- cpu_reg(host_ctxt, 1) = __pkvm_create_private_mapping(phys, size, prot);
+ cpu_reg(host_ctxt, 1) = __pkvm_create_private_mapping(phys, size, align, prot);
}

static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 526a7d6fa86f..e6355180aa49 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -37,38 +37,53 @@ static int __pkvm_create_mappings(unsigned long start, unsigned long size,
return err;
}

-unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
- enum kvm_pgtable_prot prot)
+/**
+ * pkvm_alloc_private_va_range - Allocates a private VA range.
+ * @size: The size of the VA range to reserve.
+ * @align: The required alignment for the allocation.
+ *
+ * The private VA range is allocated above __io_map_base.
+ */
+unsigned long pkvm_alloc_private_va_range(size_t size, size_t align)
{
- unsigned long addr;
- int err;
+ unsigned long base, addr;

hyp_spin_lock(&pkvm_pgd_lock);

- size = PAGE_ALIGN(size + offset_in_page(phys));
- addr = __io_map_base;
- __io_map_base += size;
+ addr = ALIGN(__io_map_base, align);
+
+ /* The allocated size is always a multiple of PAGE_SIZE */
+ base = addr + PAGE_ALIGN(size);

/* Are we overflowing on the vmemmap ? */
- if (__io_map_base > __hyp_vmemmap) {
- __io_map_base -= size;
+ if (!addr || base > __hyp_vmemmap)
addr = (unsigned long)ERR_PTR(-ENOMEM);
- goto out;
- }
-
- err = kvm_pgtable_hyp_map(&pkvm_pgtable, addr, size, phys, prot);
- if (err) {
- addr = (unsigned long)ERR_PTR(err);
- goto out;
- }
+ else
+ __io_map_base = base;

- addr = addr + offset_in_page(phys);
-out:
hyp_spin_unlock(&pkvm_pgd_lock);

return addr;
}

+unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size,
+ size_t align, enum kvm_pgtable_prot prot)
+{
+ unsigned long addr;
+ int err;
+
+ size += offset_in_page(phys);
+ addr = pkvm_alloc_private_va_range(size, align);
+ if (IS_ERR((void *)addr))
+ return addr;
+
+ err = __pkvm_create_mappings(addr, size, phys, prot);
+ if (err)
+ return (unsigned long)ERR_PTR(err);
+
+ return addr + offset_in_page(phys);
+}
+
int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot)
{
unsigned long start = (unsigned long)from;
@@ -152,10 +167,9 @@ int hyp_map_vectors(void)
return 0;

phys = __hyp_pa(__bp_harden_hyp_vecs);
- bp_base = (void *)__pkvm_create_private_mapping(phys,
- __BP_HARDEN_HYP_VECS_SZ,
- PAGE_HYP_EXEC);
- if (IS_ERR_OR_NULL(bp_base))
+ bp_base = (void *)__pkvm_create_private_mapping(phys, __BP_HARDEN_HYP_VECS_SZ,
+ PAGE_SIZE, PAGE_HYP_EXEC);
+ if (IS_ERR(bp_base))
return PTR_ERR(bp_base);

__hyp_bp_vect_base = bp_base;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index a4868a6fa1c3..433c49766671 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -506,7 +506,7 @@ int __create_hyp_private_mapping(phys_addr_t phys_addr, size_t size,

if (!kvm_host_owns_hyp_mappings()) {
addr = kvm_call_hyp_nvhe(__pkvm_create_private_mapping,
- phys_addr, size, prot);
+ phys_addr, size, align, prot);
if (IS_ERR((void *)addr))
return PTR_ERR((void *)addr);
*haddr = addr;
--
2.35.1.574.g5d30c73bfb-goog


2022-03-02 10:04:40

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 2/8] KVM: arm64: Introduce pkvm_alloc_private_va_range()

On Fri, 25 Feb 2022 03:34:47 +0000,
Kalesh Singh <[email protected]> wrote:
>
> pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges
> in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping()
> to allow specifying an alignment for the private VA mapping.
>
> These will be used to implement stack guard pages for pKVM nVHE hypervisor
> (in a subsequent patch in the series).
>
> Credits to Quentin Perret <[email protected]> for the idea of moving
> private VA allocation out of __pkvm_create_private_mapping()
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
>
> Changes in v4:
> - Handle null ptr in pkvm_alloc_private_va_range() and replace
> IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad
> - Fix kernel-doc comments format, per Fuad
> - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad
>
> Changes in v3:
> - Handle null ptr in IS_ERR_OR_NULL checks, per Mark
>
> Changes in v2:
> - Allow specifying an alignment for the private VA allocations, per Marc

I probably badly expressed my earlier concern.

Yes, an alignment is necessary. But how often do we want an alignment
that isn't naturally aligned to the size of the allocation (i.e. the
power of 2 >= the size of the allocation)? This is what the rest of
the kernel does (get_order() and co), and I thing we should follow
this.

This applies to both this patch and the previous one.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2022-03-02 22:31:19

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v4 2/8] KVM: arm64: Introduce pkvm_alloc_private_va_range()

On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <[email protected]> wrote:
>
> On Fri, 25 Feb 2022 03:34:47 +0000,
> Kalesh Singh <[email protected]> wrote:
> >
> > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges
> > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping()
> > to allow specifying an alignment for the private VA mapping.
> >
> > These will be used to implement stack guard pages for pKVM nVHE hypervisor
> > (in a subsequent patch in the series).
> >
> > Credits to Quentin Perret <[email protected]> for the idea of moving
> > private VA allocation out of __pkvm_create_private_mapping()
> >
> > Signed-off-by: Kalesh Singh <[email protected]>
> > ---
> >
> > Changes in v4:
> > - Handle null ptr in pkvm_alloc_private_va_range() and replace
> > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad
> > - Fix kernel-doc comments format, per Fuad
> > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad
> >
> > Changes in v3:
> > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark
> >
> > Changes in v2:
> > - Allow specifying an alignment for the private VA allocations, per Marc
>
> I probably badly expressed my earlier concern.
>
> Yes, an alignment is necessary. But how often do we want an alignment
> that isn't naturally aligned to the size of the allocation (i.e. the
> power of 2 >= the size of the allocation)? This is what the rest of
> the kernel does (get_order() and co), and I thing we should follow
> this.

Hi Marc,

Thanks for clarifying. I think making the alignment implicitly based
on the size here will create unnecessary holes where PAGE_SIZE
alignment would be ok and potentially overflow the private VA space
earlier. Is it not a concern?

- Kalesh
>
> This applies to both this patch and the previous one.
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>

2022-03-03 18:17:22

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 2/8] KVM: arm64: Introduce pkvm_alloc_private_va_range()

Hi Kalesh,

On Wed, 02 Mar 2022 17:24:53 +0000,
Kalesh Singh <[email protected]> wrote:
>
> On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Fri, 25 Feb 2022 03:34:47 +0000,
> > Kalesh Singh <[email protected]> wrote:
> > >
> > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges
> > > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping()
> > > to allow specifying an alignment for the private VA mapping.
> > >
> > > These will be used to implement stack guard pages for pKVM nVHE hypervisor
> > > (in a subsequent patch in the series).
> > >
> > > Credits to Quentin Perret <[email protected]> for the idea of moving
> > > private VA allocation out of __pkvm_create_private_mapping()
> > >
> > > Signed-off-by: Kalesh Singh <[email protected]>
> > > ---
> > >
> > > Changes in v4:
> > > - Handle null ptr in pkvm_alloc_private_va_range() and replace
> > > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad
> > > - Fix kernel-doc comments format, per Fuad
> > > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad
> > >
> > > Changes in v3:
> > > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark
> > >
> > > Changes in v2:
> > > - Allow specifying an alignment for the private VA allocations, per Marc
> >
> > I probably badly expressed my earlier concern.
> >
> > Yes, an alignment is necessary. But how often do we want an alignment
> > that isn't naturally aligned to the size of the allocation (i.e. the
> > power of 2 >= the size of the allocation)? This is what the rest of
> > the kernel does (get_order() and co), and I thing we should follow
> > this.
>
> Hi Marc,
>
> Thanks for clarifying. I think making the alignment implicitly based
> on the size here will create unnecessary holes where PAGE_SIZE
> alignment would be ok and potentially overflow the private VA space
> earlier. Is it not a concern?

I don't think we should worry too much about this. Even when building
the kernel with a very small VA space (commonly 39 bits), we still
have a quarter of that reserved for private EL2 mappings. That's
pretty big.

We will use a bit more of the memory that is set aside for EL2 page
tables, but this shouldn't be a problem either.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2022-03-03 18:26:54

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v4 2/8] KVM: arm64: Introduce pkvm_alloc_private_va_range()

On Thu, Mar 3, 2022 at 9:29 AM Marc Zyngier <[email protected]> wrote:
>
> Hi Kalesh,
>
> On Wed, 02 Mar 2022 17:24:53 +0000,
> Kalesh Singh <[email protected]> wrote:
> >
> > On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Fri, 25 Feb 2022 03:34:47 +0000,
> > > Kalesh Singh <[email protected]> wrote:
> > > >
> > > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges
> > > > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping()
> > > > to allow specifying an alignment for the private VA mapping.
> > > >
> > > > These will be used to implement stack guard pages for pKVM nVHE hypervisor
> > > > (in a subsequent patch in the series).
> > > >
> > > > Credits to Quentin Perret <[email protected]> for the idea of moving
> > > > private VA allocation out of __pkvm_create_private_mapping()
> > > >
> > > > Signed-off-by: Kalesh Singh <[email protected]>
> > > > ---
> > > >
> > > > Changes in v4:
> > > > - Handle null ptr in pkvm_alloc_private_va_range() and replace
> > > > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad
> > > > - Fix kernel-doc comments format, per Fuad
> > > > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad
> > > >
> > > > Changes in v3:
> > > > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark
> > > >
> > > > Changes in v2:
> > > > - Allow specifying an alignment for the private VA allocations, per Marc
> > >
> > > I probably badly expressed my earlier concern.
> > >
> > > Yes, an alignment is necessary. But how often do we want an alignment
> > > that isn't naturally aligned to the size of the allocation (i.e. the
> > > power of 2 >= the size of the allocation)? This is what the rest of
> > > the kernel does (get_order() and co), and I thing we should follow
> > > this.
> >
> > Hi Marc,
> >
> > Thanks for clarifying. I think making the alignment implicitly based
> > on the size here will create unnecessary holes where PAGE_SIZE
> > alignment would be ok and potentially overflow the private VA space
> > earlier. Is it not a concern?
>
> I don't think we should worry too much about this. Even when building
> the kernel with a very small VA space (commonly 39 bits), we still
> have a quarter of that reserved for private EL2 mappings. That's
> pretty big.
>
> We will use a bit more of the memory that is set aside for EL2 page
> tables, but this shouldn't be a problem either.

Hi Marc,

Thanks for the explanations. I'll update as suggested in the next version.

- Kalesh

>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.