2022-06-16 16:19:27

by Quentin Perret

[permalink] [raw]
Subject: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

Commit a7259df76702 ("memblock: make memblock_find_in_range method
private") changed the API using which memory is reserved for the pKVM
hypervisor. However, it seems that memblock_phys_alloc() differs
from the original API in terms of kmemleak semantics -- the old one
excluded the reserved regions from kmemleak scans when the new one
doesn't seem to. Unfortunately, when protected KVM is enabled, all
kernel accesses to pKVM-private memory result in a fatal exception,
which can now happen because of kmemleak scans:

$ echo scan > /sys/kernel/debug/kmemleak
[ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
[ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000
[ 34.991813] Kernel panic - not syncing: HYP panic:
[ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[ 34.991813] VCPU:0000000000000000
[ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102
[ 34.994059] Hardware name: linux,dummy-virt (DT)
[ 34.994452] Call trace:
[ 34.994641] dump_backtrace.part.0+0xcc/0xe0
[ 34.994932] show_stack+0x18/0x6c
[ 34.995094] dump_stack_lvl+0x68/0x84
[ 34.995276] dump_stack+0x18/0x34
[ 34.995484] panic+0x16c/0x354
[ 34.995673] __hyp_pgtable_total_pages+0x0/0x60
[ 34.995933] scan_block+0x74/0x12c
[ 34.996129] scan_gray_list+0xd8/0x19c
[ 34.996332] kmemleak_scan+0x2c8/0x580
[ 34.996535] kmemleak_write+0x340/0x4a0
[ 34.996744] full_proxy_write+0x60/0xbc
[ 34.996967] vfs_write+0xc4/0x2b0
[ 34.997136] ksys_write+0x68/0xf4
[ 34.997311] __arm64_sys_write+0x20/0x2c
[ 34.997532] invoke_syscall+0x48/0x114
[ 34.997779] el0_svc_common.constprop.0+0x44/0xec
[ 34.998029] do_el0_svc+0x2c/0xc0
[ 34.998205] el0_svc+0x2c/0x84
[ 34.998421] el0t_64_sync_handler+0xf4/0x100
[ 34.998653] el0t_64_sync+0x18c/0x190
[ 34.999252] SMP: stopping secondary CPUs
[ 35.000034] Kernel Offset: disabled
[ 35.000261] CPU features: 0x800,00007831,00001086
[ 35.000642] Memory Limit: none
[ 35.001329] ---[ end Kernel panic - not syncing: HYP panic:
[ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[ 35.001329] VCPU:0000000000000000 ]---

Fix this by explicitly excluding the hypervisor's memory pool from
kmemleak like we already do for the hyp BSS.

Cc: Mike Rapoport <[email protected]>
Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
Signed-off-by: Quentin Perret <[email protected]>
---
An alternative could be to actually exclude memory allocated using
memblock_phys_alloc_range() from kmemleak scans to revert back to the
old behaviour. But nobody else has complained about this AFAIK, so I'd
be inclined to keep this local to pKVM. No strong opinion.
---
arch/arm64/kvm/arm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 400bb0fe2745..28765bd22efb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2110,11 +2110,11 @@ static int finalize_hyp_mode(void)
return 0;

/*
- * Exclude HYP BSS from kmemleak so that it doesn't get peeked
- * at, which would end badly once the section is inaccessible.
- * None of other sections should ever be introspected.
+ * Exclude HYP sections from kmemleak so that they don't get peeked
+ * at, which would end badly once inaccessible.
*/
kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
+ kmemleak_free_part(__va(hyp_mem_base), hyp_mem_size);
return pkvm_drop_host_privileges();
}

--
2.36.1.476.g0c4daa206d-goog


2022-06-16 17:52:50

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On Thu, Jun 16, 2022 at 04:11:34PM +0000, Quentin Perret wrote:
> Commit a7259df76702 ("memblock: make memblock_find_in_range method
> private") changed the API using which memory is reserved for the pKVM
> hypervisor. However, it seems that memblock_phys_alloc() differs
> from the original API in terms of kmemleak semantics -- the old one
> excluded the reserved regions from kmemleak scans when the new one
> doesn't seem to. Unfortunately, when protected KVM is enabled, all
> kernel accesses to pKVM-private memory result in a fatal exception,
> which can now happen because of kmemleak scans:
>
> $ echo scan > /sys/kernel/debug/kmemleak
> [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
> [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000
> [ 34.991813] Kernel panic - not syncing: HYP panic:
> [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
> [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
> [ 34.991813] VCPU:0000000000000000
> [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102
> [ 34.994059] Hardware name: linux,dummy-virt (DT)
> [ 34.994452] Call trace:
> [ 34.994641] dump_backtrace.part.0+0xcc/0xe0
> [ 34.994932] show_stack+0x18/0x6c
> [ 34.995094] dump_stack_lvl+0x68/0x84
> [ 34.995276] dump_stack+0x18/0x34
> [ 34.995484] panic+0x16c/0x354
> [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60
> [ 34.995933] scan_block+0x74/0x12c
> [ 34.996129] scan_gray_list+0xd8/0x19c
> [ 34.996332] kmemleak_scan+0x2c8/0x580
> [ 34.996535] kmemleak_write+0x340/0x4a0
> [ 34.996744] full_proxy_write+0x60/0xbc
> [ 34.996967] vfs_write+0xc4/0x2b0
> [ 34.997136] ksys_write+0x68/0xf4
> [ 34.997311] __arm64_sys_write+0x20/0x2c
> [ 34.997532] invoke_syscall+0x48/0x114
> [ 34.997779] el0_svc_common.constprop.0+0x44/0xec
> [ 34.998029] do_el0_svc+0x2c/0xc0
> [ 34.998205] el0_svc+0x2c/0x84
> [ 34.998421] el0t_64_sync_handler+0xf4/0x100
> [ 34.998653] el0t_64_sync+0x18c/0x190
> [ 34.999252] SMP: stopping secondary CPUs
> [ 35.000034] Kernel Offset: disabled
> [ 35.000261] CPU features: 0x800,00007831,00001086
> [ 35.000642] Memory Limit: none
> [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic:
> [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
> [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
> [ 35.001329] VCPU:0000000000000000 ]---
>
> Fix this by explicitly excluding the hypervisor's memory pool from
> kmemleak like we already do for the hyp BSS.
>
> Cc: Mike Rapoport <[email protected]>
> Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> Signed-off-by: Quentin Perret <[email protected]>
> ---
> An alternative could be to actually exclude memory allocated using
> memblock_phys_alloc_range() from kmemleak scans to revert back to the
> old behaviour. But nobody else has complained about this AFAIK, so I'd
> be inclined to keep this local to pKVM. No strong opinion.

This works for me, I haven't heard anyone else complaining.

Acked-by: Catalin Marinas <[email protected]>

2022-06-17 08:38:59

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On Thu, Jun 16, 2022 at 04:11:34PM +0000, Quentin Perret wrote:
> Commit a7259df76702 ("memblock: make memblock_find_in_range method
> private") changed the API using which memory is reserved for the pKVM
> hypervisor. However, it seems that memblock_phys_alloc() differs
> from the original API in terms of kmemleak semantics -- the old one
> excluded the reserved regions from kmemleak scans when the new one
> doesn't seem to. Unfortunately, when protected KVM is enabled, all

I'd rather say that memblock_find_in_range() didn't inform kmemleak about
the reserved regions, while memblock_phys_alloc() does.

> kernel accesses to pKVM-private memory result in a fatal exception,
> which can now happen because of kmemleak scans:
>
> $ echo scan > /sys/kernel/debug/kmemleak
> [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
> [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000
> [ 34.991813] Kernel panic - not syncing: HYP panic:
> [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
> [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
> [ 34.991813] VCPU:0000000000000000
> [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102
> [ 34.994059] Hardware name: linux,dummy-virt (DT)
> [ 34.994452] Call trace:
> [ 34.994641] dump_backtrace.part.0+0xcc/0xe0
> [ 34.994932] show_stack+0x18/0x6c
> [ 34.995094] dump_stack_lvl+0x68/0x84
> [ 34.995276] dump_stack+0x18/0x34
> [ 34.995484] panic+0x16c/0x354
> [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60
> [ 34.995933] scan_block+0x74/0x12c
> [ 34.996129] scan_gray_list+0xd8/0x19c
> [ 34.996332] kmemleak_scan+0x2c8/0x580
> [ 34.996535] kmemleak_write+0x340/0x4a0
> [ 34.996744] full_proxy_write+0x60/0xbc
> [ 34.996967] vfs_write+0xc4/0x2b0
> [ 34.997136] ksys_write+0x68/0xf4
> [ 34.997311] __arm64_sys_write+0x20/0x2c
> [ 34.997532] invoke_syscall+0x48/0x114
> [ 34.997779] el0_svc_common.constprop.0+0x44/0xec
> [ 34.998029] do_el0_svc+0x2c/0xc0
> [ 34.998205] el0_svc+0x2c/0x84
> [ 34.998421] el0t_64_sync_handler+0xf4/0x100
> [ 34.998653] el0t_64_sync+0x18c/0x190
> [ 34.999252] SMP: stopping secondary CPUs
> [ 35.000034] Kernel Offset: disabled
> [ 35.000261] CPU features: 0x800,00007831,00001086
> [ 35.000642] Memory Limit: none
> [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic:
> [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
> [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
> [ 35.001329] VCPU:0000000000000000 ]---
>
> Fix this by explicitly excluding the hypervisor's memory pool from
> kmemleak like we already do for the hyp BSS.
>
> Cc: Mike Rapoport <[email protected]>
> Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private")
> Signed-off-by: Quentin Perret <[email protected]>
> ---
> An alternative could be to actually exclude memory allocated using
> memblock_phys_alloc_range() from kmemleak scans to revert back to the
> old behaviour.

This would be wrong because memblock_phys_alloc() does allocate memory and
unless there is a good reason to exclude it from kmemleak.

> But nobody else has complained about this AFAIK, so I'd be inclined to
> keep this local to pKVM. No strong opinion.

Yes, please :)
An alternative to excluding this memory from kmemleak is to allocate it
using

memblock_phys_alloc_range(size, align, 0, MEMBLOCK_ALLOC_NOLEAKTRACE)

then it won't be added to kmemleak at the first place.

> ---
> arch/arm64/kvm/arm.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 400bb0fe2745..28765bd22efb 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2110,11 +2110,11 @@ static int finalize_hyp_mode(void)
> return 0;
>
> /*
> - * Exclude HYP BSS from kmemleak so that it doesn't get peeked
> - * at, which would end badly once the section is inaccessible.
> - * None of other sections should ever be introspected.
> + * Exclude HYP sections from kmemleak so that they don't get peeked
> + * at, which would end badly once inaccessible.
> */
> kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
> + kmemleak_free_part(__va(hyp_mem_base), hyp_mem_size);
> return pkvm_drop_host_privileges();
> }
>
> --
> 2.36.1.476.g0c4daa206d-goog
>

--
Sincerely yours,
Mike.

2022-06-17 08:50:58

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On Friday 17 Jun 2022 at 11:38:14 (+0300), Mike Rapoport wrote:
> On Fri, Jun 17, 2022 at 09:21:31AM +0100, Marc Zyngier wrote:
> > On Thu, 16 Jun 2022 16:11:34 +0000, Quentin Perret wrote:
> > > Commit a7259df76702 ("memblock: make memblock_find_in_range method
> > > private") changed the API using which memory is reserved for the pKVM
> > > hypervisor. However, it seems that memblock_phys_alloc() differs
> > > from the original API in terms of kmemleak semantics -- the old one
> > > excluded the reserved regions from kmemleak scans when the new one
> > > doesn't seem to. Unfortunately, when protected KVM is enabled, all
> > > kernel accesses to pKVM-private memory result in a fatal exception,
> > > which can now happen because of kmemleak scans:
> > >
> > > [...]
> >
> > Applied to fixes, thanks!
> >
> > [1/1] KVM: arm64: Prevent kmemleak from accessing pKVM memory
> > commit: 9e5afa8a537f742bccc2cd91bc0bef4b6483ee98
>
> I'd really like to update the changelog to this:
>
> Commit a7259df76702 ("memblock: make memblock_find_in_range method
> private") changed the API using which memory is reserved for the pKVM
> hypervisor. However, memblock_phys_alloc() differs from the original API in
> terms of kmemleak semantics -- the old one didn't report the reserved
> regions to kmemleak while the new one does. Unfortunately, when protected
> KVM is enabled, all kernel accesses to pKVM-private memory result in a
> fatal exception, which can now happen because of kmemleak scans:
>
> $ echo scan > /sys/kernel/debug/kmemleak
> [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
> ...
>
> Fix this by explicitly excluding the hypervisor's memory pool from
> kmemleak like we already do for the hyp BSS.

Looks good to me, thanks.

Quentin

2022-06-17 09:08:46

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On 2022-06-17 09:45, Quentin Perret wrote:
> On Friday 17 Jun 2022 at 11:38:14 (+0300), Mike Rapoport wrote:
>> On Fri, Jun 17, 2022 at 09:21:31AM +0100, Marc Zyngier wrote:
>> > On Thu, 16 Jun 2022 16:11:34 +0000, Quentin Perret wrote:
>> > > Commit a7259df76702 ("memblock: make memblock_find_in_range method
>> > > private") changed the API using which memory is reserved for the pKVM
>> > > hypervisor. However, it seems that memblock_phys_alloc() differs
>> > > from the original API in terms of kmemleak semantics -- the old one
>> > > excluded the reserved regions from kmemleak scans when the new one
>> > > doesn't seem to. Unfortunately, when protected KVM is enabled, all
>> > > kernel accesses to pKVM-private memory result in a fatal exception,
>> > > which can now happen because of kmemleak scans:
>> > >
>> > > [...]
>> >
>> > Applied to fixes, thanks!
>> >
>> > [1/1] KVM: arm64: Prevent kmemleak from accessing pKVM memory
>> > commit: 9e5afa8a537f742bccc2cd91bc0bef4b6483ee98
>>
>> I'd really like to update the changelog to this:
>>
>> Commit a7259df76702 ("memblock: make memblock_find_in_range method
>> private") changed the API using which memory is reserved for the pKVM
>> hypervisor. However, memblock_phys_alloc() differs from the original
>> API in
>> terms of kmemleak semantics -- the old one didn't report the reserved
>> regions to kmemleak while the new one does. Unfortunately, when
>> protected
>> KVM is enabled, all kernel accesses to pKVM-private memory result in a
>> fatal exception, which can now happen because of kmemleak scans:
>>
>> $ echo scan > /sys/kernel/debug/kmemleak
>> [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>]
>> __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
>> ...
>>
>> Fix this by explicitly excluding the hypervisor's memory pool from
>> kmemleak like we already do for the hyp BSS.
>
> Looks good to me, thanks.

Now updated. Thanks,

M.
--
Jazz is not dead. It just smells funny...

2022-06-17 09:11:20

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On Thu, 16 Jun 2022 16:11:34 +0000, Quentin Perret wrote:
> Commit a7259df76702 ("memblock: make memblock_find_in_range method
> private") changed the API using which memory is reserved for the pKVM
> hypervisor. However, it seems that memblock_phys_alloc() differs
> from the original API in terms of kmemleak semantics -- the old one
> excluded the reserved regions from kmemleak scans when the new one
> doesn't seem to. Unfortunately, when protected KVM is enabled, all
> kernel accesses to pKVM-private memory result in a fatal exception,
> which can now happen because of kmemleak scans:
>
> [...]

Applied to fixes, thanks!

[1/1] KVM: arm64: Prevent kmemleak from accessing pKVM memory
commit: 9e5afa8a537f742bccc2cd91bc0bef4b6483ee98

Cheers,

M.
--
Marc Zyngier <[email protected]>

2022-06-17 09:13:27

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Prevent kmemleak from accessing pKVM memory

On Fri, Jun 17, 2022 at 09:21:31AM +0100, Marc Zyngier wrote:
> On Thu, 16 Jun 2022 16:11:34 +0000, Quentin Perret wrote:
> > Commit a7259df76702 ("memblock: make memblock_find_in_range method
> > private") changed the API using which memory is reserved for the pKVM
> > hypervisor. However, it seems that memblock_phys_alloc() differs
> > from the original API in terms of kmemleak semantics -- the old one
> > excluded the reserved regions from kmemleak scans when the new one
> > doesn't seem to. Unfortunately, when protected KVM is enabled, all
> > kernel accesses to pKVM-private memory result in a fatal exception,
> > which can now happen because of kmemleak scans:
> >
> > [...]
>
> Applied to fixes, thanks!
>
> [1/1] KVM: arm64: Prevent kmemleak from accessing pKVM memory
> commit: 9e5afa8a537f742bccc2cd91bc0bef4b6483ee98

I'd really like to update the changelog to this:

Commit a7259df76702 ("memblock: make memblock_find_in_range method
private") changed the API using which memory is reserved for the pKVM
hypervisor. However, memblock_phys_alloc() differs from the original API in
terms of kmemleak semantics -- the old one didn't report the reserved
regions to kmemleak while the new one does. Unfortunately, when protected
KVM is enabled, all kernel accesses to pKVM-private memory result in a
fatal exception, which can now happen because of kmemleak scans:

$ echo scan > /sys/kernel/debug/kmemleak
[ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
...

Fix this by explicitly excluding the hypervisor's memory pool from
kmemleak like we already do for the hyp BSS.


> Cheers,
>
> M.
> --
> Marc Zyngier <[email protected]>
>

--
Sincerely yours,
Mike.