2023-07-03 19:20:33

by John Hubbard

[permalink] [raw]
Subject: [PATCH] mm: riscv: fix an unsafe pte read in huge_pte_alloc()

The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
to false positives, because the pte is read twice at the C language
level, locklessly, within the same conditional statement. Depending on
compiler behavior, this can lead to generated machine code that actually
reads the pte just once, or twice. Reading twice will expose the code to
changing pte values and cause incorrect behavior.

In [1], similar code actually caused a kernel crash on 64-bit x86, when
using clang to build the kernel, but only after the conversion from *pte
reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
double read of *pte.

Rather than waiting for the upcoming ptep_get() conversion, just convert
this part of the code now, but in a way that avoids the above problem:
take a single snapshot of the pte before using it in the WARN
conditional.

As expected, this preparatory step does not actually change the
generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
12.2 cross compiler.

[1] https://lore.kernel.org/[email protected]

Suggested-by: James Houghton <[email protected]>
Cc: Ryan Roberts <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
arch/riscv/mm/hugetlbpage.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 542883b3b49b..96225a8533ad 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
}

out:
- WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
+ if (pte) {
+ pte_t pteval = ptep_get_lockless(pte);
+
+ WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));
+ }
return pte;
}


base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd
--
2.41.0



2023-07-04 06:20:26

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH] mm: riscv: fix an unsafe pte read in huge_pte_alloc()

On Mon, Jul 03, 2023 at 12:00:44PM -0700, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
>
> In [1], similar code actually caused a kernel crash on 64-bit x86, when
> using clang to build the kernel, but only after the conversion from *pte
> reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
> double read of *pte.
>
> Rather than waiting for the upcoming ptep_get() conversion, just convert
> this part of the code now, but in a way that avoids the above problem:
> take a single snapshot of the pte before using it in the WARN
> conditional.
>
> As expected, this preparatory step does not actually change the
> generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
> 12.2 cross compiler.
>
> [1] https://lore.kernel.org/[email protected]
>
> Suggested-by: James Houghton <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
> arch/riscv/mm/hugetlbpage.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index 542883b3b49b..96225a8533ad 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
> }
>
> out:
> - WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
> + if (pte) {
> + pte_t pteval = ptep_get_lockless(pte);

I think ptep_get_lockless() on riscv (even riscv32) will always just be
ptep_get(), since pte_t is unsigned long, which can be read atomically.

> +
> + WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));

Ensuring we only read the pte once is good though.

Reviewed-by: Andrew Jones <[email protected]>

Thanks,
drew


> + }
> return pte;
> }
>
>
> base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd
> --
> 2.41.0
>

2023-07-04 07:17:36

by Ryan Roberts

[permalink] [raw]
Subject: Re: [PATCH] mm: riscv: fix an unsafe pte read in huge_pte_alloc()

On 03/07/2023 20:00, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
>
> In [1], similar code actually caused a kernel crash on 64-bit x86, when
> using clang to build the kernel, but only after the conversion from *pte
> reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
> double read of *pte.
>
> Rather than waiting for the upcoming ptep_get() conversion, just convert

I'm not sure there is any upcoming ptep_get() conversion for riscv? Not from me
at least - my focus was on the generic code to suficiently encapsulate it as an
enabler for some follow on arm64 changes.

> this part of the code now, but in a way that avoids the above problem:
> take a single snapshot of the pte before using it in the WARN
> conditional.
>
> As expected, this preparatory step does not actually change the
> generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
> 12.2 cross compiler.
>
> [1] https://lore.kernel.org/[email protected]
>
> Suggested-by: James Houghton <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>

Reviewed-by: Ryan Roberts <[email protected]>

> ---
> arch/riscv/mm/hugetlbpage.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index 542883b3b49b..96225a8533ad 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
> }
>
> out:
> - WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
> + if (pte) {
> + pte_t pteval = ptep_get_lockless(pte);
> +
> + WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));
> + }
> return pte;
> }
>
>
> base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd


2023-07-05 23:44:27

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH] mm: riscv: fix an unsafe pte read in huge_pte_alloc()


On Mon, 03 Jul 2023 12:00:44 -0700, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
>
> [...]

Applied, thanks!

[1/1] mm: riscv: fix an unsafe pte read in huge_pte_alloc()
https://git.kernel.org/palmer/c/62ba41d27612

Best regards,
--
Palmer Dabbelt <[email protected]>


Subject: Re: [PATCH] mm: riscv: fix an unsafe pte read in huge_pte_alloc()

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <[email protected]>:

On Mon, 3 Jul 2023 12:00:44 -0700 you wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
>
> [...]

Here is the summary with links:
- mm: riscv: fix an unsafe pte read in huge_pte_alloc()
https://git.kernel.org/riscv/c/62ba41d27612

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html