2022-10-22 11:54:20

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 07/13] mm/gup: Fix the lockless PMD access

On architectures where the PTE/PMD is larger than the native word size
(i386-PAE for example), READ_ONCE() can do the wrong thing. Use
pmdp_get_lockless() just like we use ptep_get_lockless().

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/events/core.c | 2 +-
mm/gup.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct
return pud_leaf_size(pud);

pmdp = pmd_offset_lockless(pudp, pud, addr);
- pmd = READ_ONCE(*pmdp);
+ pmd = pmdp_get_lockless(pmdp);
if (!pmd_present(pmd))
return 0;

--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu

pmdp = pmd_offset_lockless(pudp, pud, addr);
do {
- pmd_t pmd = READ_ONCE(*pmdp);
+ pmd_t pmd = pmdp_get_lockless(pmdp);

next = pmd_addr_end(addr, end);
if (!pmd_present(pmd))



2022-10-23 00:58:40

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 07/13] mm/gup: Fix the lockless PMD access

On Sat, 22 Oct 2022, Peter Zijlstra wrote:

> On architectures where the PTE/PMD is larger than the native word size
> (i386-PAE for example), READ_ONCE() can do the wrong thing. Use
> pmdp_get_lockless() just like we use ptep_get_lockless().

I thought that was something Will Deacon put a lot of effort
into handling around 5.8 and 5.9: see "strong prevailing wind" in
include/asm-generic/rwonce.h, formerly in include/linux/compiler.h.

Was it too optimistic? Did the wind drop?

I'm interested in the answer, but I've certainly no objection
to making this all more obviously robust - thanks.

Hugh

>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> kernel/events/core.c | 2 +-
> mm/gup.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct
> return pud_leaf_size(pud);
>
> pmdp = pmd_offset_lockless(pudp, pud, addr);
> - pmd = READ_ONCE(*pmdp);
> + pmd = pmdp_get_lockless(pmdp);
> if (!pmd_present(pmd))
> return 0;
>
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu
>
> pmdp = pmd_offset_lockless(pudp, pud, addr);
> do {
> - pmd_t pmd = READ_ONCE(*pmdp);
> + pmd_t pmd = pmdp_get_lockless(pmdp);
>
> next = pmd_addr_end(addr, end);
> if (!pmd_present(pmd))

2022-10-24 07:57:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 07/13] mm/gup: Fix the lockless PMD access

On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote:
> On Sat, 22 Oct 2022, Peter Zijlstra wrote:
>
> > On architectures where the PTE/PMD is larger than the native word size
> > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use
> > pmdp_get_lockless() just like we use ptep_get_lockless().
>
> I thought that was something Will Deacon put a lot of effort
> into handling around 5.8 and 5.9: see "strong prevailing wind" in
> include/asm-generic/rwonce.h, formerly in include/linux/compiler.h.
>
> Was it too optimistic? Did the wind drop?
>
> I'm interested in the answer, but I've certainly no objection
> to making this all more obviously robust - thanks.

READ_ONCE() can't do what the hardware can't do. There is absolutely no
way i386 can do an atomic 64bit load without resorting to cmpxchg8b.

Also see the comment that goes with compiletime_assert_rwonce_type(). It
explicitly allows 64bit because there's just too much stuff that does
that (and there's actually 32bit hardware that *can* do it).

But it's still very wrong.

2022-10-25 04:21:54

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 07/13] mm/gup: Fix the lockless PMD access

On Mon, 24 Oct 2022, Peter Zijlstra wrote:
> On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote:
> > On Sat, 22 Oct 2022, Peter Zijlstra wrote:
> >
> > > On architectures where the PTE/PMD is larger than the native word size
> > > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use
> > > pmdp_get_lockless() just like we use ptep_get_lockless().
> >
> > I thought that was something Will Deacon put a lot of effort
> > into handling around 5.8 and 5.9: see "strong prevailing wind" in
> > include/asm-generic/rwonce.h, formerly in include/linux/compiler.h.
> >
> > Was it too optimistic? Did the wind drop?
> >
> > I'm interested in the answer, but I've certainly no objection
> > to making this all more obviously robust - thanks.
>
> READ_ONCE() can't do what the hardware can't do. There is absolutely no
> way i386 can do an atomic 64bit load without resorting to cmpxchg8b.

Right.

>
> Also see the comment that goes with compiletime_assert_rwonce_type(). It
> explicitly allows 64bit because there's just too much stuff that does
> that (and there's actually 32bit hardware that *can* do it).

Yes, the "strong prevailing wind" comment. I think I've never read that
carefully enough, until you redirected me back there: it is in fact
quite clear, that it's only *atomic* in the Armv7 + LPAE case; but
READ_ONCEy (READ_EACH_HALF_ONCE I guess) for other 64-on-32 cases.

>
> But it's still very wrong.

Somewhat clearer to me now, thanks.

Hugh