2019-02-14 11:17:13

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
rather than just checking that the value is non-zero, e.g.:

static inline int pgd_present(pgd_t pgd)
{
- return !pgd_none(pgd);
+ return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
}

Unfortunately this is broken on big endian, as the result of the
bitwise && is truncated to int, which is always zero because
_PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
pud_present() are always false at compile time, and the compiler
elides the subsequent code.

Remarkably with that bug present we are still able to boot and run
with few noticeable effects. However under some work loads we are able
to trigger a warning in the ext4 code:

WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
...
NIP .ext4_set_page_dirty+0x70/0xb0
LR .set_page_dirty+0xa0/0x150
Call Trace:
.set_page_dirty+0xa0/0x150
.unmap_page_range+0xbf0/0xe10
.unmap_vmas+0x84/0x130
.unmap_region+0xe8/0x190
.__do_munmap+0x2f0/0x510
.__vm_munmap+0x80/0x110
.__se_sys_munmap+0x14/0x30
system_call+0x5c/0x70

The fix is simple, we need to convert the result of the bitwise && to
an int before returning it.

Thanks to Jan Kara and Aneesh for help with debugging.

Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
Cc: [email protected] # v4.20+
Reported-by: Erhard F. <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c9bfe526ca9d..d8c8d7c9df15 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)

static inline int pud_present(pud_t pud)
{
- return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
+ return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
}

extern struct page *pud_page(pud_t pud);
@@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)

static inline int pgd_present(pgd_t pgd)
{
- return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
+ return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
}

static inline pte_t pgd_pte(pgd_t pgd)
--
2.20.1



2019-02-15 00:54:28

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Thu 14-02-19 17:23:39, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
>
> static inline int pgd_present(pgd_t pgd)
> {
> - return !pgd_none(pgd);
> + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because
> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
>
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:

Wow, good catch. I wouldn't believe there are so few bad effects from
such a major breakage! :)

Honza

>
> WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
> CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
> ...
> NIP .ext4_set_page_dirty+0x70/0xb0
> LR .set_page_dirty+0xa0/0x150
> Call Trace:
> .set_page_dirty+0xa0/0x150
> .unmap_page_range+0xbf0/0xe10
> .unmap_vmas+0x84/0x130
> .unmap_region+0xe8/0x190
> .__do_munmap+0x2f0/0x510
> .__vm_munmap+0x80/0x110
> .__se_sys_munmap+0x14/0x30
> system_call+0x5c/0x70
>
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
>
> Thanks to Jan Kara and Aneesh for help with debugging.
>
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: [email protected] # v4.20+
> Reported-by: Erhard F. <[email protected]>
> Reviewed-by: Aneesh Kumar K.V <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>
> ---
> arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c9bfe526ca9d..d8c8d7c9df15 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)
>
> static inline int pud_present(pud_t pud)
> {
> - return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> extern struct page *pud_page(pud_t pud);
> @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)
>
> static inline int pgd_present(pgd_t pgd)
> {
> - return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> static inline pte_t pgd_pte(pgd_t pgd)
> --
> 2.20.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2019-02-16 12:59:49

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
>
> static inline int pgd_present(pgd_t pgd)
> {
> - return !pgd_none(pgd);
> + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because

Not sure why that should happen, why is the result an int? What
causes the casting of pgd_t & be64 to be truncated to an int.

> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
>
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:
>
> WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
> CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
> ...
> NIP .ext4_set_page_dirty+0x70/0xb0
> LR .set_page_dirty+0xa0/0x150
> Call Trace:
> .set_page_dirty+0xa0/0x150
> .unmap_page_range+0xbf0/0xe10
> .unmap_vmas+0x84/0x130
> .unmap_region+0xe8/0x190
> .__do_munmap+0x2f0/0x510
> .__vm_munmap+0x80/0x110
> .__se_sys_munmap+0x14/0x30
> system_call+0x5c/0x70
>
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
>
> Thanks to Jan Kara and Aneesh for help with debugging.
>
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: [email protected] # v4.20+
> Reported-by: Erhard F. <[email protected]>
> Reviewed-by: Aneesh Kumar K.V <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>
> ---
> arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c9bfe526ca9d..d8c8d7c9df15 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)
>
> static inline int pud_present(pud_t pud)
> {
> - return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> extern struct page *pud_page(pud_t pud);
> @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)
>
> static inline int pgd_present(pgd_t pgd)
> {
> - return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> }
>

Care to put a big FAT warning, so that we don't repeat this again
(as in authors planning on changing these bits).

Balbir Singh.


2019-02-16 13:27:44

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Feb 14 2019, Michael Ellerman <[email protected]> wrote:

> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.

Alternatively, the return type could be changed to bool, so that the
compiler does the right thing by itself.

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."

2019-02-17 04:40:05

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Hi all,

On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> > rather than just checking that the value is non-zero, e.g.:
> >
> > static inline int pgd_present(pgd_t pgd)
> > {
> > - return !pgd_none(pgd);
> > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> > }
> >
> > Unfortunately this is broken on big endian, as the result of the
> > bitwise && is truncated to int, which is always zero because

(Bitwise "&" of course).

> Not sure why that should happen, why is the result an int? What
> causes the casting of pgd_t & be64 to be truncated to an int.

Yes, it's not obvious as written... It's simply that the return type of
pgd_present is int. So it is truncated _after_ the bitwise and.


Segher

2019-02-17 08:33:13

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> Hi all,
>
> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> > > rather than just checking that the value is non-zero, e.g.:
> > >
> > > static inline int pgd_present(pgd_t pgd)
> > > {
> > > - return !pgd_none(pgd);
> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> > > }
> > >
> > > Unfortunately this is broken on big endian, as the result of the
> > > bitwise && is truncated to int, which is always zero because
>
> (Bitwise "&" of course).
>
> > Not sure why that should happen, why is the result an int? What
> > causes the casting of pgd_t & be64 to be truncated to an int.
>
> Yes, it's not obvious as written... It's simply that the return type of
> pgd_present is int. So it is truncated _after_ the bitwise and.
>

Thanks, I am surprised the compiler does not complain about the truncation
of bits. I wonder if we are missing -Wconversion

Balbir

2019-02-17 08:41:43

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Andreas Schwab <[email protected]> writes:

> On Feb 14 2019, Michael Ellerman <[email protected]> wrote:
>
>> The fix is simple, we need to convert the result of the bitwise && to
>> an int before returning it.
>
> Alternatively, the return type could be changed to bool, so that the
> compiler does the right thing by itself.

Yes that would be preferable. All other architectures return an int so
I wasn't game to switch to bool for a fix. But I don't see why it should
matter so I'll do a patch using bool for next.

cheers

2019-02-17 08:42:04

by Michael Ellerman

[permalink] [raw]
Subject: Re: powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Thu, 2019-02-14 at 06:23:39 UTC, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
>
> static inline int pgd_present(pgd_t pgd)
> {
> - return !pgd_none(pgd);
> + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because
> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
>
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:
>
> WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
> CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
> ...
> NIP .ext4_set_page_dirty+0x70/0xb0
> LR .set_page_dirty+0xa0/0x150
> Call Trace:
> .set_page_dirty+0xa0/0x150
> .unmap_page_range+0xbf0/0xe10
> .unmap_vmas+0x84/0x130
> .unmap_region+0xe8/0x190
> .__do_munmap+0x2f0/0x510
> .__vm_munmap+0x80/0x110
> .__se_sys_munmap+0x14/0x30
> system_call+0x5c/0x70
>
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
>
> Thanks to Jan Kara and Aneesh for help with debugging.
>
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: [email protected] # v4.20+
> Reported-by: Erhard F. <[email protected]>
> Reviewed-by: Aneesh Kumar K.V <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>

Applied to powerpc fixes.

https://git.kernel.org/powerpc/c/a58007621be33e9f7c7bed5d5ff8ecb9

cheers

2019-02-17 08:42:37

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Balbir Singh <[email protected]> writes:
> On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
>> Hi all,
>>
>> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> > > rather than just checking that the value is non-zero, e.g.:
>> > >
>> > > static inline int pgd_present(pgd_t pgd)
>> > > {
>> > > - return !pgd_none(pgd);
>> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> > > }
>> > >
>> > > Unfortunately this is broken on big endian, as the result of the
>> > > bitwise && is truncated to int, which is always zero because
>>
>> (Bitwise "&" of course).
>>
>> > Not sure why that should happen, why is the result an int? What
>> > causes the casting of pgd_t & be64 to be truncated to an int.
>>
>> Yes, it's not obvious as written... It's simply that the return type of
>> pgd_present is int. So it is truncated _after_ the bitwise and.
>>
>
> Thanks, I am surprised the compiler does not complain about the truncation
> of bits. I wonder if we are missing -Wconversion

Good luck with that :)

What I should start doing is building with it enabled and then comparing
the output before and after commits to make sure we're not introducing
new cases.

cheers

2019-02-17 08:43:20

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Segher Boessenkool <[email protected]> writes:

> Hi all,
>
> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> > rather than just checking that the value is non-zero, e.g.:
>> >
>> > static inline int pgd_present(pgd_t pgd)
>> > {
>> > - return !pgd_none(pgd);
>> > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> > }
>> >
>> > Unfortunately this is broken on big endian, as the result of the
>> > bitwise && is truncated to int, which is always zero because
>
> (Bitwise "&" of course).

Thanks, I fixed that up.

cheers

2019-02-17 21:56:37

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
> Balbir Singh <[email protected]> writes:
> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> >> Hi all,
> >>
> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> >> > > rather than just checking that the value is non-zero, e.g.:
> >> > >
> >> > > static inline int pgd_present(pgd_t pgd)
> >> > > {
> >> > > - return !pgd_none(pgd);
> >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> >> > > }
> >> > >
> >> > > Unfortunately this is broken on big endian, as the result of the
> >> > > bitwise && is truncated to int, which is always zero because
> >>
> >> (Bitwise "&" of course).
> >>
> >> > Not sure why that should happen, why is the result an int? What
> >> > causes the casting of pgd_t & be64 to be truncated to an int.
> >>
> >> Yes, it's not obvious as written... It's simply that the return type of
> >> pgd_present is int. So it is truncated _after_ the bitwise and.
> >>
> >
> > Thanks, I am surprised the compiler does not complain about the truncation
> > of bits. I wonder if we are missing -Wconversion
>
> Good luck with that :)
>
> What I should start doing is building with it enabled and then comparing
> the output before and after commits to make sure we're not introducing
> new cases.
>

Fair enough, my point was that the compiler can help out. I'll see what
-Wconversion finds on my local build :)

Balbir Singh.

2019-02-18 01:10:02

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Balbir Singh <[email protected]> writes:
> On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
>> Balbir Singh <[email protected]> writes:
>> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
>> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> >> > > rather than just checking that the value is non-zero, e.g.:
>> >> > >
>> >> > > static inline int pgd_present(pgd_t pgd)
>> >> > > {
>> >> > > - return !pgd_none(pgd);
>> >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> >> > > }
>> >> > >
>> >> > > Unfortunately this is broken on big endian, as the result of the
>> >> > > bitwise && is truncated to int, which is always zero because
>> >>
>> >> (Bitwise "&" of course).
>> >>
>> >> > Not sure why that should happen, why is the result an int? What
>> >> > causes the casting of pgd_t & be64 to be truncated to an int.
>> >>
>> >> Yes, it's not obvious as written... It's simply that the return type of
>> >> pgd_present is int. So it is truncated _after_ the bitwise and.
>> >>
>> >
>> > Thanks, I am surprised the compiler does not complain about the truncation
>> > of bits. I wonder if we are missing -Wconversion
>>
>> Good luck with that :)
>>
>> What I should start doing is building with it enabled and then comparing
>> the output before and after commits to make sure we're not introducing
>> new cases.
>
> Fair enough, my point was that the compiler can help out. I'll see what
> -Wconversion finds on my local build :)

I get about 43MB of warnings here :)

cheers

2019-02-19 12:03:49

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> Balbir Singh <[email protected]> writes:
> > On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
> >> Balbir Singh <[email protected]> writes:
> >> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> >> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> >> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> >> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> >> >> > > rather than just checking that the value is non-zero, e.g.:
> >> >> > >
> >> >> > > static inline int pgd_present(pgd_t pgd)
> >> >> > > {
> >> >> > > - return !pgd_none(pgd);
> >> >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> >> >> > > }
> >> >> > >
> >> >> > > Unfortunately this is broken on big endian, as the result of the
> >> >> > > bitwise && is truncated to int, which is always zero because
> >> >>
> >> >> (Bitwise "&" of course).
> >> >>
> >> >> > Not sure why that should happen, why is the result an int? What
> >> >> > causes the casting of pgd_t & be64 to be truncated to an int.
> >> >>
> >> >> Yes, it's not obvious as written... It's simply that the return type of
> >> >> pgd_present is int. So it is truncated _after_ the bitwise and.
> >> >>
> >> >
> >> > Thanks, I am surprised the compiler does not complain about the truncation
> >> > of bits. I wonder if we are missing -Wconversion
> >>
> >> Good luck with that :)
> >>
> >> What I should start doing is building with it enabled and then comparing
> >> the output before and after commits to make sure we're not introducing
> >> new cases.
> >
> > Fair enough, my point was that the compiler can help out. I'll see what
> > -Wconversion finds on my local build :)
>
> I get about 43MB of warnings here :)
>

I got about 181M with a failed build :(, but the warnings pointed to some cases
that can be a good project for cleanup

For example

1.
static inline long regs_return_value(struct pt_regs *regs)
{
if (is_syscall_success(regs))
return regs->gpr[3];
else
return -regs->gpr[3];
}

In the case of is_syscall_success() returning false, we should ensure that
regs->gpr[3] is negative and capped within a certain limit, but it might
be an expensive check

2.
static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
unsigned int index, unsigned int hidx)
{
hpte_slot_array[index] = (hidx << 1) | 0x1;
}

hidx is 3 bits, but the argument is unsigned int. The caller probably does a
hidx & 0x7, but it's not clear from the code

3. hash__pmd_bad (pmd_bad) and hash__pud_bad (pud_bad) have issues similar to what was found,
but since the the page table indices are below 32, the macros are safe :)

And a few more, but I am not sure why I spent time looking at possible issues,
may be I am being stupid or overly pessimistic :)

Balbir



2019-02-19 20:19:13

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> Balbir Singh <[email protected]> writes:
> > Fair enough, my point was that the compiler can help out. I'll see what
> > -Wconversion finds on my local build :)
>
> I get about 43MB of warnings here :)

Yes, -Wconversion complains about a lot of things that are idiomatic C.
There is a reason -Wconversion is not in -Wall or -Wextra.


Segher

2019-02-20 11:20:42

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

Segher Boessenkool <[email protected]> writes:
> On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
>> Balbir Singh <[email protected]> writes:
>> > Fair enough, my point was that the compiler can help out. I'll see what
>> > -Wconversion finds on my local build :)
>>
>> I get about 43MB of warnings here :)
>
> Yes, -Wconversion complains about a lot of things that are idiomatic C.
> There is a reason -Wconversion is not in -Wall or -Wextra.

Actually a lot of those go away when I add -Wno-sign-conversion.

And what's left seems mostly reasonable, they all indicate the
possibility of a bug I think.

In fact this works and would have caught the bug:

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index d8c8d7c9df15..3114e3f368e2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud)

static inline int pud_present(pud_t pud)
{
+ __diag_push();
+ __diag_warn(GCC, 8, "-Wconversion", "ulong -> int");
+
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
+
+ __diag_pop();
}

extern struct page *pud_page(pud_t pud);



Obviously we're not going to instrument every function like that. But we
could start instrumenting particular files.

cheers

2019-02-20 14:53:32

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

On Wed, Feb 20, 2019 at 10:18:38PM +1100, Michael Ellerman wrote:
> Segher Boessenkool <[email protected]> writes:
> > On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> >> Balbir Singh <[email protected]> writes:
> >> > Fair enough, my point was that the compiler can help out. I'll see what
> >> > -Wconversion finds on my local build :)
> >>
> >> I get about 43MB of warnings here :)
> >
> > Yes, -Wconversion complains about a lot of things that are idiomatic C.
> > There is a reason -Wconversion is not in -Wall or -Wextra.
>
> Actually a lot of those go away when I add -Wno-sign-conversion.
>
> And what's left seems mostly reasonable, they all indicate the
> possibility of a bug I think.
>
> In fact this works and would have caught the bug:
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index d8c8d7c9df15..3114e3f368e2 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud)
>
> static inline int pud_present(pud_t pud)
> {
> + __diag_push();
> + __diag_warn(GCC, 8, "-Wconversion", "ulong -> int");
> +
> return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> +
> + __diag_pop();
> }
>
> extern struct page *pud_page(pud_t pud);
>
>
>
> Obviously we're not going to instrument every function like that. But we
> could start instrumenting particular files.

So you want to instrument the functions that you know are buggy, using some
weird incantations to catch only those errors you already know about?

(I am worried this does not scale, in many dimensions).


Segher