2021-08-23 07:55:13

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH] powerpc/64: Avoid link stack corruption in kexec_wait()

Use bcl 20,31,+4 instead of bl in order to preserve link stack.

See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()") for details.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/misc_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 4b761a18a74d..613509907166 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -255,7 +255,7 @@ _GLOBAL(scom970_write)
* Physical (hardware) cpu id should be in r3.
*/
_GLOBAL(kexec_wait)
- bl 1f
+ bcl 20,31,1f
1: mflr r5
addi r5,r5,kexec_flag-1b

--
2.25.0


2021-08-31 06:19:55

by Daniel Axtens

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64: Avoid link stack corruption in kexec_wait()

Hi Christophe,

> Use bcl 20,31,+4 instead of bl in order to preserve link stack.
>
> See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
> in __get_datapage()") for details.

From my understanding of that commit message, the change helps to keep
the link stack correctly balanced which is helpful for performance,
rather than for correctness. If I understand correctly, kexec_wait is
not in a hot path - rather it is where CPUs spin while waiting for
kexec. Is there any benefit in using the more complicated opcode in this
situation?

> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> arch/powerpc/kernel/misc_64.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index 4b761a18a74d..613509907166 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -255,7 +255,7 @@ _GLOBAL(scom970_write)
> * Physical (hardware) cpu id should be in r3.
> */
> _GLOBAL(kexec_wait)
> - bl 1f
> + bcl 20,31,1f
> 1: mflr r5

Would it be better to create a macro of some sort to wrap this unusual
special form so that the meaning is more clear?

Kind regards,
Daniel

> addi r5,r5,kexec_flag-1b
>
> --
> 2.25.0

2021-08-31 08:55:59

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64: Avoid link stack corruption in kexec_wait()



Le 31/08/2021 à 08:17, Daniel Axtens a écrit :
> Hi Christophe,
>
>> Use bcl 20,31,+4 instead of bl in order to preserve link stack.
>>
>> See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
>> in __get_datapage()") for details.
>
> From my understanding of that commit message, the change helps to keep
> the link stack correctly balanced which is helpful for performance,
> rather than for correctness. If I understand correctly, kexec_wait is
> not in a hot path - rather it is where CPUs spin while waiting for
> kexec. Is there any benefit in using the more complicated opcode in this
> situation?

AFAICS the main benefit is to keep things consistent over the kernel and not have to wonder "is it a
hot path or not ? If it is I use bcl 20,31, if it is not I use bl". The best way to keep things in
order is to always use the right instruction.

>
>> Signed-off-by: Christophe Leroy <[email protected]>
>> ---
>> arch/powerpc/kernel/misc_64.S | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
>> index 4b761a18a74d..613509907166 100644
>> --- a/arch/powerpc/kernel/misc_64.S
>> +++ b/arch/powerpc/kernel/misc_64.S
>> @@ -255,7 +255,7 @@ _GLOBAL(scom970_write)
>> * Physical (hardware) cpu id should be in r3.
>> */
>> _GLOBAL(kexec_wait)
>> - bl 1f
>> + bcl 20,31,1f
>> 1: mflr r5
>
> Would it be better to create a macro of some sort to wrap this unusual
> special form so that the meaning is more clear?

Not sure, I think people working with assembly will easily recognise that form whereas an obscure
macro is always puzzling.

I like macros when they allow you to not repeat again and again the same sequence of several
instructions, but here it is a single quite simple instruction which is not worth a macro in my mind.

Christophe

2021-08-31 12:44:40

by Daniel Axtens

[permalink] [raw]
Subject: Re: [PATCH] powerpc/64: Avoid link stack corruption in kexec_wait()

Christophe Leroy <[email protected]> writes:

> Le 31/08/2021 à 08:17, Daniel Axtens a écrit :
>> Hi Christophe,
>>
>>> Use bcl 20,31,+4 instead of bl in order to preserve link stack.
>>>
>>> See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
>>> in __get_datapage()") for details.
>>
>> From my understanding of that commit message, the change helps to keep
>> the link stack correctly balanced which is helpful for performance,
>> rather than for correctness. If I understand correctly, kexec_wait is
>> not in a hot path - rather it is where CPUs spin while waiting for
>> kexec. Is there any benefit in using the more complicated opcode in this
>> situation?
>
> AFAICS the main benefit is to keep things consistent over the kernel and not have to wonder "is it a
> hot path or not ? If it is I use bcl 20,31, if it is not I use bl". The best way to keep things in
> order is to always use the right instruction.

Yeah, Nick Piggin convinced me of this offline as well.

>
>>
>>> Signed-off-by: Christophe Leroy <[email protected]>
>>> ---
>>> arch/powerpc/kernel/misc_64.S | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
>>> index 4b761a18a74d..613509907166 100644
>>> --- a/arch/powerpc/kernel/misc_64.S
>>> +++ b/arch/powerpc/kernel/misc_64.S
>>> @@ -255,7 +255,7 @@ _GLOBAL(scom970_write)
>>> * Physical (hardware) cpu id should be in r3.
>>> */
>>> _GLOBAL(kexec_wait)
>>> - bl 1f
>>> + bcl 20,31,1f
>>> 1: mflr r5
>>
>> Would it be better to create a macro of some sort to wrap this unusual
>> special form so that the meaning is more clear?
>
> Not sure, I think people working with assembly will easily recognise that form whereas an obscure
> macro is always puzzling.
>
> I like macros when they allow you to not repeat again and again the same sequence of several
> instructions, but here it is a single quite simple instruction which is not worth a macro in my mind.
>


Sure - I was mostly thinking specifically of the bcl; mflr situation but
I agree that for the single instruction it's not needed.

In short, I am convinced, and so:
Reviewed-by: Daniel Axtens <[email protected]>

Kind regards,
Daniel

> Christophe