By delaying the setting of MSR_RI, a 1% improvment is optained on
null_syscall selftest on an mpc8321.
Without this patch:
root@vgoippro:~# ./null_syscall
1134.33 ns 378.11 cycles
With this patch:
root@vgoippro:~# ./null_syscall
1121.85 ns 373.95 cycles
The drawback is that a machine check during that period
would be unrecoverable, but as only main memory is accessed
during that period, it shouldn't be a concern.
Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/head_32.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 146385b1c2da..ea28a6ab56ec 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -282,8 +282,6 @@ __secondary_hold_acknowledge:
stw r1,GPR1(r11); \
stw r1,0(r11); \
tovirt(r1,r11); /* set new kernel sp */ \
- li r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions */ \
- MTMSRD(r10); /* (except for mach check in rtas) */ \
stw r0,GPR0(r11); \
lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
addi r10,r10,STACK_FRAME_REGS_MARKER@l; \
--
2.13.3
Christophe Leroy <[email protected]> writes:
> By delaying the setting of MSR_RI, a 1% improvment is optained on
> null_syscall selftest on an mpc8321.
>
> Without this patch:
>
> root@vgoippro:~# ./null_syscall
> 1134.33 ns 378.11 cycles
>
> With this patch:
>
> root@vgoippro:~# ./null_syscall
> 1121.85 ns 373.95 cycles
>
> The drawback is that a machine check during that period
> would be unrecoverable, but as only main memory is accessed
> during that period, it shouldn't be a concern.
On 64-bit server CPUs accessing main memory can cause a UE
(Uncorrectable Error) which can trigger a machine check.
So it may still be a concern, it depends how paranoid you are.
> diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
> index 146385b1c2da..ea28a6ab56ec 100644
> --- a/arch/powerpc/kernel/head_32.S
> +++ b/arch/powerpc/kernel/head_32.S
> @@ -282,8 +282,6 @@ __secondary_hold_acknowledge:
> stw r1,GPR1(r11); \
> stw r1,0(r11); \
> tovirt(r1,r11); /* set new kernel sp */ \
> - li r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions */ \
> - MTMSRD(r10); /* (except for mach check in rtas) */ \
> stw r0,GPR0(r11); \
> lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
> addi r10,r10,STACK_FRAME_REGS_MARKER@l; \
Where does RI get enabled? I don't see it anywhere obvious.
cheers
Le 01/02/2019 à 12:10, Michael Ellerman a écrit :
> Christophe Leroy <[email protected]> writes:
>
>> By delaying the setting of MSR_RI, a 1% improvment is optained on
>> null_syscall selftest on an mpc8321.
>>
>> Without this patch:
>>
>> root@vgoippro:~# ./null_syscall
>> 1134.33 ns 378.11 cycles
>>
>> With this patch:
>>
>> root@vgoippro:~# ./null_syscall
>> 1121.85 ns 373.95 cycles
>>
>> The drawback is that a machine check during that period
>> would be unrecoverable, but as only main memory is accessed
>> during that period, it shouldn't be a concern.
>
> On 64-bit server CPUs accessing main memory can cause a UE
> (Uncorrectable Error) which can trigger a machine check.
>
> So it may still be a concern, it depends how paranoid you are.
>
>> diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
>> index 146385b1c2da..ea28a6ab56ec 100644
>> --- a/arch/powerpc/kernel/head_32.S
>> +++ b/arch/powerpc/kernel/head_32.S
>> @@ -282,8 +282,6 @@ __secondary_hold_acknowledge:
>> stw r1,GPR1(r11); \
>> stw r1,0(r11); \
>> tovirt(r1,r11); /* set new kernel sp */ \
>> - li r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions */ \
>> - MTMSRD(r10); /* (except for mach check in rtas) */ \
>> stw r0,GPR0(r11); \
>> lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \
>> addi r10,r10,STACK_FRAME_REGS_MARKER@l; \
>
> Where does RI get enabled? I don't see it anywhere obvious.
MSR_RI is part of MSR_KERNEL, it gets then enabled when reenabling MMU
when calling the exception handler.
#define EXC_XFER_TEMPLATE(n, hdlr, trap, copyee, tfer, ret) \
li r10,trap; \
stw r10,_TRAP(r11); \
li r10,MSR_KERNEL; \
copyee(r10, r9); \
bl tfer; \
i##n: \
.long hdlr; \
.long ret
where tfer = transfer_to_handler.
In transfer_to_handler (kernel/entry_32.S) you have:
transfer_to_handler_cont:
3:
mflr r9
lwz r11,0(r9) /* virtual address of handler */
lwz r9,4(r9) /* where to go when done */
[...]
mtspr SPRN_SRR0,r11
mtspr SPRN_SRR1,r10
mtlr r9
SYNC
RFI /* jump to handler, enable MMU */
So MSR_RI is restored above as r10 contains MSR_KERNEL [ | MSR_EE ]
Christophe
>
> cheers
>
Le 01/02/2019 à 12:51, Christophe Leroy a écrit :
>
>
> Le 01/02/2019 à 12:10, Michael Ellerman a écrit :
>> Christophe Leroy <[email protected]> writes:
>>
>>> By delaying the setting of MSR_RI, a 1% improvment is optained on
>>> null_syscall selftest on an mpc8321.
>>>
>>> Without this patch:
>>>
>>> root@vgoippro:~# ./null_syscall
>>> 1134.33 ns 378.11 cycles
>>>
>>> With this patch:
>>>
>>> root@vgoippro:~# ./null_syscall
>>> 1121.85 ns 373.95 cycles
>>>
>>> The drawback is that a machine check during that period
>>> would be unrecoverable, but as only main memory is accessed
>>> during that period, it shouldn't be a concern.
>>
>> On 64-bit server CPUs accessing main memory can cause a UE
>> (Uncorrectable Error) which can trigger a machine check.
>>
>> So it may still be a concern, it depends how paranoid you are.
>>
>>> diff --git a/arch/powerpc/kernel/head_32.S
>>> b/arch/powerpc/kernel/head_32.S
>>> index 146385b1c2da..ea28a6ab56ec 100644
>>> --- a/arch/powerpc/kernel/head_32.S
>>> +++ b/arch/powerpc/kernel/head_32.S
>>> @@ -282,8 +282,6 @@ __secondary_hold_acknowledge:
>>> stw r1,GPR1(r11); \
>>> stw r1,0(r11); \
>>> tovirt(r1,r11); /* set new kernel sp */ \
>>> - li r10,MSR_KERNEL & ~(MSR_IR|MSR_DR); /* can take exceptions
>>> */ \
>>> - MTMSRD(r10); /* (except for mach check in rtas) */ \
>>> stw r0,GPR0(r11); \
>>> lis r10,STACK_FRAME_REGS_MARKER@ha; /* exception frame
>>> marker */ \
>>> addi r10,r10,STACK_FRAME_REGS_MARKER@l; \
>>
>> Where does RI get enabled? I don't see it anywhere obvious.
>
> MSR_RI is part of MSR_KERNEL, it gets then enabled when reenabling MMU
> when calling the exception handler.
>
> #define EXC_XFER_TEMPLATE(n, hdlr, trap, copyee, tfer, ret) \
> li r10,trap; \
> stw r10,_TRAP(r11); \
> li r10,MSR_KERNEL; \
> copyee(r10, r9); \
> bl tfer; \
> i##n: \
> .long hdlr; \
> .long ret
>
> where tfer = transfer_to_handler.
>
> In transfer_to_handler (kernel/entry_32.S) you have:
>
> transfer_to_handler_cont:
> 3:
> mflr r9
> lwz r11,0(r9) /* virtual address of handler */
> lwz r9,4(r9) /* where to go when done */
> [...]
> mtspr SPRN_SRR0,r11
> mtspr SPRN_SRR1,r10
> mtlr r9
> SYNC
> RFI /* jump to handler, enable MMU */
>
> So MSR_RI is restored above as r10 contains MSR_KERNEL [ | MSR_EE ]
>
Looks like fast_exception_return, which is called by hash page handlers
at least, expects MSR_RI to be set. Allthough it works well on 603
(because it doesn't hash), I would most likely not work on others.
This 1% improvment is not worth it, I give up for now.