2024-04-30 15:46:37

by Siarhei Volkau

[permalink] [raw]
Subject: [PATCH v2] MIPS: Take in account load hazards for HI/LO restoring

MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
and right after move to HI/LO will usually stall the pipeline for
significant amount of time. Let's take it into account and separate
loads and mthi/lo in instruction sequence.

The patch uses t6 and t7 registers as temporaries in addition to t8.

The patch tries to deal with SmartMIPS, but I know little about and
haven't tested it.

Changes in v2:
- clear separation of actions for SmartMIPS and pre-MIPSR6.

Signed-off-by: Siarhei Volkau <[email protected]>
---
arch/mips/include/asm/stackframe.h | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
index a8705aef47e1..a13431379073 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -308,17 +308,12 @@
jal octeon_mult_restore
#endif
#ifdef CONFIG_CPU_HAS_SMARTMIPS
- LONG_L $24, PT_ACX(sp)
- mtlhx $24
- LONG_L $24, PT_HI(sp)
- mtlhx $24
+ LONG_L $14, PT_ACX(sp)
LONG_L $24, PT_LO(sp)
- mtlhx $24
+ LONG_L $15, PT_HI(sp)
#elif !defined(CONFIG_CPU_MIPSR6)
LONG_L $24, PT_LO(sp)
- mtlo $24
- LONG_L $24, PT_HI(sp)
- mthi $24
+ LONG_L $15, PT_HI(sp)
#endif
#ifdef CONFIG_32BIT
cfi_ld $8, PT_R8, \docfi
@@ -327,6 +322,14 @@
cfi_ld $10, PT_R10, \docfi
cfi_ld $11, PT_R11, \docfi
cfi_ld $12, PT_R12, \docfi
+#ifdef CONFIG_CPU_HAS_SMARTMIPS
+ mtlhx $14
+ mtlhx $15
+ mtlhx $24
+#elif !defined(CONFIG_CPU_MIPSR6)
+ mtlo $24
+ mthi $15
+#endif
cfi_ld $13, PT_R13, \docfi
cfi_ld $14, PT_R14, \docfi
cfi_ld $15, PT_R15, \docfi
--
2.44.0



2024-05-03 12:55:46

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH v2] MIPS: Take in account load hazards for HI/LO restoring

On Tue, Apr 30, 2024 at 06:45:58PM +0300, Siarhei Volkau wrote:
> MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
> and right after move to HI/LO will usually stall the pipeline for
> significant amount of time. Let's take it into account and separate
> loads and mthi/lo in instruction sequence.
>
> The patch uses t6 and t7 registers as temporaries in addition to t8.
>
> The patch tries to deal with SmartMIPS, but I know little about and
> haven't tested it.
>
> Changes in v2:
> - clear separation of actions for SmartMIPS and pre-MIPSR6.
>
> Signed-off-by: Siarhei Volkau <[email protected]>
> ---
> arch/mips/include/asm/stackframe.h | 19 +++++++++++--------
> 1 file changed, 11 insertions(+), 8 deletions(-)
>

applied to mips-next.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]