2023-10-12 16:22:14

by Siarhei Volkau

[permalink] [raw]
Subject: [PATCH] MIPS: Take in account load hazards for HI/LO restoring

MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
and right after move to HI/LO will usually stall the pipeline for
significant amount of time. Let's take it into account and separate
loads and mthi/lo in instruction sequence.

The patch uses t6 and t7 registers as temporaries in addition to t8.

The patch tries to deal with SmartMIPS, but I know little about and
haven't tested it.

Signed-off-by: Siarhei Volkau <[email protected]>
---
arch/mips/include/asm/stackframe.h | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
index a8705aef47e1..3821d91b00fd 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -308,17 +308,11 @@
jal octeon_mult_restore
#endif
#ifdef CONFIG_CPU_HAS_SMARTMIPS
- LONG_L $24, PT_ACX(sp)
- mtlhx $24
- LONG_L $24, PT_HI(sp)
- mtlhx $24
- LONG_L $24, PT_LO(sp)
- mtlhx $24
-#elif !defined(CONFIG_CPU_MIPSR6)
+ LONG_L $14, PT_ACX(sp)
+#endif
+#if defined(CONFIG_CPU_HAS_SMARTMIPS) || !defined(CONFIG_CPU_MIPSR6)
LONG_L $24, PT_LO(sp)
- mtlo $24
- LONG_L $24, PT_HI(sp)
- mthi $24
+ LONG_L $15, PT_HI(sp)
#endif
#ifdef CONFIG_32BIT
cfi_ld $8, PT_R8, \docfi
@@ -327,6 +321,14 @@
cfi_ld $10, PT_R10, \docfi
cfi_ld $11, PT_R11, \docfi
cfi_ld $12, PT_R12, \docfi
+#ifdef CONFIG_CPU_HAS_SMARTMIPS
+ mtlhx $14
+ mtlhx $15
+ mtlhx $24
+#elif !defined(CONFIG_CPU_MIPSR6)
+ mtlo $24
+ mthi $15
+#endif
cfi_ld $13, PT_R13, \docfi
cfi_ld $14, PT_R14, \docfi
cfi_ld $15, PT_R15, \docfi
--
2.41.0


2024-04-29 06:42:50

by Siarhei Volkau

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Take in account load hazards for HI/LO restoring

Ping. The patch looks abandoned.

Paul, could you recommend right persons / lists for that ?

чт, 12 окт. 2023 г. в 19:21, Siarhei Volkau <[email protected]>:
>
> MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
> and right after move to HI/LO will usually stall the pipeline for
> significant amount of time. Let's take it into account and separate
> loads and mthi/lo in instruction sequence.
>
> The patch uses t6 and t7 registers as temporaries in addition to t8.
>
> The patch tries to deal with SmartMIPS, but I know little about and
> haven't tested it.
>
> Signed-off-by: Siarhei Volkau <[email protected]>
> ---
> arch/mips/include/asm/stackframe.h | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
> index a8705aef47e1..3821d91b00fd 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -308,17 +308,11 @@
> jal octeon_mult_restore
> #endif
> #ifdef CONFIG_CPU_HAS_SMARTMIPS
> - LONG_L $24, PT_ACX(sp)
> - mtlhx $24
> - LONG_L $24, PT_HI(sp)
> - mtlhx $24
> - LONG_L $24, PT_LO(sp)
> - mtlhx $24
> -#elif !defined(CONFIG_CPU_MIPSR6)
> + LONG_L $14, PT_ACX(sp)
> +#endif
> +#if defined(CONFIG_CPU_HAS_SMARTMIPS) || !defined(CONFIG_CPU_MIPSR6)
> LONG_L $24, PT_LO(sp)
> - mtlo $24
> - LONG_L $24, PT_HI(sp)
> - mthi $24
> + LONG_L $15, PT_HI(sp)
> #endif
> #ifdef CONFIG_32BIT
> cfi_ld $8, PT_R8, \docfi
> @@ -327,6 +321,14 @@
> cfi_ld $10, PT_R10, \docfi
> cfi_ld $11, PT_R11, \docfi
> cfi_ld $12, PT_R12, \docfi
> +#ifdef CONFIG_CPU_HAS_SMARTMIPS
> + mtlhx $14
> + mtlhx $15
> + mtlhx $24
> +#elif !defined(CONFIG_CPU_MIPSR6)
> + mtlo $24
> + mthi $15
> +#endif
> cfi_ld $13, PT_R13, \docfi
> cfi_ld $14, PT_R14, \docfi
> cfi_ld $15, PT_R15, \docfi
> --
> 2.41.0
>

2024-04-29 08:26:31

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Take in account load hazards for HI/LO restoring

On Thu, Oct 12, 2023 at 07:20:27PM +0300, Siarhei Volkau wrote:
> MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
> and right after move to HI/LO will usually stall the pipeline for
> significant amount of time. Let's take it into account and separate
> loads and mthi/lo in instruction sequence.
>
> The patch uses t6 and t7 registers as temporaries in addition to t8.
>
> The patch tries to deal with SmartMIPS, but I know little about and
> haven't tested it.
>
> Signed-off-by: Siarhei Volkau <[email protected]>
> ---
> arch/mips/include/asm/stackframe.h | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
> index a8705aef47e1..3821d91b00fd 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -308,17 +308,11 @@
> jal octeon_mult_restore
> #endif
> #ifdef CONFIG_CPU_HAS_SMARTMIPS
> - LONG_L $24, PT_ACX(sp)
> - mtlhx $24
> - LONG_L $24, PT_HI(sp)
> - mtlhx $24
> - LONG_L $24, PT_LO(sp)
> - mtlhx $24
> -#elif !defined(CONFIG_CPU_MIPSR6)
> + LONG_L $14, PT_ACX(sp)
> +#endif
> +#if defined(CONFIG_CPU_HAS_SMARTMIPS) || !defined(CONFIG_CPU_MIPSR6)

isn't that just #ifndef CONFIG_CPU_MIPSR6 ?

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2024-04-29 08:30:56

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Take in account load hazards for HI/LO restoring

On Mon, Apr 29, 2024 at 10:18:57AM +0200, Thomas Bogendoerfer wrote:
> On Thu, Oct 12, 2023 at 07:20:27PM +0300, Siarhei Volkau wrote:
> > MIPS CPUs usually have 1 to 4 cycles load hazards, thus doing load
> > and right after move to HI/LO will usually stall the pipeline for
> > significant amount of time. Let's take it into account and separate
> > loads and mthi/lo in instruction sequence.
> >
> > The patch uses t6 and t7 registers as temporaries in addition to t8.
> >
> > The patch tries to deal with SmartMIPS, but I know little about and
> > haven't tested it.
> >
> > Signed-off-by: Siarhei Volkau <[email protected]>
> > ---
> > arch/mips/include/asm/stackframe.h | 22 ++++++++++++----------
> > 1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
> > index a8705aef47e1..3821d91b00fd 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -308,17 +308,11 @@
> > jal octeon_mult_restore
> > #endif
> > #ifdef CONFIG_CPU_HAS_SMARTMIPS
> > - LONG_L $24, PT_ACX(sp)
> > - mtlhx $24
> > - LONG_L $24, PT_HI(sp)
> > - mtlhx $24
> > - LONG_L $24, PT_LO(sp)
> > - mtlhx $24
> > -#elif !defined(CONFIG_CPU_MIPSR6)
> > + LONG_L $14, PT_ACX(sp)
> > +#endif
> > +#if defined(CONFIG_CPU_HAS_SMARTMIPS) || !defined(CONFIG_CPU_MIPSR6)
>
> isn't that just #ifndef CONFIG_CPU_MIPSR6 ?

and if yes, I prefer to have the same structure as for the move to
registers later like

#ifdef CONFIG_CPU_HAS_SMARTMIPS
. do the SMARTMIPS things
elif !defined(CONFIG_CPU_MIPSR6)
. do normal hi/lo
#endif

that way it's more clear whats happening depending on selected
options.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]