2017-04-12 11:46:42

by Gautham R Shenoy

[permalink] [raw]
Subject: [PATCH 0/3] powernv:stop: Some fixes for handling deep stop

From: "Gautham R. Shenoy" <[email protected]>

Hi,

This patchset contains three fixes required to get a deep stop state
that can lose the Hypervisor state to work correctly.

The first patch in the series uses the correct value for the
IDLE_THREAD_BITS on POWER8 which has 8 threads per core and on POWER9
which has 4 threads per core.

The second patch decouples restoring Timebase from restoring per-core
spr state as the current code assumes that if the timebase is not lost
then neither is per-core state. This was true on POWER8, but no longer
true on POWER9.

The third patch in the series sets the UPRT bit in LPCR on wakeup from
a deep stop if we are running in radix mode, without which the kernel
crashes once we switch to virtual mode.

These patches are on top of the patches for fixing CPU-Hotplug on
POWER9 DD1.0 (https://lkml.org/lkml/2017/3/22/472) and Nicholas
Piggin's idle fixes and changes for POWER8 and POWER9
(https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-March/155608.html)

Gautham R. Shenoy (3):
powernv:idle: Use correct IDLE_THREAD_BITS in POWER8 vs POWER9
powernv:idle: Decouple TB restore & Per-core SPRs restore
powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

arch/powerpc/include/asm/cpuidle.h | 3 ++-
arch/powerpc/kernel/idle_book3s.S | 29 +++++++++++++++++++++++------
arch/powerpc/platforms/powernv/idle.c | 5 ++++-
3 files changed, 29 insertions(+), 8 deletions(-)

--
1.9.4


2017-04-12 11:46:46

by Gautham R Shenoy

[permalink] [raw]
Subject: [PATCH 1/3] powernv:idle: Use correct IDLE_THREAD_BITS in POWER8/9

From: "Gautham R. Shenoy" <[email protected]>

This patch ensures that POWER8 and POWER9 processors use the correct
value of IDLE_THREAD_BITS as POWER8 has 8 threads per core and hence
the IDLE_THREAD_BITS should be 0xFF while POWER9 has only 4 threads
per core and hence the IDLE_THREAD_BITS should be 0xF.

Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/cpuidle.h | 3 ++-
arch/powerpc/kernel/idle_book3s.S | 9 ++++++---
arch/powerpc/platforms/powernv/idle.c | 5 ++++-
3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 52586f9..fece6ca 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -34,7 +34,8 @@
#define PNV_CORE_IDLE_THREAD_WINKLE_BITS_SHIFT 8
#define PNV_CORE_IDLE_THREAD_WINKLE_BITS 0x0000FF00

-#define PNV_CORE_IDLE_THREAD_BITS 0x000000FF
+#define PNV_CORE_IDLE_4THREAD_BITS 0x0000000F
+#define PNV_CORE_IDLE_8THREAD_BITS 0x000000FF

/*
* ============================ NOTE =================================
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 2b13fe2..9b747e9 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -223,7 +223,7 @@ lwarx_loop1:
add r15,r15,r5 /* Add if winkle */
andc r15,r15,r7 /* Clear thread bit */

- andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
+ andi. r9,r15,PNV_CORE_IDLE_8THREAD_BITS

/*
* If cr0 = 0, then current thread is the last thread of the core entering
@@ -582,8 +582,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
stwcx. r15,0,r14
bne- 1b
isync
-
- andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
+BEGIN_FTR_SECTION
+ andi. r9,r15,PNV_CORE_IDLE_4THREAD_BITS
+FTR_SECTION_ELSE
+ andi. r9,r15,PNV_CORE_IDLE_8THREAD_BITS
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
cmpwi cr2,r9,0

/*
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 445f30a..d46920b 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -112,7 +112,10 @@ static void pnv_alloc_idle_core_states(void)
size_t paca_ptr_array_size;

core_idle_state = kmalloc_node(sizeof(u32), GFP_KERNEL, node);
- *core_idle_state = PNV_CORE_IDLE_THREAD_BITS;
+ if (cpu_has_feature(CPU_FTR_ARCH_300))
+ *core_idle_state = PNV_CORE_IDLE_4THREAD_BITS;
+ else
+ *core_idle_state = PNV_CORE_IDLE_8THREAD_BITS;
paca_ptr_array_size = (threads_per_core *
sizeof(struct paca_struct *));

--
1.9.4

2017-04-12 11:46:59

by Gautham R Shenoy

[permalink] [raw]
Subject: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

From: "Gautham R. Shenoy" <[email protected]>

On wakeup from a deep-stop used for CPU-Hotplug, we invoke
cur_cpu_spec->cpu_restore() which would set sane default values to
various SPRs including LPCR.

On POWER9, the cpu_restore_power9() call would would restore LPCR to a
sane value that is set at early boot time, thereby clearing LPCR_UPRT.

However, LPCR_UPRT is required to be set if we are running in Radix
mode. If this is not set we will end up with a crash when we enable
IR,DR.

To fix this, after returning from cur_cpu_spec->cpu_restore() in the
idle exit path, set LPCR_UPRT if we are running in Radix mode.

Cc: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/kernel/idle_book3s.S | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 6a9bd28..39a9b63 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -804,6 +804,19 @@ no_segments:
#endif
mtctr r12
bctrl
+/*
+ * cur_cpu_spec->cpu_restore would restore LPCR to a
+ * sane value that is set at early boot time,
+ * thereby clearing LPCR_UPRT.
+ * LPCR_UPRT is required if we are running in Radix mode.
+ * Set it here if that be the case.
+ */
+BEGIN_MMU_FTR_SECTION
+ mfspr r3, SPRN_LPCR
+ LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
+ or r3, r3, r4
+ mtspr SPRN_LPCR, r3
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)

hypervisor_state_restored:

--
1.9.4

2017-04-12 11:46:53

by Gautham R Shenoy

[permalink] [raw]
Subject: [PATCH 2/3] powernv:idle: Decouple TB restore & Per-core SPRs restore

From: "Gautham R. Shenoy" <[email protected]>

The idle-exit code assumes that if Timebase is not lost, then neither
are the per-core hypervisor resources lost. This was true on POWER8
where fast-sleep lost only TB but not per-core resources, and winkle
lost both.

This assumption is not true for POWER9 however, since there can be
states which do not lose timebase but can lose per-core SPRs.

Hence check if we need to restore the per-core hypervisor state even
if timebase is not lost.

Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/kernel/idle_book3s.S | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 9b747e9..6a9bd28 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -723,13 +723,14 @@ timebase_resync:
* Use cr3 which indicates that we are waking up with atleast partial
* hypervisor state loss to determine if TIMEBASE RESYNC is needed.
*/
- ble cr3,clear_lock
+ ble cr3,.Ltb_resynced
/* Time base re-sync */
bl opal_resync_timebase;
/*
- * If waking up from sleep, per core state is not lost, skip to
- * clear_lock.
+ * If waking up from sleep (POWER8), per core state
+ * is not lost, skip to clear_lock.
*/
+.Ltb_resynced:
blt cr4,clear_lock

/*
--
1.9.4

2017-04-13 04:00:16

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

"Gautham R. Shenoy" <[email protected]> writes:

> From: "Gautham R. Shenoy" <[email protected]>
>
> On wakeup from a deep-stop used for CPU-Hotplug, we invoke
> cur_cpu_spec->cpu_restore() which would set sane default values to
> various SPRs including LPCR.
>
> On POWER9, the cpu_restore_power9() call would would restore LPCR to a
> sane value that is set at early boot time, thereby clearing LPCR_UPRT.
>
> However, LPCR_UPRT is required to be set if we are running in Radix
> mode. If this is not set we will end up with a crash when we enable
> IR,DR.
>
> To fix this, after returning from cur_cpu_spec->cpu_restore() in the
> idle exit path, set LPCR_UPRT if we are running in Radix mode.
>
> Cc: Aneesh Kumar K.V <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
> arch/powerpc/kernel/idle_book3s.S | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index 6a9bd28..39a9b63 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -804,6 +804,19 @@ no_segments:
> #endif
> mtctr r12
> bctrl
> +/*
> + * cur_cpu_spec->cpu_restore would restore LPCR to a
> + * sane value that is set at early boot time,
> + * thereby clearing LPCR_UPRT.
> + * LPCR_UPRT is required if we are running in Radix mode.
> + * Set it here if that be the case.
> + */
> +BEGIN_MMU_FTR_SECTION
> + mfspr r3, SPRN_LPCR
> + LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> + or r3, r3, r4
> + mtspr SPRN_LPCR, r3
> +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)

What about LPCR_HR ?

>
> hypervisor_state_restored:
>
> --
> 1.9.4

-aneesh

2017-04-13 04:12:42

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> >   #endif
> >        mtctr   r12
> >        bctrl
> > +/*
> > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > + * sane value that is set at early boot time,
> > + * thereby clearing LPCR_UPRT.
> > + * LPCR_UPRT is required if we are running in Radix mode.
> > + * Set it here if that be the case.
> > + */
> > +BEGIN_MMU_FTR_SECTION
> > +     mfspr   r3, SPRN_LPCR
> > +     LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> > +     or      r3, r3, r4
> > +     mtspr   SPRN_LPCR, r3
> > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)

We are probably better off saving the value somewhere during boot
and just "blasting" it whole back.

Cheers
Ben.

2017-04-13 06:27:42

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

On Thu, 2017-04-13 at 14:12 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> > >   #endif
> > >        mtctr   r12
> > >        bctrl
> > > +/*
> > > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > > + * sane value that is set at early boot time,
> > > + * thereby clearing LPCR_UPRT.
> > > + * LPCR_UPRT is required if we are running in Radix mode.
> > > + * Set it here if that be the case.
> > > + */
> > > +BEGIN_MMU_FTR_SECTION
> > > +     mfspr   r3, SPRN_LPCR
> > > +     LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> > > +     or      r3, r3, r4
> > > +     mtspr   SPRN_LPCR, r3
> > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
>
> We are probably better off saving the value somewhere during boot
> and just "blasting" it whole back.

We seem to touch LPCR in a bunch of places these days. Not sure when "sometimes
during boot" should actually be.

Mikey

2017-04-13 06:36:25

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 1/3] powernv:idle: Use correct IDLE_THREAD_BITS in POWER8/9

On Wed, 2017-04-12 at 17:16 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> This patch ensures that POWER8 and POWER9 processors use the correct
> value of IDLE_THREAD_BITS as POWER8 has 8 threads per core and hence
> the IDLE_THREAD_BITS should be 0xFF while POWER9 has only 4 threads
> per core and hence the IDLE_THREAD_BITS should be 0xF.

Why don't we derive this from the device tree rather than hard wiring it per cpu
type?

Mikey

>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
>  arch/powerpc/include/asm/cpuidle.h    | 3 ++-
>  arch/powerpc/kernel/idle_book3s.S     | 9 ++++++---
>  arch/powerpc/platforms/powernv/idle.c | 5 ++++-
>  3 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cpuidle.h
> b/arch/powerpc/include/asm/cpuidle.h
> index 52586f9..fece6ca 100644
> --- a/arch/powerpc/include/asm/cpuidle.h
> +++ b/arch/powerpc/include/asm/cpuidle.h
> @@ -34,7 +34,8 @@
>  #define PNV_CORE_IDLE_THREAD_WINKLE_BITS_SHIFT 8
>  #define PNV_CORE_IDLE_THREAD_WINKLE_BITS 0x0000FF00
>  
> -#define PNV_CORE_IDLE_THREAD_BITS        0x000000FF
> +#define PNV_CORE_IDLE_4THREAD_BITS        0x0000000F
> +#define PNV_CORE_IDLE_8THREAD_BITS        0x000000FF
>  
>  /*
>   * ============================ NOTE =================================
> diff --git a/arch/powerpc/kernel/idle_book3s.S
> b/arch/powerpc/kernel/idle_book3s.S
> index 2b13fe2..9b747e9 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -223,7 +223,7 @@ lwarx_loop1:
>   add r15,r15,r5 /* Add if winkle */
>   andc r15,r15,r7 /* Clear thread bit */
>  
> - andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
> + andi. r9,r15,PNV_CORE_IDLE_8THREAD_BITS
>  
>  /*
>   * If cr0 = 0, then current thread is the last thread of the core entering
> @@ -582,8 +582,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>   stwcx. r15,0,r14
>   bne- 1b
>   isync
> -
> - andi. r9,r15,PNV_CORE_IDLE_THREAD_BITS
> +BEGIN_FTR_SECTION
> + andi. r9,r15,PNV_CORE_IDLE_4THREAD_BITS
> +FTR_SECTION_ELSE
> + andi. r9,r15,PNV_CORE_IDLE_8THREAD_BITS
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
>   cmpwi cr2,r9,0
>  
>   /*
> diff --git a/arch/powerpc/platforms/powernv/idle.c
> b/arch/powerpc/platforms/powernv/idle.c
> index 445f30a..d46920b 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -112,7 +112,10 @@ static void pnv_alloc_idle_core_states(void)
>   size_t paca_ptr_array_size;
>  
>   core_idle_state = kmalloc_node(sizeof(u32), GFP_KERNEL,
> node);
> - *core_idle_state = PNV_CORE_IDLE_THREAD_BITS;
> + if (cpu_has_feature(CPU_FTR_ARCH_300))
> + *core_idle_state = PNV_CORE_IDLE_4THREAD_BITS;
> + else
> + *core_idle_state = PNV_CORE_IDLE_8THREAD_BITS;
>   paca_ptr_array_size = (threads_per_core *
>          sizeof(struct paca_struct *));
>  

2017-04-13 06:55:50

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 2/3] powernv:idle: Decouple TB restore & Per-core SPRs restore

On Wed, 2017-04-12 at 17:16 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> The idle-exit code assumes that if Timebase is not lost, then neither
> are the per-core hypervisor resources lost.

Double negative! How about:

The idle-exit code assumes that if the timebase is restored, then the
per-core hypervisor resources are also restored.

> This was true on POWER8
> where fast-sleep lost only TB but not per-core resources, and winkle
> lost both.
>
> This assumption is not true for POWER9 however, since there can be
> states which do not lose timebase but can lose per-core SPRs.
>
> Hence check if we need to restore the per-core hypervisor state even
> if timebase is not lost.

I think I understand what you're doing, just seems awkwardly worded.

Is this actually what the patch is doing? It seem to be just changing one
branch.

Mikey

>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
>  arch/powerpc/kernel/idle_book3s.S | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kernel/idle_book3s.S
> b/arch/powerpc/kernel/idle_book3s.S
> index 9b747e9..6a9bd28 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -723,13 +723,14 @@ timebase_resync:
>    * Use cr3 which indicates that we are waking up with atleast partial
>    * hypervisor state loss to determine if TIMEBASE RESYNC is needed.
>    */
> - ble cr3,clear_lock
> + ble cr3,.Ltb_resynced
>   /* Time base re-sync */
>   bl opal_resync_timebase;
>   /*
> -  * If waking up from sleep, per core state is not lost, skip to
> -  * clear_lock.
> +  * If waking up from sleep (POWER8), per core state
> +  * is not lost, skip to clear_lock.
>    */
> +.Ltb_resynced:
>   blt cr4,clear_lock
>  
>   /*

2017-04-13 07:18:58

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

On Thu, 13 Apr 2017 16:27:34 +1000
Michael Neuling <[email protected]> wrote:

> On Thu, 2017-04-13 at 14:12 +1000, Benjamin Herrenschmidt wrote:
> > On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> > > >   #endif
> > > >        mtctr   r12
> > > >        bctrl
> > > > +/*
> > > > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > > > + * sane value that is set at early boot time,
> > > > + * thereby clearing LPCR_UPRT.
> > > > + * LPCR_UPRT is required if we are running in Radix mode.
> > > > + * Set it here if that be the case.
> > > > + */
> > > > +BEGIN_MMU_FTR_SECTION
> > > > +     mfspr   r3, SPRN_LPCR
> > > > +     LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> > > > +     or      r3, r3, r4
> > > > +     mtspr   SPRN_LPCR, r3
> > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
> >
> > We are probably better off saving the value somewhere during boot
> > and just "blasting" it whole back.
>
> We seem to touch LPCR in a bunch of places these days. Not sure when "sometimes
> during boot" should actually be.

In the short term, what if we just save LPCR and restore it after calling
cpu_restore? As you say there are a lot of things that touch LPCR we're
not catching here.

2017-04-13 10:00:56

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 1/3] powernv:idle: Use correct IDLE_THREAD_BITS in POWER8/9

Michael Neuling <[email protected]> writes:

> On Wed, 2017-04-12 at 17:16 +0530, Gautham R. Shenoy wrote:
>> From: "Gautham R. Shenoy" <[email protected]>
>>
>> This patch ensures that POWER8 and POWER9 processors use the correct
>> value of IDLE_THREAD_BITS as POWER8 has 8 threads per core and hence
>> the IDLE_THREAD_BITS should be 0xFF while POWER9 has only 4 threads
>> per core and hence the IDLE_THREAD_BITS should be 0xF.
>
> Why don't we derive this from the device tree rather than hard wiring it per cpu
> type?

Right.

In fact we already have threads_per_core which is exactly that.

cheers

2017-04-13 10:05:05

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

Nicholas Piggin <[email protected]> writes:

> On Thu, 13 Apr 2017 16:27:34 +1000
> Michael Neuling <[email protected]> wrote:
>
>> On Thu, 2017-04-13 at 14:12 +1000, Benjamin Herrenschmidt wrote:
>> > On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
>> > > >   #endif
>> > > >        mtctr   r12
>> > > >        bctrl
>> > > > +/*
>> > > > + * cur_cpu_spec->cpu_restore would restore LPCR to a
>> > > > + * sane value that is set at early boot time,
>> > > > + * thereby clearing LPCR_UPRT.
>> > > > + * LPCR_UPRT is required if we are running in Radix mode.
>> > > > + * Set it here if that be the case.
>> > > > + */
>> > > > +BEGIN_MMU_FTR_SECTION
>> > > > +     mfspr   r3, SPRN_LPCR
>> > > > +     LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
>> > > > +     or      r3, r3, r4
>> > > > +     mtspr   SPRN_LPCR, r3
>> > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
>> >
>> > We are probably better off saving the value somewhere during boot
>> > and just "blasting" it whole back.
>>
>> We seem to touch LPCR in a bunch of places these days. Not sure when "sometimes
>> during boot" should actually be.
>
> In the short term, what if we just save LPCR and restore it after calling
> cpu_restore? As you say there are a lot of things that touch LPCR we're
> not catching here.

Yeah can we save it on the way down and restore that value on the way
back up?

The real problem here is that cpu_restore() does not "restore" anything,
it programs a set of fixed values.

We should probably rework it so that it actually does a save/restore to
avoid more bugs like this.

cheers

2017-04-13 11:35:24

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 1/3] powernv:idle: Use correct IDLE_THREAD_BITS in POWER8/9

On Thu, Apr 13, 2017 at 08:00:47PM +1000, Michael Ellerman wrote:
> Michael Neuling <[email protected]> writes:
>
> > On Wed, 2017-04-12 at 17:16 +0530, Gautham R. Shenoy wrote:
> >> From: "Gautham R. Shenoy" <[email protected]>
> >>
> >> This patch ensures that POWER8 and POWER9 processors use the correct
> >> value of IDLE_THREAD_BITS as POWER8 has 8 threads per core and hence
> >> the IDLE_THREAD_BITS should be 0xFF while POWER9 has only 4 threads
> >> per core and hence the IDLE_THREAD_BITS should be 0xF.
> >
> > Why don't we derive this from the device tree rather than hard wiring it per cpu
> > type?
>
> Right.
>
> In fact we already have threads_per_core which is exactly that.

Ok. I will convert IDLE_THREAD_BITS to a variable instead of a
macro so that the variable holds the value
(1 << threads_per_core) - 1.



>
> cheers
>

2017-04-13 11:52:08

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 2/3] powernv:idle: Decouple TB restore & Per-core SPRs restore

On Thu, Apr 13, 2017 at 04:55:45PM +1000, Michael Neuling wrote:
> On Wed, 2017-04-12 at 17:16 +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <[email protected]>
> >
> > The idle-exit code assumes that if Timebase is not lost, then neither
> > are the per-core hypervisor resources lost.
>
> Double negative! How about:
>
> The idle-exit code assumes that if the timebase is restored, then the
> per-core hypervisor resources are also restored.

Ok. To be more explicit, this was the intention:

On POWER8, in case of
- nap: both timebase and hypervisor state is retained.
- fast-sleep: timebase is lost. But the hypervisor state is retained.
- winkle: timebase and hypervisor state is lost.

Hence, the current code assumes that if the timebase value is
retained, then so is the hypervisor state. Thus, the current code
doesn't restore per-core hypervisor state if this is the case.

But that is no longer true on POWER9 where we do have stop states
where timebase value is retained, but the hypervisor state is lost. So
we have to ensure that the per-core hypervisor state gets restored in
such cases.

>
> > This was true on POWER8
> > where fast-sleep lost only TB but not per-core resources, and winkle
> > lost both.
> >
> > This assumption is not true for POWER9 however, since there can be
> > states which do not lose timebase but can lose per-core SPRs.
> >
> > Hence check if we need to restore the per-core hypervisor state even
> > if timebase is not lost.
>
> I think I understand what you're doing, just seems awkwardly worded.

Will fix the wording.

>
> Is this actually what the patch is doing? It seem to be just changing one
> branch.
>

Yes, we change the branch to .Ltb_resynced which will ensure in the
case when timebase is retained, we explicitly check if we are waking
up from a deep stop that loses per-core hypervisor state (indicated by
cr4 being eq or gt).

> Mikey
>
> >
> > Signed-off-by: Gautham R. Shenoy <[email protected]>
> > ---
> > ?arch/powerpc/kernel/idle_book3s.S | 7 ++++---
> > ?1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/idle_book3s.S
> > b/arch/powerpc/kernel/idle_book3s.S
> > index 9b747e9..6a9bd28 100644
> > --- a/arch/powerpc/kernel/idle_book3s.S
> > +++ b/arch/powerpc/kernel/idle_book3s.S
> > @@ -723,13 +723,14 @@ timebase_resync:
> > ? ?* Use cr3 which indicates that we are waking up with atleast partial
> > ? ?* hypervisor state loss to determine if TIMEBASE RESYNC is needed.
> > ? ?*/
> > - ble cr3,clear_lock
> > + ble cr3,.Ltb_resynced
> > ? /* Time base re-sync */
> > ? bl opal_resync_timebase;
> > ? /*
> > - ?* If waking up from sleep, per core state is not lost, skip to
> > - ?* clear_lock.
> > + ?* If waking up from sleep (POWER8), per core state
> > + ?* is not lost, skip to clear_lock.
> > ? ?*/
> > +.Ltb_resynced:
> > ? blt cr4,clear_lock
> > ?
> > ? /*

2017-04-13 11:54:46

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

On Thu, Apr 13, 2017 at 05:18:17PM +1000, Nicholas Piggin wrote:
> On Thu, 13 Apr 2017 16:27:34 +1000
> Michael Neuling <[email protected]> wrote:
>
> > On Thu, 2017-04-13 at 14:12 +1000, Benjamin Herrenschmidt wrote:
> > > On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> > > > > ? #endif
> > > > > ???????mtctr???r12
> > > > > ???????bctrl
> > > > > +/*
> > > > > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > > > > + * sane value that is set at early boot time,
> > > > > + * thereby clearing LPCR_UPRT.
> > > > > + * LPCR_UPRT is required if we are running in Radix mode.
> > > > > + * Set it here if that be the case.
> > > > > + */
> > > > > +BEGIN_MMU_FTR_SECTION
> > > > > +?????mfspr???r3, SPRN_LPCR
> > > > > +?????LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> > > > > +?????or??????r3, r3, r4
> > > > > +?????mtspr???SPRN_LPCR, r3
> > > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
> > >
> > > We are probably better off saving the value somewhere during boot
> > > and just "blasting" it whole back.
> >
> > We seem to touch LPCR in a bunch of places these days. Not sure when "sometimes
> > during boot" should actually be.
>
> In the short term, what if we just save LPCR and restore it after calling
> cpu_restore? As you say there are a lot of things that touch LPCR we're
> not catching here.

In that case can we skip calling cpu_restore in the idle_exit path
altogether and simply restore LPCR to the value that the thread had
before executing stop ?

>

--
Thanks and Regards
gautham.

2017-04-13 12:09:09

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [PATCH 3/3] powernv:idle: Set LPCR_UPRT on wakeup from deep-stop

On Thu, 13 Apr 2017 17:24:34 +0530
Gautham R Shenoy <[email protected]> wrote:

> On Thu, Apr 13, 2017 at 05:18:17PM +1000, Nicholas Piggin wrote:
> > On Thu, 13 Apr 2017 16:27:34 +1000
> > Michael Neuling <[email protected]> wrote:
> >
> > > On Thu, 2017-04-13 at 14:12 +1000, Benjamin Herrenschmidt wrote:
> > > > On Thu, 2017-04-13 at 09:28 +0530, Aneesh Kumar K.V wrote:
> > > > > >   #endif
> > > > > >        mtctr   r12
> > > > > >        bctrl
> > > > > > +/*
> > > > > > + * cur_cpu_spec->cpu_restore would restore LPCR to a
> > > > > > + * sane value that is set at early boot time,
> > > > > > + * thereby clearing LPCR_UPRT.
> > > > > > + * LPCR_UPRT is required if we are running in Radix mode.
> > > > > > + * Set it here if that be the case.
> > > > > > + */
> > > > > > +BEGIN_MMU_FTR_SECTION
> > > > > > +     mfspr   r3, SPRN_LPCR
> > > > > > +     LOAD_REG_IMMEDIATE(r4, LPCR_UPRT)
> > > > > > +     or      r3, r3, r4
> > > > > > +     mtspr   SPRN_LPCR, r3
> > > > > > +END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_RADIX)
> > > >
> > > > We are probably better off saving the value somewhere during boot
> > > > and just "blasting" it whole back.
> > >
> > > We seem to touch LPCR in a bunch of places these days. Not sure when "sometimes
> > > during boot" should actually be.
> >
> > In the short term, what if we just save LPCR and restore it after calling
> > cpu_restore? As you say there are a lot of things that touch LPCR we're
> > not catching here.
>
> In that case can we skip calling cpu_restore in the idle_exit path
> altogether and simply restore LPCR to the value that the thread had
> before executing stop ?

Good question. For a minimal fix I would keep calling cpu_restore. I'd
like to get rid of it if we can though, but that might take a bit more
work.