From: "Gautham R. Shenoy" <[email protected]>
Hi,
This is the fourth iteration of the patchset to enable exploitation of
stop4 idle state on POWER9 via cpuidle.
The earlier version can be found here :
[v3]: https://lkml.org/lkml/2017/7/21/209
[v2]: https://lkml.org/lkml/2017/7/19/152
[v1]: https://lkml.org/lkml/2017/7/18/691
The changes across the versions are as follows:
v3-->v4:
- Modified the subject line to be consistent with the convention. No changes to code.
v2-->v3:
- Use a structure instead of an array for the stop sprs save area.
- Name the offsets into the paca->stop_sprs as STOP_XXX instead of PACA_XXX.
- Add comments in the assembly code explaining why saving/restoring
is not needed on POWER8.
- Program the LPCR during platform idle entry/exit on both POWER8 and POWER9
as suggested by Nicholas Piggin.
v1 --> v2:
- Move the LPCR manipulations for CPU-Hotplug into
arch/powerpc/platforms/powernv/idle.c as per Nicholas Piggin's
suggestion.
====================== Description ===========================
The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.
Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time. When stop4 is used in the context
of cpuidle, we want these additional SPRs to be restored to their
older value, to ensure that the context on the CPU coming back from
idle is same as it was before going idle.
Additionally, the CPU which is in stop4 while idling can be woken up
by the decrementer interrupts. So we need to ensure that the LPCR is
programmed with PECE1 bit cleared via the stop-api only for the
CPU-Hotplug case and not for cpuidle.
The two patches in the series address this problem.
Gautham R. Shenoy (2):
powernv/powerpc:Save/Restore additional SPRs for stop4 cpuidle
powernv/powerpc: Clear PECE1 in LPCR via stop-api only on Hotplug
Gautham R. Shenoy (2):
powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle
powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug
arch/powerpc/include/asm/cpuidle.h | 11 ++++++
arch/powerpc/include/asm/paca.h | 7 ++++
arch/powerpc/kernel/asm-offsets.c | 8 +++++
arch/powerpc/kernel/idle_book3s.S | 65 +++++++++++++++++++++++++++++++++--
arch/powerpc/platforms/powernv/idle.c | 34 +++++++++++++++++-
arch/powerpc/platforms/powernv/smp.c | 8 -----
6 files changed, 122 insertions(+), 11 deletions(-)
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
Currently we use the stop-api provided by the firmware to program the
SLW engine to restore the values of hypervisor resources that get lost
on deeper idle states (such as winkle). Since the deep states were
only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
woken up by decrementer.
On POWER9, some of the deep platform idle states such as stop4 can be
used in cpuidle as well. In this case, we want the CPU in stop4 to be
woken up by the decrementer when some timer on the CPU expires.
In this patch, we program the stop-api for LPCR with PECE1
bit cleared only when we are offlining the CPU and set it
back once the CPU is online.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/platforms/powernv/idle.c | 34 +++++++++++++++++++++++++++++++++-
arch/powerpc/platforms/powernv/smp.c | 8 --------
2 files changed, 33 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 2abee07..a1296e7 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -68,7 +68,7 @@ static int pnv_save_sprs_for_deep_states(void)
* all cpus at boot. Get these reg values of current cpu and use the
* same across all cpus.
*/
- uint64_t lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+ uint64_t lpcr_val = mfspr(SPRN_LPCR);
uint64_t hid0_val = mfspr(SPRN_HID0);
uint64_t hid1_val = mfspr(SPRN_HID1);
uint64_t hid4_val = mfspr(SPRN_HID4);
@@ -355,6 +355,14 @@ void power9_idle(void)
}
#ifdef CONFIG_HOTPLUG_CPU
+static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val)
+{
+ u64 pir = get_hard_smp_processor_id(cpu);
+
+ mtspr(SPRN_LPCR, lpcr_val);
+ opal_slw_set_reg(pir, SPRN_LPCR, lpcr_val);
+}
+
/*
* pnv_cpu_offline: A function that puts the CPU into the deepest
* available platform idle state on a CPU-Offline.
@@ -364,6 +372,20 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
{
unsigned long srr1;
u32 idle_states = pnv_get_supported_cpuidle_states();
+ u64 lpcr_val;
+
+ /*
+ * We don't want to take decrementer interrupts while we are
+ * offline, so clear LPCR:PECE1. We keep PECE2 (and
+ * LPCR_PECE_HVEE on P9) enabled as to let IPIs in.
+ *
+ * If the CPU gets woken up by a special wakeup, ensure that
+ * the SLW engine sets LPCR with decrementer bit cleared, else
+ * the CPU will come back to the kernel due to a spurious
+ * wakeup.
+ */
+ lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1;
+ pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
__ppc64_runlatch_off();
@@ -394,6 +416,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
__ppc64_runlatch_on();
+ /*
+ * Re-enable decrementer interrupts in LPCR.
+ *
+ * Further, we want stop states to be woken up by decrementer
+ * for non-hotplug cases. So program the LPCR via stop api as
+ * well.
+ */
+ lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1;
+ pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
+
return srr1;
}
#endif
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 40dae96..536b07b 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -164,12 +164,6 @@ static void pnv_smp_cpu_kill_self(void)
if (cpu_has_feature(CPU_FTR_ARCH_207S))
wmask = SRR1_WAKEMASK_P8;
- /* We don't want to take decrementer interrupts while we are offline,
- * so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9)
- * enabled as to let IPIs in.
- */
- mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
-
while (!generic_check_cpu_restart(cpu)) {
/*
* Clear IPI flag, since we don't handle IPIs while
@@ -219,8 +213,6 @@ static void pnv_smp_cpu_kill_self(void)
}
- /* Re-enable decrementer interrupts */
- mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_PECE1);
DBG("CPU%d coming online...\n", cpu);
}
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
The stop4 idle state on POWER9 is a deep idle state which loses
hypervisor resources, but whose latency is low enough that it can be
exposed via cpuidle.
Until now, the deep idle states which lose hypervisor resources (eg:
winkle) were only exposed via CPU-Hotplug. Hence currently on wakeup
from such states, barring a few SPRs which need to be restored to
their older value, rest of the SPRS are reinitialized to their values
corresponding to that at boot time.
When stop4 is used in the context of cpuidle, we want these additional
SPRs to be restored to their older value, to ensure that the context
on the CPU coming back from idle is same as it was before going idle.
In this patch, we define a SPR save area in PACA (since we have used
up the volatile register space in the stack) and on POWER9, we restore
SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
SPRN_MMCR2 to the values they had before entering stop.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/cpuidle.h | 11 +++++++
arch/powerpc/include/asm/paca.h | 7 ++++
arch/powerpc/kernel/asm-offsets.c | 8 +++++
arch/powerpc/kernel/idle_book3s.S | 65 ++++++++++++++++++++++++++++++++++++--
4 files changed, 89 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 52586f9..8a174cb 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -67,6 +67,17 @@
#define ERR_DEEP_STATE_ESL_MISMATCH -2
#ifndef __ASSEMBLY__
+/* Additional SPRs that need to be saved/restored during stop */
+struct stop_sprs {
+ u64 pid;
+ u64 ldbar;
+ u64 fscr;
+ u64 hfscr;
+ u64 mmcr1;
+ u64 mmcr2;
+ u64 mmcra;
+};
+
extern u32 pnv_fastsleep_workaround_at_entry[];
extern u32 pnv_fastsleep_workaround_at_exit[];
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc88a31..04b60af 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -31,6 +31,7 @@
#endif
#include <asm/accounting.h>
#include <asm/hmi.h>
+#include <asm/cpuidle.h>
register struct paca_struct *local_paca asm("r13");
@@ -183,6 +184,12 @@ struct paca_struct {
struct paca_struct **thread_sibling_pacas;
/* The PSSCR value that the kernel requested before going to stop */
u64 requested_psscr;
+
+ /*
+ * Save area for additional SPRs that need to be
+ * saved/restored during cpuidle stop.
+ */
+ struct stop_sprs stop_sprs;
#endif
#ifdef CONFIG_PPC_STD_MMU_64
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 6e95c2c..8cfb20e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -746,6 +746,14 @@ int main(void)
OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask);
OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas);
OFFSET(PACA_REQ_PSSCR, paca_struct, requested_psscr);
+#define STOP_SPR(x, f) OFFSET(x, paca_struct, stop_sprs.f)
+ STOP_SPR(STOP_PID, pid);
+ STOP_SPR(STOP_LDBAR, ldbar);
+ STOP_SPR(STOP_FSCR, fscr);
+ STOP_SPR(STOP_HFSCR, hfscr);
+ STOP_SPR(STOP_MMCR1, mmcr1);
+ STOP_SPR(STOP_MMCR2, mmcr2);
+ STOP_SPR(STOP_MMCRA, mmcra);
#endif
DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 516ebef..4621568 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -85,7 +85,61 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
std r3,_WORT(r1)
mfspr r3,SPRN_WORC
std r3,_WORC(r1)
+/*
+ * On POWER9, there are idle states such as stop4, invoked via cpuidle,
+ * that lose hypervisor resources. In such cases, we need to save
+ * additional SPRs before entering those idle states so that they can
+ * be restored to their older values on wakeup from the idle state.
+ *
+ * On POWER8, the only such deep idle state is winkle which is used
+ * only in the context of CPU-Hotplug, where these additional SPRs are
+ * reinitiazed to a sane value. Hence there is no need to save/restore
+ * these SPRs.
+ */
+BEGIN_FTR_SECTION
+ blr
+END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
+
+power9_save_additional_sprs:
+ mfspr r3, SPRN_PID
+ mfspr r4, SPRN_LDBAR
+ std r3, STOP_PID(r13)
+ std r4, STOP_LDBAR(r13)
+ mfspr r3, SPRN_FSCR
+ mfspr r4, SPRN_HFSCR
+ std r3, STOP_FSCR(r13)
+ std r4, STOP_HFSCR(r13)
+
+ mfspr r3, SPRN_MMCRA
+ mfspr r4, SPRN_MMCR1
+ std r3, STOP_MMCRA(r13)
+ std r4, STOP_MMCR1(r13)
+
+ mfspr r3, SPRN_MMCR2
+ std r3, STOP_MMCR2(r13)
+ blr
+
+power9_restore_additional_sprs:
+ ld r3,_LPCR(r1)
+ ld r4, STOP_PID(r13)
+ mtspr SPRN_LPCR,r3
+ mtspr SPRN_PID, r4
+
+ ld r3, STOP_LDBAR(r13)
+ ld r4, STOP_FSCR(r13)
+ mtspr SPRN_LDBAR, r3
+ mtspr SPRN_FSCR, r4
+
+ ld r3, STOP_HFSCR(r13)
+ ld r4, STOP_MMCRA(r13)
+ mtspr SPRN_HFSCR, r3
+ mtspr SPRN_MMCRA, r4
+ /* We have already restored PACA_MMCR0 */
+ ld r3, STOP_MMCR1(r13)
+ ld r4, STOP_MMCR2(r13)
+ mtspr SPRN_MMCR1, r3
+ mtspr SPRN_MMCR2, r4
blr
/*
@@ -803,9 +857,16 @@ no_segments:
mtctr r12
bctrl
+/*
+ * On POWER9, we can come here on wakeup from a cpuidle stop state.
+ * Hence restore the additional SPRs to the saved value.
+ *
+ * On POWER8, we come here only on winkle. Since winkle is used
+ * only in the case of CPU-Hotplug, we don't need to restore
+ * the additional SPRs.
+ */
BEGIN_FTR_SECTION
- ld r4,_LPCR(r1)
- mtspr SPRN_LPCR,r4
+ bl power9_restore_additional_sprs
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
hypervisor_state_restored:
--
1.9.4
On Mon, 7 Aug 2017 11:12:42 +0530
"Gautham R. Shenoy" <[email protected]> wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Hi,
>
> This is the fourth iteration of the patchset to enable exploitation of
> stop4 idle state on POWER9 via cpuidle.
These look good to me, thanks. I think the SPR saving is moving in the
right direction too. We should get this enabled stream as soon as possible.
For both patches,
Reviewed-by: Nicholas Piggin <[email protected]>