From: "Gautham R. Shenoy" <[email protected]>
Hi,
This is the second iteration of the patchset to use the psscr_val and
psscr_mask provided by the firmware for each of the stop states.
The previous version can be found here:
https://lkml.org/lkml/2016/9/29/45
The main changes in this version are:
1) Add a helper function in powernv-cpuidle.c to help initialize the
powernv_states cpuidle table.
2) Handle the older firmware which populates only the Requested Level
(RL) fields of the psscr and psscr_mask in the device tree. This
patchset ensures that in the presence of the older firmware, the
other fields of PSSCR are initalized to sane default values.
Synopsis
==========
In the current implementation, the code for ISA v3.0 stop
implementation has a couple of shortcomings.
a) The code hand-codes the values for ESL,EC,TR,MTL bits of PSSCR and
uses only the RL field from the firmware. While this is not
incorrect, since the hand-coded values are legitimate, it is not a
very flexible design since the firmware has the capability to
communicate these values via the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" properties. In case where the
firmware provides values for these fields that is different from
the hand-coded values, the current code will not work as intended.
b) Due to issue a), the current code assumes that ESL=EC=1 for all the
stop states and hence the wakeup from the stop instruction will
happen at 0x100, the system-reset vector. However, the ISA v3.0
allows the ESL=EC=0 behaviour where the corresponding stop-state
loses no state and wakes up from the subsequent instruction. The
current code doesn't handle this case.
This patch series addresses these issues.
The first patch in the series renames the existing
IDLE_STATE_ENTER_SEQ macro to IDLE_STATE_ENTER_SEQ_NORET. It reuses
the name IDLE_STATE_ENTER_SEQ for entering into stop-states which wake
up at the subsequent instruction.
The second patch adds a helper function in cpuidle-powernv.c for
initializing entries of the powernv_states[] table that is passed to
the cpu-idle core. This eliminates some of the code duplication in the
function that discovers and initializes the stop states.
The third patch in the series fixes issues a) and b) by ensuring that
the psscr-value and the psscr-mask provided by the firmware are what
will be used to set a particular stop state. It also adds support for
handling wake-up from stop states which were entered with ESL=EC=0.
The third patch also handles the older firmware which sets only the
Requested Level (RL) field in the psscr and psscr-mask exposed in the
device tree. In the presence of such older firmware, this patch will
set the default sane values for for remaining PSSCR fields (i.e PSLL,
MTL, ESL, EC, and TR).
The skiboot patch populates all the relevant fields in the PSSCR
values and the mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
The patches are based on top of
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes
Gautham R. Shenoy (3):
powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro
cpuidle:powernv: Add helper function to populate powernv idle states.
powernv: Pass PSSCR value and mask to power9_idle_stop
arch/powerpc/include/asm/cpuidle.h | 42 +++++++++++-
arch/powerpc/include/asm/processor.h | 3 +-
arch/powerpc/kernel/exceptions-64s.S | 6 +-
arch/powerpc/kernel/idle_book3s.S | 41 +++++++-----
arch/powerpc/platforms/powernv/idle.c | 81 +++++++++++++++++++----
arch/powerpc/platforms/powernv/powernv.h | 3 +-
arch/powerpc/platforms/powernv/smp.c | 14 ++--
drivers/cpuidle/cpuidle-powernv.c | 110 ++++++++++++++++++++-----------
include/linux/cpuidle.h | 1 +
9 files changed, 220 insertions(+), 81 deletions(-)
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
In the current code for powernv_add_idle_states, there is a lot of code
duplication while initializing an idle state in powernv_states table.
Add an inline helper function to populate the powernv_states[] table for
a given idle state. Invoke this for populating the "Nap", "Fastsleep"
and the stop states in powernv_add_idle_states.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
drivers/cpuidle/cpuidle-powernv.c | 82 +++++++++++++++++++++++----------------
include/linux/cpuidle.h | 1 +
2 files changed, 49 insertions(+), 34 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 7fe442c..11b22b9 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -167,6 +167,28 @@ static int powernv_cpuidle_driver_init(void)
return 0;
}
+static inline void add_powernv_state(int index, const char *name,
+ unsigned int flags,
+ int (*idle_fn)(struct cpuidle_device *,
+ struct cpuidle_driver *,
+ int),
+ unsigned int target_residency,
+ unsigned int exit_latency,
+ u64 psscr_val)
+{
+ strncpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
+ strncpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
+ powernv_states[index].flags = flags;
+ powernv_states[index].target_residency = target_residency;
+ powernv_states[index].exit_latency = exit_latency;
+ powernv_states[index].enter = idle_fn;
+
+ if (idle_fn != stop_loop)
+ return;
+
+ stop_psscr_table[index] = psscr_val;
+}
+
static int powernv_add_idle_states(void)
{
struct device_node *power_mgt;
@@ -236,6 +258,7 @@ static int powernv_add_idle_states(void)
"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
for (i = 0; i < dt_idle_states; i++) {
+ unsigned int exit_latency, target_residency;
/*
* If an idle state has exit latency beyond
* POWERNV_THRESHOLD_LATENCY_NS then don't use it
@@ -244,27 +267,30 @@ static int powernv_add_idle_states(void)
if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
continue;
+ exit_latency = ((unsigned int)latency_ns[i]) / 1000;
+ target_residency = (!rc) ? ((unsigned int)residency_ns[i]) : 0;
+ /*
+ * Firmware passes residency values in ns.
+ * cpuidle expects it in us.
+ */
+ target_residency /= 1000;
+
/*
* Cpuidle accepts exit_latency and target_residency in us.
* Use default target_residency values if f/w does not expose it.
*/
if (flags[i] & OPAL_PM_NAP_ENABLED) {
+ target_residency = 100;
/* Add NAP state */
- strcpy(powernv_states[nr_idle_states].name, "Nap");
- strcpy(powernv_states[nr_idle_states].desc, "Nap");
- powernv_states[nr_idle_states].flags = 0;
- powernv_states[nr_idle_states].target_residency = 100;
- powernv_states[nr_idle_states].enter = nap_loop;
+ add_powernv_state(nr_idle_states, "Nap",
+ CPUIDLE_FLAG_NONE, nap_loop,
+ target_residency, exit_latency, 0);
} else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
!(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
- strncpy(powernv_states[nr_idle_states].name,
- names[i], CPUIDLE_NAME_LEN);
- strncpy(powernv_states[nr_idle_states].desc,
- names[i], CPUIDLE_NAME_LEN);
- powernv_states[nr_idle_states].flags = 0;
-
- powernv_states[nr_idle_states].enter = stop_loop;
- stop_psscr_table[nr_idle_states] = psscr_val[i];
+ add_powernv_state(nr_idle_states, names[i],
+ CPUIDLE_FLAG_NONE, stop_loop,
+ target_residency, exit_latency,
+ psscr_val[i]);
}
/*
@@ -274,32 +300,20 @@ static int powernv_add_idle_states(void)
#ifdef CONFIG_TICK_ONESHOT
if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
+ target_residency = 300000;
/* Add FASTSLEEP state */
- strcpy(powernv_states[nr_idle_states].name, "FastSleep");
- strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
- powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
- powernv_states[nr_idle_states].target_residency = 300000;
- powernv_states[nr_idle_states].enter = fastsleep_loop;
+ add_powernv_state(nr_idle_states, "FastSleep",
+ CPUIDLE_FLAG_TIMER_STOP,
+ fastsleep_loop,
+ target_residency, exit_latency, 0);
} else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
- strncpy(powernv_states[nr_idle_states].name,
- names[i], CPUIDLE_NAME_LEN);
- strncpy(powernv_states[nr_idle_states].desc,
- names[i], CPUIDLE_NAME_LEN);
-
- powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
- powernv_states[nr_idle_states].enter = stop_loop;
- stop_psscr_table[nr_idle_states] = psscr_val[i];
+ add_powernv_state(nr_idle_states, names[i],
+ CPUIDLE_FLAG_TIMER_STOP, stop_loop,
+ target_residency, exit_latency,
+ psscr_val[i]);
}
#endif
- powernv_states[nr_idle_states].exit_latency =
- ((unsigned int)latency_ns[i]) / 1000;
-
- if (!rc) {
- powernv_states[nr_idle_states].target_residency =
- ((unsigned int)residency_ns[i]) / 1000;
- }
-
nr_idle_states++;
}
out:
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index bb31373..c4e10f8 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -62,6 +62,7 @@ struct cpuidle_state {
};
/* Idle State Flags */
+#define CPUIDLE_FLAG_NONE (0x00)
#define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */
#define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
The power9_idle_stop method currently takes only the requested stop
level as a parameter and picks up the rest of the PSSCR bits from a
hand-coded macro. This is not a very flexible design, especially when
the firmware has the capability to communicate the psscr value and the
mask associated with a particular stop state via device tree.
This patch modifies the power9_idle_stop API to take as parameters the
PSSCR value and the PSSCR mask corresponding to the stop state that
needs to be set. These PSSCR value and mask are respectively obtained
by parsing the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" fields from the device tree.
In addition to this, the patch adds support for handling stop states
for which ESL and EC bits in the PSSCR are zero. As per the
architecture, a wakeup from these stop states resumes execution from
the subsequent instruction as opposed to waking up at the System
Vector.
The older firmware sets only the Requested Level (RL) field in the
psscr and psscr-mask exposed in the device tree. For older firmware
where psscr-mask=0xf, this patch will set the default sane values that
the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
TR).
This skiboot patch that exports fully populated PSSCR values and the
mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/cpuidle.h | 37 +++++++++++++++
arch/powerpc/include/asm/processor.h | 3 +-
arch/powerpc/kernel/idle_book3s.S | 31 +++++++-----
arch/powerpc/platforms/powernv/idle.c | 81 ++++++++++++++++++++++++++------
arch/powerpc/platforms/powernv/powernv.h | 3 +-
arch/powerpc/platforms/powernv/smp.c | 14 +++---
drivers/cpuidle/cpuidle-powernv.c | 40 +++++++++++-----
7 files changed, 165 insertions(+), 44 deletions(-)
diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 0a3255b..41b5d27 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -10,11 +10,48 @@
#define PNV_CORE_IDLE_LOCK_BIT 0x100
#define PNV_CORE_IDLE_THREAD_BITS 0x0FF
+/*
+ * By default we set the ESL and EC bits in the PSSCR.
+ *
+ * The MTL and PSLL are set to the maximum value possible as per the
+ * ISA, i.e 15.
+ *
+ * The Transition Rate is set to the Maximum value 3.
+ */
+#define PSSCR_HV_DEFAULT_VAL (PSSCR_ESL | PSSCR_EC | \
+ PSSCR_PSLL_MASK | PSSCR_TR_MASK | \
+ PSSCR_MTL_MASK)
+
+#define PSSCR_HV_DEFAULT_MASK (PSSCR_ESL | PSSCR_EC | \
+ PSSCR_PSLL_MASK | PSSCR_TR_MASK | \
+ PSSCR_MTL_MASK | PSSCR_RL_MASK)
+
#ifndef __ASSEMBLY__
extern u32 pnv_fastsleep_workaround_at_entry[];
extern u32 pnv_fastsleep_workaround_at_exit[];
extern u64 pnv_first_deep_stop_state;
+
+/*
+ * Older firmware only populates the RL field in psscr_val and
+ * psscr_mask.
+ *
+ * Set the remaining fields to the default value in case of the
+ * older firmware.
+ */
+static inline u64 compute_psscr_val(u64 psscr_val, u64 psscr_mask)
+{
+ if (psscr_mask == 0xf)
+ return psscr_val | PSSCR_HV_DEFAULT_VAL;
+ return psscr_val;
+}
+
+static inline u64 compute_psscr_mask(u64 psscr_mask)
+{
+ if (psscr_mask == 0xf)
+ return PSSCR_HV_DEFAULT_MASK;
+ return psscr_mask;
+}
#endif
#endif
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index c07c31b..422becd 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -458,7 +458,8 @@ static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
extern unsigned long power7_nap(int check_irq);
extern unsigned long power7_sleep(void);
extern unsigned long power7_winkle(void);
-extern unsigned long power9_idle_stop(unsigned long stop_level);
+extern unsigned long power9_idle_stop(unsigned long stop_psscr_val,
+ unsigned long stop_psscr_mask);
extern void flush_instruction_cache(void);
extern void hard_reset_now(void);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index be90e2f..37ee533 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -40,9 +40,7 @@
#define _WORC GPR11
#define _PTCR GPR12
-#define PSSCR_HV_TEMPLATE PSSCR_ESL | PSSCR_EC | \
- PSSCR_PSLL_MASK | PSSCR_TR_MASK | \
- PSSCR_MTL_MASK
+#define PSSCR_EC_ESL_MASK_SHIFTED (PSSCR_EC | PSSCR_ESL) >> 16
.text
@@ -264,7 +262,7 @@ enter_winkle:
IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
/*
- * r3 - requested stop state
+ * r3 - PSSCR value corresponding to the requested stop state.
*/
power_enter_stop:
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
@@ -274,9 +272,19 @@ power_enter_stop:
stb r4,HSTATE_HWTHREAD_STATE(r13)
#endif
/*
+ * Check if we are executing the lite variant with ESL=EC=0
+ */
+ andis. r4, r3, PSSCR_EC_ESL_MASK_SHIFTED
+ andi. r3, r3, PSSCR_RL_MASK /* r3 = requested stop state */
+ cmpdi r4, 0
+ bne 1f
+ IDLE_STATE_ENTER_SEQ(PPC_STOP)
+ li r3,0 /* Since we didn't lose state, return 0 */
+ b pnv_wakeup_noloss
+/*
* Check if the requested state is a deep idle state.
*/
- LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
+1: LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
ld r4,ADDROFF(pnv_first_deep_stop_state)(r5)
cmpd r3,r4
bge 2f
@@ -353,16 +361,17 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
ld r3,ORIG_GPR3(r1); /* Restore original r3 */ \
20: nop;
-
/*
- * r3 - requested stop state
+ * r3 - The PSSCR value corresponding to the stop state.
+ * r4 - The PSSCR mask corrresonding to the stop state.
*/
_GLOBAL(power9_idle_stop)
- LOAD_REG_IMMEDIATE(r4, PSSCR_HV_TEMPLATE)
- or r4,r4,r3
- mtspr SPRN_PSSCR, r4
- li r4, 1
+ mfspr r5, SPRN_PSSCR
+ andc r5, r5, r4
+ or r3, r3, r5
+ mtspr SPRN_PSSCR, r3
LOAD_REG_ADDR(r5,power_enter_stop)
+ li r4, 1
b pnv_powersave_common
/* No return */
/*
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 479c256..663c6ef 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -237,15 +237,21 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
show_fastsleep_workaround_applyonce,
store_fastsleep_workaround_applyonce);
+/*
+ * The default stop state that will be used by ppc_md.power_save
+ * function on platforms that support stop instruction.
+ */
+u64 pnv_default_stop_val;
+u64 pnv_default_stop_mask;
/*
* Used for ppc_md.power_save which needs a function with no parameters
*/
static void power9_idle(void)
{
- /* Requesting stop state 0 */
- power9_idle_stop(0);
+ power9_idle_stop(pnv_default_stop_val, pnv_default_stop_mask);
}
+
/*
* First deep stop state. Used to figure out when to save/restore
* hypervisor context.
@@ -253,9 +259,11 @@ static void power9_idle(void)
u64 pnv_first_deep_stop_state = MAX_STOP_STATE;
/*
- * Deepest stop idle state. Used when a cpu is offlined
+ * psscr value and mask of the deepest stop idle state.
+ * Used when a cpu is offlined.
*/
-u64 pnv_deepest_stop_state;
+u64 pnv_deepest_stop_psscr_val;
+u64 pnv_deepest_stop_psscr_mask;
/*
* Power ISA 3.0 idle initialization.
@@ -302,28 +310,58 @@ static int __init pnv_arch300_idle_init(struct device_node *np, u32 *flags,
int dt_idle_states)
{
u64 *psscr_val = NULL;
+ u64 *psscr_mask = NULL;
+ u32 *residency_ns = NULL;
+ u64 max_residency_ns = 0;
int rc = 0, i;
+ bool default_stop_found = false;
- psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val),
- GFP_KERNEL);
- if (!psscr_val) {
+ psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val), GFP_KERNEL);
+ psscr_mask = kcalloc(dt_idle_states, sizeof(*psscr_mask), GFP_KERNEL);
+ residency_ns = kcalloc(dt_idle_states, sizeof(*residency_ns),
+ GFP_KERNEL);
+
+ if (!psscr_val || !psscr_mask || !residency_ns) {
rc = -1;
goto out;
}
+
if (of_property_read_u64_array(np,
"ibm,cpu-idle-state-psscr",
psscr_val, dt_idle_states)) {
- pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n");
+ pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr in DT\n");
+ rc = -1;
+ goto out;
+ }
+
+ if (of_property_read_u64_array(np,
+ "ibm,cpu-idle-state-psscr-mask",
+ psscr_mask, dt_idle_states)) {
+ pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr-mask in DT\n");
+ rc = -1;
+ goto out;
+ }
+
+ if (of_property_read_u32_array(np,
+ "ibm,cpu-idle-state-residency-ns",
+ residency_ns, dt_idle_states)) {
+ pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-residency-ns in DT\n");
rc = -1;
goto out;
}
/*
- * Set pnv_first_deep_stop_state and pnv_deepest_stop_state.
+ * Set pnv_first_deep_stop_state, pnv_deepest_stop_psscr_{val,mask},
+ * and the pnv_default_stop_{val,mask}.
+ *
* pnv_first_deep_stop_state should be set to the first stop
* level to cause hypervisor state loss.
- * pnv_deepest_stop_state should be set to the deepest stop
- * stop state.
+ *
+ * pnv_deepest_stop_{val,mask} should be set to values corresponding to
+ * the deepest stop state.
+ *
+ * pnv_default_stop_{val,mask} should be set to values corresponding to
+ * the shallowest (OPAL_PM_STOP_INST_FAST) loss-less stop state.
*/
pnv_first_deep_stop_state = MAX_STOP_STATE;
for (i = 0; i < dt_idle_states; i++) {
@@ -333,12 +371,27 @@ static int __init pnv_arch300_idle_init(struct device_node *np, u32 *flags,
(pnv_first_deep_stop_state > psscr_rl))
pnv_first_deep_stop_state = psscr_rl;
- if (pnv_deepest_stop_state < psscr_rl)
- pnv_deepest_stop_state = psscr_rl;
- }
+ if (max_residency_ns < residency_ns[i]) {
+ max_residency_ns = residency_ns[i];
+ pnv_deepest_stop_psscr_val =
+ compute_psscr_val(psscr_val[i], psscr_mask[i]);
+ pnv_deepest_stop_psscr_mask =
+ compute_psscr_mask(psscr_mask[i]);
+ }
+ if (!default_stop_found &&
+ (flags[i] & OPAL_PM_STOP_INST_FAST)) {
+ pnv_default_stop_val =
+ compute_psscr_val(psscr_val[i], psscr_mask[i]);
+ pnv_default_stop_mask =
+ compute_psscr_mask(psscr_mask[i]);
+ default_stop_found = true;
+ }
+ }
out:
kfree(psscr_val);
+ kfree(psscr_mask);
+ kfree(residency_ns);
return rc;
}
diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
index da7c843..6130522 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -18,7 +18,8 @@ static inline void pnv_pci_shutdown(void) { }
#endif
extern u32 pnv_get_supported_cpuidle_states(void);
-extern u64 pnv_deepest_stop_state;
+extern u64 pnv_deepest_stop_psscr_val;
+extern u64 pnv_deepest_stop_psscr_mask;
extern void pnv_lpc_init(void);
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index c789258..1c6405f 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -182,15 +182,17 @@ static void pnv_smp_cpu_kill_self(void)
ppc64_runlatch_off();
- if (cpu_has_feature(CPU_FTR_ARCH_300))
- srr1 = power9_idle_stop(pnv_deepest_stop_state);
- else if (idle_states & OPAL_PM_WINKLE_ENABLED)
+ if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
+ pnv_deepest_stop_psscr_mask);
+ } else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
srr1 = power7_winkle();
- else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
- (idle_states & OPAL_PM_SLEEP_ENABLED_ER1))
+ } else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
+ (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
srr1 = power7_sleep();
- else
+ } else {
srr1 = power7_nap(1);
+ }
ppc64_runlatch_on();
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 11b22b9..0be0d97 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -19,6 +19,7 @@
#include <asm/firmware.h>
#include <asm/opal.h>
#include <asm/runlatch.h>
+#include <asm/cpuidle.h>
#define POWERNV_THRESHOLD_LATENCY_NS 200000
@@ -30,7 +31,12 @@ struct cpuidle_driver powernv_idle_driver = {
static int max_idle_state;
static struct cpuidle_state *cpuidle_state_table;
-static u64 stop_psscr_table[CPUIDLE_STATE_MAX];
+struct stop_psscr_table {
+ u64 val;
+ u64 mask;
+};
+
+static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX];
static u64 snooze_timeout;
static bool snooze_timeout_en;
@@ -102,7 +108,8 @@ static int stop_loop(struct cpuidle_device *dev,
int index)
{
ppc64_runlatch_off();
- power9_idle_stop(stop_psscr_table[index]);
+ power9_idle_stop(stop_psscr_table[index].val,
+ stop_psscr_table[index].mask);
ppc64_runlatch_on();
return index;
}
@@ -174,7 +181,7 @@ static inline void add_powernv_state(int index, const char *name,
int),
unsigned int target_residency,
unsigned int exit_latency,
- u64 psscr_val)
+ u64 psscr_val, u64 psscr_mask)
{
strncpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
strncpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
@@ -186,7 +193,9 @@ static inline void add_powernv_state(int index, const char *name,
if (idle_fn != stop_loop)
return;
- stop_psscr_table[index] = psscr_val;
+ stop_psscr_table[index].val = compute_psscr_val(psscr_val,
+ psscr_mask);
+ stop_psscr_table[index].mask = compute_psscr_mask(psscr_mask);
}
static int powernv_add_idle_states(void)
@@ -198,6 +207,7 @@ static int powernv_add_idle_states(void)
u32 residency_ns[CPUIDLE_STATE_MAX];
u32 flags[CPUIDLE_STATE_MAX];
u64 psscr_val[CPUIDLE_STATE_MAX];
+ u64 psscr_mask[CPUIDLE_STATE_MAX];
const char *names[CPUIDLE_STATE_MAX];
int i, rc;
@@ -245,15 +255,23 @@ static int powernv_add_idle_states(void)
/*
* If the idle states use stop instruction, probe for psscr values
- * which are necessary to specify required stop level.
+ * and psscr mask which are necessary to specify required stop level.
*/
- if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP))
+ if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP)) {
if (of_property_read_u64_array(power_mgt,
"ibm,cpu-idle-state-psscr", psscr_val, dt_idle_states)) {
- pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n");
+ pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr in DT\n");
goto out;
}
+ if (of_property_read_u64_array(power_mgt,
+ "ibm,cpu-idle-state-psscr-mask",
+ psscr_mask, dt_idle_states)) {
+ pr_warn("cpuidle-powernv:Missing ibm,cpu-idle-state-psscr-mask in DT\n");
+ goto out;
+ }
+ }
+
rc = of_property_read_u32_array(power_mgt,
"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
@@ -284,13 +302,13 @@ static int powernv_add_idle_states(void)
/* Add NAP state */
add_powernv_state(nr_idle_states, "Nap",
CPUIDLE_FLAG_NONE, nap_loop,
- target_residency, exit_latency, 0);
+ target_residency, exit_latency, 0, 0);
} else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
!(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
add_powernv_state(nr_idle_states, names[i],
CPUIDLE_FLAG_NONE, stop_loop,
target_residency, exit_latency,
- psscr_val[i]);
+ psscr_val[i], psscr_mask[i]);
}
/*
@@ -305,13 +323,13 @@ static int powernv_add_idle_states(void)
add_powernv_state(nr_idle_states, "FastSleep",
CPUIDLE_FLAG_TIMER_STOP,
fastsleep_loop,
- target_residency, exit_latency, 0);
+ target_residency, exit_latency, 0, 0);
} else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
add_powernv_state(nr_idle_states, names[i],
CPUIDLE_FLAG_TIMER_STOP, stop_loop,
target_residency, exit_latency,
- psscr_val[i]);
+ psscr_val[i], psscr_mask[i]);
}
#endif
nr_idle_states++;
--
1.9.4
From: "Gautham R. Shenoy" <[email protected]>
Currently all the low-power idle states are expected to wake up
at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ
that puts the CPU to an idle state and never returns.
On ISA_300, when the ESL and EC bits in the PSSCR are zero, the
CPU is expected to wake up at the next instruction of the idle
instruction.
This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the
no-return variant and reuses the name IDLE_STATE_ENTER_SEQ
for a variant that allows resuming operation at the instruction next
to the idle-instruction.
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
arch/powerpc/include/asm/cpuidle.h | 5 ++++-
arch/powerpc/kernel/exceptions-64s.S | 6 +++---
arch/powerpc/kernel/idle_book3s.S | 10 +++++-----
3 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 3919332..0a3255b 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -21,7 +21,7 @@
/* Idle state entry routines */
#ifdef CONFIG_PPC_P7_NAP
-#define IDLE_STATE_ENTER_SEQ(IDLE_INST) \
+#define IDLE_STATE_ENTER_SEQ(IDLE_INST) \
/* Magic NAP/SLEEP/WINKLE mode enter sequence */ \
std r0,0(r1); \
ptesync; \
@@ -29,6 +29,9 @@
1: cmpd cr0,r0,r0; \
bne 1b; \
IDLE_INST; \
+
+#define IDLE_STATE_ENTER_SEQ_NORET(IDLE_INST) \
+ IDLE_STATE_ENTER_SEQ(IDLE_INST) \
b .
#endif /* CONFIG_PPC_P7_NAP */
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index f129408..7a71cdf 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -366,12 +366,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
lbz r3,PACA_THREAD_IDLE_STATE(r13)
cmpwi r3,PNV_THREAD_NAP
bgt 10f
- IDLE_STATE_ENTER_SEQ(PPC_NAP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
/* No return */
10:
cmpwi r3,PNV_THREAD_SLEEP
bgt 2f
- IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
/* No return */
2:
@@ -385,7 +385,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
*/
ori r13,r13,1
SET_PACA(r13)
- IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
/* No return */
4:
#endif
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 72dac0b..be90e2f 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -205,7 +205,7 @@ pnv_enter_arch207_idle_mode:
stb r3,PACA_THREAD_IDLE_STATE(r13)
cmpwi cr3,r3,PNV_THREAD_SLEEP
bge cr3,2f
- IDLE_STATE_ENTER_SEQ(PPC_NAP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
/* No return */
2:
/* Sleep or winkle */
@@ -239,7 +239,7 @@ pnv_fastsleep_workaround_at_entry:
common_enter: /* common code for all the threads entering sleep or winkle */
bgt cr3,enter_winkle
- IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
fastsleep_workaround_at_entry:
ori r15,r15,PNV_CORE_IDLE_LOCK_BIT
@@ -261,7 +261,7 @@ fastsleep_workaround_at_entry:
enter_winkle:
bl save_sprs_to_stack
- IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
/*
* r3 - requested stop state
@@ -280,7 +280,7 @@ power_enter_stop:
ld r4,ADDROFF(pnv_first_deep_stop_state)(r5)
cmpd r3,r4
bge 2f
- IDLE_STATE_ENTER_SEQ(PPC_STOP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
2:
/*
* Entering deep idle state.
@@ -302,7 +302,7 @@ lwarx_loop_stop:
bl save_sprs_to_stack
- IDLE_STATE_ENTER_SEQ(PPC_STOP)
+ IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
_GLOBAL(power7_idle)
/* Now check if user or arch enabled NAP mode */
--
1.9.4
On Thursday, October 27, 2016 02:05:04 PM Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> Hi,
>
> This is the second iteration of the patchset to use the psscr_val and
> psscr_mask provided by the firmware for each of the stop states.
>
> The previous version can be found here:
> https://lkml.org/lkml/2016/9/29/45
>
> The main changes in this version are:
>
> 1) Add a helper function in powernv-cpuidle.c to help initialize the
> powernv_states cpuidle table.
>
> 2) Handle the older firmware which populates only the Requested Level
> (RL) fields of the psscr and psscr_mask in the device tree. This
> patchset ensures that in the presence of the older firmware, the
> other fields of PSSCR are initalized to sane default values.
>
>
> Synopsis
> ==========
> In the current implementation, the code for ISA v3.0 stop
> implementation has a couple of shortcomings.
>
> a) The code hand-codes the values for ESL,EC,TR,MTL bits of PSSCR and
> uses only the RL field from the firmware. While this is not
> incorrect, since the hand-coded values are legitimate, it is not a
> very flexible design since the firmware has the capability to
> communicate these values via the "ibm,cpu-idle-state-psscr" and
> "ibm,cpu-idle-state-psscr-mask" properties. In case where the
> firmware provides values for these fields that is different from
> the hand-coded values, the current code will not work as intended.
>
> b) Due to issue a), the current code assumes that ESL=EC=1 for all the
> stop states and hence the wakeup from the stop instruction will
> happen at 0x100, the system-reset vector. However, the ISA v3.0
> allows the ESL=EC=0 behaviour where the corresponding stop-state
> loses no state and wakes up from the subsequent instruction. The
> current code doesn't handle this case.
>
> This patch series addresses these issues.
>
> The first patch in the series renames the existing
> IDLE_STATE_ENTER_SEQ macro to IDLE_STATE_ENTER_SEQ_NORET. It reuses
> the name IDLE_STATE_ENTER_SEQ for entering into stop-states which wake
> up at the subsequent instruction.
>
> The second patch adds a helper function in cpuidle-powernv.c for
> initializing entries of the powernv_states[] table that is passed to
> the cpu-idle core. This eliminates some of the code duplication in the
> function that discovers and initializes the stop states.
>
> The third patch in the series fixes issues a) and b) by ensuring that
> the psscr-value and the psscr-mask provided by the firmware are what
> will be used to set a particular stop state. It also adds support for
> handling wake-up from stop states which were entered with ESL=EC=0.
>
> The third patch also handles the older firmware which sets only the
> Requested Level (RL) field in the psscr and psscr-mask exposed in the
> device tree. In the presence of such older firmware, this patch will
> set the default sane values for for remaining PSSCR fields (i.e PSLL,
> MTL, ESL, EC, and TR).
>
> The skiboot patch populates all the relevant fields in the PSSCR
> values and the mask for all the stop states can be found here:
> https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
>
> The patches are based on top of
> git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes
>
> Gautham R. Shenoy (3):
> powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro
> cpuidle:powernv: Add helper function to populate powernv idle states.
> powernv: Pass PSSCR value and mask to power9_idle_stop
OK
I need someone to review this for me.
Thanks,
Rafael
On Thu, Oct 27, 2016 at 7:35 PM, Gautham R. Shenoy
<[email protected]> wrote:
> From: "Gautham R. Shenoy" <[email protected]>
>
> In the current code for powernv_add_idle_states, there is a lot of code
> duplication while initializing an idle state in powernv_states table.
>
> Add an inline helper function to populate the powernv_states[] table for
> a given idle state. Invoke this for populating the "Nap", "Fastsleep"
> and the stop states in powernv_add_idle_states.
>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
> drivers/cpuidle/cpuidle-powernv.c | 82 +++++++++++++++++++++++----------------
> include/linux/cpuidle.h | 1 +
> 2 files changed, 49 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index 7fe442c..11b22b9 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -167,6 +167,28 @@ static int powernv_cpuidle_driver_init(void)
> return 0;
> }
>
> +static inline void add_powernv_state(int index, const char *name,
> + unsigned int flags,
> + int (*idle_fn)(struct cpuidle_device *,
> + struct cpuidle_driver *,
> + int),
> + unsigned int target_residency,
> + unsigned int exit_latency,
> + u64 psscr_val)
> +{
> + strncpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> + strncpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
If the supplied name is equal to CPUIDLE_NAME_LEN then strncpy() won't
terminate the string. The least annoying fix is to memset() the whole
structure to zero and set n to CPUIDLE_NAME_LEN - 1.
> + powernv_states[index].flags = flags;
> + powernv_states[index].target_residency = target_residency;
> + powernv_states[index].exit_latency = exit_latency;
> + powernv_states[index].enter = idle_fn;
> +
> + if (idle_fn != stop_loop)
> + return;
This can probably be deleted. The nap/sleep loops don't look at the
psscr setting and you've passed in a dummy value of zero anyway.
> +
> + stop_psscr_table[index] = psscr_val;
> +}
> +
> static int powernv_add_idle_states(void)
> {
> struct device_node *power_mgt;
> @@ -236,6 +258,7 @@ static int powernv_add_idle_states(void)
> "ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
>
> for (i = 0; i < dt_idle_states; i++) {
> + unsigned int exit_latency, target_residency;
> /*
> * If an idle state has exit latency beyond
> * POWERNV_THRESHOLD_LATENCY_NS then don't use it
> @@ -244,27 +267,30 @@ static int powernv_add_idle_states(void)
> if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
> continue;
>
> + exit_latency = ((unsigned int)latency_ns[i]) / 1000;
> + target_residency = (!rc) ? ((unsigned int)residency_ns[i]) : 0;
> + /*
> + * Firmware passes residency values in ns.
> + * cpuidle expects it in us.
> + */
> + target_residency /= 1000;
> +
> /*
> * Cpuidle accepts exit_latency and target_residency in us.
This comment says the same thing as the one above.
> * Use default target_residency values if f/w does not expose it.
> */
> if (flags[i] & OPAL_PM_NAP_ENABLED) {
> + target_residency = 100;
Hmm, the above comment says that we should only use the default value
for target_residency if firmware hasn't provided a value. Is there a
reason we always use the hard coded value?
> /* Add NAP state */
> - strcpy(powernv_states[nr_idle_states].name, "Nap");
> - strcpy(powernv_states[nr_idle_states].desc, "Nap");
> - powernv_states[nr_idle_states].flags = 0;
> - powernv_states[nr_idle_states].target_residency = 100;
> - powernv_states[nr_idle_states].enter = nap_loop;
> + add_powernv_state(nr_idle_states, "Nap",
> + CPUIDLE_FLAG_NONE, nap_loop,
> + target_residency, exit_latency, 0);
> } else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
> !(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
> - strncpy(powernv_states[nr_idle_states].name,
> - names[i], CPUIDLE_NAME_LEN);
> - strncpy(powernv_states[nr_idle_states].desc,
> - names[i], CPUIDLE_NAME_LEN);
> - powernv_states[nr_idle_states].flags = 0;
> -
> - powernv_states[nr_idle_states].enter = stop_loop;
> - stop_psscr_table[nr_idle_states] = psscr_val[i];
> + add_powernv_state(nr_idle_states, names[i],
> + CPUIDLE_FLAG_NONE, stop_loop,
> + target_residency, exit_latency,
> + psscr_val[i]);
> }
>
> /*
> @@ -274,32 +300,20 @@ static int powernv_add_idle_states(void)
> #ifdef CONFIG_TICK_ONESHOT
> if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
> flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
> + target_residency = 300000;
Same comment as above.
> /* Add FASTSLEEP state */
> - strcpy(powernv_states[nr_idle_states].name, "FastSleep");
> - strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
> - powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
> - powernv_states[nr_idle_states].target_residency = 300000;
> - powernv_states[nr_idle_states].enter = fastsleep_loop;
> + add_powernv_state(nr_idle_states, "FastSleep",
> + CPUIDLE_FLAG_TIMER_STOP,
> + fastsleep_loop,
> + target_residency, exit_latency, 0);
> } else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
> (flags[i] & OPAL_PM_TIMEBASE_STOP)) {
> - strncpy(powernv_states[nr_idle_states].name,
> - names[i], CPUIDLE_NAME_LEN);
> - strncpy(powernv_states[nr_idle_states].desc,
> - names[i], CPUIDLE_NAME_LEN);
> -
> - powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
> - powernv_states[nr_idle_states].enter = stop_loop;
> - stop_psscr_table[nr_idle_states] = psscr_val[i];
> + add_powernv_state(nr_idle_states, names[i],
> + CPUIDLE_FLAG_TIMER_STOP, stop_loop,
> + target_residency, exit_latency,
> + psscr_val[i]);
> }
> #endif
> - powernv_states[nr_idle_states].exit_latency =
> - ((unsigned int)latency_ns[i]) / 1000;
> -
> - if (!rc) {
> - powernv_states[nr_idle_states].target_residency =
> - ((unsigned int)residency_ns[i]) / 1000;
> - }
> -
> nr_idle_states++;
> }
> out:
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index bb31373..c4e10f8 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -62,6 +62,7 @@ struct cpuidle_state {
> };
>
> /* Idle State Flags */
> +#define CPUIDLE_FLAG_NONE (0x00)
> #define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */
> #define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
>
> --
> 1.9.4
>
Looks good otherwise.
Reviewed-by: Oliver O'Halloran <[email protected]>
On Tue, Nov 01, 2016 at 07:32:58PM +1100, Oliver O'Halloran wrote:
> On Thu, Oct 27, 2016 at 7:35 PM, Gautham R. Shenoy
> <[email protected]> wrote:
> > From: "Gautham R. Shenoy" <[email protected]>
> >
> > In the current code for powernv_add_idle_states, there is a lot of code
> > duplication while initializing an idle state in powernv_states table.
> >
> > Add an inline helper function to populate the powernv_states[] table for
> > a given idle state. Invoke this for populating the "Nap", "Fastsleep"
> > and the stop states in powernv_add_idle_states.
> >
> > Signed-off-by: Gautham R. Shenoy <[email protected]>
> > ---
> > drivers/cpuidle/cpuidle-powernv.c | 82 +++++++++++++++++++++++----------------
> > include/linux/cpuidle.h | 1 +
> > 2 files changed, 49 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> > index 7fe442c..11b22b9 100644
> > --- a/drivers/cpuidle/cpuidle-powernv.c
> > +++ b/drivers/cpuidle/cpuidle-powernv.c
> > @@ -167,6 +167,28 @@ static int powernv_cpuidle_driver_init(void)
> > return 0;
> > }
> >
> > +static inline void add_powernv_state(int index, const char *name,
> > + unsigned int flags,
> > + int (*idle_fn)(struct cpuidle_device *,
> > + struct cpuidle_driver *,
> > + int),
> > + unsigned int target_residency,
> > + unsigned int exit_latency,
> > + u64 psscr_val)
> > +{
> > + strncpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> > + strncpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
>
> If the supplied name is equal to CPUIDLE_NAME_LEN then strncpy() won't
> terminate the string. The least annoying fix is to memset() the whole
> structure to zero and set n to CPUIDLE_NAME_LEN - 1.
Or he could use strlcpy() instead of strncpy().
Paul.
Hi Oliver,
Thanks for reviewing the patch!
On Tue, Nov 01, 2016 at 07:32:58PM +1100, Oliver O'Halloran wrote:
> On Thu, Oct 27, 2016 at 7:35 PM, Gautham R. Shenoy
> <[email protected]> wrote:
> > From: "Gautham R. Shenoy" <[email protected]>
> >
> > In the current code for powernv_add_idle_states, there is a lot of code
> > duplication while initializing an idle state in powernv_states table.
> >
> > Add an inline helper function to populate the powernv_states[] table for
> > a given idle state. Invoke this for populating the "Nap", "Fastsleep"
> > and the stop states in powernv_add_idle_states.
> >
> > Signed-off-by: Gautham R. Shenoy <[email protected]>
> > ---
> > drivers/cpuidle/cpuidle-powernv.c | 82 +++++++++++++++++++++++----------------
> > include/linux/cpuidle.h | 1 +
> > 2 files changed, 49 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> > index 7fe442c..11b22b9 100644
> > --- a/drivers/cpuidle/cpuidle-powernv.c
> > +++ b/drivers/cpuidle/cpuidle-powernv.c
> > @@ -167,6 +167,28 @@ static int powernv_cpuidle_driver_init(void)
> > return 0;
> > }
> >
> > +static inline void add_powernv_state(int index, const char *name,
> > + unsigned int flags,
> > + int (*idle_fn)(struct cpuidle_device *,
> > + struct cpuidle_driver *,
> > + int),
> > + unsigned int target_residency,
> > + unsigned int exit_latency,
> > + u64 psscr_val)
> > +{
> > + strncpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> > + strncpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
>
> If the supplied name is equal to CPUIDLE_NAME_LEN then strncpy() won't
> terminate the string. The least annoying fix is to memset() the whole
> structure to zero and set n to CPUIDLE_NAME_LEN - 1.
I will change this to use strlcpy as Paul suggested in the reply.
>
> > + powernv_states[index].flags = flags;
> > + powernv_states[index].target_residency = target_residency;
> > + powernv_states[index].exit_latency = exit_latency;
> > + powernv_states[index].enter = idle_fn;
> > +
> > + if (idle_fn != stop_loop)
> > + return;
>
> This can probably be deleted. The nap/sleep loops don't look at the
> psscr setting and you've passed in a dummy value of zero anyway.
The subsequent patch adds the missing bits to the psscr_val if we are
running on the older firmware. But in any case, as you rightly pointed
out we don't use psscr_table in nap/sleep loops. So this can go.
>
> > +
> > + stop_psscr_table[index] = psscr_val;
> > +}
> > +
> > static int powernv_add_idle_states(void)
> > {
> > struct device_node *power_mgt;
> > @@ -236,6 +258,7 @@ static int powernv_add_idle_states(void)
> > "ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
> >
> > for (i = 0; i < dt_idle_states; i++) {
> > + unsigned int exit_latency, target_residency;
> > /*
> > * If an idle state has exit latency beyond
> > * POWERNV_THRESHOLD_LATENCY_NS then don't use it
> > @@ -244,27 +267,30 @@ static int powernv_add_idle_states(void)
> > if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
> > continue;
> >
> > + exit_latency = ((unsigned int)latency_ns[i]) / 1000;
> > + target_residency = (!rc) ? ((unsigned int)residency_ns[i]) : 0;
> > + /*
> > + * Firmware passes residency values in ns.
> > + * cpuidle expects it in us.
> > + */
> > + target_residency /= 1000;
> > +
> > /*
> > * Cpuidle accepts exit_latency and target_residency in us.
>
> This comment says the same thing as the one above.
Will clean this up to reflect the state of the code.
>
> > * Use default target_residency values if f/w does not expose it.
> > */
> > if (flags[i] & OPAL_PM_NAP_ENABLED) {
> > + target_residency = 100;
>
> Hmm, the above comment says that we should only use the default value
> for target_residency if firmware hasn't provided a value. Is there a
> reason we always use the hard coded value?
Ah, good catch! I should be setting target_residency to the default
value only if it is not set by the firmware. Will fix this.
>
> > /* Add NAP state */
> > - strcpy(powernv_states[nr_idle_states].name, "Nap");
> > - strcpy(powernv_states[nr_idle_states].desc, "Nap");
> > - powernv_states[nr_idle_states].flags = 0;
> > - powernv_states[nr_idle_states].target_residency = 100;
> > - powernv_states[nr_idle_states].enter = nap_loop;
> > + add_powernv_state(nr_idle_states, "Nap",
> > + CPUIDLE_FLAG_NONE, nap_loop,
> > + target_residency, exit_latency, 0);
> > } else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
> > !(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
> > - strncpy(powernv_states[nr_idle_states].name,
> > - names[i], CPUIDLE_NAME_LEN);
> > - strncpy(powernv_states[nr_idle_states].desc,
> > - names[i], CPUIDLE_NAME_LEN);
> > - powernv_states[nr_idle_states].flags = 0;
> > -
> > - powernv_states[nr_idle_states].enter = stop_loop;
> > - stop_psscr_table[nr_idle_states] = psscr_val[i];
> > + add_powernv_state(nr_idle_states, names[i],
> > + CPUIDLE_FLAG_NONE, stop_loop,
> > + target_residency, exit_latency,
> > + psscr_val[i]);
> > }
> >
> > /*
> > @@ -274,32 +300,20 @@ static int powernv_add_idle_states(void)
> > #ifdef CONFIG_TICK_ONESHOT
> > if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
> > flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
> > + target_residency = 300000;
>
> Same comment as above.
>
> > /* Add FASTSLEEP state */
> > - strcpy(powernv_states[nr_idle_states].name, "FastSleep");
> > - strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
> > - powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
> > - powernv_states[nr_idle_states].target_residency = 300000;
> > - powernv_states[nr_idle_states].enter = fastsleep_loop;
> > + add_powernv_state(nr_idle_states, "FastSleep",
> > + CPUIDLE_FLAG_TIMER_STOP,
> > + fastsleep_loop,
> > + target_residency, exit_latency, 0);
> > } else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
> > (flags[i] & OPAL_PM_TIMEBASE_STOP)) {
> > - strncpy(powernv_states[nr_idle_states].name,
> > - names[i], CPUIDLE_NAME_LEN);
> > - strncpy(powernv_states[nr_idle_states].desc,
> > - names[i], CPUIDLE_NAME_LEN);
> > -
> > - powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
> > - powernv_states[nr_idle_states].enter = stop_loop;
> > - stop_psscr_table[nr_idle_states] = psscr_val[i];
> > + add_powernv_state(nr_idle_states, names[i],
> > + CPUIDLE_FLAG_TIMER_STOP, stop_loop,
> > + target_residency, exit_latency,
> > + psscr_val[i]);
> > }
> > #endif
> > - powernv_states[nr_idle_states].exit_latency =
> > - ((unsigned int)latency_ns[i]) / 1000;
> > -
> > - if (!rc) {
> > - powernv_states[nr_idle_states].target_residency =
> > - ((unsigned int)residency_ns[i]) / 1000;
> > - }
> > -
> > nr_idle_states++;
> > }
> > out:
> > diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> > index bb31373..c4e10f8 100644
> > --- a/include/linux/cpuidle.h
> > +++ b/include/linux/cpuidle.h
> > @@ -62,6 +62,7 @@ struct cpuidle_state {
> > };
> >
> > /* Idle State Flags */
> > +#define CPUIDLE_FLAG_NONE (0x00)
> > #define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */
> > #define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
> >
> > --
> > 1.9.4
> >
>
> Looks good otherwise.
>
> Reviewed-by: Oliver O'Halloran <[email protected]>
Thanks again!
>
--
Regards
gautham.