2020-07-17 04:11:26

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 00/10] powerpc/watchpoint: Enable 2nd DAWR on baremetal and powervm

Last series[1] was to add basic infrastructure support for more than
one watchpoint on Book3S powerpc. This series actually enables the 2nd
DAWR for baremetal and powervm. Kvm guest is still not supported.

v3: https://lore.kernel.org/lkml/[email protected]

v3->v4:
- v3 patch #2 is split into two v4 patches: #2 and #3
- Few other minor neats suggested by Jordan Niethe
- Rebased to powerpc/next

[1]: https://lore.kernel.org/linuxppc-dev/[email protected]/

Ravi Bangoria (10):
powerpc/watchpoint: Fix 512 byte boundary limit
powerpc/watchpoint: Fix DAWR exception constraint
powerpc/watchpoint: Fix DAWR exception for CACHEOP
powerpc/watchpoint: Enable watchpoint functionality on power10 guest
powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR
powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit
powerpc/watchpoint: Rename current H_SET_MODE DAWR macro
powerpc/watchpoint: Guest support for 2nd DAWR hcall
powerpc/watchpoint: Return available watchpoints dynamically
powerpc/watchpoint: Remove 512 byte boundary

arch/powerpc/include/asm/cputable.h | 13 ++-
arch/powerpc/include/asm/hvcall.h | 3 +-
arch/powerpc/include/asm/hw_breakpoint.h | 5 +-
arch/powerpc/include/asm/machdep.h | 2 +-
arch/powerpc/include/asm/plpar_wrappers.h | 7 +-
arch/powerpc/kernel/dawr.c | 2 +-
arch/powerpc/kernel/dt_cpu_ftrs.c | 7 ++
arch/powerpc/kernel/hw_breakpoint.c | 98 +++++++++++++++--------
arch/powerpc/kernel/prom.c | 2 +
arch/powerpc/kvm/book3s_hv.c | 2 +-
arch/powerpc/platforms/pseries/setup.c | 7 +-
11 files changed, 101 insertions(+), 47 deletions(-)

--
2.26.2


2020-07-17 04:11:45

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 01/10] powerpc/watchpoint: Fix 512 byte boundary limit

Milton Miller reported that we are aligning start and end address to
wrong size SZ_512M. It should be SZ_512. Fix that.

While doing this change I also found a case where ALIGN() comparison
fails. Within a given aligned range, ALIGN() of two addresses does not
match when start address is pointing to the first byte and end address
is pointing to any other byte except the first one. But that's not true
for ALIGN_DOWN(). ALIGN_DOWN() of any two addresses within that range
will always point to the first byte. So use ALIGN_DOWN() instead of
ALIGN().

Fixes: e68ef121c1f4 ("powerpc/watchpoint: Use builtin ALIGN*() macros")
Reported-by: Milton Miller <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Jordan Niethe <[email protected]>
---
arch/powerpc/kernel/hw_breakpoint.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 0000daf0e1da..031e6defc08e 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -419,7 +419,7 @@ static int hw_breakpoint_validate_len(struct arch_hw_breakpoint *hw)
if (dawr_enabled()) {
max_len = DAWR_MAX_LEN;
/* DAWR region can't cross 512 bytes boundary */
- if (ALIGN(start_addr, SZ_512M) != ALIGN(end_addr - 1, SZ_512M))
+ if (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512))
return -EINVAL;
} else if (IS_ENABLED(CONFIG_PPC_8xx)) {
/* 8xx can setup a range without limitation */
--
2.26.2

2020-07-17 04:12:00

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 04/10] powerpc/watchpoint: Enable watchpoint functionality on power10 guest

CPU_FTR_DAWR is by default enabled for host via CPU_FTRS_DT_CPU_BASE
(controlled by CONFIG_PPC_DT_CPU_FTRS). But cpu-features device-tree
node is not PAPR compatible and thus not yet used by kvm or pHyp
guests. Enable watchpoint functionality on power10 guest (both kvm
and powervm) by adding CPU_FTR_DAWR to CPU_FTRS_POWER10. Note that
this change does not enable 2nd DAWR support.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/include/asm/cputable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index bac2252c839e..e506d429b1af 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -478,7 +478,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
- CPU_FTR_ARCH_31)
+ CPU_FTR_ARCH_31 | CPU_FTR_DAWR)
#define CPU_FTRS_CELL (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
--
2.26.2

2020-07-17 04:12:03

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR

Add new device-tree feature for 2nd DAWR. If this feature is present,
2nd DAWR is supported, otherwise not.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/include/asm/cputable.h | 7 +++++--
arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index e506d429b1af..3445c86e1f6f 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
#define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
#define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
#define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
+#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)

#ifndef __ASSEMBLY__

@@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
#define CPU_FTRS_POSSIBLE \
(CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
- CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+ CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
+ CPU_FTR_DAWR1)
#else
#define CPU_FTRS_POSSIBLE \
(CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
- CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+ CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
+ CPU_FTR_DAWR1)
#endif /* CONFIG_CPU_LITTLE_ENDIAN */
#endif
#else
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index ac650c233cd9..c78cd3596ec4 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -574,6 +574,12 @@ static int __init feat_enable_mma(struct dt_cpu_feature *f)
return 1;
}

+static int __init feat_enable_debug_facilities_v31(struct dt_cpu_feature *f)
+{
+ cur_cpu_spec->cpu_features |= CPU_FTR_DAWR1;
+ return 1;
+}
+
struct dt_cpu_feature_match {
const char *name;
int (*enable)(struct dt_cpu_feature *f);
@@ -649,6 +655,7 @@ static struct dt_cpu_feature_match __initdata
{"wait-v3", feat_enable, 0},
{"prefix-instructions", feat_enable, 0},
{"matrix-multiply-assist", feat_enable_mma, 0},
+ {"debug-facilities-v31", feat_enable_debug_facilities_v31, 0},
};

static bool __initdata using_dt_cpu_ftrs;
--
2.26.2

2020-07-17 04:12:47

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 02/10] powerpc/watchpoint: Fix DAWR exception constraint

Pedro Miraglia Franco de Carvalho noticed that on p8/p9, DAR value is
inconsistent with different type of load/store. Like for byte,word
etc. load/stores, DAR is set to the address of the first byte of
overlap between watch range and real access. But for quadword load/
store it's sometime set to the address of the first byte of real
access whereas sometime set to the address of the first byte of
overlap. This issue has been fixed in p10. In p10(ISA 3.1), DAR is
always set to the address of the first byte of overlap. Commit 27985b2a640e
("powerpc/watchpoint: Don't ignore extraneous exceptions blindly")
wrongly assumes that DAR is set to the address of the first byte of
overlap for all load/stores on p8/p9 as well. Fix that. With the fix,
we now rely on 'ea' provided by analyse_instr(). If analyse_instr()
fails, generate event unconditionally on p8/p9, and on p10 generate
event only if DAR is within a DAWR range.

Note: 8xx is not affected.

Fixes: 27985b2a640e ("powerpc/watchpoint: Don't ignore extraneous exceptions blindly")
Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint")
Reported-by: Pedro Miraglia Franco de Carvalho <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/kernel/hw_breakpoint.c | 72 ++++++++++++++++-------------
1 file changed, 41 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 031e6defc08e..a971e22aea81 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -498,11 +498,11 @@ static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info
return ((info->address <= dar) && (dar - info->address < info->len));
}

-static bool dar_user_range_overlaps(unsigned long dar, int size,
- struct arch_hw_breakpoint *info)
+static bool ea_user_range_overlaps(unsigned long ea, int size,
+ struct arch_hw_breakpoint *info)
{
- return ((dar < info->address + info->len) &&
- (dar + size > info->address));
+ return ((ea < info->address + info->len) &&
+ (ea + size > info->address));
}

static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
@@ -515,20 +515,22 @@ static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
return ((hw_start_addr <= dar) && (hw_end_addr > dar));
}

-static bool dar_hw_range_overlaps(unsigned long dar, int size,
- struct arch_hw_breakpoint *info)
+static bool ea_hw_range_overlaps(unsigned long ea, int size,
+ struct arch_hw_breakpoint *info)
{
unsigned long hw_start_addr, hw_end_addr;

hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);

- return ((dar < hw_end_addr) && (dar + size > hw_start_addr));
+ return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
}

/*
* If hw has multiple DAWR registers, we also need to check all
* dawrx constraint bits to confirm this is _really_ a valid event.
+ * If type is UNKNOWN, but privilege level matches, consider it as
+ * a positive match.
*/
static bool check_dawrx_constraints(struct pt_regs *regs, int type,
struct arch_hw_breakpoint *info)
@@ -553,7 +555,8 @@ static bool check_dawrx_constraints(struct pt_regs *regs, int type,
* including extraneous exception. Otherwise return false.
*/
static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
- int type, int size, struct arch_hw_breakpoint *info)
+ unsigned long ea, int type, int size,
+ struct arch_hw_breakpoint *info)
{
bool in_user_range = dar_in_user_range(regs->dar, info);
bool dawrx_constraints;
@@ -569,22 +572,27 @@ static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
}

if (unlikely(ppc_inst_equal(instr, ppc_inst(0)))) {
- if (in_user_range)
- return true;
+ if (cpu_has_feature(CPU_FTR_ARCH_31) &&
+ !dar_in_hw_range(regs->dar, info))
+ return false;

- if (dar_in_hw_range(regs->dar, info)) {
- info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
- return true;
- }
- return false;
+ return true;
}

dawrx_constraints = check_dawrx_constraints(regs, type, info);

- if (dar_user_range_overlaps(regs->dar, size, info))
+ if (type == UNKNOWN) {
+ if (cpu_has_feature(CPU_FTR_ARCH_31) &&
+ !dar_in_hw_range(regs->dar, info))
+ return false;
+
return dawrx_constraints;
+ }

- if (dar_hw_range_overlaps(regs->dar, size, info)) {
+ if (ea_user_range_overlaps(ea, size, info))
+ return dawrx_constraints;
+
+ if (ea_hw_range_overlaps(ea, size, info)) {
if (dawrx_constraints) {
info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
return true;
@@ -594,7 +602,7 @@ static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
}

static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
- int *type, int *size, bool *larx_stcx)
+ int *type, int *size, unsigned long *ea)
{
struct instruction_op op;

@@ -602,16 +610,18 @@ static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
return;

analyse_instr(&op, regs, *instr);
-
- /*
- * Set size = 8 if analyse_instr() fails. If it's a userspace
- * watchpoint(valid or extraneous), we can notify user about it.
- * If it's a kernel watchpoint, instruction emulation will fail
- * in stepping_handler() and watchpoint will be disabled.
- */
*type = GETTYPE(op.type);
- *size = !(*type == UNKNOWN) ? GETSIZE(op.type) : 8;
- *larx_stcx = (*type == LARX || *type == STCX);
+ *ea = op.ea;
+#ifdef __powerpc64__
+ if (!(regs->msr & MSR_64BIT))
+ *ea &= 0xffffffffUL;
+#endif
+ *size = GETSIZE(op.type);
+}
+
+static bool is_larx_stcx_instr(int type)
+{
+ return type == LARX || type == STCX;
}

/*
@@ -678,7 +688,7 @@ int hw_breakpoint_handler(struct die_args *args)
struct ppc_inst instr = ppc_inst(0);
int type = 0;
int size = 0;
- bool larx_stcx = false;
+ unsigned long ea;

/* Disable breakpoints during exception handling */
hw_breakpoint_disable();
@@ -692,7 +702,7 @@ int hw_breakpoint_handler(struct die_args *args)
rcu_read_lock();

if (!IS_ENABLED(CONFIG_PPC_8xx))
- get_instr_detail(regs, &instr, &type, &size, &larx_stcx);
+ get_instr_detail(regs, &instr, &type, &size, &ea);

for (i = 0; i < nr_wp_slots(); i++) {
bp[i] = __this_cpu_read(bp_per_reg[i]);
@@ -702,7 +712,7 @@ int hw_breakpoint_handler(struct die_args *args)
info[i] = counter_arch_bp(bp[i]);
info[i]->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ;

- if (check_constraints(regs, instr, type, size, info[i])) {
+ if (check_constraints(regs, instr, ea, type, size, info[i])) {
if (!IS_ENABLED(CONFIG_PPC_8xx) &&
ppc_inst_equal(instr, ppc_inst(0))) {
handler_error(bp[i], info[i]);
@@ -744,7 +754,7 @@ int hw_breakpoint_handler(struct die_args *args)
}

if (!IS_ENABLED(CONFIG_PPC_8xx)) {
- if (larx_stcx) {
+ if (is_larx_stcx_instr(type)) {
for (i = 0; i < nr_wp_slots(); i++) {
if (!hit[i])
continue;
--
2.26.2

2020-07-17 04:13:24

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically

So far Book3S Powerpc supported only one watchpoint. Power10 is
introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/include/asm/cputable.h | 4 +++-
arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index 3445c86e1f6f..36a0851a7a9b 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -633,7 +633,9 @@ enum {
* Maximum number of hw breakpoint supported on powerpc. Number of
* breakpoints supported by actual hw might be less than this.
*/
-#define HBP_NUM_MAX 1
+#define HBP_NUM_MAX 2
+#define HBP_NUM_ONE 1
+#define HBP_NUM_TWO 2

#endif /* !__ASSEMBLY__ */

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index cb424799da0d..d4eab1694bcd 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -5,10 +5,11 @@
* Copyright 2010, IBM Corporation.
* Author: K.Prasad <[email protected]>
*/
-
#ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
#define _PPC_BOOK3S_64_HW_BREAKPOINT_H

+#include <asm/cpu_has_feature.h>
+
#ifdef __KERNEL__
struct arch_hw_breakpoint {
unsigned long address;
@@ -46,7 +47,7 @@ struct arch_hw_breakpoint {

static inline int nr_wp_slots(void)
{
- return HBP_NUM_MAX;
+ return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
}

#ifdef CONFIG_HAVE_HW_BREAKPOINT
--
2.26.2

2020-07-17 04:13:33

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 03/10] powerpc/watchpoint: Fix DAWR exception for CACHEOP

'ea' returned by analyse_instr() needs to be aligned down to cache
block size for CACHEOP instructions. analyse_instr() does not set
size for CACHEOP, thus size also needs to be calculated manually.

Fixes: 27985b2a640e ("powerpc/watchpoint: Don't ignore extraneous exceptions blindly")
Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint")
Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/kernel/hw_breakpoint.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index a971e22aea81..c55e67bab271 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -538,7 +538,12 @@ static bool check_dawrx_constraints(struct pt_regs *regs, int type,
if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
return false;

- if (OP_IS_STORE(type) && !(info->type & HW_BRK_TYPE_WRITE))
+ /*
+ * The Cache Management instructions other than dcbz never
+ * cause a match. i.e. if type is CACHEOP, the instruction
+ * is dcbz, and dcbz is treated as Store.
+ */
+ if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & HW_BRK_TYPE_WRITE))
return false;

if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
@@ -601,6 +606,15 @@ static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
return false;
}

+static int cache_op_size(void)
+{
+#ifdef __powerpc64__
+ return ppc64_caches.l1d.block_size;
+#else
+ return L1_CACHE_BYTES;
+#endif
+}
+
static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
int *type, int *size, unsigned long *ea)
{
@@ -616,7 +630,12 @@ static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
if (!(regs->msr & MSR_64BIT))
*ea &= 0xffffffffUL;
#endif
+
*size = GETSIZE(op.type);
+ if (*type == CACHEOP) {
+ *size = cache_op_size();
+ *ea &= ~(*size - 1);
+ }
}

static bool is_larx_stcx_instr(int type)
--
2.26.2

2020-07-17 04:14:34

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 06/10] powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit

As per the PAPR, bit 0 of byte 64 in pa-features property indicates
availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
DAWR is present, otherwise not. Host generally uses "cpu-features",
which masks "pa-features". But "cpu-features" are still not used for
guests and thus this change is mostly applicable for guests only.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/kernel/prom.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 9cc49f265c86..c76c09b97bc8 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -175,6 +175,8 @@ static struct ibm_pa_feature {
*/
{ .pabyte = 22, .pabit = 0, .cpu_features = CPU_FTR_TM_COMP,
.cpu_user_ftrs2 = PPC_FEATURE2_HTM_COMP | PPC_FEATURE2_HTM_NOSC_COMP },
+
+ { .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
};

static void __init scan_features(unsigned long node, const unsigned char *ftrs,
--
2.26.2

2020-07-17 04:14:34

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 07/10] powerpc/watchpoint: Rename current H_SET_MODE DAWR macro

Current H_SET_MODE hcall macro name for setting/resetting DAWR0 is
H_SET_MODE_RESOURCE_SET_DAWR. Add suffix 0 to macro name as well.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/include/asm/hvcall.h | 2 +-
arch/powerpc/include/asm/plpar_wrappers.h | 2 +-
arch/powerpc/kvm/book3s_hv.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 43486e773bd6..b785e9f0071c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -355,7 +355,7 @@

/* Values for 2nd argument to H_SET_MODE */
#define H_SET_MODE_RESOURCE_SET_CIABR 1
-#define H_SET_MODE_RESOURCE_SET_DAWR 2
+#define H_SET_MODE_RESOURCE_SET_DAWR0 2
#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
#define H_SET_MODE_RESOURCE_LE 4

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 4293c5d2ddf4..d12c3680d946 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -312,7 +312,7 @@ static inline long plpar_set_ciabr(unsigned long ciabr)

static inline long plpar_set_watchpoint0(unsigned long dawr0, unsigned long dawrx0)
{
- return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
+ return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR0, dawr0, dawrx0);
}

static inline long plpar_signal_sys_reset(long cpu)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6bf66649ab92..7ad692c2d7c7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -764,7 +764,7 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
return H_P3;
vcpu->arch.ciabr = value1;
return H_SUCCESS;
- case H_SET_MODE_RESOURCE_SET_DAWR:
+ case H_SET_MODE_RESOURCE_SET_DAWR0:
if (!kvmppc_power8_compatible(vcpu))
return H_P2;
if (!ppc_breakpoint_available())
--
2.26.2

2020-07-17 04:15:19

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 10/10] powerpc/watchpoint: Remove 512 byte boundary

Power10 has removed 512 bytes boundary from match criteria. i.e. The watch
range can cross 512 bytes boundary.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/kernel/hw_breakpoint.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index c55e67bab271..1f4a1efa0074 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -418,8 +418,9 @@ static int hw_breakpoint_validate_len(struct arch_hw_breakpoint *hw)

if (dawr_enabled()) {
max_len = DAWR_MAX_LEN;
- /* DAWR region can't cross 512 bytes boundary */
- if (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512))
+ /* DAWR region can't cross 512 bytes boundary on p10 predecessors */
+ if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
+ (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512)))
return -EINVAL;
} else if (IS_ENABLED(CONFIG_PPC_8xx)) {
/* 8xx can setup a range without limitation */
--
2.26.2

2020-07-17 04:15:36

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v4 08/10] powerpc/watchpoint: Guest support for 2nd DAWR hcall

2nd DAWR can be set/unset using H_SET_MODE hcall with resource value 5.
Enable powervm guest support with that. This has no effect on kvm guest
because kvm will return error if guest does hcall with resource value 5.

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/powerpc/include/asm/hvcall.h | 1 +
arch/powerpc/include/asm/machdep.h | 2 +-
arch/powerpc/include/asm/plpar_wrappers.h | 5 +++++
arch/powerpc/kernel/dawr.c | 2 +-
arch/powerpc/platforms/pseries/setup.c | 7 +++++--
5 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index b785e9f0071c..33793444144c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -358,6 +358,7 @@
#define H_SET_MODE_RESOURCE_SET_DAWR0 2
#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
#define H_SET_MODE_RESOURCE_LE 4
+#define H_SET_MODE_RESOURCE_SET_DAWR1 5

/* Values for argument to H_SIGNAL_SYS_RESET */
#define H_SIGNAL_SYS_RESET_ALL -1
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 7bcb64444a39..a90b892f0bfe 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -131,7 +131,7 @@ struct machdep_calls {
unsigned long dabrx);

/* Set DAWR for this platform, leave empty for default implementation */
- int (*set_dawr)(unsigned long dawr,
+ int (*set_dawr)(int nr, unsigned long dawr,
unsigned long dawrx);

#ifdef CONFIG_PPC32 /* XXX for now */
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index d12c3680d946..ece84a430701 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -315,6 +315,11 @@ static inline long plpar_set_watchpoint0(unsigned long dawr0, unsigned long dawr
return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR0, dawr0, dawrx0);
}

+static inline long plpar_set_watchpoint1(unsigned long dawr1, unsigned long dawrx1)
+{
+ return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR1, dawr1, dawrx1);
+}
+
static inline long plpar_signal_sys_reset(long cpu)
{
return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu);
diff --git a/arch/powerpc/kernel/dawr.c b/arch/powerpc/kernel/dawr.c
index 500f52fa4711..cdc2dccb987d 100644
--- a/arch/powerpc/kernel/dawr.c
+++ b/arch/powerpc/kernel/dawr.c
@@ -37,7 +37,7 @@ int set_dawr(int nr, struct arch_hw_breakpoint *brk)
dawrx |= (mrd & 0x3f) << (63 - 53);

if (ppc_md.set_dawr)
- return ppc_md.set_dawr(dawr, dawrx);
+ return ppc_md.set_dawr(nr, dawr, dawrx);

if (nr == 0) {
mtspr(SPRN_DAWR0, dawr);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 2db8469e475f..d516ee8eb7fc 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -831,12 +831,15 @@ static int pseries_set_xdabr(unsigned long dabr, unsigned long dabrx)
return plpar_hcall_norets(H_SET_XDABR, dabr, dabrx);
}

-static int pseries_set_dawr(unsigned long dawr, unsigned long dawrx)
+static int pseries_set_dawr(int nr, unsigned long dawr, unsigned long dawrx)
{
/* PAPR says we can't set HYP */
dawrx &= ~DAWRX_HYP;

- return plpar_set_watchpoint0(dawr, dawrx);
+ if (nr == 0)
+ return plpar_set_watchpoint0(dawr, dawrx);
+ else
+ return plpar_set_watchpoint1(dawr, dawrx);
}

#define CMO_CHARACTERISTICS_TOKEN 44
--
2.26.2

2020-07-17 04:25:17

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 04/10] powerpc/watchpoint: Enable watchpoint functionality on power10 guest

On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
<[email protected]> wrote:
>
> CPU_FTR_DAWR is by default enabled for host via CPU_FTRS_DT_CPU_BASE
> (controlled by CONFIG_PPC_DT_CPU_FTRS). But cpu-features device-tree
> node is not PAPR compatible and thus not yet used by kvm or pHyp
> guests. Enable watchpoint functionality on power10 guest (both kvm
> and powervm) by adding CPU_FTR_DAWR to CPU_FTRS_POWER10. Note that
> this change does not enable 2nd DAWR support.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
I ran the ptrace-hwbreak selftest successfully within a power10 kvm guest.
Tested-by: Jordan Niethe <[email protected]>
> ---
> arch/powerpc/include/asm/cputable.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> index bac2252c839e..e506d429b1af 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -478,7 +478,7 @@ static inline void cpu_feature_keys_init(void) { }
> CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
> CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
> CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
> - CPU_FTR_ARCH_31)
> + CPU_FTR_ARCH_31 | CPU_FTR_DAWR)
> #define CPU_FTRS_CELL (CPU_FTR_LWSYNC | \
> CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
> CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> --
> 2.26.2
>

2020-07-17 05:47:01

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR

On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
<[email protected]> wrote:
>
> Add new device-tree feature for 2nd DAWR. If this feature is present,
> 2nd DAWR is supported, otherwise not.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> arch/powerpc/include/asm/cputable.h | 7 +++++--
> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> index e506d429b1af..3445c86e1f6f 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>
> #ifndef __ASSEMBLY__
>
> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
> #define CPU_FTRS_POSSIBLE \
> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
> + CPU_FTR_DAWR1)
> #else
> #define CPU_FTRS_POSSIBLE \
> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
> + CPU_FTR_DAWR1)
Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
into CPU_FTRS_POWER10?
Then it will be picked up by CPU_FTRS_POSSIBLE.
> #endif /* CONFIG_CPU_LITTLE_ENDIAN */
> #endif
> #else
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index ac650c233cd9..c78cd3596ec4 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -574,6 +574,12 @@ static int __init feat_enable_mma(struct dt_cpu_feature *f)
> return 1;
> }
>
> +static int __init feat_enable_debug_facilities_v31(struct dt_cpu_feature *f)
> +{
> + cur_cpu_spec->cpu_features |= CPU_FTR_DAWR1;
> + return 1;
> +}
> +
> struct dt_cpu_feature_match {
> const char *name;
> int (*enable)(struct dt_cpu_feature *f);
> @@ -649,6 +655,7 @@ static struct dt_cpu_feature_match __initdata
> {"wait-v3", feat_enable, 0},
> {"prefix-instructions", feat_enable, 0},
> {"matrix-multiply-assist", feat_enable_mma, 0},
> + {"debug-facilities-v31", feat_enable_debug_facilities_v31, 0},
Since all feat_enable_debug_facilities_v31() does is set
CPU_FTR_DAWR1, if you just have:
{"debug-facilities-v31", feat_enable, CPU_FTR_DAWR1},
I think cpufeatures_process_feature() should set it in for you at this point:
if (m->enable(f)) {
cur_cpu_spec->cpu_features |= m->cpu_ftr_bit_mask;
break;
}

> };
>
> static bool __initdata using_dt_cpu_ftrs;
> --
> 2.26.2
>

2020-07-20 01:44:24

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 06/10] powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit

On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
<[email protected]> wrote:
>
> As per the PAPR, bit 0 of byte 64 in pa-features property indicates
> availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
> DAWR is present, otherwise not. Host generally uses "cpu-features",
> which masks "pa-features". But "cpu-features" are still not used for
> guests and thus this change is mostly applicable for guests only.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
I checked those PAPR values are correct and checked running a powernv
kernel in p10 mambo with dt_cpu_ftrs=off and it does set the
CPU_FTR_DAWR1 bit.
(using p10 skiboot).
Tested-by: Jordan Niethe <[email protected]>
> ---
> arch/powerpc/kernel/prom.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9cc49f265c86..c76c09b97bc8 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -175,6 +175,8 @@ static struct ibm_pa_feature {
> */
> { .pabyte = 22, .pabit = 0, .cpu_features = CPU_FTR_TM_COMP,
> .cpu_user_ftrs2 = PPC_FEATURE2_HTM_COMP | PPC_FEATURE2_HTM_NOSC_COMP },
> +
> + { .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
> };
>
> static void __init scan_features(unsigned long node, const unsigned char *ftrs,
> --
> 2.26.2
>

2020-07-20 01:56:50

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 07/10] powerpc/watchpoint: Rename current H_SET_MODE DAWR macro

On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
<[email protected]> wrote:
>
> Current H_SET_MODE hcall macro name for setting/resetting DAWR0 is
> H_SET_MODE_RESOURCE_SET_DAWR. Add suffix 0 to macro name as well.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
Reviewed-by: Jordan Niethe <[email protected]>
> ---
> arch/powerpc/include/asm/hvcall.h | 2 +-
> arch/powerpc/include/asm/plpar_wrappers.h | 2 +-
> arch/powerpc/kvm/book3s_hv.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
> index 43486e773bd6..b785e9f0071c 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -355,7 +355,7 @@
>
> /* Values for 2nd argument to H_SET_MODE */
> #define H_SET_MODE_RESOURCE_SET_CIABR 1
> -#define H_SET_MODE_RESOURCE_SET_DAWR 2
> +#define H_SET_MODE_RESOURCE_SET_DAWR0 2
> #define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
> #define H_SET_MODE_RESOURCE_LE 4
>
> diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
> index 4293c5d2ddf4..d12c3680d946 100644
> --- a/arch/powerpc/include/asm/plpar_wrappers.h
> +++ b/arch/powerpc/include/asm/plpar_wrappers.h
> @@ -312,7 +312,7 @@ static inline long plpar_set_ciabr(unsigned long ciabr)
>
> static inline long plpar_set_watchpoint0(unsigned long dawr0, unsigned long dawrx0)
> {
> - return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
> + return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR0, dawr0, dawrx0);
> }
>
> static inline long plpar_signal_sys_reset(long cpu)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6bf66649ab92..7ad692c2d7c7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -764,7 +764,7 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
> return H_P3;
> vcpu->arch.ciabr = value1;
> return H_SUCCESS;
> - case H_SET_MODE_RESOURCE_SET_DAWR:
> + case H_SET_MODE_RESOURCE_SET_DAWR0:
> if (!kvmppc_power8_compatible(vcpu))
> return H_P2;
> if (!ppc_breakpoint_available())
> --
> 2.26.2
>

2020-07-20 03:46:53

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically

On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
<[email protected]> wrote:
>
> So far Book3S Powerpc supported only one watchpoint. Power10 is
> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> arch/powerpc/include/asm/cputable.h | 4 +++-
> arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
> 2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> index 3445c86e1f6f..36a0851a7a9b 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -633,7 +633,9 @@ enum {
> * Maximum number of hw breakpoint supported on powerpc. Number of
> * breakpoints supported by actual hw might be less than this.
> */
> -#define HBP_NUM_MAX 1
> +#define HBP_NUM_MAX 2
> +#define HBP_NUM_ONE 1
> +#define HBP_NUM_TWO 2
I wonder if these defines are necessary - has it any advantage over
just using the literal?
>
> #endif /* !__ASSEMBLY__ */
>
> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> index cb424799da0d..d4eab1694bcd 100644
> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -5,10 +5,11 @@
> * Copyright 2010, IBM Corporation.
> * Author: K.Prasad <[email protected]>
> */
> -
Was removing this line deliberate?
> #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
> #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
>
> +#include <asm/cpu_has_feature.h>
> +
> #ifdef __KERNEL__
> struct arch_hw_breakpoint {
> unsigned long address;
> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
>
> static inline int nr_wp_slots(void)
> {
> - return HBP_NUM_MAX;
> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
So it'd be something like:
+ return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
But thinking that there might be more slots added in the future, it
may be better to make the number of slots a variable that is set
during the init and then have this function return that.
> }
>
> #ifdef CONFIG_HAVE_HW_BREAKPOINT
> --
> 2.26.2
>

2020-07-20 06:58:19

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 10/10] powerpc/watchpoint: Remove 512 byte boundary

On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
<[email protected]> wrote:
>
> Power10 has removed 512 bytes boundary from match criteria. i.e. The watch
> range can cross 512 bytes boundary.
It looks like this change is not mentioned in ISA v3.1 Book III 9.4
Data Address Watchpoint. It could be useful to mention that in the
commit message.
Also I wonder if could add a test for this to the ptrace-hwbreak selftest?

>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> arch/powerpc/kernel/hw_breakpoint.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index c55e67bab271..1f4a1efa0074 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -418,8 +418,9 @@ static int hw_breakpoint_validate_len(struct arch_hw_breakpoint *hw)
>
> if (dawr_enabled()) {
> max_len = DAWR_MAX_LEN;
> - /* DAWR region can't cross 512 bytes boundary */
> - if (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512))
> + /* DAWR region can't cross 512 bytes boundary on p10 predecessors */
> + if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
> + (ALIGN_DOWN(start_addr, SZ_512) != ALIGN_DOWN(end_addr - 1, SZ_512)))
> return -EINVAL;
> } else if (IS_ENABLED(CONFIG_PPC_8xx)) {
> /* 8xx can setup a range without limitation */
> --
> 2.26.2
>

2020-07-21 03:26:07

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 10/10] powerpc/watchpoint: Remove 512 byte boundary

Hi Jordan,

On 7/20/20 12:24 PM, Jordan Niethe wrote:
> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> <[email protected]> wrote:
>>
>> Power10 has removed 512 bytes boundary from match criteria. i.e. The watch
>> range can cross 512 bytes boundary.
> It looks like this change is not mentioned in ISA v3.1 Book III 9.4
> Data Address Watchpoint. It could be useful to mention that in the
> commit message.

Yes, ISA 3.1 Book III 9.4 has a documentation mistake and hopefully it
will be fixed in the next version of ISA. Though, this is mentioned in
ISA 3.1 change log:

Multiple DEAW:
Added a second Data Address Watchpoint. [H]DAR is
set to the first byte of overlap. 512B boundary is
removed.

I'll mention this in the commit description.

> Also I wonder if could add a test for this to the ptrace-hwbreak selftest?

Yes, I already have a selftest for this in perf-hwbreak. Will send that soon.

Thanks,
Ravi

2020-07-21 03:59:48

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically



On 7/20/20 9:12 AM, Jordan Niethe wrote:
> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> <[email protected]> wrote:
>>
>> So far Book3S Powerpc supported only one watchpoint. Power10 is
>> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
>> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>>
>> Signed-off-by: Ravi Bangoria <[email protected]>
>> ---
>> arch/powerpc/include/asm/cputable.h | 4 +++-
>> arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
>> 2 files changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>> index 3445c86e1f6f..36a0851a7a9b 100644
>> --- a/arch/powerpc/include/asm/cputable.h
>> +++ b/arch/powerpc/include/asm/cputable.h
>> @@ -633,7 +633,9 @@ enum {
>> * Maximum number of hw breakpoint supported on powerpc. Number of
>> * breakpoints supported by actual hw might be less than this.
>> */
>> -#define HBP_NUM_MAX 1
>> +#define HBP_NUM_MAX 2
>> +#define HBP_NUM_ONE 1
>> +#define HBP_NUM_TWO 2
> I wonder if these defines are necessary - has it any advantage over
> just using the literal?

No, not really. Initially I had something like:

#define HBP_NUM_MAX 2
#define HBP_NUM_P8_P9 1
#define HBP_NUM_P10 2

But then I thought it's also not right. So I made it _ONE and _TWO.
Now the function that decides nr watchpoints dynamically (nr_wp_slots)
is in different file, I thought to keep it like this so it would be
easier to figure out why _MAX is 2.

>>
>> #endif /* !__ASSEMBLY__ */
>>
>> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
>> index cb424799da0d..d4eab1694bcd 100644
>> --- a/arch/powerpc/include/asm/hw_breakpoint.h
>> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
>> @@ -5,10 +5,11 @@
>> * Copyright 2010, IBM Corporation.
>> * Author: K.Prasad <[email protected]>
>> */
>> -
> Was removing this line deliberate?

Nah. Will remove that hunk.

>> #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
>> #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
>>
>> +#include <asm/cpu_has_feature.h>
>> +
>> #ifdef __KERNEL__
>> struct arch_hw_breakpoint {
>> unsigned long address;
>> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
>>
>> static inline int nr_wp_slots(void)
>> {
>> - return HBP_NUM_MAX;
>> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
> So it'd be something like:
> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
> But thinking that there might be more slots added in the future, it
> may be better to make the number of slots a variable that is set
> during the init and then have this function return that.

Not sure I follow. What do you mean by setting number of slots a
variable that is set during the init?

Thanks,
Ravi

2020-07-21 04:45:03

by Jordan Niethe

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically

On Tue, Jul 21, 2020 at 1:57 PM Ravi Bangoria
<[email protected]> wrote:
>
>
>
> On 7/20/20 9:12 AM, Jordan Niethe wrote:
> > On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> > <[email protected]> wrote:
> >>
> >> So far Book3S Powerpc supported only one watchpoint. Power10 is
> >> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
> >> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
> >>
> >> Signed-off-by: Ravi Bangoria <[email protected]>
> >> ---
> >> arch/powerpc/include/asm/cputable.h | 4 +++-
> >> arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
> >> 2 files changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> >> index 3445c86e1f6f..36a0851a7a9b 100644
> >> --- a/arch/powerpc/include/asm/cputable.h
> >> +++ b/arch/powerpc/include/asm/cputable.h
> >> @@ -633,7 +633,9 @@ enum {
> >> * Maximum number of hw breakpoint supported on powerpc. Number of
> >> * breakpoints supported by actual hw might be less than this.
> >> */
> >> -#define HBP_NUM_MAX 1
> >> +#define HBP_NUM_MAX 2
> >> +#define HBP_NUM_ONE 1
> >> +#define HBP_NUM_TWO 2
> > I wonder if these defines are necessary - has it any advantage over
> > just using the literal?
>
> No, not really. Initially I had something like:
>
> #define HBP_NUM_MAX 2
> #define HBP_NUM_P8_P9 1
> #define HBP_NUM_P10 2
>
> But then I thought it's also not right. So I made it _ONE and _TWO.
> Now the function that decides nr watchpoints dynamically (nr_wp_slots)
> is in different file, I thought to keep it like this so it would be
> easier to figure out why _MAX is 2.
>
> >>
> >> #endif /* !__ASSEMBLY__ */
> >>
> >> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> >> index cb424799da0d..d4eab1694bcd 100644
> >> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> >> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> >> @@ -5,10 +5,11 @@
> >> * Copyright 2010, IBM Corporation.
> >> * Author: K.Prasad <[email protected]>
> >> */
> >> -
> > Was removing this line deliberate?
>
> Nah. Will remove that hunk.
>
> >> #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
> >> #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
> >>
> >> +#include <asm/cpu_has_feature.h>
> >> +
> >> #ifdef __KERNEL__
> >> struct arch_hw_breakpoint {
> >> unsigned long address;
> >> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
> >>
> >> static inline int nr_wp_slots(void)
> >> {
> >> - return HBP_NUM_MAX;
> >> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
> > So it'd be something like:
> > + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
> > But thinking that there might be more slots added in the future, it
> > may be better to make the number of slots a variable that is set
> > during the init and then have this function return that.
>
> Not sure I follow. What do you mean by setting number of slots a
> variable that is set during the init?
Sorry I was unclear there.
I was just looking and saw arm also has a variable number of hw breakpoints.
If we did something like how they handle it, it might look something like:

static int num_wp_slots __ro_after_init;

int nr_wp_slots(void) {
return num_wp_slots;
}

static int __init arch_hw_breakpoint_init(void) {
num_wp_slots = work out how many wp_slots
}
arch_initcall(arch_hw_breakpoint_init);

Then we wouldn't have to calculate everytime nr_wp_slots() is called.
In the future if more wp's are added nr_wp_slots() will get more complicated.
But just an idea, feel free to ignore.

>
> Thanks,
> Ravi

2020-07-21 07:54:29

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR



On 7/17/20 11:14 AM, Jordan Niethe wrote:
> On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
> <[email protected]> wrote:
>>
>> Add new device-tree feature for 2nd DAWR. If this feature is present,
>> 2nd DAWR is supported, otherwise not.
>>
>> Signed-off-by: Ravi Bangoria <[email protected]>
>> ---
>> arch/powerpc/include/asm/cputable.h | 7 +++++--
>> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>> index e506d429b1af..3445c86e1f6f 100644
>> --- a/arch/powerpc/include/asm/cputable.h
>> +++ b/arch/powerpc/include/asm/cputable.h
>> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
>> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
>> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
>> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
>> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>>
>> #ifndef __ASSEMBLY__
>>
>> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
>> #define CPU_FTRS_POSSIBLE \
>> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>> + CPU_FTR_DAWR1)
>> #else
>> #define CPU_FTRS_POSSIBLE \
>> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>> + CPU_FTR_DAWR1)
> Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
> into CPU_FTRS_POWER10?
> Then it will be picked up by CPU_FTRS_POSSIBLE.

I remember a discussion about this with Mikey and we decided to do it
this way. Obviously, the purpose is to make CPU_FTR_DAWR1 independent of
CPU_FTRS_POWER10 because DAWR1 is an optional feature in p10. I fear
including CPU_FTR_DAWR1 in CPU_FTRS_POWER10 can make it forcefully enabled
even when device-tree property is not present or pa-feature bit it not set,
because we do:

{ /* 3.1-compliant processor, i.e. Power10 "architected" mode */
.pvr_mask = 0xffffffff,
.pvr_value = 0x0f000006,
.cpu_name = "POWER10 (architected)",
.cpu_features = CPU_FTRS_POWER10,

>> #endif /* CONFIG_CPU_LITTLE_ENDIAN */
>> #endif
>> #else
>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> index ac650c233cd9..c78cd3596ec4 100644
>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> @@ -574,6 +574,12 @@ static int __init feat_enable_mma(struct dt_cpu_feature *f)
>> return 1;
>> }
>>
>> +static int __init feat_enable_debug_facilities_v31(struct dt_cpu_feature *f)
>> +{
>> + cur_cpu_spec->cpu_features |= CPU_FTR_DAWR1;
>> + return 1;
>> +}
>> +
>> struct dt_cpu_feature_match {
>> const char *name;
>> int (*enable)(struct dt_cpu_feature *f);
>> @@ -649,6 +655,7 @@ static struct dt_cpu_feature_match __initdata
>> {"wait-v3", feat_enable, 0},
>> {"prefix-instructions", feat_enable, 0},
>> {"matrix-multiply-assist", feat_enable_mma, 0},
>> + {"debug-facilities-v31", feat_enable_debug_facilities_v31, 0},
> Since all feat_enable_debug_facilities_v31() does is set
> CPU_FTR_DAWR1, if you just have:
> {"debug-facilities-v31", feat_enable, CPU_FTR_DAWR1},
> I think cpufeatures_process_feature() should set it in for you at this point:
> if (m->enable(f)) {
> cur_cpu_spec->cpu_features |= m->cpu_ftr_bit_mask;
> break;
> }

Yes, that seems a better option.

Thanks,
Ravi

2020-07-21 08:17:45

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically


>>>> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
>>>>
>>>> static inline int nr_wp_slots(void)
>>>> {
>>>> - return HBP_NUM_MAX;
>>>> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
>>> So it'd be something like:
>>> + return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
>>> But thinking that there might be more slots added in the future, it
>>> may be better to make the number of slots a variable that is set
>>> during the init and then have this function return that.
>>
>> Not sure I follow. What do you mean by setting number of slots a
>> variable that is set during the init?
> Sorry I was unclear there.
> I was just looking and saw arm also has a variable number of hw breakpoints.
> If we did something like how they handle it, it might look something like:
>
> static int num_wp_slots __ro_after_init;
>
> int nr_wp_slots(void) {
> return num_wp_slots;
> }
>
> static int __init arch_hw_breakpoint_init(void) {
> num_wp_slots = work out how many wp_slots
> }
> arch_initcall(arch_hw_breakpoint_init);
>
> Then we wouldn't have to calculate everytime nr_wp_slots() is called.
> In the future if more wp's are added nr_wp_slots() will get more complicated.
> But just an idea, feel free to ignore.

Ok I got the idea. But ARM arch_hw_breakpoint_init() is much more complex
compared to our nr_wp_slots(). I don't see any benefit by making our code
like ARM.

Thanks for the idea though :)
Ravi

2020-07-21 11:30:18

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR

Ravi Bangoria <[email protected]> writes:
> On 7/17/20 11:14 AM, Jordan Niethe wrote:
>> On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
>> <[email protected]> wrote:
>>>
>>> Add new device-tree feature for 2nd DAWR. If this feature is present,
>>> 2nd DAWR is supported, otherwise not.
>>>
>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>> ---
>>> arch/powerpc/include/asm/cputable.h | 7 +++++--
>>> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
>>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>> index e506d429b1af..3445c86e1f6f 100644
>>> --- a/arch/powerpc/include/asm/cputable.h
>>> +++ b/arch/powerpc/include/asm/cputable.h
>>> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
>>> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
>>> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
>>> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
>>> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>>>
>>> #ifndef __ASSEMBLY__
>>>
>>> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
>>> #define CPU_FTRS_POSSIBLE \
>>> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>>> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>> + CPU_FTR_DAWR1)
>>> #else
>>> #define CPU_FTRS_POSSIBLE \
>>> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>>> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>>> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>>> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>> + CPU_FTR_DAWR1)

>> Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
>> into CPU_FTRS_POWER10?
>> Then it will be picked up by CPU_FTRS_POSSIBLE.
>
> I remember a discussion about this with Mikey and we decided to do it
> this way. Obviously, the purpose is to make CPU_FTR_DAWR1 independent of
> CPU_FTRS_POWER10 because DAWR1 is an optional feature in p10. I fear
> including CPU_FTR_DAWR1 in CPU_FTRS_POWER10 can make it forcefully enabled
> even when device-tree property is not present or pa-feature bit it not set,
> because we do:
>
> { /* 3.1-compliant processor, i.e. Power10 "architected" mode */
> .pvr_mask = 0xffffffff,
> .pvr_value = 0x0f000006,
> .cpu_name = "POWER10 (architected)",
> .cpu_features = CPU_FTRS_POWER10,

The pa-features logic will turn it off if the feature bit is not set.

So you should be able to put it in CPU_FTRS_POWER10.

See for example CPU_FTR_NOEXECUTE.

cheers

2020-07-21 11:38:16

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically

Ravi Bangoria <[email protected]> writes:
> On 7/20/20 9:12 AM, Jordan Niethe wrote:
>> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
>> <[email protected]> wrote:
>>>
>>> So far Book3S Powerpc supported only one watchpoint. Power10 is
>>> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
>>> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>>>
>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>> ---
>>> arch/powerpc/include/asm/cputable.h | 4 +++-
>>> arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
>>> 2 files changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>> index 3445c86e1f6f..36a0851a7a9b 100644
>>> --- a/arch/powerpc/include/asm/cputable.h
>>> +++ b/arch/powerpc/include/asm/cputable.h
>>> @@ -633,7 +633,9 @@ enum {
>>> * Maximum number of hw breakpoint supported on powerpc. Number of
>>> * breakpoints supported by actual hw might be less than this.
>>> */
>>> -#define HBP_NUM_MAX 1
>>> +#define HBP_NUM_MAX 2
>>> +#define HBP_NUM_ONE 1
>>> +#define HBP_NUM_TWO 2

>> I wonder if these defines are necessary - has it any advantage over
>> just using the literal?
>
> No, not really. Initially I had something like:
>
> #define HBP_NUM_MAX 2
> #define HBP_NUM_P8_P9 1
> #define HBP_NUM_P10 2
>
> But then I thought it's also not right. So I made it _ONE and _TWO.
> Now the function that decides nr watchpoints dynamically (nr_wp_slots)
> is in different file, I thought to keep it like this so it would be
> easier to figure out why _MAX is 2.

I don't think it makes anything clearer.

I had to stare at it thinking there was some sort of mapping or
indirection going on, before I realised it's just literally the number
of breakpoints.

So please just do:

static inline int nr_wp_slots(void)
{
return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
}

If you think HBP_NUM_MAX needs explanation then do that with a comment,
it can refer to nr_wp_slots() if that's helpful.

cheers

2020-07-21 13:34:48

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically



On 7/21/20 5:06 PM, Michael Ellerman wrote:
> Ravi Bangoria <[email protected]> writes:
>> On 7/20/20 9:12 AM, Jordan Niethe wrote:
>>> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
>>> <[email protected]> wrote:
>>>>
>>>> So far Book3S Powerpc supported only one watchpoint. Power10 is
>>>> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
>>>> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>>>>
>>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>>> ---
>>>> arch/powerpc/include/asm/cputable.h | 4 +++-
>>>> arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
>>>> 2 files changed, 6 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>>> index 3445c86e1f6f..36a0851a7a9b 100644
>>>> --- a/arch/powerpc/include/asm/cputable.h
>>>> +++ b/arch/powerpc/include/asm/cputable.h
>>>> @@ -633,7 +633,9 @@ enum {
>>>> * Maximum number of hw breakpoint supported on powerpc. Number of
>>>> * breakpoints supported by actual hw might be less than this.
>>>> */
>>>> -#define HBP_NUM_MAX 1
>>>> +#define HBP_NUM_MAX 2
>>>> +#define HBP_NUM_ONE 1
>>>> +#define HBP_NUM_TWO 2
>
>>> I wonder if these defines are necessary - has it any advantage over
>>> just using the literal?
>>
>> No, not really. Initially I had something like:
>>
>> #define HBP_NUM_MAX 2
>> #define HBP_NUM_P8_P9 1
>> #define HBP_NUM_P10 2
>>
>> But then I thought it's also not right. So I made it _ONE and _TWO.
>> Now the function that decides nr watchpoints dynamically (nr_wp_slots)
>> is in different file, I thought to keep it like this so it would be
>> easier to figure out why _MAX is 2.
>
> I don't think it makes anything clearer.
>
> I had to stare at it thinking there was some sort of mapping or
> indirection going on, before I realised it's just literally the number
> of breakpoints.
>
> So please just do:
>
> static inline int nr_wp_slots(void)
> {
> return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
> }
>
> If you think HBP_NUM_MAX needs explanation then do that with a comment,
> it can refer to nr_wp_slots() if that's helpful.

Agreed. By adding a comment, we can remove those macros. Will change it.

Thanks,
Ravi

2020-07-21 13:45:07

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR



On 7/21/20 4:59 PM, Michael Ellerman wrote:
> Ravi Bangoria <[email protected]> writes:
>> On 7/17/20 11:14 AM, Jordan Niethe wrote:
>>> On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
>>> <[email protected]> wrote:
>>>>
>>>> Add new device-tree feature for 2nd DAWR. If this feature is present,
>>>> 2nd DAWR is supported, otherwise not.
>>>>
>>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>>> ---
>>>> arch/powerpc/include/asm/cputable.h | 7 +++++--
>>>> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
>>>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>>> index e506d429b1af..3445c86e1f6f 100644
>>>> --- a/arch/powerpc/include/asm/cputable.h
>>>> +++ b/arch/powerpc/include/asm/cputable.h
>>>> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
>>>> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
>>>> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
>>>> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
>>>> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>>>>
>>>> #ifndef __ASSEMBLY__
>>>>
>>>> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
>>>> #define CPU_FTRS_POSSIBLE \
>>>> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>>>> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>> + CPU_FTR_DAWR1)
>>>> #else
>>>> #define CPU_FTRS_POSSIBLE \
>>>> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>>>> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>>>> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>>>> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>> + CPU_FTR_DAWR1)
>
>>> Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
>>> into CPU_FTRS_POWER10?
>>> Then it will be picked up by CPU_FTRS_POSSIBLE.
>>
>> I remember a discussion about this with Mikey and we decided to do it
>> this way. Obviously, the purpose is to make CPU_FTR_DAWR1 independent of
>> CPU_FTRS_POWER10 because DAWR1 is an optional feature in p10. I fear
>> including CPU_FTR_DAWR1 in CPU_FTRS_POWER10 can make it forcefully enabled
>> even when device-tree property is not present or pa-feature bit it not set,
>> because we do:
>>
>> { /* 3.1-compliant processor, i.e. Power10 "architected" mode */
>> .pvr_mask = 0xffffffff,
>> .pvr_value = 0x0f000006,
>> .cpu_name = "POWER10 (architected)",
>> .cpu_features = CPU_FTRS_POWER10,
>
> The pa-features logic will turn it off if the feature bit is not set.
>
> So you should be able to put it in CPU_FTRS_POWER10.
>
> See for example CPU_FTR_NOEXECUTE.

Ah ok. scan_features() clears the feature if the bit is not set in
pa-features. So it should work find for powervm. I'll verify the same
thing happens in case of baremetal where we use cpu-features not
pa-features. If it works in baremetal as well, will put it in
CPU_FTRS_POWER10.

Thanks for the clarification,
Ravi

2020-07-21 14:10:06

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR

Ravi Bangoria <[email protected]> writes:
> On 7/21/20 4:59 PM, Michael Ellerman wrote:
>> Ravi Bangoria <[email protected]> writes:
>>> On 7/17/20 11:14 AM, Jordan Niethe wrote:
>>>> On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
>>>> <[email protected]> wrote:
>>>>>
>>>>> Add new device-tree feature for 2nd DAWR. If this feature is present,
>>>>> 2nd DAWR is supported, otherwise not.
>>>>>
>>>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>>>> ---
>>>>> arch/powerpc/include/asm/cputable.h | 7 +++++--
>>>>> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
>>>>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>>>> index e506d429b1af..3445c86e1f6f 100644
>>>>> --- a/arch/powerpc/include/asm/cputable.h
>>>>> +++ b/arch/powerpc/include/asm/cputable.h
>>>>> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
>>>>> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
>>>>> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
>>>>> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
>>>>> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>>>>>
>>>>> #ifndef __ASSEMBLY__
>>>>>
>>>>> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
>>>>> #define CPU_FTRS_POSSIBLE \
>>>>> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>>>>> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
>>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>>> + CPU_FTR_DAWR1)
>>>>> #else
>>>>> #define CPU_FTRS_POSSIBLE \
>>>>> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>>>>> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>>>>> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>>>>> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
>>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>>> + CPU_FTR_DAWR1)
>>
>>>> Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
>>>> into CPU_FTRS_POWER10?
>>>> Then it will be picked up by CPU_FTRS_POSSIBLE.
>>>
>>> I remember a discussion about this with Mikey and we decided to do it
>>> this way. Obviously, the purpose is to make CPU_FTR_DAWR1 independent of
>>> CPU_FTRS_POWER10 because DAWR1 is an optional feature in p10. I fear
>>> including CPU_FTR_DAWR1 in CPU_FTRS_POWER10 can make it forcefully enabled
>>> even when device-tree property is not present or pa-feature bit it not set,
>>> because we do:
>>>
>>> { /* 3.1-compliant processor, i.e. Power10 "architected" mode */
>>> .pvr_mask = 0xffffffff,
>>> .pvr_value = 0x0f000006,
>>> .cpu_name = "POWER10 (architected)",
>>> .cpu_features = CPU_FTRS_POWER10,
>>
>> The pa-features logic will turn it off if the feature bit is not set.
>>
>> So you should be able to put it in CPU_FTRS_POWER10.
>>
>> See for example CPU_FTR_NOEXECUTE.
>
> Ah ok. scan_features() clears the feature if the bit is not set in
> pa-features. So it should work find for powervm. I'll verify the same
> thing happens in case of baremetal where we use cpu-features not
> pa-features. If it works in baremetal as well, will put it in
> CPU_FTRS_POWER10.

When we use DT CPU features we don't use CPU_FTRS_POWER10 at all.

We construct a cpu_spec from scratch with just the base set of features:

static struct cpu_spec __initdata base_cpu_spec = {
.cpu_name = NULL,
.cpu_features = CPU_FTRS_DT_CPU_BASE,


And then individual features are enabled via the device tree flags.

cheers

2020-07-21 14:20:14

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v4 05/10] powerpc/dt_cpu_ftrs: Add feature for 2nd DAWR



On 7/21/20 7:37 PM, Michael Ellerman wrote:
> Ravi Bangoria <[email protected]> writes:
>> On 7/21/20 4:59 PM, Michael Ellerman wrote:
>>> Ravi Bangoria <[email protected]> writes:
>>>> On 7/17/20 11:14 AM, Jordan Niethe wrote:
>>>>> On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Add new device-tree feature for 2nd DAWR. If this feature is present,
>>>>>> 2nd DAWR is supported, otherwise not.
>>>>>>
>>>>>> Signed-off-by: Ravi Bangoria <[email protected]>
>>>>>> ---
>>>>>> arch/powerpc/include/asm/cputable.h | 7 +++++--
>>>>>> arch/powerpc/kernel/dt_cpu_ftrs.c | 7 +++++++
>>>>>> 2 files changed, 12 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>>>>>> index e506d429b1af..3445c86e1f6f 100644
>>>>>> --- a/arch/powerpc/include/asm/cputable.h
>>>>>> +++ b/arch/powerpc/include/asm/cputable.h
>>>>>> @@ -214,6 +214,7 @@ static inline void cpu_feature_keys_init(void) { }
>>>>>> #define CPU_FTR_P9_TLBIE_ERAT_BUG LONG_ASM_CONST(0x0001000000000000)
>>>>>> #define CPU_FTR_P9_RADIX_PREFETCH_BUG LONG_ASM_CONST(0x0002000000000000)
>>>>>> #define CPU_FTR_ARCH_31 LONG_ASM_CONST(0x0004000000000000)
>>>>>> +#define CPU_FTR_DAWR1 LONG_ASM_CONST(0x0008000000000000)
>>>>>>
>>>>>> #ifndef __ASSEMBLY__
>>>>>>
>>>>>> @@ -497,14 +498,16 @@ static inline void cpu_feature_keys_init(void) { }
>>>>>> #define CPU_FTRS_POSSIBLE \
>>>>>> (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>>>>>> CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
>>>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>>>> + CPU_FTR_DAWR1)
>>>>>> #else
>>>>>> #define CPU_FTRS_POSSIBLE \
>>>>>> (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>>>>>> CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>>>>>> CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>>>>>> CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
>>>>>> - CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
>>>>>> + CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
>>>>>> + CPU_FTR_DAWR1)
>>>
>>>>> Instead of putting CPU_FTR_DAWR1 into CPU_FTRS_POSSIBLE should it go
>>>>> into CPU_FTRS_POWER10?
>>>>> Then it will be picked up by CPU_FTRS_POSSIBLE.
>>>>
>>>> I remember a discussion about this with Mikey and we decided to do it
>>>> this way. Obviously, the purpose is to make CPU_FTR_DAWR1 independent of
>>>> CPU_FTRS_POWER10 because DAWR1 is an optional feature in p10. I fear
>>>> including CPU_FTR_DAWR1 in CPU_FTRS_POWER10 can make it forcefully enabled
>>>> even when device-tree property is not present or pa-feature bit it not set,
>>>> because we do:
>>>>
>>>> { /* 3.1-compliant processor, i.e. Power10 "architected" mode */
>>>> .pvr_mask = 0xffffffff,
>>>> .pvr_value = 0x0f000006,
>>>> .cpu_name = "POWER10 (architected)",
>>>> .cpu_features = CPU_FTRS_POWER10,
>>>
>>> The pa-features logic will turn it off if the feature bit is not set.
>>>
>>> So you should be able to put it in CPU_FTRS_POWER10.
>>>
>>> See for example CPU_FTR_NOEXECUTE.
>>
>> Ah ok. scan_features() clears the feature if the bit is not set in
>> pa-features. So it should work find for powervm. I'll verify the same
>> thing happens in case of baremetal where we use cpu-features not
>> pa-features. If it works in baremetal as well, will put it in
>> CPU_FTRS_POWER10.
>
> When we use DT CPU features we don't use CPU_FTRS_POWER10 at all.
>
> We construct a cpu_spec from scratch with just the base set of features:
>
> static struct cpu_spec __initdata base_cpu_spec = {
> .cpu_name = NULL,
> .cpu_features = CPU_FTRS_DT_CPU_BASE,
>
>
> And then individual features are enabled via the device tree flags.

Ah good. I was under a wrong impression that we use cpu_specs[] for all
the cases. Thanks mpe for explaining in detail :)

Ravi