2023-02-15 09:24:05

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v3 0/3] LoongArch: Add hardware breakpoints/watchpoints support

Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and load/store operations, after the software configures hardware watchpoints
for instruction fetch and load/store operations. The processor hardware will
monitor the access address of the instruction fetch and load/store operation,
and will trigger the exception of the watchpoint when it meets the conditions
set by the watchpoint.

The hardware monitoring points for instruction fetching and load/store operations
each have a register for the overall configuration of all monitoring points,
a register for recording the status of all monitoring points, and four registers
required for configuration of each watchpoint individually.

Watchpoint related control status register chapters:
https://github.com/loongson/LoongArch-Documentation/releases/download/2022.08.12/
LoongArch-Vol1-v1.02-CN.pdf

Initial version has the following limitations:
- no support for virtualization

Can be tested: e.g.
1. see samples/hw_breakpoint and register_wide_hw_breakpoint.
2. ptrace(PTRACE_SINGLESTEP/..., pid, NULL, NULL)
3. ptrace (PTRACE_SETREGSET, tid, ... , ...)

TODO:
- Add hardware breakpoints/watchpoints for gdb, kgdb

Changes v1 -> v2:
Suggested by Huacai:
- Use irqentry_enter()/irqentry_exit() instead of
exception_enter()/exception_exit().
- Add ptrace interface to expose hw-breakpoints to debuggers.
- Use 2022-2023.
- Some other changes.

Changes v2 -> v3:
Suggested by Jinyang:
- get_num_brps/wrps, decode/encode_ctrl_reg which the function
returns directly.
- Remove irrelevant content from the first patch.
- Use macros to avoid using magic values directly.
- Add new arg to indicate it is breakpoint or watchpoint
to avoid function coupling.
- Remove redundant tab.
- Modify the breakpoint/watchpoint_handler function type to void.
- Some other changes.

Qing Zhang (3):
LoongArch: Add hardware breakpoints/watchpoints support
LoongArch: Add ptrace single step support
LoongArch: ptrace: expose hardware breakpoints to debuggers

arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/hw_breakpoint.h | 145 ++++++
arch/loongarch/include/asm/loongarch.h | 32 +-
arch/loongarch/include/asm/processor.h | 14 +-
arch/loongarch/include/asm/ptrace.h | 2 +
arch/loongarch/include/asm/switch_to.h | 1 +
arch/loongarch/include/uapi/asm/ptrace.h | 9 +
arch/loongarch/kernel/Makefile | 1 +
arch/loongarch/kernel/hw_breakpoint.c | 544 +++++++++++++++++++++
arch/loongarch/kernel/process.c | 7 +
arch/loongarch/kernel/ptrace.c | 484 ++++++++++++++++++
arch/loongarch/kernel/traps.c | 24 +-
include/uapi/linux/elf.h | 2 +
13 files changed, 1245 insertions(+), 21 deletions(-)
create mode 100644 arch/loongarch/include/asm/hw_breakpoint.h
create mode 100644 arch/loongarch/kernel/hw_breakpoint.c

--
2.36.0



2023-02-15 09:24:09

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v3 2/3] LoongArch: Add ptrace single step support

Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
arch_has_single_step in and implementing the
user_enable_single_step and user_disable_single_step functions.

LongArch has no hardware single-step register. the hardware single-step
function multiplex fetch instruction watchpoint(FWPS) and specifies that
the next instruction must trigger the watch exception by setting the mask bit.

Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/include/asm/processor.h | 3 ++
arch/loongarch/include/asm/ptrace.h | 2 +
arch/loongarch/kernel/hw_breakpoint.c | 35 +++++++++++--
arch/loongarch/kernel/ptrace.c | 68 ++++++++++++++++++++++++++
arch/loongarch/kernel/traps.c | 20 ++++++--
5 files changed, 120 insertions(+), 8 deletions(-)

diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
index db060c5a976f..3ea0f1910c23 100644
--- a/arch/loongarch/include/asm/processor.h
+++ b/arch/loongarch/include/asm/processor.h
@@ -131,6 +131,9 @@ struct thread_struct {
struct perf_event *hbp_break[LOONGARCH_MAX_BRP];
struct perf_event *hbp_watch[LOONGARCH_MAX_WRP];

+ /* Used by ptrace single_step */
+ unsigned long single_step;
+
/*
* FPU & vector registers, must be at last because
* they are conditionally copied at fork().
diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
index 58596c4f8a0f..66a0e6c480a3 100644
--- a/arch/loongarch/include/asm/ptrace.h
+++ b/arch/loongarch/include/asm/ptrace.h
@@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct pt_regs *regs,
regs->regs[3] = val;
}

+#define arch_has_single_step() (1)
+
#endif /* _ASM_PTRACE_H */
diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
index 6431cd319c32..94967b887d92 100644
--- a/arch/loongarch/kernel/hw_breakpoint.c
+++ b/arch/loongarch/kernel/hw_breakpoint.c
@@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
*/
void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
{
+ int i;
+ struct thread_struct *t = &tsk->thread;
+
+ for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
+ if (t->hbp_break[i]) {
+ unregister_hw_breakpoint(t->hbp_break[i]);
+ t->hbp_break[i] = NULL;
+ }
+ }
+
+ for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
+ if (t->hbp_watch[i]) {
+ unregister_hw_breakpoint(t->hbp_watch[i]);
+ t->hbp_watch[i] = NULL;
+ }
+ }
}

void ptrace_hw_copy_thread(struct task_struct *tsk)
@@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
void hw_breakpoint_thread_switch(struct task_struct *next)
{
struct pt_regs *regs = task_pt_regs(next);
-
- /* Update breakpoints */
- update_bp_registers(regs, 1, 0);
- /* Update watchpoints */
- update_bp_registers(regs, 1, 1);
+ u64 addr, mask;
+
+ if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
+ addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
+ mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
+ if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
+ csr_write32(0x10000, LOONGARCH_CSR_FWPS);
+ regs->csr_prmd |= CSR_PRMD_PWE;
+ } else {
+ /* Update breakpoints */
+ update_bp_registers(regs, 1, 0);
+ /* Update watchpoints */
+ update_bp_registers(regs, 1, 1);
+ }
}

void hw_breakpoint_pmu_read(struct perf_event *bp)
diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
index bee4194177fd..52a3ee4366f4 100644
--- a/arch/loongarch/kernel/ptrace.c
+++ b/arch/loongarch/kernel/ptrace.c
@@ -20,6 +20,7 @@
#include <linux/context_tracking.h>
#include <linux/elf.h>
#include <linux/errno.h>
+#include <linux/hw_breakpoint.h>
#include <linux/mm.h>
#include <linux/ptrace.h>
#include <linux/regset.h>
@@ -30,6 +31,7 @@
#include <linux/stddef.h>
#include <linux/seccomp.h>
#include <linux/uaccess.h>
+#include <linux/thread_info.h>

#include <asm/byteorder.h>
#include <asm/cpu.h>
@@ -39,6 +41,7 @@
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
+#include <asm/ptrace.h>
#include <asm/reg.h>
#include <asm/syscall.h>

@@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long request,

return ret;
}
+
+void ptrace_triggered(struct perf_event *bp,
+ struct perf_sample_data *data, struct pt_regs *regs)
+{
+ struct perf_event_attr attr;
+
+ attr = bp->attr;
+ attr.disabled = true;
+ modify_user_hw_breakpoint(bp, &attr);
+}
+
+static int set_single_step(struct task_struct *tsk, unsigned long addr)
+{
+ struct thread_struct *thread = &tsk->thread;
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+ struct arch_hw_breakpoint *info;
+
+ bp = thread->hbp_break[0];
+ if (!bp) {
+ ptrace_breakpoint_init(&attr);
+
+ attr.bp_addr = addr;
+ attr.bp_len = HW_BREAKPOINT_LEN_8;
+ attr.bp_type = HW_BREAKPOINT_X;
+
+ bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
+ NULL, tsk);
+ if (IS_ERR(bp))
+ return PTR_ERR(bp);
+
+ thread->hbp_break[0] = bp;
+ } else {
+ int err;
+
+ attr = bp->attr;
+ attr.bp_addr = addr;
+ /* reenable breakpoint */
+ attr.disabled = false;
+ err = modify_user_hw_breakpoint(bp, &attr);
+ if (unlikely(err))
+ return err;
+
+ csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
+ }
+ info = counter_arch_bp(bp);
+ info->mask = 0xffffffffffff;
+
+ return 0;
+}
+
+/* ptrace API */
+void user_enable_single_step(struct task_struct *task)
+{
+ struct thread_info *ti = task_thread_info(task);
+
+ set_single_step(task, task_pt_regs(task)->csr_era);
+ task->thread.single_step = task_pt_regs(task)->csr_era;
+ set_ti_thread_flag(ti, TIF_SINGLESTEP);
+}
+
+void user_disable_single_step(struct task_struct *task)
+{
+ clear_tsk_thread_flag(task, TIF_SINGLESTEP);
+}
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 2b133079e0f3..56d7d076153c 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs *regs)
#ifdef CONFIG_HAVE_HW_BREAKPOINT
irqentry_state_t state = irqentry_enter(regs);

- breakpoint_handler(regs);
- watchpoint_handler(regs);
- force_sig(SIGTRAP);
+ if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
+ int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
+ unsigned long pc = regs->csr_era;
+
+ if (llbit) {
+ csr_write32(0x10000, LOONGARCH_CSR_FWPS);
+ csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
+ } else if (pc == current->thread.single_step) {
+ csr_write32(0x10000, LOONGARCH_CSR_FWPS);
+ } else {
+ force_sig(SIGTRAP);
+ }
+ } else {
+ breakpoint_handler(regs);
+ watchpoint_handler(regs);
+ force_sig(SIGTRAP);
+ }

irqentry_exit(regs, state);
#endif
--
2.36.0


2023-02-15 09:24:20

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v3 1/3] LoongArch: Add hardware breakpoints/watchpoints support

Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and load/store operations, after the software configures hardware watchpoints
for instruction fetch and load/store operations. The processor hardware will
monitor the access address of the instruction fetch and load/store operation,
and will trigger the exception of the watchpoint when it meets the conditions
set by the watchpoint.

The hardware monitoring points for instruction fetching and load/store operations
each have a register for the overall configuration of all monitoring points,
a register for recording the status of all monitoring points, and four registers
required for configuration of each watchpoint individually.

Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/hw_breakpoint.h | 145 ++++++
arch/loongarch/include/asm/loongarch.h | 32 +-
arch/loongarch/include/asm/processor.h | 11 +-
arch/loongarch/include/asm/switch_to.h | 1 +
arch/loongarch/kernel/Makefile | 1 +
arch/loongarch/kernel/hw_breakpoint.c | 519 +++++++++++++++++++++
arch/loongarch/kernel/process.c | 7 +
arch/loongarch/kernel/traps.c | 10 +-
9 files changed, 706 insertions(+), 21 deletions(-)
create mode 100644 arch/loongarch/include/asm/hw_breakpoint.h
create mode 100644 arch/loongarch/kernel/hw_breakpoint.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 54bd3dbde1f2..e3eba2eb4b44 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -100,6 +100,7 @@ config LOONGARCH
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_GENERIC_VDSO
+ select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/arch/loongarch/include/asm/hw_breakpoint.h b/arch/loongarch/include/asm/hw_breakpoint.h
new file mode 100644
index 000000000000..c70b56559e00
--- /dev/null
+++ b/arch/loongarch/include/asm/hw_breakpoint.h
@@ -0,0 +1,145 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022-2023 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_HW_BREAKPOINT_H
+#define __ASM_HW_BREAKPOINT_H
+
+#include <asm/loongarch.h>
+
+#ifdef __KERNEL__
+
+/* Breakpoint */
+#define LOONGARCH_BREAKPOINT_EXECUTE (0 << 0)
+
+/* Watchpoints */
+#define LOONGARCH_BREAKPOINT_LOAD (1 << 0)
+#define LOONGARCH_BREAKPOINT_STORE (1 << 1)
+
+struct arch_hw_breakpoint_ctrl {
+ u32 __reserved1 : 28,
+ len : 2,
+ type : 2;
+};
+
+struct arch_hw_breakpoint {
+ u64 address;
+ u64 mask;
+ struct arch_hw_breakpoint_ctrl ctrl;
+};
+
+/* Lengths */
+#define LOONGARCH_BREAKPOINT_LEN_1 0b11
+#define LOONGARCH_BREAKPOINT_LEN_2 0b10
+#define LOONGARCH_BREAKPOINT_LEN_4 0b01
+#define LOONGARCH_BREAKPOINT_LEN_8 0b00
+
+/*
+ * Limits.
+ * Changing these will require modifications to the register accessors.
+ */
+#define LOONGARCH_MAX_BRP 8
+#define LOONGARCH_MAX_WRP 8
+
+/* Virtual debug register bases. */
+#define CSR_CFG_ADDR 0
+#define CSR_CFG_MASK (CSR_CFG_ADDR + LOONGARCH_MAX_BRP)
+#define CSR_CFG_CTRL (CSR_CFG_MASK + LOONGARCH_MAX_BRP)
+#define CSR_CFG_ASID (CSR_CFG_CTRL + LOONGARCH_MAX_WRP)
+
+/* Debug register names. */
+#define LOONGARCH_CSR_NAME_ADDR ADDR
+#define LOONGARCH_CSR_NAME_MASK MASK
+#define LOONGARCH_CSR_NAME_CTRL CTRL
+#define LOONGARCH_CSR_NAME_ASID ASID
+
+/* Accessor macros for the debug registers. */
+#define LOONGARCH_CSR_WATCH_READ(N, REG, T, VAL) \
+do { \
+ if (T == 0) \
+ VAL = csr_read64(LOONGARCH_CSR_##IB##N##REG); \
+ else \
+ VAL = csr_read64(LOONGARCH_CSR_##DB##N##REG); \
+} while (0)
+
+#define LOONGARCH_CSR_WATCH_WRITE(N, REG, T, VAL) \
+do { \
+ if (T == 0) \
+ csr_write64(VAL, LOONGARCH_CSR_##IB##N##REG); \
+ else \
+ csr_write64(VAL, LOONGARCH_CSR_##DB##N##REG); \
+} while (0)
+
+/* Exact number */
+#define CSR_FWPC_NUM 0x3f
+#define CSR_MWPC_NUM 0x3f
+
+#define CTRL_PLV_ENABLE 0x1e
+
+#define MWPnCFG3_LoadEn 8
+#define MWPnCFG3_StoreEn 9
+
+#define MWPnCFG3_Type_mask 0x3
+#define MWPnCFG3_Size_mask 0x3
+
+static inline u32 encode_ctrl_reg(struct arch_hw_breakpoint_ctrl ctrl)
+{
+ return (ctrl.len << 10) | (ctrl.type << 8);
+}
+
+static inline void decode_ctrl_reg(u32 reg, struct arch_hw_breakpoint_ctrl *ctrl)
+{
+ reg >>= 8;
+ ctrl->type = reg & MWPnCFG3_Type_mask;
+ reg >>= 2;
+ ctrl->len = reg & MWPnCFG3_Size_mask;
+}
+
+struct task_struct;
+struct notifier_block;
+struct perf_event_attr;
+struct perf_event;
+
+extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+ int *gen_len, int *gen_type, int *offset);
+extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw);
+extern int hw_breakpoint_arch_parse(struct perf_event *bp,
+ const struct perf_event_attr *attr,
+ struct arch_hw_breakpoint *hw);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+ unsigned long val, void *data);
+
+extern int arch_install_hw_breakpoint(struct perf_event *bp);
+extern void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+extern void hw_breakpoint_pmu_read(struct perf_event *bp);
+extern int hw_breakpoint_slots(int type);
+
+void breakpoint_handler(struct pt_regs *regs);
+void watchpoint_handler(struct pt_regs *regs);
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+extern void hw_breakpoint_thread_switch(struct task_struct *next);
+extern void ptrace_hw_copy_thread(struct task_struct *task);
+#else
+static inline void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+}
+static inline void ptrace_hw_copy_thread(struct task_struct *task)
+{
+}
+#endif
+
+/* Determine number of BRP registers available. */
+static inline int get_num_brps(void)
+{
+ return csr_read64(LOONGARCH_CSR_FWPC) & CSR_FWPC_NUM;
+}
+
+/* Determine number of WRP registers available. */
+static inline int get_num_wrps(void)
+{
+ return csr_read64(LOONGARCH_CSR_MWPC) & CSR_MWPC_NUM;
+}
+
+#endif /* __KERNEL__ */
+#endif /* __ASM_BREAKPOINT_H */
diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index 7f8d57a61c8b..e9aed583a064 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -970,42 +970,42 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)

#define LOONGARCH_CSR_DB0ADDR 0x310 /* data breakpoint 0 address */
#define LOONGARCH_CSR_DB0MASK 0x311 /* data breakpoint 0 mask */
-#define LOONGARCH_CSR_DB0CTL 0x312 /* data breakpoint 0 control */
+#define LOONGARCH_CSR_DB0CTRL 0x312 /* data breakpoint 0 control */
#define LOONGARCH_CSR_DB0ASID 0x313 /* data breakpoint 0 asid */

#define LOONGARCH_CSR_DB1ADDR 0x318 /* data breakpoint 1 address */
#define LOONGARCH_CSR_DB1MASK 0x319 /* data breakpoint 1 mask */
-#define LOONGARCH_CSR_DB1CTL 0x31a /* data breakpoint 1 control */
+#define LOONGARCH_CSR_DB1CTRL 0x31a /* data breakpoint 1 control */
#define LOONGARCH_CSR_DB1ASID 0x31b /* data breakpoint 1 asid */

#define LOONGARCH_CSR_DB2ADDR 0x320 /* data breakpoint 2 address */
#define LOONGARCH_CSR_DB2MASK 0x321 /* data breakpoint 2 mask */
-#define LOONGARCH_CSR_DB2CTL 0x322 /* data breakpoint 2 control */
+#define LOONGARCH_CSR_DB2CTRL 0x322 /* data breakpoint 2 control */
#define LOONGARCH_CSR_DB2ASID 0x323 /* data breakpoint 2 asid */

#define LOONGARCH_CSR_DB3ADDR 0x328 /* data breakpoint 3 address */
#define LOONGARCH_CSR_DB3MASK 0x329 /* data breakpoint 3 mask */
-#define LOONGARCH_CSR_DB3CTL 0x32a /* data breakpoint 3 control */
+#define LOONGARCH_CSR_DB3CTRL 0x32a /* data breakpoint 3 control */
#define LOONGARCH_CSR_DB3ASID 0x32b /* data breakpoint 3 asid */

#define LOONGARCH_CSR_DB4ADDR 0x330 /* data breakpoint 4 address */
#define LOONGARCH_CSR_DB4MASK 0x331 /* data breakpoint 4 maks */
-#define LOONGARCH_CSR_DB4CTL 0x332 /* data breakpoint 4 control */
+#define LOONGARCH_CSR_DB4CTRL 0x332 /* data breakpoint 4 control */
#define LOONGARCH_CSR_DB4ASID 0x333 /* data breakpoint 4 asid */

#define LOONGARCH_CSR_DB5ADDR 0x338 /* data breakpoint 5 address */
#define LOONGARCH_CSR_DB5MASK 0x339 /* data breakpoint 5 mask */
-#define LOONGARCH_CSR_DB5CTL 0x33a /* data breakpoint 5 control */
+#define LOONGARCH_CSR_DB5CTRL 0x33a /* data breakpoint 5 control */
#define LOONGARCH_CSR_DB5ASID 0x33b /* data breakpoint 5 asid */

#define LOONGARCH_CSR_DB6ADDR 0x340 /* data breakpoint 6 address */
#define LOONGARCH_CSR_DB6MASK 0x341 /* data breakpoint 6 mask */
-#define LOONGARCH_CSR_DB6CTL 0x342 /* data breakpoint 6 control */
+#define LOONGARCH_CSR_DB6CTRL 0x342 /* data breakpoint 6 control */
#define LOONGARCH_CSR_DB6ASID 0x343 /* data breakpoint 6 asid */

#define LOONGARCH_CSR_DB7ADDR 0x348 /* data breakpoint 7 address */
#define LOONGARCH_CSR_DB7MASK 0x349 /* data breakpoint 7 mask */
-#define LOONGARCH_CSR_DB7CTL 0x34a /* data breakpoint 7 control */
+#define LOONGARCH_CSR_DB7CTRL 0x34a /* data breakpoint 7 control */
#define LOONGARCH_CSR_DB7ASID 0x34b /* data breakpoint 7 asid */

#define LOONGARCH_CSR_FWPC 0x380 /* instruction breakpoint config */
@@ -1013,42 +1013,42 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)

#define LOONGARCH_CSR_IB0ADDR 0x390 /* inst breakpoint 0 address */
#define LOONGARCH_CSR_IB0MASK 0x391 /* inst breakpoint 0 mask */
-#define LOONGARCH_CSR_IB0CTL 0x392 /* inst breakpoint 0 control */
+#define LOONGARCH_CSR_IB0CTRL 0x392 /* inst breakpoint 0 control */
#define LOONGARCH_CSR_IB0ASID 0x393 /* inst breakpoint 0 asid */

#define LOONGARCH_CSR_IB1ADDR 0x398 /* inst breakpoint 1 address */
#define LOONGARCH_CSR_IB1MASK 0x399 /* inst breakpoint 1 mask */
-#define LOONGARCH_CSR_IB1CTL 0x39a /* inst breakpoint 1 control */
+#define LOONGARCH_CSR_IB1CTRL 0x39a /* inst breakpoint 1 control */
#define LOONGARCH_CSR_IB1ASID 0x39b /* inst breakpoint 1 asid */

#define LOONGARCH_CSR_IB2ADDR 0x3a0 /* inst breakpoint 2 address */
#define LOONGARCH_CSR_IB2MASK 0x3a1 /* inst breakpoint 2 mask */
-#define LOONGARCH_CSR_IB2CTL 0x3a2 /* inst breakpoint 2 control */
+#define LOONGARCH_CSR_IB2CTRL 0x3a2 /* inst breakpoint 2 control */
#define LOONGARCH_CSR_IB2ASID 0x3a3 /* inst breakpoint 2 asid */

#define LOONGARCH_CSR_IB3ADDR 0x3a8 /* inst breakpoint 3 address */
#define LOONGARCH_CSR_IB3MASK 0x3a9 /* breakpoint 3 mask */
-#define LOONGARCH_CSR_IB3CTL 0x3aa /* inst breakpoint 3 control */
+#define LOONGARCH_CSR_IB3CTRL 0x3aa /* inst breakpoint 3 control */
#define LOONGARCH_CSR_IB3ASID 0x3ab /* inst breakpoint 3 asid */

#define LOONGARCH_CSR_IB4ADDR 0x3b0 /* inst breakpoint 4 address */
#define LOONGARCH_CSR_IB4MASK 0x3b1 /* inst breakpoint 4 mask */
-#define LOONGARCH_CSR_IB4CTL 0x3b2 /* inst breakpoint 4 control */
+#define LOONGARCH_CSR_IB4CTRL 0x3b2 /* inst breakpoint 4 control */
#define LOONGARCH_CSR_IB4ASID 0x3b3 /* inst breakpoint 4 asid */

#define LOONGARCH_CSR_IB5ADDR 0x3b8 /* inst breakpoint 5 address */
#define LOONGARCH_CSR_IB5MASK 0x3b9 /* inst breakpoint 5 mask */
-#define LOONGARCH_CSR_IB5CTL 0x3ba /* inst breakpoint 5 control */
+#define LOONGARCH_CSR_IB5CTRL 0x3ba /* inst breakpoint 5 control */
#define LOONGARCH_CSR_IB5ASID 0x3bb /* inst breakpoint 5 asid */

#define LOONGARCH_CSR_IB6ADDR 0x3c0 /* inst breakpoint 6 address */
#define LOONGARCH_CSR_IB6MASK 0x3c1 /* inst breakpoint 6 mask */
-#define LOONGARCH_CSR_IB6CTL 0x3c2 /* inst breakpoint 6 control */
+#define LOONGARCH_CSR_IB6CTRL 0x3c2 /* inst breakpoint 6 control */
#define LOONGARCH_CSR_IB6ASID 0x3c3 /* inst breakpoint 6 asid */

#define LOONGARCH_CSR_IB7ADDR 0x3c8 /* inst breakpoint 7 address */
#define LOONGARCH_CSR_IB7MASK 0x3c9 /* inst breakpoint 7 mask */
-#define LOONGARCH_CSR_IB7CTL 0x3ca /* inst breakpoint 7 control */
+#define LOONGARCH_CSR_IB7CTRL 0x3ca /* inst breakpoint 7 control */
#define LOONGARCH_CSR_IB7ASID 0x3cb /* inst breakpoint 7 asid */

#define LOONGARCH_CSR_DEBUG 0x500 /* debug config */
diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
index 7184f1dc61f2..db060c5a976f 100644
--- a/arch/loongarch/include/asm/processor.h
+++ b/arch/loongarch/include/asm/processor.h
@@ -11,6 +11,7 @@

#include <asm/cpu.h>
#include <asm/cpu-info.h>
+#include <asm/hw_breakpoint.h>
#include <asm/loongarch.h>
#include <asm/vdso/processor.h>
#include <uapi/asm/ptrace.h>
@@ -126,6 +127,10 @@ struct thread_struct {
unsigned long error_code;
struct loongarch_vdso_info *vdso;

+ /* Hardware breakpoints pinned to this task. */
+ struct perf_event *hbp_break[LOONGARCH_MAX_BRP];
+ struct perf_event *hbp_watch[LOONGARCH_MAX_WRP];
+
/*
* FPU & vector registers, must be at last because
* they are conditionally copied at fork().
@@ -164,6 +169,8 @@ struct thread_struct {
*/ \
.trap_nr = 0, \
.error_code = 0, \
+ .hbp_break = {0}, \
+ .hbp_watch = {0}, \
/* \
* FPU & vector registers \
*/ \
@@ -184,10 +191,6 @@ extern unsigned long boot_option_idle_override;
*/
extern void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp);

-static inline void flush_thread(void)
-{
-}
-
unsigned long __get_wchan(struct task_struct *p);

#define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
diff --git a/arch/loongarch/include/asm/switch_to.h b/arch/loongarch/include/asm/switch_to.h
index 43a5ab162d38..24e3094bebab 100644
--- a/arch/loongarch/include/asm/switch_to.h
+++ b/arch/loongarch/include/asm/switch_to.h
@@ -34,6 +34,7 @@ extern asmlinkage struct task_struct *__switch_to(struct task_struct *prev,
#define switch_to(prev, next, last) \
do { \
lose_fpu_inatomic(1, prev); \
+ hw_breakpoint_thread_switch(next); \
(last) = __switch_to(prev, next, task_thread_info(next), \
__builtin_return_address(0), __builtin_frame_address(0)); \
} while (0)
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index 45c78aea63ce..006c892f9be3 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
obj-$(CONFIG_UNWINDER_PROLOGUE) += unwind_prologue.o

obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_regs.o
+obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o

obj-$(CONFIG_KPROBES) += kprobes.o kprobes_trampoline.o

diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
new file mode 100644
index 000000000000..6431cd319c32
--- /dev/null
+++ b/arch/loongarch/kernel/hw_breakpoint.c
@@ -0,0 +1,519 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022-2023 Loongson Technology Corporation Limited
+ */
+#define pr_fmt(fmt) "hw-breakpoint: " fmt
+
+#include <linux/hw_breakpoint.h>
+#include <linux/kprobes.h>
+#include <linux/perf_event.h>
+
+#include <asm/hw_breakpoint.h>
+
+/* Breakpoint currently in use for each BRP. */
+static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[LOONGARCH_MAX_BRP]);
+
+/* Watchpoint currently in use for each WRP. */
+static DEFINE_PER_CPU(struct perf_event *, wp_on_reg[LOONGARCH_MAX_WRP]);
+
+/* Number of BRP/WRP registers on this CPU. */
+static int core_num_brps;
+static int core_num_wrps;
+
+int hw_breakpoint_slots(int type)
+{
+ /*
+ * We can be called early, so don't rely on
+ * our static variables being initialised.
+ */
+ switch (type) {
+ case TYPE_INST:
+ return get_num_brps();
+ case TYPE_DATA:
+ return get_num_wrps();
+ default:
+ pr_warn("unknown slot type: %d\n", type);
+ return 0;
+ }
+}
+
+#define READ_WB_REG_CASE(OFF, N, REG, T, VAL) \
+ case (OFF + N): \
+ LOONGARCH_CSR_WATCH_READ(N, REG, T, VAL); \
+ break
+
+#define WRITE_WB_REG_CASE(OFF, N, REG, T, VAL) \
+ case (OFF + N): \
+ LOONGARCH_CSR_WATCH_WRITE(N, REG, T, VAL); \
+ break
+
+#define GEN_READ_WB_REG_CASES(OFF, REG, T, VAL) \
+ READ_WB_REG_CASE(OFF, 0, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 1, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 2, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 3, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 4, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 5, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 6, REG, T, VAL); \
+ READ_WB_REG_CASE(OFF, 7, REG, T, VAL);
+
+#define GEN_WRITE_WB_REG_CASES(OFF, REG, T, VAL) \
+ WRITE_WB_REG_CASE(OFF, 0, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 1, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 2, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 3, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 4, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 5, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 6, REG, T, VAL); \
+ WRITE_WB_REG_CASE(OFF, 7, REG, T, VAL);
+
+static u64 read_wb_reg(int reg, int n, int t)
+{
+ u64 val = 0;
+
+ switch (reg + n) {
+ GEN_READ_WB_REG_CASES(CSR_CFG_ADDR, ADDR, t, val);
+ GEN_READ_WB_REG_CASES(CSR_CFG_MASK, MASK, t, val);
+ GEN_READ_WB_REG_CASES(CSR_CFG_CTRL, CTRL, t, val);
+ GEN_READ_WB_REG_CASES(CSR_CFG_ASID, ASID, t, val);
+ default:
+ pr_warn("attempt to read from unknown breakpoint register %d\n", n);
+ }
+
+ return val;
+}
+NOKPROBE_SYMBOL(read_wb_reg);
+
+static void write_wb_reg(int reg, int n, int t, u64 val)
+{
+ switch (reg + n) {
+ GEN_WRITE_WB_REG_CASES(CSR_CFG_ADDR, ADDR, t, val);
+ GEN_WRITE_WB_REG_CASES(CSR_CFG_MASK, MASK, t, val);
+ GEN_WRITE_WB_REG_CASES(CSR_CFG_CTRL, CTRL, t, val);
+ GEN_WRITE_WB_REG_CASES(CSR_CFG_ASID, ASID, t, val);
+ default:
+ pr_warn("attempt to write to unknown breakpoint register %d\n", n);
+ }
+}
+NOKPROBE_SYMBOL(write_wb_reg);
+
+enum hw_breakpoint_ops {
+ HW_BREAKPOINT_INSTALL,
+ HW_BREAKPOINT_UNINSTALL,
+};
+
+/**
+ * hw_breakpoint_slot_setup - Find and setup a perf slot according to
+ * operations
+ *
+ * @slots: pointer to array of slots
+ * @max_slots: max number of slots
+ * @bp: perf_event to setup
+ * @ops: operation to be carried out on the slot
+ *
+ * Return:
+ * slot index on success
+ * -ENOSPC if no slot is available/matches
+ * -EINVAL on wrong operations parameter
+ */
+
+static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
+ struct perf_event *bp,
+ enum hw_breakpoint_ops ops)
+{
+ int i;
+ struct perf_event **slot;
+
+ for (i = 0; i < max_slots; ++i) {
+ slot = &slots[i];
+ switch (ops) {
+ case HW_BREAKPOINT_INSTALL:
+ if (!*slot) {
+ *slot = bp;
+ return i;
+ }
+ break;
+ case HW_BREAKPOINT_UNINSTALL:
+ if (*slot == bp) {
+ *slot = NULL;
+ return i;
+ }
+ break;
+ default:
+ pr_warn_once("Unhandled hw breakpoint ops %d\n", ops);
+ return -EINVAL;
+ }
+ }
+ return -ENOSPC;
+}
+
+/*
+ * Unregister breakpoints from this task and reset the pointers in
+ * the thread_struct.
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+}
+
+void ptrace_hw_copy_thread(struct task_struct *tsk)
+{
+ memset(&tsk->thread.hbp_break, 0, sizeof(tsk->thread.hbp_break));
+ memset(&tsk->thread.hbp_watch, 0, sizeof(tsk->thread.hbp_watch));
+}
+
+static int hw_breakpoint_control(struct perf_event *bp,
+ enum hw_breakpoint_ops ops)
+{
+ struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+ struct perf_event **slots;
+ int i, max_slots, enable;
+ u32 ctrl;
+
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) {
+ /* Breakpoint */
+ slots = this_cpu_ptr(bp_on_reg);
+ max_slots = core_num_brps;
+ } else {
+ /* Watchpoint */
+ slots = this_cpu_ptr(wp_on_reg);
+ max_slots = core_num_wrps;
+ }
+
+ i = hw_breakpoint_slot_setup(slots, max_slots, bp, ops);
+
+ if (WARN_ONCE(i < 0, "Can't find any breakpoint slot"))
+ return i;
+
+ switch (ops) {
+ case HW_BREAKPOINT_INSTALL:
+ /* Set the FWPnCFG/MWPnCFG 1~4 register. */
+ write_wb_reg(CSR_CFG_ADDR, i, 0, info->address);
+ write_wb_reg(CSR_CFG_ADDR, i, 1, info->address);
+ write_wb_reg(CSR_CFG_MASK, i, 0, info->mask);
+ write_wb_reg(CSR_CFG_MASK, i, 1, info->mask);
+ write_wb_reg(CSR_CFG_ASID, i, 0, 0);
+ write_wb_reg(CSR_CFG_ASID, i, 1, 0);
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) {
+ write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+ } else {
+ ctrl = encode_ctrl_reg(info->ctrl);
+ write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl | CTRL_PLV_ENABLE |
+ 1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn);
+ }
+ enable = csr_read64(LOONGARCH_CSR_CRMD);
+ csr_write64(CSR_CRMD_WE | enable, LOONGARCH_CSR_CRMD);
+ break;
+ case HW_BREAKPOINT_UNINSTALL:
+ /* Reset the FWPnCFG/MWPnCFG 1~4 register. */
+ write_wb_reg(CSR_CFG_ADDR, i, 0, 0);
+ write_wb_reg(CSR_CFG_ADDR, i, 1, 0);
+ write_wb_reg(CSR_CFG_MASK, i, 0, 0);
+ write_wb_reg(CSR_CFG_MASK, i, 1, 0);
+ write_wb_reg(CSR_CFG_CTRL, i, 0, 0);
+ write_wb_reg(CSR_CFG_CTRL, i, 1, 0);
+ write_wb_reg(CSR_CFG_ASID, i, 0, 0);
+ write_wb_reg(CSR_CFG_ASID, i, 1, 0);
+ break;
+ }
+
+ return 0;
+}
+
+/*
+ * Install a perf counter breakpoint.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+ return hw_breakpoint_control(bp, HW_BREAKPOINT_INSTALL);
+}
+
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+ hw_breakpoint_control(bp, HW_BREAKPOINT_UNINSTALL);
+}
+
+static int get_hbp_len(u8 hbp_len)
+{
+ unsigned int len_in_bytes = 0;
+
+ switch (hbp_len) {
+ case LOONGARCH_BREAKPOINT_LEN_1:
+ len_in_bytes = 1;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_2:
+ len_in_bytes = 2;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_4:
+ len_in_bytes = 4;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_8:
+ len_in_bytes = 8;
+ break;
+ }
+
+ return len_in_bytes;
+}
+
+/*
+ * Check whether bp virtual address is in kernel space.
+ */
+int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw)
+{
+ unsigned int len;
+ unsigned long va;
+
+ va = hw->address;
+ len = get_hbp_len(hw->ctrl.len);
+
+ return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
+}
+
+/*
+ * Extract generic type and length encodings from an arch_hw_breakpoint_ctrl.
+ * Hopefully this will disappear when ptrace can bypass the conversion
+ * to generic breakpoint descriptions.
+ */
+int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+ int *gen_len, int *gen_type, int *offset)
+{
+ /* Type */
+ switch (ctrl.type) {
+ case LOONGARCH_BREAKPOINT_EXECUTE:
+ *gen_type = HW_BREAKPOINT_X;
+ break;
+ case LOONGARCH_BREAKPOINT_LOAD:
+ *gen_type = HW_BREAKPOINT_R;
+ break;
+ case LOONGARCH_BREAKPOINT_STORE:
+ *gen_type = HW_BREAKPOINT_W;
+ break;
+ case LOONGARCH_BREAKPOINT_LOAD | LOONGARCH_BREAKPOINT_STORE:
+ *gen_type = HW_BREAKPOINT_RW;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!ctrl.len)
+ return -EINVAL;
+ *offset = __ffs(ctrl.len);
+
+ /* Len */
+ switch (ctrl.len) {
+ case LOONGARCH_BREAKPOINT_LEN_1:
+ *gen_len = HW_BREAKPOINT_LEN_1;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_2:
+ *gen_len = HW_BREAKPOINT_LEN_2;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_4:
+ *gen_len = HW_BREAKPOINT_LEN_4;
+ break;
+ case LOONGARCH_BREAKPOINT_LEN_8:
+ *gen_len = HW_BREAKPOINT_LEN_8;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/*
+ * Construct an arch_hw_breakpoint from a perf_event.
+ */
+static int arch_build_bp_info(struct perf_event *bp,
+ const struct perf_event_attr *attr,
+ struct arch_hw_breakpoint *hw)
+{
+ /* Type */
+ switch (attr->bp_type) {
+ case HW_BREAKPOINT_X:
+ hw->ctrl.type = LOONGARCH_BREAKPOINT_EXECUTE;
+ break;
+ case HW_BREAKPOINT_R:
+ hw->ctrl.type = LOONGARCH_BREAKPOINT_LOAD;
+ break;
+ case HW_BREAKPOINT_W:
+ hw->ctrl.type = LOONGARCH_BREAKPOINT_STORE;
+ break;
+ case HW_BREAKPOINT_RW:
+ hw->ctrl.type = LOONGARCH_BREAKPOINT_LOAD | LOONGARCH_BREAKPOINT_STORE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Len */
+ switch (attr->bp_len) {
+ case HW_BREAKPOINT_LEN_1:
+ hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_1;
+ break;
+ case HW_BREAKPOINT_LEN_2:
+ hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_2;
+ break;
+ case HW_BREAKPOINT_LEN_4:
+ hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_4;
+ break;
+ case HW_BREAKPOINT_LEN_8:
+ hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_8;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Address */
+ hw->address = attr->bp_addr;
+
+ return 0;
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings.
+ */
+int hw_breakpoint_arch_parse(struct perf_event *bp,
+ const struct perf_event_attr *attr,
+ struct arch_hw_breakpoint *hw)
+{
+ int ret;
+ u64 alignment_mask, offset;
+
+ /* Build the arch_hw_breakpoint. */
+ ret = arch_build_bp_info(bp, attr, hw);
+ if (ret)
+ return ret;
+
+ if (hw->ctrl.type != LOONGARCH_BREAKPOINT_EXECUTE)
+ alignment_mask = 0x7;
+ offset = hw->address & alignment_mask;
+
+ hw->address &= ~alignment_mask;
+ hw->ctrl.len <<= offset;
+ return 0;
+}
+
+static void update_bp_registers(struct pt_regs *regs, int enable, int type)
+{
+ int i, max_slots;
+ u32 ctrl;
+ struct perf_event **slots;
+ struct arch_hw_breakpoint *info;
+
+ if (type == 0) {
+ slots = this_cpu_ptr(bp_on_reg);
+ max_slots = core_num_brps;
+ } else if (type == 1) {
+ slots = this_cpu_ptr(wp_on_reg);
+ max_slots = core_num_wrps;
+ } else {
+ return;
+ }
+
+ for (i = 0; i < max_slots; ++i) {
+ if (!slots[i])
+ continue;
+
+ info = counter_arch_bp(slots[i]);
+ if (enable) {
+ if ((info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) && (type == 0)) {
+ write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+ write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+ } else {
+ ctrl = read_wb_reg(CSR_CFG_CTRL, i, 1);
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_LOAD)
+ ctrl |= 0x1 << MWPnCFG3_LoadEn;
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_STORE)
+ ctrl |= 0x1 << MWPnCFG3_StoreEn;
+ write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl);
+ }
+ regs->csr_prmd |= CSR_PRMD_PWE;
+ } else {
+ if ((info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) && (type == 0)) {
+ write_wb_reg(CSR_CFG_CTRL, i, 0, 0);
+ } else {
+ ctrl = read_wb_reg(CSR_CFG_CTRL, i, 1);
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_LOAD)
+ ctrl &= ~0x1 << MWPnCFG3_LoadEn;
+ if (info->ctrl.type == LOONGARCH_BREAKPOINT_STORE)
+ ctrl &= ~0x1 << MWPnCFG3_StoreEn;
+ write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl);
+ }
+ regs->csr_prmd &= ~CSR_PRMD_PWE;
+ }
+ }
+}
+NOKPROBE_SYMBOL(update_bp_registers);
+
+/*
+ * Debug exception handlers.
+ */
+void breakpoint_handler(struct pt_regs *regs)
+{
+ int i;
+ struct perf_event *bp, **slots;
+
+ slots = this_cpu_ptr(bp_on_reg);
+
+ for (i = 0; i < core_num_brps; ++i) {
+ bp = slots[i];
+ if (bp == NULL)
+ continue;
+ perf_bp_event(bp, regs);
+ }
+ update_bp_registers(regs, 0, 0);
+}
+NOKPROBE_SYMBOL(breakpoint_handler);
+
+void watchpoint_handler(struct pt_regs *regs)
+{
+ int i;
+ struct perf_event *wp, **slots;
+
+ slots = this_cpu_ptr(wp_on_reg);
+
+ for (i = 0; i < core_num_wrps; ++i) {
+ wp = slots[i];
+ if (wp == NULL)
+ continue;
+ perf_bp_event(wp, regs);
+ }
+ update_bp_registers(regs, 0, 1);
+}
+NOKPROBE_SYMBOL(watchpoint_handler);
+
+/*
+ * One-time initialisation.
+ */
+static int __init arch_hw_breakpoint_init(void)
+{
+ core_num_brps = get_num_brps();
+ core_num_wrps = get_num_wrps();
+
+ pr_info("found %d breakpoint and %d watchpoint registers.\n",
+ core_num_brps, core_num_wrps);
+ return 0;
+}
+arch_initcall(arch_hw_breakpoint_init);
+
+void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+ struct pt_regs *regs = task_pt_regs(next);
+
+ /* Update breakpoints */
+ update_bp_registers(regs, 1, 0);
+ /* Update watchpoints */
+ update_bp_registers(regs, 1, 1);
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+}
+
+/*
+ * Dummy function to register with die_notifier.
+ */
+int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+ unsigned long val, void *data)
+{
+ return NOTIFY_DONE;
+}
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index 9073fab1a487..6b53781c0b59 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -37,6 +37,7 @@
#include <asm/cpu.h>
#include <asm/elf.h>
#include <asm/fpu.h>
+#include <linux/hw_breakpoint.h>
#include <asm/io.h>
#include <asm/irq.h>
#include <asm/irq_regs.h>
@@ -100,6 +101,11 @@ void exit_thread(struct task_struct *tsk)
{
}

+void flush_thread(void)
+{
+ flush_ptrace_hw_breakpoint(current);
+}
+
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
{
/*
@@ -186,6 +192,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
if (clone_flags & CLONE_SETTLS)
childregs->regs[2] = tls;

+ ptrace_hw_copy_thread(p);
out:
clear_tsk_thread_flag(p, TIF_USEDFPU);
clear_tsk_thread_flag(p, TIF_USEDSIMD);
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 4aa3901c5623..2b133079e0f3 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -508,7 +508,15 @@ asmlinkage void noinstr do_bp(struct pt_regs *regs)

asmlinkage void noinstr do_watch(struct pt_regs *regs)
{
- pr_warn("Hardware watch point handler not implemented!\n");
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+ irqentry_state_t state = irqentry_enter(regs);
+
+ breakpoint_handler(regs);
+ watchpoint_handler(regs);
+ force_sig(SIGTRAP);
+
+ irqentry_exit(regs, state);
+#endif
}

asmlinkage void noinstr do_ri(struct pt_regs *regs)
--
2.36.0


2023-02-15 09:24:22

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v3 3/3] LoongArch: ptrace: expose hardware breakpoints to debuggers

Implement regset-based ptrace interface that exposes hardware
breakpoints to user-space debuggers to query and set instruction
and data breakpoints.

Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/include/uapi/asm/ptrace.h | 9 +
arch/loongarch/kernel/ptrace.c | 416 +++++++++++++++++++++++
include/uapi/linux/elf.h | 2 +
3 files changed, 427 insertions(+)

diff --git a/arch/loongarch/include/uapi/asm/ptrace.h b/arch/loongarch/include/uapi/asm/ptrace.h
index 46eb40932bb1..7ac4a0e44570 100644
--- a/arch/loongarch/include/uapi/asm/ptrace.h
+++ b/arch/loongarch/include/uapi/asm/ptrace.h
@@ -56,6 +56,15 @@ struct user_lasx_state {
uint64_t vregs[32*4];
};

+struct user_watch_state {
+ uint16_t dbg_info;
+ struct {
+ uint64_t addr;
+ uint64_t mask;
+ uint32_t ctrl;
+ } dbg_regs[8];
+};
+
#define PTRACE_SYSEMU 0x1f
#define PTRACE_SYSEMU_SINGLESTEP 0x20

diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
index 52a3ee4366f4..7b0ec3b29212 100644
--- a/arch/loongarch/kernel/ptrace.c
+++ b/arch/loongarch/kernel/ptrace.c
@@ -22,6 +22,7 @@
#include <linux/errno.h>
#include <linux/hw_breakpoint.h>
#include <linux/mm.h>
+#include <linux/nospec.h>
#include <linux/ptrace.h>
#include <linux/regset.h>
#include <linux/sched.h>
@@ -333,6 +334,399 @@ static int simd_set(struct task_struct *target,

#endif /* CONFIG_CPU_HAS_LSX */

+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+
+/*
+ * Handle hitting a HW-breakpoint.
+ */
+static void ptrace_hbptriggered(struct perf_event *bp,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ int i;
+ struct arch_hw_breakpoint *bkpt = counter_arch_bp(bp);
+
+ for (i = 0; i < LOONGARCH_MAX_BRP; ++i)
+ if (current->thread.hbp_break[i] == bp)
+ break;
+
+ for (i = 0; i < LOONGARCH_MAX_WRP; ++i)
+ if (current->thread.hbp_watch[i] == bp)
+ break;
+
+ force_sig_ptrace_errno_trap(i, (void __user *)bkpt->address);
+}
+
+static struct perf_event *ptrace_hbp_get_event(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx)
+{
+ struct perf_event *bp = ERR_PTR(-EINVAL);
+
+ switch (note_type) {
+ case NT_LOONGARCH_HW_BREAK:
+ if (idx >= LOONGARCH_MAX_BRP)
+ goto out;
+ idx = array_index_nospec(idx, LOONGARCH_MAX_BRP);
+ bp = tsk->thread.hbp_break[idx];
+ break;
+ case NT_LOONGARCH_HW_WATCH:
+ if (idx >= LOONGARCH_MAX_WRP)
+ goto out;
+ idx = array_index_nospec(idx, LOONGARCH_MAX_WRP);
+ bp = tsk->thread.hbp_watch[idx];
+ break;
+ }
+
+out:
+ return bp;
+}
+
+static int ptrace_hbp_set_event(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ struct perf_event *bp)
+{
+ int err = -EINVAL;
+
+ switch (note_type) {
+ case NT_LOONGARCH_HW_BREAK:
+ if (idx >= LOONGARCH_MAX_BRP)
+ goto out;
+ idx = array_index_nospec(idx, LOONGARCH_MAX_BRP);
+ tsk->thread.hbp_break[idx] = bp;
+ err = 0;
+ break;
+ case NT_LOONGARCH_HW_WATCH:
+ if (idx >= LOONGARCH_MAX_WRP)
+ goto out;
+ idx = array_index_nospec(idx, LOONGARCH_MAX_WRP);
+ tsk->thread.hbp_watch[idx] = bp;
+ err = 0;
+ break;
+ }
+
+out:
+ return err;
+}
+
+static struct perf_event *ptrace_hbp_create(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx)
+{
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+ int err, type;
+
+ switch (note_type) {
+ case NT_LOONGARCH_HW_BREAK:
+ type = HW_BREAKPOINT_X;
+ break;
+ case NT_LOONGARCH_HW_WATCH:
+ type = HW_BREAKPOINT_RW;
+ break;
+ default:
+ return ERR_PTR(-EINVAL);
+ }
+
+ ptrace_breakpoint_init(&attr);
+
+ /*
+ * Initialise fields to sane defaults
+ * (i.e. values that will pass validation).
+ */
+ attr.bp_addr = 0;
+ attr.bp_len = HW_BREAKPOINT_LEN_4;
+ attr.bp_type = type;
+ attr.disabled = 1;
+
+ bp = register_user_hw_breakpoint(&attr, ptrace_hbptriggered, NULL, tsk);
+ if (IS_ERR(bp))
+ return bp;
+
+ err = ptrace_hbp_set_event(note_type, tsk, idx, bp);
+ if (err)
+ return ERR_PTR(err);
+
+ return bp;
+}
+
+static int ptrace_hbp_fill_attr_ctrl(unsigned int note_type,
+ struct arch_hw_breakpoint_ctrl ctrl,
+ struct perf_event_attr *attr)
+{
+ int err, len, type, offset;
+
+ err = arch_bp_generic_fields(ctrl, &len, &type, &offset);
+ if (err)
+ return err;
+
+ switch (note_type) {
+ case NT_LOONGARCH_HW_BREAK:
+ if ((type & HW_BREAKPOINT_X) != type)
+ return -EINVAL;
+ break;
+ case NT_LOONGARCH_HW_WATCH:
+ if ((type & HW_BREAKPOINT_RW) != type)
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ attr->bp_len = len;
+ attr->bp_type = type;
+ attr->bp_addr += offset;
+
+ return 0;
+}
+
+static int ptrace_hbp_get_resource_info(unsigned int note_type, u16 *info)
+{
+ u8 num;
+ u16 reg = 0;
+
+ switch (note_type) {
+ case NT_LOONGARCH_HW_BREAK:
+ num = hw_breakpoint_slots(TYPE_INST);
+ break;
+ case NT_LOONGARCH_HW_WATCH:
+ num = hw_breakpoint_slots(TYPE_DATA);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ reg |= num;
+
+ *info = reg;
+ return 0;
+}
+
+static int ptrace_hbp_get_ctrl(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u32 *ctrl)
+{
+ struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+ if (IS_ERR(bp))
+ return PTR_ERR(bp);
+
+ *ctrl = bp ? encode_ctrl_reg(counter_arch_bp(bp)->ctrl) : 0;
+ return 0;
+}
+
+static int ptrace_hbp_get_mask(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u64 *mask)
+{
+ struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+ if (IS_ERR(bp))
+ return PTR_ERR(bp);
+
+ *mask = bp ? counter_arch_bp(bp)->mask : 0;
+ return 0;
+}
+
+static int ptrace_hbp_get_addr(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u64 *addr)
+{
+ struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+ if (IS_ERR(bp))
+ return PTR_ERR(bp);
+
+ *addr = bp ? counter_arch_bp(bp)->address : 0;
+ return 0;
+}
+
+static struct perf_event *ptrace_hbp_get_initialised_bp(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx)
+{
+ struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+ if (!bp)
+ bp = ptrace_hbp_create(note_type, tsk, idx);
+
+ return bp;
+}
+
+static int ptrace_hbp_set_ctrl(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u32 uctrl)
+{
+ int err;
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+ struct arch_hw_breakpoint_ctrl ctrl;
+
+ bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+ if (IS_ERR(bp)) {
+ err = PTR_ERR(bp);
+ return err;
+ }
+
+ attr = bp->attr;
+ decode_ctrl_reg(uctrl, &ctrl);
+ err = ptrace_hbp_fill_attr_ctrl(note_type, ctrl, &attr);
+ if (err)
+ return err;
+
+ return modify_user_hw_breakpoint(bp, &attr);
+}
+
+static int ptrace_hbp_set_mask(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u64 mask)
+{
+ int err;
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+ struct arch_hw_breakpoint *info;
+
+ bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+ if (IS_ERR(bp)) {
+ err = PTR_ERR(bp);
+ return err;
+ }
+
+ attr = bp->attr;
+ info = counter_arch_bp(bp);
+ info->mask = mask;
+ err = modify_user_hw_breakpoint(bp, &attr);
+ return err;
+}
+
+static int ptrace_hbp_set_addr(unsigned int note_type,
+ struct task_struct *tsk,
+ unsigned long idx,
+ u64 addr)
+{
+ int err;
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+
+ bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+ if (IS_ERR(bp)) {
+ err = PTR_ERR(bp);
+ return err;
+ }
+
+ attr = bp->attr;
+ attr.bp_addr = addr;
+ err = modify_user_hw_breakpoint(bp, &attr);
+ return err;
+}
+
+#define PTRACE_HBP_ADDR_SZ sizeof(u64)
+#define PTRACE_HBP_MASK_SZ sizeof(u64)
+#define PTRACE_HBP_CTRL_SZ sizeof(u32)
+
+static int hw_break_get(struct task_struct *target,
+ const struct user_regset *regset,
+ struct membuf to)
+{
+ unsigned int note_type = regset->core_note_type;
+ int ret, idx = 0;
+ u16 info;
+ u32 ctrl;
+ u64 addr, mask;
+
+ /* Resource info */
+ ret = ptrace_hbp_get_resource_info(note_type, &info);
+ if (ret)
+ return ret;
+
+ membuf_write(&to, &info, sizeof(info));
+ /* (address, ctrl) registers */
+ while (to.left) {
+ ret = ptrace_hbp_get_addr(note_type, target, idx, &addr);
+ if (ret)
+ return ret;
+
+ ret = ptrace_hbp_get_mask(note_type, target, idx, &mask);
+ if (ret)
+ return ret;
+
+ ret = ptrace_hbp_get_ctrl(note_type, target, idx, &ctrl);
+ if (ret)
+ return ret;
+
+ membuf_store(&to, addr);
+ membuf_store(&to, mask);
+ membuf_store(&to, ctrl);
+ idx++;
+ }
+ return 0;
+}
+
+static int hw_break_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ unsigned int note_type = regset->core_note_type;
+ int ret, idx = 0, offset, limit;
+ u32 ctrl;
+ u64 addr, mask;
+
+ /* Resource info */
+ offset = offsetof(struct user_watch_state, dbg_regs);
+ user_regset_copyin_ignore(&pos, &count, &kbuf, &ubuf, 0, offset);
+
+ /* (address, ctrl) registers */
+ limit = regset->n * regset->size;
+ while (count && offset < limit) {
+ if (count < PTRACE_HBP_ADDR_SZ)
+ return -EINVAL;
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &addr,
+ offset, offset + PTRACE_HBP_ADDR_SZ);
+ if (ret)
+ return ret;
+
+ ret = ptrace_hbp_set_addr(note_type, target, idx, addr);
+ if (ret)
+ return ret;
+ offset += PTRACE_HBP_ADDR_SZ;
+
+ if (!count)
+ break;
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &mask,
+ offset, offset + PTRACE_HBP_ADDR_SZ);
+ if (ret)
+ return ret;
+
+ ret = ptrace_hbp_set_mask(note_type, target, idx, mask);
+ if (ret)
+ return ret;
+ offset += PTRACE_HBP_MASK_SZ;
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &mask,
+ offset, offset + PTRACE_HBP_MASK_SZ);
+ if (ret)
+ return ret;
+
+ ret = ptrace_hbp_set_ctrl(note_type, target, idx, ctrl);
+ if (ret)
+ return ret;
+ offset += PTRACE_HBP_CTRL_SZ;
+ idx++;
+ }
+
+ return 0;
+}
+#endif
+
struct pt_regs_offset {
const char *name;
int offset;
@@ -412,6 +806,10 @@ enum loongarch_regset {
#ifdef CONFIG_CPU_HAS_LASX
REGSET_LASX,
#endif
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+ REGSET_HW_BREAK,
+ REGSET_HW_WATCH,
+#endif
};

static const struct user_regset loongarch64_regsets[] = {
@@ -459,6 +857,24 @@ static const struct user_regset loongarch64_regsets[] = {
.set = simd_set,
},
#endif
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+ [REGSET_HW_BREAK] = {
+ .core_note_type = NT_LOONGARCH_HW_BREAK,
+ .n = sizeof(struct user_watch_state) / sizeof(u32),
+ .size = sizeof(u32),
+ .align = sizeof(u32),
+ .regset_get = hw_break_get,
+ .set = hw_break_set,
+ },
+ [REGSET_HW_WATCH] = {
+ .core_note_type = NT_LOONGARCH_HW_WATCH,
+ .n = sizeof(struct user_watch_state) / sizeof(u32),
+ .size = sizeof(u32),
+ .align = sizeof(u32),
+ .regset_get = hw_break_get,
+ .set = hw_break_set,
+ },
+#endif
};

static const struct user_regset_view user_loongarch64_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 4c6a8fa5e7ed..3cf66946a0bb 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -444,6 +444,8 @@ typedef struct elf64_shdr {
#define NT_LOONGARCH_LSX 0xa02 /* LoongArch Loongson SIMD Extension registers */
#define NT_LOONGARCH_LASX 0xa03 /* LoongArch Loongson Advanced SIMD Extension registers */
#define NT_LOONGARCH_LBT 0xa04 /* LoongArch Loongson Binary Translation registers */
+#define NT_LOONGARCH_HW_BREAK 0xa05 /* LoongArch hardware breakpoint registers */
+#define NT_LOONGARCH_HW_WATCH 0xa06 /* LoongArch hardware watchpoint registers */

/* Note types with note name "GNU" */
#define NT_GNU_PROPERTY_TYPE_0 5
--
2.36.0


2023-02-16 02:07:55

by Jinyang He

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] LoongArch: Add ptrace single step support


On 2023-02-15 17:23, Qing Zhang wrote:
> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
> arch_has_single_step in and implementing the
> user_enable_single_step and user_disable_single_step functions.
>
> LongArch has no hardware single-step register. the hardware single-step
> function multiplex fetch instruction watchpoint(FWPS) and specifies that
> the next instruction must trigger the watch exception by setting the mask bit.
>
> Signed-off-by: Qing Zhang <[email protected]>
> ---
> arch/loongarch/include/asm/processor.h | 3 ++
> arch/loongarch/include/asm/ptrace.h | 2 +
> arch/loongarch/kernel/hw_breakpoint.c | 35 +++++++++++--
> arch/loongarch/kernel/ptrace.c | 68 ++++++++++++++++++++++++++
> arch/loongarch/kernel/traps.c | 20 ++++++--
> 5 files changed, 120 insertions(+), 8 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
> index db060c5a976f..3ea0f1910c23 100644
> --- a/arch/loongarch/include/asm/processor.h
> +++ b/arch/loongarch/include/asm/processor.h
> @@ -131,6 +131,9 @@ struct thread_struct {
> struct perf_event *hbp_break[LOONGARCH_MAX_BRP];
> struct perf_event *hbp_watch[LOONGARCH_MAX_WRP];
>
> + /* Used by ptrace single_step */
> + unsigned long single_step;
> +
> /*
> * FPU & vector registers, must be at last because
> * they are conditionally copied at fork().
> diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
> index 58596c4f8a0f..66a0e6c480a3 100644
> --- a/arch/loongarch/include/asm/ptrace.h
> +++ b/arch/loongarch/include/asm/ptrace.h
> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct pt_regs *regs,
> regs->regs[3] = val;
> }
>
> +#define arch_has_single_step() (1)
> +
> #endif /* _ASM_PTRACE_H */
> diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
> index 6431cd319c32..94967b887d92 100644
> --- a/arch/loongarch/kernel/hw_breakpoint.c
> +++ b/arch/loongarch/kernel/hw_breakpoint.c
> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
> */
> void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
> {
> + int i;
> + struct thread_struct *t = &tsk->thread;
> +
> + for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
> + if (t->hbp_break[i]) {
> + unregister_hw_breakpoint(t->hbp_break[i]);
> + t->hbp_break[i] = NULL;
> + }
> + }
> +
> + for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
> + if (t->hbp_watch[i]) {
> + unregister_hw_breakpoint(t->hbp_watch[i]);
> + t->hbp_watch[i] = NULL;
> + }
> + }
> }
>
> void ptrace_hw_copy_thread(struct task_struct *tsk)
> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
> void hw_breakpoint_thread_switch(struct task_struct *next)
> {
> struct pt_regs *regs = task_pt_regs(next);
> -
> - /* Update breakpoints */
> - update_bp_registers(regs, 1, 0);
> - /* Update watchpoints */
> - update_bp_registers(regs, 1, 1);
> + u64 addr, mask;
> +
> + if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
> + addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
> + mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
> + if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
> + regs->csr_prmd |= CSR_PRMD_PWE;
> + } else {
> + /* Update breakpoints */
> + update_bp_registers(regs, 1, 0);
> + /* Update watchpoints */
> + update_bp_registers(regs, 1, 1);
> + }
> }
>
> void hw_breakpoint_pmu_read(struct perf_event *bp)
> diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
> index bee4194177fd..52a3ee4366f4 100644
> --- a/arch/loongarch/kernel/ptrace.c
> +++ b/arch/loongarch/kernel/ptrace.c
> @@ -20,6 +20,7 @@
> #include <linux/context_tracking.h>
> #include <linux/elf.h>
> #include <linux/errno.h>
> +#include <linux/hw_breakpoint.h>
> #include <linux/mm.h>
> #include <linux/ptrace.h>
> #include <linux/regset.h>
> @@ -30,6 +31,7 @@
> #include <linux/stddef.h>
> #include <linux/seccomp.h>
> #include <linux/uaccess.h>
> +#include <linux/thread_info.h>
>
> #include <asm/byteorder.h>
> #include <asm/cpu.h>
> @@ -39,6 +41,7 @@
> #include <asm/page.h>
> #include <asm/pgtable.h>
> #include <asm/processor.h>
> +#include <asm/ptrace.h>
> #include <asm/reg.h>
> #include <asm/syscall.h>
>
> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long request,
>
> return ret;
> }
> +
> +void ptrace_triggered(struct perf_event *bp,
> + struct perf_sample_data *data, struct pt_regs *regs)
> +{
> + struct perf_event_attr attr;
> +
> + attr = bp->attr;
> + attr.disabled = true;
> + modify_user_hw_breakpoint(bp, &attr);
> +}
> +
> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
> +{
> + struct thread_struct *thread = &tsk->thread;
> + struct perf_event *bp;
> + struct perf_event_attr attr;
> + struct arch_hw_breakpoint *info;
> +
> + bp = thread->hbp_break[0];
> + if (!bp) {
> + ptrace_breakpoint_init(&attr);
> +
> + attr.bp_addr = addr;
> + attr.bp_len = HW_BREAKPOINT_LEN_8;
> + attr.bp_type = HW_BREAKPOINT_X;
> +
> + bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
> + NULL, tsk);
> + if (IS_ERR(bp))
> + return PTR_ERR(bp);
> +
> + thread->hbp_break[0] = bp;
> + } else {
> + int err;
> +
> + attr = bp->attr;
> + attr.bp_addr = addr;
> + /* reenable breakpoint */
> + attr.disabled = false;
> + err = modify_user_hw_breakpoint(bp, &attr);
> + if (unlikely(err))
> + return err;
> +
> + csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
> + }
> + info = counter_arch_bp(bp);
> + info->mask = 0xffffffffffff;
> +
> + return 0;
> +}
> +
> +/* ptrace API */
> +void user_enable_single_step(struct task_struct *task)
> +{
> + struct thread_info *ti = task_thread_info(task);
> +
> + set_single_step(task, task_pt_regs(task)->csr_era);
> + task->thread.single_step = task_pt_regs(task)->csr_era;
> + set_ti_thread_flag(ti, TIF_SINGLESTEP);
> +}
> +
> +void user_disable_single_step(struct task_struct *task)
> +{
> + clear_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
> index 2b133079e0f3..56d7d076153c 100644
> --- a/arch/loongarch/kernel/traps.c
> +++ b/arch/loongarch/kernel/traps.c
> @@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs *regs)
> #ifdef CONFIG_HAVE_HW_BREAKPOINT
> irqentry_state_t state = irqentry_enter(regs);
>
> - breakpoint_handler(regs);
> - watchpoint_handler(regs);
> - force_sig(SIGTRAP);
> + if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
> + int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
> + unsigned long pc = regs->csr_era;
> +
> + if (llbit) {

Hi, Qing,


It should be noted here. When the ll-sc combo is encountered, it is
regarded as an single instruction. So donnot clear llbit and reset
CSR.FWPS.Skip until the llsc execution is completed.

> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
> + csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
> + } else if (pc == current->thread.single_step) {
Note here as well. Because 3A5000 has a strange hardware issue that
certain insns are occasionally not skipped when CSR.FWPS.Skip is set,
such as fld.d/fst.d. Singlestep needs compare whether the csr_era is
equal to the value of singlestep which last time set, as in most case

they should be not equal.


And for this condition expression, some potentially strange insns may
cause bugs. For example, "b 0" or "jr rd" where rd is equal to its PC
will cause cannot stop the singlestep. These insns is so strange that
we did not consider in OW. However, I think we should consider this
case for robustness in upstream.

It is better to use macros avoid direct numbers when csr_write* is invoked.


Thanks,

Jinyang

> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
> + } else {
> + force_sig(SIGTRAP);
> + }
> + } else {
> + breakpoint_handler(regs);
> + watchpoint_handler(regs);
> + force_sig(SIGTRAP);
> + }
>
> irqentry_exit(regs, state);
> #endif


2023-02-16 03:13:50

by Youling Tang

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] LoongArch: Add ptrace single step support



On 02/16/2023 10:07 AM, Jinyang He wrote:
>
> On 2023-02-15 17:23, Qing Zhang wrote:
>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>> arch_has_single_step in and implementing the
>> user_enable_single_step and user_disable_single_step functions.
>>
>> LongArch has no hardware single-step register. the hardware single-step
>> function multiplex fetch instruction watchpoint(FWPS) and specifies that
>> the next instruction must trigger the watch exception by setting the
>> mask bit.
>>
>> Signed-off-by: Qing Zhang <[email protected]>
>> ---
>> arch/loongarch/include/asm/processor.h | 3 ++
>> arch/loongarch/include/asm/ptrace.h | 2 +
>> arch/loongarch/kernel/hw_breakpoint.c | 35 +++++++++++--
>> arch/loongarch/kernel/ptrace.c | 68 ++++++++++++++++++++++++++
>> arch/loongarch/kernel/traps.c | 20 ++++++--
>> 5 files changed, 120 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/processor.h
>> b/arch/loongarch/include/asm/processor.h
>> index db060c5a976f..3ea0f1910c23 100644
>> --- a/arch/loongarch/include/asm/processor.h
>> +++ b/arch/loongarch/include/asm/processor.h
>> @@ -131,6 +131,9 @@ struct thread_struct {
>> struct perf_event *hbp_break[LOONGARCH_MAX_BRP];
>> struct perf_event *hbp_watch[LOONGARCH_MAX_WRP];
>> + /* Used by ptrace single_step */
>> + unsigned long single_step;
>> +
>> /*
>> * FPU & vector registers, must be at last because
>> * they are conditionally copied at fork().
>> diff --git a/arch/loongarch/include/asm/ptrace.h
>> b/arch/loongarch/include/asm/ptrace.h
>> index 58596c4f8a0f..66a0e6c480a3 100644
>> --- a/arch/loongarch/include/asm/ptrace.h
>> +++ b/arch/loongarch/include/asm/ptrace.h
>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct
>> pt_regs *regs,
>> regs->regs[3] = val;
>> }
>> +#define arch_has_single_step() (1)
>> +
>> #endif /* _ASM_PTRACE_H */
>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c
>> b/arch/loongarch/kernel/hw_breakpoint.c
>> index 6431cd319c32..94967b887d92 100644
>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct
>> perf_event **slots, int max_slots,
>> */
>> void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>> {
>> + int i;
>> + struct thread_struct *t = &tsk->thread;
>> +
>> + for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>> + if (t->hbp_break[i]) {
>> + unregister_hw_breakpoint(t->hbp_break[i]);
>> + t->hbp_break[i] = NULL;
>> + }
>> + }
>> +
>> + for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>> + if (t->hbp_watch[i]) {
>> + unregister_hw_breakpoint(t->hbp_watch[i]);
>> + t->hbp_watch[i] = NULL;
>> + }
>> + }
>> }
>> void ptrace_hw_copy_thread(struct task_struct *tsk)
>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>> void hw_breakpoint_thread_switch(struct task_struct *next)
>> {
>> struct pt_regs *regs = task_pt_regs(next);
>> -
>> - /* Update breakpoints */
>> - update_bp_registers(regs, 1, 0);
>> - /* Update watchpoints */
>> - update_bp_registers(regs, 1, 1);
>> + u64 addr, mask;
>> +
>> + if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>> + addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>> + mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>> + if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>> + regs->csr_prmd |= CSR_PRMD_PWE;
>> + } else {
>> + /* Update breakpoints */
>> + update_bp_registers(regs, 1, 0);
>> + /* Update watchpoints */
>> + update_bp_registers(regs, 1, 1);
>> + }
>> }
>> void hw_breakpoint_pmu_read(struct perf_event *bp)
>> diff --git a/arch/loongarch/kernel/ptrace.c
>> b/arch/loongarch/kernel/ptrace.c
>> index bee4194177fd..52a3ee4366f4 100644
>> --- a/arch/loongarch/kernel/ptrace.c
>> +++ b/arch/loongarch/kernel/ptrace.c
>> @@ -20,6 +20,7 @@
>> #include <linux/context_tracking.h>
>> #include <linux/elf.h>
>> #include <linux/errno.h>
>> +#include <linux/hw_breakpoint.h>
>> #include <linux/mm.h>
>> #include <linux/ptrace.h>
>> #include <linux/regset.h>
>> @@ -30,6 +31,7 @@
>> #include <linux/stddef.h>
>> #include <linux/seccomp.h>
>> #include <linux/uaccess.h>
>> +#include <linux/thread_info.h>
>> #include <asm/byteorder.h>
>> #include <asm/cpu.h>
>> @@ -39,6 +41,7 @@
>> #include <asm/page.h>
>> #include <asm/pgtable.h>
>> #include <asm/processor.h>
>> +#include <asm/ptrace.h>
>> #include <asm/reg.h>
>> #include <asm/syscall.h>
>> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child,
>> long request,
>> return ret;
>> }
>> +
>> +void ptrace_triggered(struct perf_event *bp,
>> + struct perf_sample_data *data, struct pt_regs *regs)
>> +{
>> + struct perf_event_attr attr;
>> +
>> + attr = bp->attr;
>> + attr.disabled = true;
>> + modify_user_hw_breakpoint(bp, &attr);
>> +}
>> +
>> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
>> +{
>> + struct thread_struct *thread = &tsk->thread;
>> + struct perf_event *bp;
>> + struct perf_event_attr attr;
>> + struct arch_hw_breakpoint *info;
>> +
>> + bp = thread->hbp_break[0];
>> + if (!bp) {
>> + ptrace_breakpoint_init(&attr);
>> +
>> + attr.bp_addr = addr;
>> + attr.bp_len = HW_BREAKPOINT_LEN_8;
>> + attr.bp_type = HW_BREAKPOINT_X;
>> +
>> + bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>> + NULL, tsk);
>> + if (IS_ERR(bp))
>> + return PTR_ERR(bp);
>> +
>> + thread->hbp_break[0] = bp;
>> + } else {
>> + int err;
>> +
>> + attr = bp->attr;
>> + attr.bp_addr = addr;
>> + /* reenable breakpoint */
>> + attr.disabled = false;
>> + err = modify_user_hw_breakpoint(bp, &attr);
>> + if (unlikely(err))
>> + return err;
>> +
>> + csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>> + }
>> + info = counter_arch_bp(bp);
>> + info->mask = 0xffffffffffff;
>> +
>> + return 0;
>> +}
>> +
>> +/* ptrace API */
>> +void user_enable_single_step(struct task_struct *task)
>> +{
>> + struct thread_info *ti = task_thread_info(task);
>> +
>> + set_single_step(task, task_pt_regs(task)->csr_era);
>> + task->thread.single_step = task_pt_regs(task)->csr_era;
>> + set_ti_thread_flag(ti, TIF_SINGLESTEP);
>> +}
>> +
>> +void user_disable_single_step(struct task_struct *task)
>> +{
>> + clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>> +}
>> diff --git a/arch/loongarch/kernel/traps.c
>> b/arch/loongarch/kernel/traps.c
>> index 2b133079e0f3..56d7d076153c 100644
>> --- a/arch/loongarch/kernel/traps.c
>> +++ b/arch/loongarch/kernel/traps.c
>> @@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs
>> *regs)
>> #ifdef CONFIG_HAVE_HW_BREAKPOINT
>> irqentry_state_t state = irqentry_enter(regs);
>> - breakpoint_handler(regs);
>> - watchpoint_handler(regs);
>> - force_sig(SIGTRAP);
>> + if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>> + int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>> + unsigned long pc = regs->csr_era;
>> +
>> + if (llbit) {
>
> Hi, Qing,
>
>
> It should be noted here. When the ll-sc combo is encountered, it is
> regarded as an single instruction. So donnot clear llbit and reset
> CSR.FWPS.Skip until the llsc execution is completed.
>
>> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>> + csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
>> + } else if (pc == current->thread.single_step) {
> Note here as well. Because 3A5000 has a strange hardware issue that
> certain insns are occasionally not skipped when CSR.FWPS.Skip is set,
> such as fld.d/fst.d. Singlestep needs compare whether the csr_era is
> equal to the value of singlestep which last time set, as in most case
>
> they should be not equal.

BTW, I prefer to separate this special processing from this patch (for
example, add two patchs in this series, add special processing of
instructions such as LL-SC, and FLD.D/FST.D, etc.), and add
corresponding test cases to describe the phenomenon and reason, this
is conducive to everyone's understanding of the code.

>
>
> And for this condition expression, some potentially strange insns may
> cause bugs. For example, "b 0" or "jr rd" where rd is equal to its PC
> will cause cannot stop the singlestep. These insns is so strange that
> we did not consider in OW. However, I think we should consider this
> case for robustness in upstream.
>

I don't know if there will be instructions like "b 0" or "jr rd (rd =
pc)" in the executable file after linking?

Youling.

> It is better to use macros avoid direct numbers when csr_write* is invoked.
>
>
> Thanks,
>
> Jinyang
>
>> + csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>> + } else {
>> + force_sig(SIGTRAP);
>> + }
>> + } else {
>> + breakpoint_handler(regs);
>> + watchpoint_handler(regs);
>> + force_sig(SIGTRAP);
>> + }
>> irqentry_exit(regs, state);
>> #endif
>


2023-02-16 06:44:42

by Jinyang He

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] LoongArch: Add ptrace single step support


On 2023-02-16 11:13, Youling Tang wrote:
>
>
> On 02/16/2023 10:07 AM, Jinyang He wrote:
>>
>> On 2023-02-15 17:23, Qing Zhang wrote:
>>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>>> arch_has_single_step in  and implementing the
>>> user_enable_single_step and user_disable_single_step functions.
>>>
>>> LongArch has no hardware single-step register. the hardware single-step
>>> function multiplex fetch instruction watchpoint(FWPS) and specifies
>>> that
>>> the next instruction must trigger the watch exception by setting the
>>> mask bit.
>>>
>>> Signed-off-by: Qing Zhang <[email protected]>
>>> ---
>>>   arch/loongarch/include/asm/processor.h |  3 ++
>>>   arch/loongarch/include/asm/ptrace.h    |  2 +
>>>   arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>>   arch/loongarch/kernel/ptrace.c         | 68
>>> ++++++++++++++++++++++++++
>>>   arch/loongarch/kernel/traps.c          | 20 ++++++--
>>>   5 files changed, 120 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/loongarch/include/asm/processor.h
>>> b/arch/loongarch/include/asm/processor.h
>>> index db060c5a976f..3ea0f1910c23 100644
>>> --- a/arch/loongarch/include/asm/processor.h
>>> +++ b/arch/loongarch/include/asm/processor.h
>>> @@ -131,6 +131,9 @@ struct thread_struct {
>>>       struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>>       struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>>   +    /* Used by ptrace single_step */
>>> +    unsigned long single_step;
>>> +
>>>       /*
>>>        * FPU & vector registers, must be at last because
>>>        * they are conditionally copied at fork().
>>> diff --git a/arch/loongarch/include/asm/ptrace.h
>>> b/arch/loongarch/include/asm/ptrace.h
>>> index 58596c4f8a0f..66a0e6c480a3 100644
>>> --- a/arch/loongarch/include/asm/ptrace.h
>>> +++ b/arch/loongarch/include/asm/ptrace.h
>>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct
>>> pt_regs *regs,
>>>       regs->regs[3] = val;
>>>   }
>>>   +#define arch_has_single_step()        (1)
>>> +
>>>   #endif /* _ASM_PTRACE_H */
>>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c
>>> b/arch/loongarch/kernel/hw_breakpoint.c
>>> index 6431cd319c32..94967b887d92 100644
>>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct
>>> perf_event **slots, int max_slots,
>>>    */
>>>   void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>>   {
>>> +    int i;
>>> +    struct thread_struct *t = &tsk->thread;
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>>> +        if (t->hbp_break[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>>> +            t->hbp_break[i] = NULL;
>>> +        }
>>> +    }
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>>> +        if (t->hbp_watch[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>>> +            t->hbp_watch[i] = NULL;
>>> +        }
>>> +    }
>>>   }
>>>     void ptrace_hw_copy_thread(struct task_struct *tsk)
>>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>>   void hw_breakpoint_thread_switch(struct task_struct *next)
>>>   {
>>>       struct pt_regs *regs = task_pt_regs(next);
>>> -
>>> -    /* Update breakpoints */
>>> -    update_bp_registers(regs, 1, 0);
>>> -    /* Update watchpoints */
>>> -    update_bp_registers(regs, 1, 1);
>>> +    u64 addr, mask;
>>> +
>>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>>> +    } else {
>>> +        /* Update breakpoints */
>>> +        update_bp_registers(regs, 1, 0);
>>> +        /* Update watchpoints */
>>> +        update_bp_registers(regs, 1, 1);
>>> +    }
>>>   }
>>>     void hw_breakpoint_pmu_read(struct perf_event *bp)
>>> diff --git a/arch/loongarch/kernel/ptrace.c
>>> b/arch/loongarch/kernel/ptrace.c
>>> index bee4194177fd..52a3ee4366f4 100644
>>> --- a/arch/loongarch/kernel/ptrace.c
>>> +++ b/arch/loongarch/kernel/ptrace.c
>>> @@ -20,6 +20,7 @@
>>>   #include <linux/context_tracking.h>
>>>   #include <linux/elf.h>
>>>   #include <linux/errno.h>
>>> +#include <linux/hw_breakpoint.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/ptrace.h>
>>>   #include <linux/regset.h>
>>> @@ -30,6 +31,7 @@
>>>   #include <linux/stddef.h>
>>>   #include <linux/seccomp.h>
>>>   #include <linux/uaccess.h>
>>> +#include <linux/thread_info.h>
>>>     #include <asm/byteorder.h>
>>>   #include <asm/cpu.h>
>>> @@ -39,6 +41,7 @@
>>>   #include <asm/page.h>
>>>   #include <asm/pgtable.h>
>>>   #include <asm/processor.h>
>>> +#include <asm/ptrace.h>
>>>   #include <asm/reg.h>
>>>   #include <asm/syscall.h>
>>>   @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child,
>>> long request,
>>>         return ret;
>>>   }
>>> +
>>> +void ptrace_triggered(struct perf_event *bp,
>>> +              struct perf_sample_data *data, struct pt_regs *regs)
>>> +{
>>> +    struct perf_event_attr attr;
>>> +
>>> +    attr = bp->attr;
>>> +    attr.disabled = true;
>>> +    modify_user_hw_breakpoint(bp, &attr);
>>> +}
>>> +
>>> +static int set_single_step(struct task_struct *tsk, unsigned long
>>> addr)
>>> +{
>>> +    struct thread_struct *thread = &tsk->thread;
>>> +    struct perf_event *bp;
>>> +    struct perf_event_attr attr;
>>> +    struct arch_hw_breakpoint *info;
>>> +
>>> +    bp = thread->hbp_break[0];
>>> +    if (!bp) {
>>> +        ptrace_breakpoint_init(&attr);
>>> +
>>> +        attr.bp_addr = addr;
>>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>>> +        attr.bp_type = HW_BREAKPOINT_X;
>>> +
>>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>>> +                         NULL, tsk);
>>> +        if (IS_ERR(bp))
>>> +            return PTR_ERR(bp);
>>> +
>>> +        thread->hbp_break[0] = bp;
>>> +    } else {
>>> +        int err;
>>> +
>>> +        attr = bp->attr;
>>> +        attr.bp_addr = addr;
>>> +        /* reenable breakpoint */
>>> +        attr.disabled = false;
>>> +        err = modify_user_hw_breakpoint(bp, &attr);
>>> +        if (unlikely(err))
>>> +            return err;
>>> +
>>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>>> +    }
>>> +    info = counter_arch_bp(bp);
>>> +    info->mask = 0xffffffffffff;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/* ptrace API */
>>> +void user_enable_single_step(struct task_struct *task)
>>> +{
>>> +    struct thread_info *ti = task_thread_info(task);
>>> +
>>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>>> +}
>>> +
>>> +void user_disable_single_step(struct task_struct *task)
>>> +{
>>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>>> +}
>>> diff --git a/arch/loongarch/kernel/traps.c
>>> b/arch/loongarch/kernel/traps.c
>>> index 2b133079e0f3..56d7d076153c 100644
>>> --- a/arch/loongarch/kernel/traps.c
>>> +++ b/arch/loongarch/kernel/traps.c
>>> @@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs
>>> *regs)
>>>   #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>>       irqentry_state_t state = irqentry_enter(regs);
>>>   -    breakpoint_handler(regs);
>>> -    watchpoint_handler(regs);
>>> -    force_sig(SIGTRAP);
>>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>>> +        unsigned long pc = regs->csr_era;
>>> +
>>> +        if (llbit) {
>>
>> Hi, Qing,
>>
>>
>> It should be noted here. When the ll-sc combo is encountered, it is
>> regarded as an single instruction. So donnot clear llbit and reset
>> CSR.FWPS.Skip until the llsc execution is completed.
>>
>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>> +            csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
>>> +        } else if (pc == current->thread.single_step) {
>> Note here as well. Because 3A5000 has a strange hardware issue that
>> certain insns are occasionally not skipped when CSR.FWPS.Skip is set,
>> such as fld.d/fst.d. Singlestep needs compare whether the csr_era is
>> equal to the value of singlestep which last time set, as in most case
>>
>> they should be not equal.
>
> BTW, I prefer to separate this special processing from this patch (for
> example, add two patchs in this series, add special processing of
> instructions such as LL-SC, and FLD.D/FST.D, etc.), and add
> corresponding test cases to describe the phenomenon and reason, this
> is conducive to everyone's understanding of the code.

Agree.


>
>>
>>
>> And for this condition expression, some potentially strange insns may
>> cause bugs. For example, "b 0" or "jr rd" where rd is equal to its PC
>> will cause cannot stop the singlestep. These insns is so strange that
>> we did not consider in OW. However, I think we should consider this
>> case for robustness in upstream.
>>
>
> I don't know if there will be instructions like "b 0" or "jr rd (rd =
> pc)" in the executable file after linking?

So I said that is strange, and it can be fatal if happened. :-)


>
> Youling.
>
>> It is better to use macros avoid direct numbers when csr_write* is
>> invoked.
>>
>>
>> Thanks,
>>
>> Jinyang
>>
>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>> +        } else {
>>> +            force_sig(SIGTRAP);
>>> +        }
>>> +    } else {
>>> +        breakpoint_handler(regs);
>>> +        watchpoint_handler(regs);
>>> +        force_sig(SIGTRAP);
>>> +    }
>>>         irqentry_exit(regs, state);
>>>   #endif
>>
>


2023-02-16 06:57:58

by WANG Xuerui

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] LoongArch: Add ptrace single step support

On 2023/2/16 11:13, Youling Tang wrote:
>
>
> On 02/16/2023 10:07 AM, Jinyang He wrote:
>>
>> On 2023-02-15 17:23, Qing Zhang wrote:
>>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>>> arch_has_single_step in  and implementing the
>>> user_enable_single_step and user_disable_single_step functions.
>>>
>>> LongArch has no hardware single-step register. the hardware single-step
>>> function multiplex fetch instruction watchpoint(FWPS) and specifies that
>>> the next instruction must trigger the watch exception by setting the
>>> mask bit.
>>>
>>> Signed-off-by: Qing Zhang <[email protected]>
>>> ---
>>>   arch/loongarch/include/asm/processor.h |  3 ++
>>>   arch/loongarch/include/asm/ptrace.h    |  2 +
>>>   arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>>   arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>>>   arch/loongarch/kernel/traps.c          | 20 ++++++--
>>>   5 files changed, 120 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/loongarch/include/asm/processor.h
>>> b/arch/loongarch/include/asm/processor.h
>>> index db060c5a976f..3ea0f1910c23 100644
>>> --- a/arch/loongarch/include/asm/processor.h
>>> +++ b/arch/loongarch/include/asm/processor.h
>>> @@ -131,6 +131,9 @@ struct thread_struct {
>>>       struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>>       struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>>   +    /* Used by ptrace single_step */
>>> +    unsigned long single_step;
>>> +
>>>       /*
>>>        * FPU & vector registers, must be at last because
>>>        * they are conditionally copied at fork().
>>> diff --git a/arch/loongarch/include/asm/ptrace.h
>>> b/arch/loongarch/include/asm/ptrace.h
>>> index 58596c4f8a0f..66a0e6c480a3 100644
>>> --- a/arch/loongarch/include/asm/ptrace.h
>>> +++ b/arch/loongarch/include/asm/ptrace.h
>>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct
>>> pt_regs *regs,
>>>       regs->regs[3] = val;
>>>   }
>>>   +#define arch_has_single_step()        (1)
>>> +
>>>   #endif /* _ASM_PTRACE_H */
>>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c
>>> b/arch/loongarch/kernel/hw_breakpoint.c
>>> index 6431cd319c32..94967b887d92 100644
>>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct
>>> perf_event **slots, int max_slots,
>>>    */
>>>   void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>>   {
>>> +    int i;
>>> +    struct thread_struct *t = &tsk->thread;
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>>> +        if (t->hbp_break[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>>> +            t->hbp_break[i] = NULL;
>>> +        }
>>> +    }
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>>> +        if (t->hbp_watch[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>>> +            t->hbp_watch[i] = NULL;
>>> +        }
>>> +    }
>>>   }
>>>     void ptrace_hw_copy_thread(struct task_struct *tsk)
>>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>>   void hw_breakpoint_thread_switch(struct task_struct *next)
>>>   {
>>>       struct pt_regs *regs = task_pt_regs(next);
>>> -
>>> -    /* Update breakpoints */
>>> -    update_bp_registers(regs, 1, 0);
>>> -    /* Update watchpoints */
>>> -    update_bp_registers(regs, 1, 1);
>>> +    u64 addr, mask;
>>> +
>>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>>> +    } else {
>>> +        /* Update breakpoints */
>>> +        update_bp_registers(regs, 1, 0);
>>> +        /* Update watchpoints */
>>> +        update_bp_registers(regs, 1, 1);
>>> +    }
>>>   }
>>>     void hw_breakpoint_pmu_read(struct perf_event *bp)
>>> diff --git a/arch/loongarch/kernel/ptrace.c
>>> b/arch/loongarch/kernel/ptrace.c
>>> index bee4194177fd..52a3ee4366f4 100644
>>> --- a/arch/loongarch/kernel/ptrace.c
>>> +++ b/arch/loongarch/kernel/ptrace.c
>>> @@ -20,6 +20,7 @@
>>>   #include <linux/context_tracking.h>
>>>   #include <linux/elf.h>
>>>   #include <linux/errno.h>
>>> +#include <linux/hw_breakpoint.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/ptrace.h>
>>>   #include <linux/regset.h>
>>> @@ -30,6 +31,7 @@
>>>   #include <linux/stddef.h>
>>>   #include <linux/seccomp.h>
>>>   #include <linux/uaccess.h>
>>> +#include <linux/thread_info.h>
>>>     #include <asm/byteorder.h>
>>>   #include <asm/cpu.h>
>>> @@ -39,6 +41,7 @@
>>>   #include <asm/page.h>
>>>   #include <asm/pgtable.h>
>>>   #include <asm/processor.h>
>>> +#include <asm/ptrace.h>
>>>   #include <asm/reg.h>
>>>   #include <asm/syscall.h>
>>>   @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child,
>>> long request,
>>>         return ret;
>>>   }
>>> +
>>> +void ptrace_triggered(struct perf_event *bp,
>>> +              struct perf_sample_data *data, struct pt_regs *regs)
>>> +{
>>> +    struct perf_event_attr attr;
>>> +
>>> +    attr = bp->attr;
>>> +    attr.disabled = true;
>>> +    modify_user_hw_breakpoint(bp, &attr);
>>> +}
>>> +
>>> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
>>> +{
>>> +    struct thread_struct *thread = &tsk->thread;
>>> +    struct perf_event *bp;
>>> +    struct perf_event_attr attr;
>>> +    struct arch_hw_breakpoint *info;
>>> +
>>> +    bp = thread->hbp_break[0];
>>> +    if (!bp) {
>>> +        ptrace_breakpoint_init(&attr);
>>> +
>>> +        attr.bp_addr = addr;
>>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>>> +        attr.bp_type = HW_BREAKPOINT_X;
>>> +
>>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>>> +                         NULL, tsk);
>>> +        if (IS_ERR(bp))
>>> +            return PTR_ERR(bp);
>>> +
>>> +        thread->hbp_break[0] = bp;
>>> +    } else {
>>> +        int err;
>>> +
>>> +        attr = bp->attr;
>>> +        attr.bp_addr = addr;
>>> +        /* reenable breakpoint */
>>> +        attr.disabled = false;
>>> +        err = modify_user_hw_breakpoint(bp, &attr);
>>> +        if (unlikely(err))
>>> +            return err;
>>> +
>>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>>> +    }
>>> +    info = counter_arch_bp(bp);
>>> +    info->mask = 0xffffffffffff;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/* ptrace API */
>>> +void user_enable_single_step(struct task_struct *task)
>>> +{
>>> +    struct thread_info *ti = task_thread_info(task);
>>> +
>>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>>> +}
>>> +
>>> +void user_disable_single_step(struct task_struct *task)
>>> +{
>>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>>> +}
>>> diff --git a/arch/loongarch/kernel/traps.c
>>> b/arch/loongarch/kernel/traps.c
>>> index 2b133079e0f3..56d7d076153c 100644
>>> --- a/arch/loongarch/kernel/traps.c
>>> +++ b/arch/loongarch/kernel/traps.c
>>> @@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs
>>> *regs)
>>>   #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>>       irqentry_state_t state = irqentry_enter(regs);
>>>   -    breakpoint_handler(regs);
>>> -    watchpoint_handler(regs);
>>> -    force_sig(SIGTRAP);
>>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>>> +        unsigned long pc = regs->csr_era;
>>> +
>>> +        if (llbit) {
>>
>> Hi, Qing,
>>
>>
>> It should be noted here. When the ll-sc combo is encountered, it is
>> regarded as an single instruction. So donnot clear llbit and reset
>> CSR.FWPS.Skip until the llsc execution is completed.
>>
>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>> +            csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
>>> +        } else if (pc == current->thread.single_step) {
>> Note here as well. Because 3A5000 has a strange hardware issue that
>> certain insns are occasionally not skipped when CSR.FWPS.Skip is set,
>> such as fld.d/fst.d. Singlestep needs compare whether the csr_era is
>> equal to the value of singlestep which last time set, as in most case
>>
>> they should be not equal.
>
> BTW, I prefer to separate this special processing from this patch (for
> example, add two patchs in this series, add special processing of
> instructions such as LL-SC, and FLD.D/FST.D, etc.), and add
> corresponding test cases to describe the phenomenon and reason, this
> is conducive to everyone's understanding of the code.
>
>>
>>
>> And for this condition expression, some potentially strange insns may
>> cause bugs. For example, "b 0" or "jr rd" where rd is equal to its PC
>> will cause cannot stop the singlestep. These insns is so strange that
>> we did not consider in OW. However, I think we should consider this
>> case for robustness in upstream.
>>
>
> I don't know if there will be instructions like "b 0" or "jr rd (rd =
> pc)" in the executable file after linking?
FWIW `jirl A, A, 0` is apparently ubiquitous in the old world:

$ objdump -d
~/loongnix-sysroot/usr/lib/loongarch64-linux-gnu/libc_nonshared.a | grep
-E 'jirl\s+\$(..), \$\1'
20: 4c000021 jirl $ra, $ra, 0
24: 4c000021 jirl $ra, $ra, 0
c: 4c000021 jirl $ra, $ra, 0

One of the occurrences is in __stack_chk_fail_local, guess how popular
it is among the whole old world...

But fortunately a quick check against my Gentoo sysroot yielded nothing,
so perhaps the new world as a whole isn't going to be affected. I've
checked some of the *older* (i.e. before 2022) new world sysroots I have
at hand and they seemed not affected either.

As for `b 0`, they are kinda already signifying some panic/abort/halt
condition, at least in a few projects that I've studied/ported.

But IMO the kernel shouldn't get DoS'd even if userland gets stuck in
things like this, so some kind of workaround should still be necessary.

--
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


2023-02-17 02:36:31

by Qing Zhang

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] LoongArch: Add ptrace single step support

Hi, forks,
On 2023/2/16 下午2:57, WANG Xuerui wrote:
> On 2023/2/16 11:13, Youling Tang wrote:
>>
>>
>> On 02/16/2023 10:07 AM, Jinyang He wrote:
>>>
>>> On 2023-02-15 17:23, Qing Zhang wrote:
>>>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>>>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>>>> arch_has_single_step in  and implementing the
>>>> user_enable_single_step and user_disable_single_step functions.
>>>>
>>>> LongArch has no hardware single-step register. the hardware single-step
>>>> function multiplex fetch instruction watchpoint(FWPS) and specifies
>>>> that
>>>> the next instruction must trigger the watch exception by setting the
>>>> mask bit.
>>>>
>>>> Signed-off-by: Qing Zhang <[email protected]>
>>>> ---
>>>>   arch/loongarch/include/asm/processor.h |  3 ++
>>>>   arch/loongarch/include/asm/ptrace.h    |  2 +
>>>>   arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>>>   arch/loongarch/kernel/ptrace.c         | 68
>>>> ++++++++++++++++++++++++++
>>>>   arch/loongarch/kernel/traps.c          | 20 ++++++--
>>>>   5 files changed, 120 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/loongarch/include/asm/processor.h
>>>> b/arch/loongarch/include/asm/processor.h
>>>> index db060c5a976f..3ea0f1910c23 100644
>>>> --- a/arch/loongarch/include/asm/processor.h
>>>> +++ b/arch/loongarch/include/asm/processor.h
>>>> @@ -131,6 +131,9 @@ struct thread_struct {
>>>>       struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>>>       struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>>>   +    /* Used by ptrace single_step */
>>>> +    unsigned long single_step;
>>>> +
>>>>       /*
>>>>        * FPU & vector registers, must be at last because
>>>>        * they are conditionally copied at fork().
>>>> diff --git a/arch/loongarch/include/asm/ptrace.h
>>>> b/arch/loongarch/include/asm/ptrace.h
>>>> index 58596c4f8a0f..66a0e6c480a3 100644
>>>> --- a/arch/loongarch/include/asm/ptrace.h
>>>> +++ b/arch/loongarch/include/asm/ptrace.h
>>>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct
>>>> pt_regs *regs,
>>>>       regs->regs[3] = val;
>>>>   }
>>>>   +#define arch_has_single_step()        (1)
>>>> +
>>>>   #endif /* _ASM_PTRACE_H */
>>>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c
>>>> b/arch/loongarch/kernel/hw_breakpoint.c
>>>> index 6431cd319c32..94967b887d92 100644
>>>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>>>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>>>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct
>>>> perf_event **slots, int max_slots,
>>>>    */
>>>>   void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>>>   {
>>>> +    int i;
>>>> +    struct thread_struct *t = &tsk->thread;
>>>> +
>>>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>>>> +        if (t->hbp_break[i]) {
>>>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>>>> +            t->hbp_break[i] = NULL;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>>>> +        if (t->hbp_watch[i]) {
>>>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>>>> +            t->hbp_watch[i] = NULL;
>>>> +        }
>>>> +    }
>>>>   }
>>>>     void ptrace_hw_copy_thread(struct task_struct *tsk)
>>>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>>>   void hw_breakpoint_thread_switch(struct task_struct *next)
>>>>   {
>>>>       struct pt_regs *regs = task_pt_regs(next);
>>>> -
>>>> -    /* Update breakpoints */
>>>> -    update_bp_registers(regs, 1, 0);
>>>> -    /* Update watchpoints */
>>>> -    update_bp_registers(regs, 1, 1);
>>>> +    u64 addr, mask;
>>>> +
>>>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>>>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>>>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>>>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>>>> +    } else {
>>>> +        /* Update breakpoints */
>>>> +        update_bp_registers(regs, 1, 0);
>>>> +        /* Update watchpoints */
>>>> +        update_bp_registers(regs, 1, 1);
>>>> +    }
>>>>   }
>>>>     void hw_breakpoint_pmu_read(struct perf_event *bp)
>>>> diff --git a/arch/loongarch/kernel/ptrace.c
>>>> b/arch/loongarch/kernel/ptrace.c
>>>> index bee4194177fd..52a3ee4366f4 100644
>>>> --- a/arch/loongarch/kernel/ptrace.c
>>>> +++ b/arch/loongarch/kernel/ptrace.c
>>>> @@ -20,6 +20,7 @@
>>>>   #include <linux/context_tracking.h>
>>>>   #include <linux/elf.h>
>>>>   #include <linux/errno.h>
>>>> +#include <linux/hw_breakpoint.h>
>>>>   #include <linux/mm.h>
>>>>   #include <linux/ptrace.h>
>>>>   #include <linux/regset.h>
>>>> @@ -30,6 +31,7 @@
>>>>   #include <linux/stddef.h>
>>>>   #include <linux/seccomp.h>
>>>>   #include <linux/uaccess.h>
>>>> +#include <linux/thread_info.h>
>>>>     #include <asm/byteorder.h>
>>>>   #include <asm/cpu.h>
>>>> @@ -39,6 +41,7 @@
>>>>   #include <asm/page.h>
>>>>   #include <asm/pgtable.h>
>>>>   #include <asm/processor.h>
>>>> +#include <asm/ptrace.h>
>>>>   #include <asm/reg.h>
>>>>   #include <asm/syscall.h>
>>>>   @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child,
>>>> long request,
>>>>         return ret;
>>>>   }
>>>> +
>>>> +void ptrace_triggered(struct perf_event *bp,
>>>> +              struct perf_sample_data *data, struct pt_regs *regs)
>>>> +{
>>>> +    struct perf_event_attr attr;
>>>> +
>>>> +    attr = bp->attr;
>>>> +    attr.disabled = true;
>>>> +    modify_user_hw_breakpoint(bp, &attr);
>>>> +}
>>>> +
>>>> +static int set_single_step(struct task_struct *tsk, unsigned long
>>>> addr)
>>>> +{
>>>> +    struct thread_struct *thread = &tsk->thread;
>>>> +    struct perf_event *bp;
>>>> +    struct perf_event_attr attr;
>>>> +    struct arch_hw_breakpoint *info;
>>>> +
>>>> +    bp = thread->hbp_break[0];
>>>> +    if (!bp) {
>>>> +        ptrace_breakpoint_init(&attr);
>>>> +
>>>> +        attr.bp_addr = addr;
>>>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>>>> +        attr.bp_type = HW_BREAKPOINT_X;
>>>> +
>>>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>>>> +                         NULL, tsk);
>>>> +        if (IS_ERR(bp))
>>>> +            return PTR_ERR(bp);
>>>> +
>>>> +        thread->hbp_break[0] = bp;
>>>> +    } else {
>>>> +        int err;
>>>> +
>>>> +        attr = bp->attr;
>>>> +        attr.bp_addr = addr;
>>>> +        /* reenable breakpoint */
>>>> +        attr.disabled = false;
>>>> +        err = modify_user_hw_breakpoint(bp, &attr);
>>>> +        if (unlikely(err))
>>>> +            return err;
>>>> +
>>>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>>>> +    }
>>>> +    info = counter_arch_bp(bp);
>>>> +    info->mask = 0xffffffffffff;
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +/* ptrace API */
>>>> +void user_enable_single_step(struct task_struct *task)
>>>> +{
>>>> +    struct thread_info *ti = task_thread_info(task);
>>>> +
>>>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>>>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>>>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>>>> +}
>>>> +
>>>> +void user_disable_single_step(struct task_struct *task)
>>>> +{
>>>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>>>> +}
>>>> diff --git a/arch/loongarch/kernel/traps.c
>>>> b/arch/loongarch/kernel/traps.c
>>>> index 2b133079e0f3..56d7d076153c 100644
>>>> --- a/arch/loongarch/kernel/traps.c
>>>> +++ b/arch/loongarch/kernel/traps.c
>>>> @@ -511,9 +511,23 @@ asmlinkage void noinstr do_watch(struct pt_regs
>>>> *regs)
>>>>   #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>>>       irqentry_state_t state = irqentry_enter(regs);
>>>>   -    breakpoint_handler(regs);
>>>> -    watchpoint_handler(regs);
>>>> -    force_sig(SIGTRAP);
>>>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>>>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>>>> +        unsigned long pc = regs->csr_era;
>>>> +
>>>> +        if (llbit) {
>>>
>>> Hi, Qing,
>>>
>>>
>>> It should be noted here. When the ll-sc combo is encountered, it is
>>> regarded as an single instruction. So donnot clear llbit and reset
>>> CSR.FWPS.Skip until the llsc execution is completed.
>>>
>>>> +            csr_write32(0x10000, LOONGARCH_CSR_FWPS);
>>>> +            csr_write32(0x4, LOONGARCH_CSR_LLBCTL);
>>>> +        } else if (pc == current->thread.single_step) {
>>> Note here as well. Because 3A5000 has a strange hardware issue that
>>> certain insns are occasionally not skipped when CSR.FWPS.Skip is set,
>>> such as fld.d/fst.d. Singlestep needs compare whether the csr_era is
>>> equal to the value of singlestep which last time set, as in most case
>>>
>>> they should be not equal.
>>
>> BTW, I prefer to separate this special processing from this patch (for
>> example, add two patchs in this series, add special processing of
>> instructions such as LL-SC, and FLD.D/FST.D, etc.), and add
>> corresponding test cases to describe the phenomenon and reason, this
>> is conducive to everyone's understanding of the code.
>>
In order to ensure the integrity of the patch function,
it is better to add comments without splitting.
>>>
>>>
>>> And for this condition expression, some potentially strange insns may
>>> cause bugs. For example, "b 0" or "jr rd" where rd is equal to its PC
>>> will cause cannot stop the singlestep. These insns is so strange that
>>> we did not consider in OW. However, I think we should consider this
>>> case for robustness in upstream.
>>>
>>
>> I don't know if there will be instructions like "b 0" or "jr rd (rd =
>> pc)" in the executable file after linking?
> FWIW `jirl A, A, 0` is apparently ubiquitous in the old world:
>
> $ objdump -d
> ~/loongnix-sysroot/usr/lib/loongarch64-linux-gnu/libc_nonshared.a | grep
> -E 'jirl\s+\$(..), \$\1'
>   20:   4c000021        jirl            $ra, $ra, 0
>   24:   4c000021        jirl            $ra, $ra, 0
>    c:   4c000021        jirl            $ra, $ra, 0
>
> One of the occurrences is in __stack_chk_fail_local, guess how popular
> it is among the whole old world...
>
> But fortunately a quick check against my Gentoo sysroot yielded nothing,
> so perhaps the new world as a whole isn't going to be affected. I've
> checked some of the *older* (i.e. before 2022) new world sysroots I have
> at hand and they seemed not affected either.
>
> As for `b 0`, they are kinda already signifying some panic/abort/halt
> condition, at least in a few projects that I've studied/ported.
>
> But IMO the kernel shouldn't get DoS'd even if userland gets stuck in
> things like this, so some kind of workaround should still be necessary.
>
Got it, the fix is already in v4.

Thanks,
-Qing