2009-03-19 23:48:34

by K.Prasad

[permalink] [raw]
Subject: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

This patch introduces two new files hw_breakpoint.[ch] which defines the
generic interfaces to use hardware breakpoint infrastructure of the system.

Signed-off-by: K.Prasad <[email protected]>
Signed-off-by: Alan Stern <[email protected]>
---
arch/Kconfig | 3
include/asm-generic/hw_breakpoint.h | 140 +++++++++++++
kernel/Makefile | 1
kernel/hw_breakpoint.c | 361 ++++++++++++++++++++++++++++++++++++
4 files changed, 505 insertions(+)

Index: linux-2.6-tip.hbkpt/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6-tip.hbkpt/kernel/hw_breakpoint.c
@@ -0,0 +1,361 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2007 Alan Stern
+ * Copyright (C) IBM Corporation, 2009
+ */
+
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers.
+ *
+ * This file contains the arch-independent routines. It is not meant
+ * to be compiled as a standalone source file; rather it should be
+ * #include'd by the arch-specific implementation.
+ */
+
+#include <linux/irqflags.h>
+#include <linux/kallsyms.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/kdebug.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/percpu.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/debugreg.h>
+
+/* Array of kernel-space breakpoint structures */
+struct hw_breakpoint *hbkpt_kernel[HB_NUM];
+/*
+ * Kernel breakpoints grow downwards, starting from HB_NUM
+ * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
+ * kernel-space request
+ */
+unsigned int hbkpt_kernel_pos;
+
+/* An array containing refcount of threads using a given bkpt register */
+unsigned int hbkpt_user_max_refcount[HB_NUM];
+
+/* One higher than the highest counted user-space breakpoint register */
+unsigned int hbkpt_user_max;
+
+struct task_struct *last_debugged_task;
+
+/*
+ * Install the debug register values for a new thread.
+ */
+void switch_to_thread_hw_breakpoint(struct task_struct *tsk)
+{
+ /* Set the debug register */
+ arch_install_thread_hbkpt(tsk);
+ last_debugged_task = current;
+
+ put_cpu_no_resched();
+}
+
+/*
+ * Install the debug register values for just the kernel, no thread.
+ */
+void switch_to_none_hw_breakpoint(void)
+{
+ arch_install_none();
+ put_cpu_no_resched();
+}
+
+/*
+ * Load the debug registers during startup of a CPU.
+ */
+void load_debug_registers(void)
+{
+ int i;
+ unsigned long flags;
+
+ /* Prevent IPIs for new kernel breakpoint updates */
+ local_irq_save(flags);
+
+ for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
+ if (hbkpt_kernel[i])
+ on_each_cpu(arch_install_kernel_hbkpt,
+ (void *)hbkpt_kernel[i], 0);
+ if (current->thread.dr7)
+ arch_install_thread_hbkpt(current);
+
+ local_irq_restore(flags);
+}
+
+/*
+ * Erase all the hardware breakpoint info associated with a thread.
+ *
+ * If tsk != current then tsk must not be usable (for example, a
+ * child being cleaned up from a failed fork).
+ */
+void flush_thread_hw_breakpoint(struct task_struct *tsk)
+{
+ int i;
+ struct thread_struct *thread = &(tsk->thread);
+
+ mutex_lock(&hw_breakpoint_mutex);
+
+ /* Let the breakpoints know they are being uninstalled */
+
+ /* The thread no longer has any breakpoints associated with it */
+ clear_tsk_thread_flag(tsk, TIF_DEBUG);
+ for (i = 0; i < HB_NUM; i++) {
+ if (thread->hbkpt[i]) {
+ hbkpt_user_max_refcount[i]--;
+ if (!hbkpt_user_max_refcount[i])
+ hbkpt_user_max--;
+ kfree(thread->hbkpt[i]);
+ thread->hbkpt[i] = NULL;
+ }
+ }
+ thread->hbkpt_num_installed = 0;
+
+ /* Actually uninstall the breakpoints if necessary */
+ if (tsk == current)
+ switch_to_none_hw_breakpoint();
+ mutex_unlock(&hw_breakpoint_mutex);
+}
+
+/*
+ * Copy the hardware breakpoint info from a thread to its cloned child.
+ */
+int copy_thread_hw_breakpoint(struct task_struct *tsk,
+ struct task_struct *child, unsigned long clone_flags)
+{
+ /* We will assume that breakpoint settings are not inherited
+ * and the child starts out with no debug registers set.
+ * But what about CLONE_PTRACE?
+ */
+ clear_tsk_thread_flag(child, TIF_DEBUG);
+ return 0;
+}
+
+/*
+ * Validate the settings in a hw_breakpoint structure.
+ */
+static int validate_settings(struct hw_breakpoint *bp, struct task_struct *tsk)
+{
+ int ret;
+ unsigned int align;
+
+ ret = arch_validate_hwbkpt_settings(bp, &align, tsk);
+ if (ret < 0)
+ goto err;
+
+ /* Check that the low-order bits of the address are appropriate
+ * for the alignment implied by len.
+ */
+ if (bp->info.address & align)
+ return -EINVAL;
+
+ /* Check that the virtual address is in the proper range */
+ if (tsk) {
+ if (!arch_check_va_in_userspace(bp->info.address, tsk))
+ return -EFAULT;
+ } else {
+ if (!arch_check_va_in_kernelspace(bp->info.address))
+ return -EFAULT;
+ }
+ err:
+ return ret;
+}
+
+int __register_user_hw_breakpoint(int pos, struct task_struct *tsk,
+ struct hw_breakpoint *bp)
+{
+ struct thread_struct *thread = &(tsk->thread);
+ int rc;
+
+ /* Do not overcommit. Fail if kernel has used the hbkpt registers */
+ if (pos >= hbkpt_kernel_pos)
+ return -ENOSPC;
+
+ rc = validate_settings(bp, tsk);
+ if (rc)
+ return rc;
+
+ thread->hbkpt[pos] = bp;
+ thread->hbkpt_num_installed++;
+ hbkpt_user_max_refcount[pos]++;
+ /* 'tsk' is the thread that uses max number of hbkpt registers */
+ if (hbkpt_user_max < thread->hbkpt_num_installed)
+ hbkpt_user_max++;
+
+ arch_register_user_hw_breakpoint(pos, bp, tsk);
+
+ /*
+ * Does it need to be installed right now?
+ * Otherwise it will get installed the next time tsk runs
+ */
+ if (tsk == current)
+ switch_to_thread_hw_breakpoint(tsk);
+ return rc;
+}
+
+/*
+ * Modify the address of a hbkpt register already in use by the task
+ * Do not invoke this in-lieu of a __unregister_user_hw_breakpoint()
+ */
+int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk,
+ struct hw_breakpoint *bp)
+{
+ int rc;
+ struct thread_struct *thread = &(tsk->thread);
+
+ if ((pos >= hbkpt_kernel_pos) || (validate_settings(bp, tsk)))
+ return -EINVAL;
+
+ thread->hbkpt[pos] = bp;
+
+ /*
+ * 'pos' must be that of a hbkpt register already used by 'tsk'
+ * Otherwise arch_modify_user_hw_breakpoint() will fail
+ */
+ rc = arch_modify_user_hw_breakpoint(pos, bp, tsk);
+ if (rc)
+ return rc;
+
+ if (tsk == current)
+ switch_to_thread_hw_breakpoint(tsk);
+ return 0;
+}
+
+/*
+ * Actual implementation of unregister_user_hw_breakpoint.
+ */
+void __unregister_user_hw_breakpoint(int pos, struct task_struct *tsk,
+ struct hw_breakpoint *bp)
+{
+ struct thread_struct *thread = &(tsk->thread);
+
+ if (!bp)
+ return;
+
+ hbkpt_user_max_refcount[pos]--;
+ if ((hbkpt_user_max == pos + 1) && (hbkpt_user_max_refcount[pos] == 0))
+ hbkpt_user_max--;
+ thread->hbkpt_num_installed--;
+
+ arch_unregister_user_hw_breakpoint(pos, bp, tsk);
+
+ if (tsk == current)
+ switch_to_thread_hw_breakpoint(tsk);
+ kfree(tsk->thread.hbkpt[pos]);
+ tsk->thread.hbkpt[pos] = NULL;
+}
+
+/**
+ * register_kernel_hw_breakpoint - register a hardware breakpoint for kernel space
+ * @bp: the breakpoint structure to register
+ *
+ * @bp.info->name or @bp.info->address, @bp.info->len, @bp.info->type and
+ * @bp->triggered must be set properly before invocation
+ *
+ */
+int register_kernel_hw_breakpoint(struct hw_breakpoint *bp)
+{
+ int rc;
+
+ rc = validate_settings(bp, NULL);
+ if (rc)
+ return rc;
+
+ mutex_lock(&hw_breakpoint_mutex);
+
+ /* Check if we are over-committing */
+ if (hbkpt_kernel_pos <= hbkpt_user_max) {
+ mutex_unlock(&hw_breakpoint_mutex);
+ return -EINVAL;
+ }
+
+ hbkpt_kernel_pos--;
+ hbkpt_kernel[hbkpt_kernel_pos] = bp;
+ arch_register_kernel_hw_breakpoint(bp);
+
+ mutex_unlock(&hw_breakpoint_mutex);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(register_kernel_hw_breakpoint);
+
+/**
+ * unregister_kernel_hw_breakpoint - unregister a hardware breakpoint for kernel space
+ * @bp: the breakpoint structure to unregister
+ *
+ * Uninstalls and unregisters @bp.
+ */
+void unregister_kernel_hw_breakpoint(struct hw_breakpoint *bp)
+{
+ int i, j;
+
+ mutex_lock(&hw_breakpoint_mutex);
+
+ for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
+ if (bp == hbkpt_kernel[i])
+ break;
+
+ arch_unregister_kernel_hw_breakpoint(i);
+
+ /*
+ * We'll shift the breakpoints one-level above to accomodate new thread
+ * requests
+ */
+ if (i > hbkpt_kernel_pos)
+ for (j = i; j == hbkpt_kernel_pos; j--)
+ hbkpt_kernel[j] = hbkpt_kernel[j-1];
+ hbkpt_kernel_pos++;
+
+ mutex_unlock(&hw_breakpoint_mutex);
+}
+EXPORT_SYMBOL_GPL(unregister_kernel_hw_breakpoint);
+
+/*
+ * Handle debug exception notifications.
+ */
+static int __kprobes hw_breakpoint_exceptions_notify(
+ struct notifier_block *unused, unsigned long val, void *data)
+{
+ if (val != DIE_DEBUG)
+ return NOTIFY_DONE;
+ return hw_breakpoint_handler(data);
+}
+
+static struct notifier_block hw_breakpoint_exceptions_nb = {
+ .notifier_call = hw_breakpoint_exceptions_notify,
+ /* we need to be notified first */
+ .priority = 0x7fffffff
+};
+
+static int __init init_hw_breakpoint(void)
+{
+ int i;
+
+ hbkpt_kernel_pos = HB_NUM;
+ for (i = 0; i < HB_NUM; i++)
+ hbkpt_user_max_refcount[i] = 0;
+ hbkpt_user_max = 0;
+ load_debug_registers();
+
+ return register_die_notifier(&hw_breakpoint_exceptions_nb);
+}
+
+core_initcall(init_hw_breakpoint);
Index: linux-2.6-tip.hbkpt/kernel/Makefile
===================================================================
--- linux-2.6-tip.hbkpt.orig/kernel/Makefile
+++ linux-2.6-tip.hbkpt/kernel/Makefile
@@ -95,6 +95,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += trace/
obj-$(CONFIG_TRACING) += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
obj-$(CONFIG_PERF_COUNTERS) += perf_counter.o
+obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o

ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
# According to Alan Modra <[email protected]>, the -fno-omit-frame-pointer is
Index: linux-2.6-tip.hbkpt/include/asm-generic/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6-tip.hbkpt/include/asm-generic/hw_breakpoint.h
@@ -0,0 +1,140 @@
+#ifndef _ASM_GENERIC_HW_BREAKPOINT_H
+#define _ASM_GENERIC_HW_BREAKPOINT_H
+
+#ifndef __ARCH_HW_BREAKPOINT_H
+#error "Please don't include this file directly"
+#endif
+
+#ifdef __KERNEL__
+#include <linux/list.h>
+#include <linux/types.h>
+#include <linux/kallsyms.h>
+
+/**
+ * struct hw_breakpoint - unified kernel/user-space hardware breakpoint
+ * @triggered: callback invoked after target address access
+ * @info: arch-specific breakpoint info (address, length, and type)
+ *
+ * %hw_breakpoint structures are the kernel's way of representing
+ * hardware breakpoints. These are data breakpoints
+ * (also known as "watchpoints", triggered on data access), and the breakpoint's
+ * target address can be located in either kernel space or user space.
+ *
+ * The breakpoint's address, length, and type are highly
+ * architecture-specific. The values are encoded in the @info field; you
+ * specify them when registering the breakpoint. To examine the encoded
+ * values use hw_breakpoint_get_{kaddress,uaddress,len,type}(), declared
+ * below.
+ *
+ * The address is specified as a regular kernel pointer (for kernel-space
+ * breakponts) or as an %__user pointer (for user-space breakpoints).
+ * With register_user_hw_breakpoint(), the address must refer to a
+ * location in user space. The breakpoint will be active only while the
+ * requested task is running. Conversely with
+ * register_kernel_hw_breakpoint(), the address must refer to a location
+ * in kernel space, and the breakpoint will be active on all CPUs
+ * regardless of the current task.
+ *
+ * The length is the breakpoint's extent in bytes, which is subject to
+ * certain limitations. include/asm/hw_breakpoint.h contains macros
+ * defining the available lengths for a specific architecture. Note that
+ * the address's alignment must match the length. The breakpoint will
+ * catch accesses to any byte in the range from address to address +
+ * (length - 1).
+ *
+ * The breakpoint's type indicates the sort of access that will cause it
+ * to trigger. Possible values may include:
+ *
+ * %HW_BREAKPOINT_RW (triggered on read or write access),
+ * %HW_BREAKPOINT_WRITE (triggered on write access), and
+ * %HW_BREAKPOINT_READ (triggered on read access).
+ *
+ * Appropriate macros are defined in include/asm/hw_breakpoint.h; not all
+ * possibilities are available on all architectures. Execute breakpoints
+ * must have length equal to the special value %HW_BREAKPOINT_LEN_EXECUTE.
+ *
+ * When a breakpoint gets hit, the @triggered callback is
+ * invoked in_interrupt with a pointer to the %hw_breakpoint structure and the
+ * processor registers.
+ * Data breakpoints occur after the memory access has taken place.
+ * Breakpoints are disabled during execution @triggered, to avoid
+ * recursive traps and allow unhindered access to breakpointed memory.
+ *
+ * This sample code sets a breakpoint on pid_max and registers a callback
+ * function for writes to that variable. Note that it is not portable
+ * as written, because not all architectures support HW_BREAKPOINT_LEN_4.
+ *
+ * ----------------------------------------------------------------------
+ *
+ * #include <asm/hw_breakpoint.h>
+ *
+ * struct hw_breakpoint my_bp;
+ *
+ * static void my_triggered(struct hw_breakpoint *bp, struct pt_regs *regs)
+ * {
+ * printk(KERN_DEBUG "Inside triggered routine of breakpoint exception\n");
+ * dump_stack();
+ * .......<more debugging output>........
+ * }
+ *
+ * static struct hw_breakpoint my_bp;
+ *
+ * static int init_module(void)
+ * {
+ * ..........<do anything>............
+ * my_bp.info.type = HW_BREAKPOINT_WRITE;
+ * my_bp.info.len = HW_BREAKPOINT_LEN_4;
+ *
+ * my_bp.installed = (void *)my_bp_installed;
+ *
+ * rc = register_kernel_hw_breakpoint(&my_bp);
+ * ..........<do anything>............
+ * }
+ *
+ * static void cleanup_module(void)
+ * {
+ * ..........<do anything>............
+ * unregister_kernel_hw_breakpoint(&my_bp);
+ * ..........<do anything>............
+ * }
+ *
+ * ----------------------------------------------------------------------
+ */
+struct hw_breakpoint {
+ void (*triggered)(struct hw_breakpoint *, struct pt_regs *);
+ struct arch_hw_breakpoint info;
+};
+
+/*
+ * len and type values are defined in include/asm/hw_breakpoint.h.
+ * Available values vary according to the architecture. On i386 the
+ * possibilities are:
+ *
+ * HW_BREAKPOINT_LEN_1
+ * HW_BREAKPOINT_LEN_2
+ * HW_BREAKPOINT_LEN_4
+ * HW_BREAKPOINT_LEN_EXECUTE
+ * HW_BREAKPOINT_RW
+ * HW_BREAKPOINT_READ
+ * HW_BREAKPOINT_EXECUTE
+ *
+ * On other architectures HW_BREAKPOINT_LEN_8 may be available, and the
+ * 1-, 2-, and 4-byte lengths may be unavailable. There also may be
+ * HW_BREAKPOINT_WRITE. You can use #ifdef to check at compile time.
+ */
+
+static DEFINE_MUTEX(hw_breakpoint_mutex); /* Protects everything */
+
+/*
+ * Kernel breakpoints are not associated with any particular thread.
+ */
+int register_kernel_hw_breakpoint(struct hw_breakpoint *bp);
+void unregister_kernel_hw_breakpoint(struct hw_breakpoint *bp);
+void switch_to_none_hw_breakpoint(void);
+
+extern unsigned int hbkpt_kernel_pos;
+extern unsigned int hbkpt_user_max;
+extern struct task_struct *last_debugged_task;
+
+#endif /* __KERNEL__ */
+#endif /* _ASM_GENERIC_HW_BREAKPOINT_H */
Index: linux-2.6-tip.hbkpt/arch/Kconfig
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/Kconfig
+++ linux-2.6-tip.hbkpt/arch/Kconfig
@@ -106,3 +106,6 @@ config HAVE_CLK
help
The <linux/clk.h> calls support software clock gating and
thus are a key power management tool on many systems.
+
+config HAVE_HW_BREAKPOINT
+ bool


2009-03-20 14:33:41

by Alan Stern

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Fri, 20 Mar 2009, K.Prasad wrote:

> This patch introduces two new files hw_breakpoint.[ch] which defines the
> generic interfaces to use hardware breakpoint infrastructure of the system.

Prasad:

I'm sorry to say this is full of mistakes. So far I have looked only
at patch 01/11, but it's not good.

> + * Kernel breakpoints grow downwards, starting from HB_NUM
> + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> + * kernel-space request
> + */
> +unsigned int hbkpt_kernel_pos;

This doesn't make much sense. All you need to know is which registers
are in use; all others are available.

For example, suppose the kernel allocated breakpoints 3, 2, and 1, and
then deallocated 2. Then bp 2 would be available for use, even though
2 > 1.

It's also a poor choice of name. Everywhere else (in my patches,
anyway) the code refers to hardware breakpoints using the abbreviation
"hwbp" or "hw_breakpoint". There's no reason suddenly to start using
"hbkpt".

> +/* An array containing refcount of threads using a given bkpt register */
> +unsigned int hbkpt_user_max_refcount[HB_NUM];

Why did you put "max" in the name? Isn't this just a simple refcount?

> +/* One higher than the highest counted user-space breakpoint register */
> +unsigned int hbkpt_user_max;

Likewise, this variable isn't really needed. It's just one more than
the largest i such that hbkpt_user_max_refcount[i] > 0.

> +/*
> + * Install the debug register values for a new thread.
> + */
> +void switch_to_thread_hw_breakpoint(struct task_struct *tsk)
> +{
> + /* Set the debug register */

Set _which_ debug register?

> + arch_install_thread_hbkpt(tsk);
> + last_debugged_task = current;
> +
> + put_cpu_no_resched();

What's this line doing here? It looks like something you forgot to
erase.

> +}
> +
> +/*
> + * Install the debug register values for just the kernel, no thread.
> + */
> +void switch_to_none_hw_breakpoint(void)
> +{
> + arch_install_none();
> + put_cpu_no_resched();

Same for this line.

> +}
> +
> +/*
> + * Load the debug registers during startup of a CPU.
> + */
> +void load_debug_registers(void)
> +{
> + int i;
> + unsigned long flags;
> +
> + /* Prevent IPIs for new kernel breakpoint updates */
> + local_irq_save(flags);
> +
> + for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
> + if (hbkpt_kernel[i])
> + on_each_cpu(arch_install_kernel_hbkpt,
> + (void *)hbkpt_kernel[i], 0);

This is completely wrong. First of all, it's dumb to send multiple
IPIs (one for each iteration through the loop). Second, this routine
shouldn't send any IPIs at all! It gets invoked when a CPU is
starting up and wants to load its _own_ debug registers -- not tell
another CPU to load anything.

> + if (current->thread.dr7)
> + arch_install_thread_hbkpt(current);
> +
> + local_irq_restore(flags);
> +}
> +
> +/*
> + * Erase all the hardware breakpoint info associated with a thread.
> + *
> + * If tsk != current then tsk must not be usable (for example, a
> + * child being cleaned up from a failed fork).
> + */
> +void flush_thread_hw_breakpoint(struct task_struct *tsk)
> +{
> + int i;
> + struct thread_struct *thread = &(tsk->thread);
> +
> + mutex_lock(&hw_breakpoint_mutex);
> +
> + /* Let the breakpoints know they are being uninstalled */

This comment looks like a leftover which should have been erased.

> +/*
> + * Validate the settings in a hw_breakpoint structure.
> + */
> +static int validate_settings(struct hw_breakpoint *bp, struct task_struct *tsk)
> +{
> + int ret;
> + unsigned int align;
> +
> + ret = arch_validate_hwbkpt_settings(bp, &align, tsk);
> + if (ret < 0)
> + goto err;
> +
> + /* Check that the low-order bits of the address are appropriate
> + * for the alignment implied by len.
> + */
> + if (bp->info.address & align)
> + return -EINVAL;
> +
> + /* Check that the virtual address is in the proper range */
> + if (tsk) {
> + if (!arch_check_va_in_userspace(bp->info.address, tsk))
> + return -EFAULT;
> + } else {
> + if (!arch_check_va_in_kernelspace(bp->info.address))
> + return -EFAULT;
> + }

Roland pointed out that these checks need to take into account the
length of the breakpoint. For example, in
arch_check_va_in_userspace() it isn't sufficient for the start of the
breakpoint region to be a userspace address; the end of the
breakpoint region must also be in userspace.

> + err:
> + return ret;
> +}
> +
> +int __register_user_hw_breakpoint(int pos, struct task_struct *tsk,
> + struct hw_breakpoint *bp)
> +{
> + struct thread_struct *thread = &(tsk->thread);
> + int rc;
> +
> + /* Do not overcommit. Fail if kernel has used the hbkpt registers */
> + if (pos >= hbkpt_kernel_pos)
> + return -ENOSPC;

In fact you should fail if the debug register is already in use,
regardless of whether it is being used by a kernel breakpoint. And you
shouldn't check against hbkpt_kernel_pos; you should check whether
hbkpt_kernel[pos] is NULL and thread->hbkpt[pos] is NULL.

> +
> + rc = validate_settings(bp, tsk);
> + if (rc)
> + return rc;
> +
> + thread->hbkpt[pos] = bp;
> + thread->hbkpt_num_installed++;
> + hbkpt_user_max_refcount[pos]++;
> + /* 'tsk' is the thread that uses max number of hbkpt registers */

This is a bad comment. It sounds like it's saying that "tsk" is
defined as the thread using the maximum number of breakpoints, rather
than being defined as the thread for which the breakpoint is being
registered.

Besides, there's no reason to keep track of which thread uses the max
number of breakpoints anyway. Not to mention the fact that you don't
update hbkpt_user_max when its thread exits.

> + if (hbkpt_user_max < thread->hbkpt_num_installed)
> + hbkpt_user_max++;

At this point I got tired of looking, but it seems obvious that the new
patch series needs a bunch of improvements.

Alan Stern

2009-03-20 18:31:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces


* Alan Stern <[email protected]> wrote:

> > + * Kernel breakpoints grow downwards, starting from HB_NUM
> > + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> > + * kernel-space request
> > + */
> > +unsigned int hbkpt_kernel_pos;
>
> This doesn't make much sense. All you need to know is which
> registers are in use; all others are available.
>
> For example, suppose the kernel allocated breakpoints 3, 2, and 1,
> and then deallocated 2. Then bp 2 would be available for use,
> even though 2 > 1.

it's a high/low watermark mechanism. Yes, it's not an allocator that
can allocate into a debug registrs 'hole', but it is a simple one
that matches current hardware breakpoint usages and enables the
kernel to utilize them as well - and keeps all the code simple.

Ingo

2009-03-20 18:33:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces


* Alan Stern <[email protected]> wrote:

> > + /* Check that the virtual address is in the proper range */
> > + if (tsk) {
> > + if (!arch_check_va_in_userspace(bp->info.address, tsk))
> > + return -EFAULT;
> > + } else {
> > + if (!arch_check_va_in_kernelspace(bp->info.address))
> > + return -EFAULT;
> > + }
>
> Roland pointed out that these checks need to take into account the
> length of the breakpoint. For example, in
> arch_check_va_in_userspace() it isn't sufficient for the start of
> the breakpoint region to be a userspace address; the end of the
> breakpoint region must also be in userspace.

i pointed it out - but yes, this needs to be fixed.

Ingo

2009-03-21 17:26:44

by K.Prasad

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Fri, Mar 20, 2009 at 10:33:26AM -0400, Alan Stern wrote:
> On Fri, 20 Mar 2009, K.Prasad wrote:
>
> > This patch introduces two new files hw_breakpoint.[ch] which defines the
> > generic interfaces to use hardware breakpoint infrastructure of the system.
>
> Prasad:
>
> I'm sorry to say this is full of mistakes. So far I have looked only
> at patch 01/11, but it's not good.
>

After you pointed out, I realise that the code in load_debug_registers()
is an overkill and unregister_kernel_hw_breakpoint() has an obvious
error which should have caught my attention. My next revision should
fix them.

> > + * Kernel breakpoints grow downwards, starting from HB_NUM
> > + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> > + * kernel-space request
> > + */
> > +unsigned int hbkpt_kernel_pos;
>
> This doesn't make much sense. All you need to know is which registers
> are in use; all others are available.
>

As explained by Maneesh earlier, we compact the kernel-space requests
into registers (HB_NUM - 1) to hbkpt_kernel_pos. The kernel-space
requests aren't specific to any given register number too, and so
compaction would be suitable for this case (unlike when implemented for
user-space which might need virtualisation of registers).

> For example, suppose the kernel allocated breakpoints 3, 2, and 1, and
> then deallocated 2. Then bp 2 would be available for use, even though
> 2 > 1.
>
> It's also a poor choice of name. Everywhere else (in my patches,
> anyway) the code refers to hardware breakpoints using the abbreviation
> "hwbp" or "hw_breakpoint". There's no reason suddenly to start using
> "hbkpt".
>

I began using 'hbkpt' as a shorter naming convention (the longer one
being hw_breakpoint) without being really conscious of the 'hwbkpt'
usage by you (even some of the previous iterations contained them in
samples/hw_breakpoint and ftrace-plugin).

Well, I will rename my shorter name to 'hwbkpt' for uniformity.

> > +/* An array containing refcount of threads using a given bkpt register */
> > +unsigned int hbkpt_user_max_refcount[HB_NUM];
>
> Why did you put "max" in the name? Isn't this just a simple refcount?
>

Ok. It will be hbkpt_user_refcount[].

> > +/* One higher than the highest counted user-space breakpoint register */
> > +unsigned int hbkpt_user_max;
>
> Likewise, this variable isn't really needed. It's just one more than
> the largest i such that hbkpt_user_max_refcount[i] > 0.
>

It acts like a cache for determining the user-space breakpoint boundary.
It is used for sanity checks and in its absence we will have to compute from
hbkpt_user_max_refcount[] everytime.

> > +/*
> > + * Install the debug register values for a new thread.
> > + */
> > +void switch_to_thread_hw_breakpoint(struct task_struct *tsk)
> > +{
> > + /* Set the debug register */
>
> Set _which_ debug register?
>

Will change it to read:
/* Set all debug registers used by 'tsk' */

> > + arch_install_thread_hbkpt(tsk);
> > + last_debugged_task = current;
> > +
> > + put_cpu_no_resched();
>
> What's this line doing here? It looks like something you forgot to
> erase.
>
> > +}
> > +
> > +/*
> > + * Install the debug register values for just the kernel, no thread.
> > + */
> > +void switch_to_none_hw_breakpoint(void)
> > +{
> > + arch_install_none();
> > + put_cpu_no_resched();
>
> Same for this line.
>

These are carriages from the previous code. They are still invoked from
the same places (such as flush_thread_hw_breakpoint(),
hw_breakpoint_handler()) and hence I didn't analyse it enough to see if
they were to be removed.

However, having found that preempt_count() is already zero at places where
these are called I think they can be removed.

> > +}
> > +
> > +/*
> > + * Load the debug registers during startup of a CPU.
> > + */
> > +void load_debug_registers(void)
> > +{
> > + int i;
> > + unsigned long flags;
> > +
> > + /* Prevent IPIs for new kernel breakpoint updates */
> > + local_irq_save(flags);
> > +
> > + for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
> > + if (hbkpt_kernel[i])
> > + on_each_cpu(arch_install_kernel_hbkpt,
> > + (void *)hbkpt_kernel[i], 0);
>
> This is completely wrong. First of all, it's dumb to send multiple
> IPIs (one for each iteration through the loop). Second, this routine
> shouldn't send any IPIs at all! It gets invoked when a CPU is
> starting up and wants to load its _own_ debug registers -- not tell
> another CPU to load anything.
>

As I agreed before, it is an overkill (given the design of
arch_install_kernel_hbkpt()). I will create a new
arch_update_kernel_hbkpt(pos, bp) that will install breakpoints only
on the CPU starting up.

> > + if (current->thread.dr7)
> > + arch_install_thread_hbkpt(current);
> > +
> > + local_irq_restore(flags);
> > +}
> > +
> > +/*
> > + * Erase all the hardware breakpoint info associated with a thread.
> > + *
> > + * If tsk != current then tsk must not be usable (for example, a
> > + * child being cleaned up from a failed fork).
> > + */
> > +void flush_thread_hw_breakpoint(struct task_struct *tsk)
> > +{
> > + int i;
> > + struct thread_struct *thread = &(tsk->thread);
> > +
> > + mutex_lock(&hw_breakpoint_mutex);
> > +
> > + /* Let the breakpoints know they are being uninstalled */
>
> This comment looks like a leftover which should have been erased.
>
> > +/*
> > + * Validate the settings in a hw_breakpoint structure.
> > + */
> > +static int validate_settings(struct hw_breakpoint *bp, struct task_struct *tsk)
> > +{
> > + int ret;
> > + unsigned int align;
> > +
> > + ret = arch_validate_hwbkpt_settings(bp, &align, tsk);
> > + if (ret < 0)
> > + goto err;
> > +
> > + /* Check that the low-order bits of the address are appropriate
> > + * for the alignment implied by len.
> > + */
> > + if (bp->info.address & align)
> > + return -EINVAL;
> > +
> > + /* Check that the virtual address is in the proper range */
> > + if (tsk) {
> > + if (!arch_check_va_in_userspace(bp->info.address, tsk))
> > + return -EFAULT;
> > + } else {
> > + if (!arch_check_va_in_kernelspace(bp->info.address))
> > + return -EFAULT;
> > + }
>
> Roland pointed out that these checks need to take into account the
> length of the breakpoint. For example, in
> arch_check_va_in_userspace() it isn't sufficient for the start of the
> breakpoint region to be a userspace address; the end of the
> breakpoint region must also be in userspace.
>

Ok. Will do something like:
return (va <= (TASK_SIZE - (hw_breakpoint_length * word_size)));

> > + err:
> > + return ret;
> > +}
> > +
> > +int __register_user_hw_breakpoint(int pos, struct task_struct *tsk,
> > + struct hw_breakpoint *bp)
> > +{
> > + struct thread_struct *thread = &(tsk->thread);
> > + int rc;
> > +
> > + /* Do not overcommit. Fail if kernel has used the hbkpt registers */
> > + if (pos >= hbkpt_kernel_pos)
> > + return -ENOSPC;
>
> In fact you should fail if the debug register is already in use,
> regardless of whether it is being used by a kernel breakpoint. And you
> shouldn't check against hbkpt_kernel_pos; you should check whether
> hbkpt_kernel[pos] is NULL and thread->hbkpt[pos] is NULL.
>

As explained before, the intended design was like this:

ample layout:
hbkpt_kernel_pos = 1
hbkpt_user_max = 1

---------------------------------------------------------------------
| | | | |
| DR3 | DR2 | DR1 | DR0 |
| | | | |
---------------------------------------------------------------------
^ ^ ^
| | |
-----------------kernel-space addresses-------------------user-------

After removing breakpoint in say DR2, compaction occurs.
New layout will be:
hbkpt_kernel_pos = 2
hbkpt_user_max = 1

---------------------------------------------------------------------
| | | | |
| DR3 | DR2 | DR1 | DR0 |
| | | | |
---------------------------------------------------------------------
^ ^ ^ ^
| | | |
-----------------kernel------------------empty-----------user--------

The above design, in my opinion is intuitive, allows re-use of
uninstalled registers and is simple to implement.

What was missing in the sent patch was the updation of dr7 and dr[pos]
register after compaction. I will add them in the next iteration of the
patch.

> > +
> > + rc = validate_settings(bp, tsk);
> > + if (rc)
> > + return rc;
> > +
> > + thread->hbkpt[pos] = bp;
> > + thread->hbkpt_num_installed++;
> > + hbkpt_user_max_refcount[pos]++;
> > + /* 'tsk' is the thread that uses max number of hbkpt registers */
>
> This is a bad comment. It sounds like it's saying that "tsk" is
> defined as the thread using the maximum number of breakpoints, rather
> than being defined as the thread for which the breakpoint is being
> registered.
>
> Besides, there's no reason to keep track of which thread uses the max
> number of breakpoints anyway. Not to mention the fact that you don't
> update hbkpt_user_max when its thread exits.
>

We don't keep track of the thread (in the sense the task_struct) but
'hbkpt_user_max' is used for validating requests and book-keeping. As
Maneesh mentioned before flush_thread_hw_breakpoint() updates
'hbkpt_user_max'.

I can change it to read like the one below if it sounds better to you.

/*
* 'tsk' uses more number of registers than 'hbkpt_user_max'. Update
* the same.
*/

> > + if (hbkpt_user_max < thread->hbkpt_num_installed)
> > + hbkpt_user_max++;
>
> At this point I got tired of looking, but it seems obvious that the new
> patch series needs a bunch of improvements.
>
> Alan Stern
>

As mentioned before the next iteration would contain the changes I've
discussed above.

Thanks,
K.Prasad

2009-03-21 17:32:57

by K.Prasad

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Fri, Mar 20, 2009 at 07:30:58PM +0100, Ingo Molnar wrote:
>
> * Alan Stern <[email protected]> wrote:
>
> > > + * Kernel breakpoints grow downwards, starting from HB_NUM
> > > + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> > > + * kernel-space request
> > > + */
> > > +unsigned int hbkpt_kernel_pos;
> >
> > This doesn't make much sense. All you need to know is which
> > registers are in use; all others are available.
> >
> > For example, suppose the kernel allocated breakpoints 3, 2, and 1,
> > and then deallocated 2. Then bp 2 would be available for use,
> > even though 2 > 1.
>
> it's a high/low watermark mechanism. Yes, it's not an allocator that
> can allocate into a debug registrs 'hole', but it is a simple one
> that matches current hardware breakpoint usages and enables the
> kernel to utilize them as well - and keeps all the code simple.
>
> Ingo

I've explained the design here: http://lkml.org/lkml/2009/3/21/169 in a
and is slightly different from what you've explained above.

It involves shifting of kernel-space registers by one-level if a
kernel-register is uninstalled. We compact the kernel-space registers
since a)not to leave a 'hole' thereby wasting a register forever during
runtime b)kernel-space requests are not specific to a register number
and can be moved at will (unlike user-space requests).

Hope that the design is acceptable and the resultant code - simple.

Thanks,
K.Prasad

2009-03-21 21:40:20

by Alan Stern

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Sat, 21 Mar 2009, K.Prasad wrote:

> > > + * Kernel breakpoints grow downwards, starting from HB_NUM
> > > + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> > > + * kernel-space request
> > > + */
> > > +unsigned int hbkpt_kernel_pos;
> >
> > This doesn't make much sense. All you need to know is which registers
> > are in use; all others are available.
> >
>
> As explained by Maneesh earlier, we compact the kernel-space requests
> into registers (HB_NUM - 1) to hbkpt_kernel_pos. The kernel-space
> requests aren't specific to any given register number too, and so
> compaction would be suitable for this case (unlike when implemented for
> user-space which might need virtualisation of registers).

Okay, that makes sense. Perhaps you could add a short comment here
explaining that the register allocations get compacted when a kernel
breakpoint is unregistered, so they will always be contiguous.

> > It's also a poor choice of name. Everywhere else (in my patches,
> > anyway) the code refers to hardware breakpoints using the abbreviation
> > "hwbp" or "hw_breakpoint". There's no reason suddenly to start using
> > "hbkpt".
> >
>
> I began using 'hbkpt' as a shorter naming convention (the longer one
> being hw_breakpoint) without being really conscious of the 'hwbkpt'
> usage by you (even some of the previous iterations contained them in
> samples/hw_breakpoint and ftrace-plugin).
>
> Well, I will rename my shorter name to 'hwbkpt' for uniformity.

My patch never used "hwbkpt". Besides "hw_breakpoint", it used only
"bp". On going back and checking, I found that it didn't even use
"hwbp". Some other abbreviations it did use were "kbp" for kernel
breakpoint, "chbi" for per-CPU hardware breakpoint info, and "thbi" for
per-thread hardware breakpoint info.

If you're looking for a good short name, and if you want to keep
hardware breakpoints distinct from software breakpoints, I suggest
"hbp" instead of "hbkpt". But it's up to you, and it's worth noticing
that the code already contains lots of variables named just "bp".

> > > +/* One higher than the highest counted user-space breakpoint register */
> > > +unsigned int hbkpt_user_max;
> >
> > Likewise, this variable isn't really needed. It's just one more than
> > the largest i such that hbkpt_user_max_refcount[i] > 0.
> >
>
> It acts like a cache for determining the user-space breakpoint boundary.
> It is used for sanity checks and in its absence we will have to compute from
> hbkpt_user_max_refcount[] everytime.

That's right. Isn't it simpler to check

kernel_pos > 0 && hbkpt_user_refcount[kernel_pos - 1] == 0

than to check

kernel_pos - 1 >= hbkpt_user_max

_and_ to keep hbkpt_user_max set to the correct value at all times?

> > > +/*
> > > + * Load the debug registers during startup of a CPU.
> > > + */
> > > +void load_debug_registers(void)
> > > +{
> > > + int i;
> > > + unsigned long flags;
> > > +
> > > + /* Prevent IPIs for new kernel breakpoint updates */
> > > + local_irq_save(flags);
> > > +
> > > + for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
> > > + if (hbkpt_kernel[i])
> > > + on_each_cpu(arch_install_kernel_hbkpt,
> > > + (void *)hbkpt_kernel[i], 0);
> >
> > This is completely wrong. First of all, it's dumb to send multiple
> > IPIs (one for each iteration through the loop). Second, this routine
> > shouldn't send any IPIs at all! It gets invoked when a CPU is
> > starting up and wants to load its _own_ debug registers -- not tell
> > another CPU to load anything.
> >
>
> As I agreed before, it is an overkill (given the design of
> arch_install_kernel_hbkpt()). I will create a new
> arch_update_kernel_hbkpt(pos, bp) that will install breakpoints only
> on the CPU starting up.

Doesn't arch_install_kernel_hbkpt() already install breakpoints
on only the current CPU? So why do you need a new function?

> > > + /* Check that the virtual address is in the proper range */
> > > + if (tsk) {
> > > + if (!arch_check_va_in_userspace(bp->info.address, tsk))
> > > + return -EFAULT;
> > > + } else {
> > > + if (!arch_check_va_in_kernelspace(bp->info.address))
> > > + return -EFAULT;
> > > + }
> >
> > Roland pointed out that these checks need to take into account the
> > length of the breakpoint. For example, in
> > arch_check_va_in_userspace() it isn't sufficient for the start of the
> > breakpoint region to be a userspace address; the end of the
> > breakpoint region must also be in userspace.
> >
>
> Ok. Will do something like:
> return (va <= (TASK_SIZE - (hw_breakpoint_length * word_size)));

What is the purpose of word_size here? The breakpoint length should be
specified in bytes, not words.

Don't forget that that in arch_check_va_in_kernelspace() you need to
check both for values that are too low and values that are too high
(they overflow and wrap around back to a user address).

> We don't keep track of the thread (in the sense the task_struct) but
> 'hbkpt_user_max' is used for validating requests and book-keeping. As
> Maneesh mentioned before flush_thread_hw_breakpoint() updates
> 'hbkpt_user_max'.
>
> I can change it to read like the one below if it sounds better to you.
>
> /*
> * 'tsk' uses more number of registers than 'hbkpt_user_max'. Update
> * the same.
> */

My preference is simply to eliminate hbkpt_user_max entirely.

Alan Stern

2009-03-23 19:03:55

by K.Prasad

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Sat, Mar 21, 2009 at 05:39:59PM -0400, Alan Stern wrote:
> On Sat, 21 Mar 2009, K.Prasad wrote:
>
> > > > + * Kernel breakpoints grow downwards, starting from HB_NUM
> > > > + * 'hbkpt_kernel_pos' denotes lowest numbered breakpoint register occupied for
> > > > + * kernel-space request
> > > > + */
> > > > +unsigned int hbkpt_kernel_pos;
> > >
> > > This doesn't make much sense. All you need to know is which registers
> > > are in use; all others are available.
> > >
> >
> > As explained by Maneesh earlier, we compact the kernel-space requests
> > into registers (HB_NUM - 1) to hbkpt_kernel_pos. The kernel-space
> > requests aren't specific to any given register number too, and so
> > compaction would be suitable for this case (unlike when implemented for
> > user-space which might need virtualisation of registers).
>
> Okay, that makes sense. Perhaps you could add a short comment here
> explaining that the register allocations get compacted when a kernel
> breakpoint is unregistered, so they will always be contiguous.
>
> > > It's also a poor choice of name. Everywhere else (in my patches,
> > > anyway) the code refers to hardware breakpoints using the abbreviation
> > > "hwbp" or "hw_breakpoint". There's no reason suddenly to start using
> > > "hbkpt".
> > >
> >
> > I began using 'hbkpt' as a shorter naming convention (the longer one
> > being hw_breakpoint) without being really conscious of the 'hwbkpt'
> > usage by you (even some of the previous iterations contained them in
> > samples/hw_breakpoint and ftrace-plugin).
> >
> > Well, I will rename my shorter name to 'hwbkpt' for uniformity.
>
> My patch never used "hwbkpt". Besides "hw_breakpoint", it used only
> "bp". On going back and checking, I found that it didn't even use
> "hwbp". Some other abbreviations it did use were "kbp" for kernel
> breakpoint, "chbi" for per-CPU hardware breakpoint info, and "thbi" for
> per-thread hardware breakpoint info.
>
> If you're looking for a good short name, and if you want to keep
> hardware breakpoints distinct from software breakpoints, I suggest
> "hbp" instead of "hbkpt". But it's up to you, and it's worth noticing
> that the code already contains lots of variables named just "bp".
>

I am renaming all 'hbkpt' strings to 'hbp'.

> > > > +/* One higher than the highest counted user-space breakpoint register */
> > > > +unsigned int hbkpt_user_max;
> > >
> > > Likewise, this variable isn't really needed. It's just one more than
> > > the largest i such that hbkpt_user_max_refcount[i] > 0.
> > >
> >
> > It acts like a cache for determining the user-space breakpoint boundary.
> > It is used for sanity checks and in its absence we will have to compute from
> > hbkpt_user_max_refcount[] everytime.
>
> That's right. Isn't it simpler to check
>
> kernel_pos > 0 && hbkpt_user_refcount[kernel_pos - 1] == 0
>
> than to check
>
> kernel_pos - 1 >= hbkpt_user_max
>
> _and_ to keep hbkpt_user_max set to the correct value at all times?
>

Unfortunately the lines of code required to maintain the variable comes
close to the amount of lines it would potentially save. I will change to
code to compute it from hbkpt_user_refcount[] everytime.

> > > > +/*
> > > > + * Load the debug registers during startup of a CPU.
> > > > + */
> > > > +void load_debug_registers(void)
> > > > +{
> > > > + int i;
> > > > + unsigned long flags;
> > > > +
> > > > + /* Prevent IPIs for new kernel breakpoint updates */
> > > > + local_irq_save(flags);
> > > > +
> > > > + for (i = hbkpt_kernel_pos; i < HB_NUM; i++)
> > > > + if (hbkpt_kernel[i])
> > > > + on_each_cpu(arch_install_kernel_hbkpt,
> > > > + (void *)hbkpt_kernel[i], 0);
> > >
> > > This is completely wrong. First of all, it's dumb to send multiple
> > > IPIs (one for each iteration through the loop). Second, this routine
> > > shouldn't send any IPIs at all! It gets invoked when a CPU is
> > > starting up and wants to load its _own_ debug registers -- not tell
> > > another CPU to load anything.
> > >
> >
> > As I agreed before, it is an overkill (given the design of
> > arch_install_kernel_hbkpt()). I will create a new
> > arch_update_kernel_hbkpt(pos, bp) that will install breakpoints only
> > on the CPU starting up.
>
> Doesn't arch_install_kernel_hbkpt() already install breakpoints
> on only the current CPU? So why do you need a new function?
>

There will be a few more changes to arch_install_kernel_hbkpt() along
with this. Please find the changes in the ensuing patchset.

> > > > + /* Check that the virtual address is in the proper range */
> > > > + if (tsk) {
> > > > + if (!arch_check_va_in_userspace(bp->info.address, tsk))
> > > > + return -EFAULT;
> > > > + } else {
> > > > + if (!arch_check_va_in_kernelspace(bp->info.address))
> > > > + return -EFAULT;
> > > > + }
> > >
> > > Roland pointed out that these checks need to take into account the
> > > length of the breakpoint. For example, in
> > > arch_check_va_in_userspace() it isn't sufficient for the start of the
> > > breakpoint region to be a userspace address; the end of the
> > > breakpoint region must also be in userspace.
> > >
> >
> > Ok. Will do something like:
> > return (va <= (TASK_SIZE - (hw_breakpoint_length * word_size)));
>
> What is the purpose of word_size here? The breakpoint length should be
> specified in bytes, not words.
>
> Don't forget that that in arch_check_va_in_kernelspace() you need to
> check both for values that are too low and values that are too high
> (they overflow and wrap around back to a user address).
>

While I understand the user-space checking using the length of the HW
Breakpoint, I don't really see how I can check for an upper-bound for
kernel-space virtual addresses. Most usage in the kernel only checks for
the address >= TASK_SIZE (while they check for add + len if the length
of the memory is known). I will be glad to have any suggestions in this
regard.

> > We don't keep track of the thread (in the sense the task_struct) but
> > 'hbkpt_user_max' is used for validating requests and book-keeping. As
> > Maneesh mentioned before flush_thread_hw_breakpoint() updates
> > 'hbkpt_user_max'.
> >
> > I can change it to read like the one below if it sounds better to you.
> >
> > /*
> > * 'tsk' uses more number of registers than 'hbkpt_user_max'. Update
> > * the same.
> > */
>
> My preference is simply to eliminate hbkpt_user_max entirely.
>
> Alan Stern
>

Done.

Thanks,
K.Prasad

2009-03-23 19:22:04

by Alan Stern

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Tue, 24 Mar 2009, K.Prasad wrote:

> > > Ok. Will do something like:
> > > return (va <= (TASK_SIZE - (hw_breakpoint_length * word_size)));
> >
> > What is the purpose of word_size here? The breakpoint length should be
> > specified in bytes, not words.
> >
> > Don't forget that that in arch_check_va_in_kernelspace() you need to
> > check both for values that are too low and values that are too high
> > (they overflow and wrap around back to a user address).
> >
>
> While I understand the user-space checking using the length of the HW
> Breakpoint, I don't really see how I can check for an upper-bound for
> kernel-space virtual addresses. Most usage in the kernel only checks for
> the address >= TASK_SIZE (while they check for add + len if the length
> of the memory is known). I will be glad to have any suggestions in this
> regard.

Isn't that exactly the check you need to implement?

addr >= TASK_SIZE && (addr + len) >= TASK_SIZE,

or perhaps better,

addr >= TASK_SIZE && (addr + len) >= addr.

In this case you _do_ know the length of the breakpoint.

Alan Stern

2009-03-23 20:42:45

by K.Prasad

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Mon, Mar 23, 2009 at 03:21:49PM -0400, Alan Stern wrote:
> On Tue, 24 Mar 2009, K.Prasad wrote:
>
> > > > Ok. Will do something like:
> > > > return (va <= (TASK_SIZE - (hw_breakpoint_length * word_size)));
> > >
> > > What is the purpose of word_size here? The breakpoint length should be
> > > specified in bytes, not words.
> > >
> > > Don't forget that that in arch_check_va_in_kernelspace() you need to
> > > check both for values that are too low and values that are too high
> > > (they overflow and wrap around back to a user address).
> > >
> >
> > While I understand the user-space checking using the length of the HW
> > Breakpoint, I don't really see how I can check for an upper-bound for
> > kernel-space virtual addresses. Most usage in the kernel only checks for
> > the address >= TASK_SIZE (while they check for add + len if the length
> > of the memory is known). I will be glad to have any suggestions in this
> > regard.
>
> Isn't that exactly the check you need to implement?
>
> addr >= TASK_SIZE && (addr + len) >= TASK_SIZE,
>
> or perhaps better,
>
> addr >= TASK_SIZE && (addr + len) >= addr.
>
> In this case you _do_ know the length of the breakpoint.
>
> Alan Stern
>

Aren't we just checking if len is a positive number through the above
checks? The validation checks in the patchset should take care of
negative lengths. Or am I missing something?

I thought you wanted the code to check for an upper sane limit for addr
in kernel-space, say something like this:

TASK_SIZE <= addr <= (Upper limit for Kernel Virtual Address)

When I referred to 'len' in my previous mail, it meant the length
of the kernel virtual memory area (which can be used to find the upper
bound).

Thanks,
K.Prasad

2009-03-23 21:20:41

by Alan Stern

[permalink] [raw]
Subject: Re: [Patch 01/11] Introducing generic hardware breakpoint handler interfaces

On Tue, 24 Mar 2009, K.Prasad wrote:

> > Isn't that exactly the check you need to implement?
> >
> > addr >= TASK_SIZE && (addr + len) >= TASK_SIZE,
> >
> > or perhaps better,
> >
> > addr >= TASK_SIZE && (addr + len) >= addr.
> >
> > In this case you _do_ know the length of the breakpoint.
> >
> > Alan Stern
> >
>
> Aren't we just checking if len is a positive number through the above
> checks? The validation checks in the patchset should take care of
> negative lengths. Or am I missing something?

Well, 0x60000000 is a positive number, and 0xd0000000 is >= TASK_SIZE.
But their sum is 0x30000000, which lies in userspace. In other words,
you are missing the possibility that the addition might overflow and
wrap around.

> I thought you wanted the code to check for an upper sane limit for addr
> in kernel-space, say something like this:
>
> TASK_SIZE <= addr <= (Upper limit for Kernel Virtual Address)

No, the test should be

TASK_SIZE <= addr <= addr + (len-1) <= (Upper limit for Kernel VA)

By the way, is TASK_SIZE the correct lower bound for kernel virtual
addresses on x86-64?

> When I referred to 'len' in my previous mail, it meant the length
> of the kernel virtual memory area (which can be used to find the upper
> bound).

Oh, sorry, I misunderstood. Isn't that limit always 0xffffffff on
x86-32?

Alan Stern