2009-04-25 00:08:44

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 0/17] tracehook & user_regset for ARM

The various things listed under CONFIG_HAVE_ARCH_TRACEHOOK (arch/Kconfig)
are what each arch wants to do nowadays to fit with the generic code for
user debugging, core dumping, etc. The "big machine" arch's have done this
since 2.6.28 or earlier.

Christoph suggested helping ARM with the work would be a good way to
encourage all the "little machine" arch maintainers to catch up soon.
(Of the "little machine" and oddball arch's, so far only sh has it.)

The following patches bring ARM up to speed with HAVE_ARCH_TRACEHOOK (just
about). There are so many patches because I sliced them into many small
changes. Each patch is pretty short (some of them very tiny). The overall
diffstat from the whole series is attached in the "pull request" below.

This series is relative to ~2.6.30-rc3 (0c8454f). I expect it rebases
easily to whatever tree you might want to queue it on.

The immediate user-visible effects of the series are to enable the
/proc/pid/syscall feature, and to add VFP, WMMX, Crunch, and $tp register
data to core dumps.

AFAIK only the asm/syscall.h patch still needs work. The preliminary
version is only buggy in the way that /proc/pid/syscall will give bogus
answers for a task not really in a syscall, or for the non-EABI entry
styles. It's not unsafe or anything. It needs some attention from folks
who really know ARM to fill in the truly proper version of syscall_get_nr().

I only know how to run and test one ABI flavor, and only in qemu. I used
versatile_defconfig and ran it in qemu-system-arm -M versatilepb using NFS
root with the userland binaries from Fedora ARM.

I don't know how to simulate hardware that has iWMMXt or Crunch, nor if my
ARM userland handles those kernel configurations. So I've only
(cross-)compile-tested the iWMMXt and Crunch code. (It is however the
simplest of the user_regset code and pretty easy to eyeball-review.)

My testing is quite minimal. Booted, nothing went wrong, simple strace
uses still look sane, "cat /proc/self/syscall" looks right, core dump
contents look right. (Not knowing the arch at all, I don't actually know
how to put anything in the FPA or VFP registers so as to notice they are
right rather than just the right number of zeros.) The userland I have
does not have gdb (and qemu would take a week to build and run the gdb
testsuite if it did), so I didn't try to test any ptrace use beyond what
strace does.


Thanks,
Roland

---

The following changes since commit 0c8454f56623505a99463405fd7d5664adfbb094:
Rafael J. Wysocki (1):
PM/Hibernate: Fix waiting for image device to appear on resume

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland.git arm/tracehook

Roland McGrath (17):
arm: arch_ptrace clean-up
arm: arch_ptrace indentation
arm: tracehook_report_syscall
arm: tracehook_signal_handler
arm: TIF_NOTIFY_RESUME
arm: user_regset: general regs
arm: user_regset: FPU regs
arm: CORE_DUMP_USE_REGSET
arm: user_regset: VFP regs
arm: user_regset: VFP in core dumps
arm: user_regset: iWMMXt regs
arm: user_regset: iWMMXt in core dumps
arm: user_regset: Crunch regs
arm: user_regset: Crunch in core dumps
arm: user_regset: thread pointer in core dumps
arm: asm/syscall.h (unfinished)
arm: HAVE_ARCH_TRACEHOOK

arch/arm/Kconfig | 1 +
arch/arm/include/asm/elf.h | 1 +
arch/arm/include/asm/ptrace.h | 4 +-
arch/arm/include/asm/syscall.h | 65 +++++
arch/arm/include/asm/thread_info.h | 4 +
arch/arm/kernel/entry-common.S | 2 +-
arch/arm/kernel/ptrace.c | 526 ++++++++++++++++++++++++------------
arch/arm/kernel/signal.c | 7 +
include/linux/elf.h | 3 +
9 files changed, 433 insertions(+), 180 deletions(-)
create mode 100644 arch/arm/include/asm/syscall.h


2009-04-25 00:08:20

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 02/17] arm: arch_ptrace indentation

This changes the arch_ptrace() indentation to match the standard style.
No change but the whitespace.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 116 +++++++++++++++++++++++-----------------------
1 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 825b7ab..93d7822 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -706,13 +706,13 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
int ret;

switch (request) {
- case PTRACE_PEEKUSR:
- ret = ptrace_read_user(child, addr, (unsigned long __user *)data);
- break;
+ case PTRACE_PEEKUSR:
+ ret = ptrace_read_user(child, addr, (unsigned long __user *)data);
+ break;

- case PTRACE_POKEUSR:
- ret = ptrace_write_user(child, addr, data);
- break;
+ case PTRACE_POKEUSR:
+ ret = ptrace_write_user(child, addr, data);
+ break;

/*
* The generic code handles these mostly, but we have to
@@ -723,11 +723,11 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
* come back through here or PTRACE_SINGLESTEP again
* before it matters.
*/
- case PTRACE_SYSCALL:
- case PTRACE_CONT:
- case PTRACE_KILL:
- single_step_disable(child);
- goto common;
+ case PTRACE_SYSCALL:
+ case PTRACE_CONT:
+ case PTRACE_KILL:
+ single_step_disable(child);
+ goto common;

/*
* After we set up single-step state, the generic
@@ -735,71 +735,71 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
* Note as above we don't bother with checking early
* for the error case.
*/
- case PTRACE_SINGLESTEP:
- single_step_enable(child);
- request = PTRACE_CONT;
- goto common;
+ case PTRACE_SINGLESTEP:
+ single_step_enable(child);
+ request = PTRACE_CONT;
+ goto common;

- case PTRACE_GETREGS:
- ret = ptrace_getregs(child, (void __user *)data);
- break;
+ case PTRACE_GETREGS:
+ ret = ptrace_getregs(child, (void __user *)data);
+ break;

- case PTRACE_SETREGS:
- ret = ptrace_setregs(child, (void __user *)data);
- break;
+ case PTRACE_SETREGS:
+ ret = ptrace_setregs(child, (void __user *)data);
+ break;

- case PTRACE_GETFPREGS:
- ret = ptrace_getfpregs(child, (void __user *)data);
- break;
-
- case PTRACE_SETFPREGS:
- ret = ptrace_setfpregs(child, (void __user *)data);
- break;
+ case PTRACE_GETFPREGS:
+ ret = ptrace_getfpregs(child, (void __user *)data);
+ break;
+
+ case PTRACE_SETFPREGS:
+ ret = ptrace_setfpregs(child, (void __user *)data);
+ break;

#ifdef CONFIG_IWMMXT
- case PTRACE_GETWMMXREGS:
- ret = ptrace_getwmmxregs(child, (void __user *)data);
- break;
+ case PTRACE_GETWMMXREGS:
+ ret = ptrace_getwmmxregs(child, (void __user *)data);
+ break;

- case PTRACE_SETWMMXREGS:
- ret = ptrace_setwmmxregs(child, (void __user *)data);
- break;
+ case PTRACE_SETWMMXREGS:
+ ret = ptrace_setwmmxregs(child, (void __user *)data);
+ break;
#endif

- case PTRACE_GET_THREAD_AREA:
- ret = put_user(task_thread_info(child)->tp_value,
- (unsigned long __user *) data);
- break;
+ case PTRACE_GET_THREAD_AREA:
+ ret = put_user(task_thread_info(child)->tp_value,
+ (unsigned long __user *) data);
+ break;

- case PTRACE_SET_SYSCALL:
- task_thread_info(child)->syscall = data;
- ret = 0;
- break;
+ case PTRACE_SET_SYSCALL:
+ task_thread_info(child)->syscall = data;
+ ret = 0;
+ break;

#ifdef CONFIG_CRUNCH
- case PTRACE_GETCRUNCHREGS:
- ret = ptrace_getcrunchregs(child, (void __user *)data);
- break;
+ case PTRACE_GETCRUNCHREGS:
+ ret = ptrace_getcrunchregs(child, (void __user *)data);
+ break;

- case PTRACE_SETCRUNCHREGS:
- ret = ptrace_setcrunchregs(child, (void __user *)data);
- break;
+ case PTRACE_SETCRUNCHREGS:
+ ret = ptrace_setcrunchregs(child, (void __user *)data);
+ break;
#endif

#ifdef CONFIG_VFP
- case PTRACE_GETVFPREGS:
- ret = ptrace_getvfpregs(child, (void __user *)data);
- break;
+ case PTRACE_GETVFPREGS:
+ ret = ptrace_getvfpregs(child, (void __user *)data);
+ break;

- case PTRACE_SETVFPREGS:
- ret = ptrace_setvfpregs(child, (void __user *)data);
- break;
+ case PTRACE_SETVFPREGS:
+ ret = ptrace_setvfpregs(child, (void __user *)data);
+ break;
#endif

- default:
- common:
- ret = ptrace_request(child, request, addr, data);
- break;
+ default:
+ common:
+ ret = ptrace_request(child, request, addr, data);
+ break;
}

return ret;

2009-04-25 00:09:26

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 04/17] arm: tracehook_signal_handler

This makes arm call the standard tracehook_signal_handler() hook
in <linux/tracehook.h> after setting up a signal handler.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/signal.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 80b8b5c..51e2187 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -12,6 +12,7 @@
#include <linux/personality.h>
#include <linux/freezer.h>
#include <linux/uaccess.h>
+#include <linux/tracehook.h>

#include <asm/elf.h>
#include <asm/cacheflush.h>
@@ -606,6 +607,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
recalc_sigpending();
spin_unlock_irq(&tsk->sighand->siglock);

+ tracehook_signal_handler(sig, info, ka, regs, 0);
}

/*

2009-04-25 00:09:41

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 01/17] arm: arch_ptrace clean-up

This cleans up arch_ptrace() on arm to use the generic ptrace_request()
for everything it can. This gets rid of the non-arch ptrace internals
magic from the arch code.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 66 ++++++++++-----------------------------------
1 files changed, 15 insertions(+), 51 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 89882a1..825b7ab 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -706,76 +706,39 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
int ret;

switch (request) {
- /*
- * read word at location "addr" in the child process.
- */
- case PTRACE_PEEKTEXT:
- case PTRACE_PEEKDATA:
- ret = generic_ptrace_peekdata(child, addr, data);
- break;
-
case PTRACE_PEEKUSR:
ret = ptrace_read_user(child, addr, (unsigned long __user *)data);
break;

- /*
- * write the word at location addr.
- */
- case PTRACE_POKETEXT:
- case PTRACE_POKEDATA:
- ret = generic_ptrace_pokedata(child, addr, data);
- break;
-
case PTRACE_POKEUSR:
ret = ptrace_write_user(child, addr, data);
break;

/*
- * continue/restart and stop at next (return from) syscall
+ * The generic code handles these mostly, but we have to
+ * clear any previous single-step state first. Note we
+ * do single_step_disable() before even checking if the
+ * ptrace request will fail for an invalid argument.
+ * It's harmless in the error case, since we will have
+ * come back through here or PTRACE_SINGLESTEP again
+ * before it matters.
*/
case PTRACE_SYSCALL:
case PTRACE_CONT:
- ret = -EIO;
- if (!valid_signal(data))
- break;
- if (request == PTRACE_SYSCALL)
- set_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- else
- clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- child->exit_code = data;
- single_step_disable(child);
- wake_up_process(child);
- ret = 0;
- break;
-
- /*
- * make the child exit. Best I can do is send it a sigkill.
- * perhaps it should be put in the status that it wants to
- * exit.
- */
case PTRACE_KILL:
single_step_disable(child);
- if (child->exit_state != EXIT_ZOMBIE) {
- child->exit_code = SIGKILL;
- wake_up_process(child);
- }
- ret = 0;
- break;
+ goto common;

/*
- * execute single instruction.
+ * After we set up single-step state, the generic
+ * PTRACE_CONT handling takes over and does the rest.
+ * Note as above we don't bother with checking early
+ * for the error case.
*/
case PTRACE_SINGLESTEP:
- ret = -EIO;
- if (!valid_signal(data))
- break;
single_step_enable(child);
- clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- child->exit_code = data;
- /* give it a chance to run. */
- wake_up_process(child);
- ret = 0;
- break;
+ request = PTRACE_CONT;
+ goto common;

case PTRACE_GETREGS:
ret = ptrace_getregs(child, (void __user *)data);
@@ -834,6 +797,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
#endif

default:
+ common:
ret = ptrace_request(child, request, addr, data);
break;
}

2009-04-25 00:10:39

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 06/17] arm: user_regset: general regs

This converts PTRACE_GETREGS/PTRACE_SETREGS into user_regset form.
There should be no change in the ptrace behavior.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 72 +++++++++++++++++++++++++++++++++++----------
1 files changed, 56 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index effada1..fbb18e4 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -15,6 +15,8 @@
#include <linux/smp.h>
#include <linux/ptrace.h>
#include <linux/tracehook.h>
+#include <linux/regset.h>
+#include <linux/elf.h>
#include <linux/user.h>
#include <linux/security.h>
#include <linux/init.h>
@@ -547,30 +549,38 @@ static int ptrace_write_user(struct task_struct *tsk, unsigned long off,
/*
* Get all user integer registers.
*/
-static int ptrace_getregs(struct task_struct *tsk, void __user *uregs)
+static int gpr_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
{
- struct pt_regs *regs = task_pt_regs(tsk);
-
- return copy_to_user(uregs, regs, sizeof(struct pt_regs)) ? -EFAULT : 0;
+ struct pt_regs *regs = task_pt_regs(target);
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ regs, 0, sizeof(*regs));
}

/*
* Set all user integer registers.
*/
-static int ptrace_setregs(struct task_struct *tsk, void __user *uregs)
+static int gpr_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
{
+ struct pt_regs *regs = task_pt_regs(target);
struct pt_regs newregs;
int ret;

- ret = -EFAULT;
- if (copy_from_user(&newregs, uregs, sizeof(struct pt_regs)) == 0) {
- struct pt_regs *regs = task_pt_regs(tsk);
+ if (pos != 0 || count != sizeof(newregs))
+ newregs = *regs;

- ret = -EINVAL;
- if (valid_user_regs(&newregs)) {
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &newregs, 0, sizeof(newregs));
+ if (!ret) {
+ if (valid_user_regs(&newregs))
*regs = newregs;
- ret = 0;
- }
+ else
+ ret = -EINVAL;
}

return ret;
@@ -702,6 +712,34 @@ static int ptrace_setvfpregs(struct task_struct *tsk, void __user *data)
}
#endif

+/*
+ * Indices in arm_regsets[] array. These are used only in arch_ptrace(),
+ * below. The ordering that matters is that the NT_PRSTATUS regset is
+ * first. Other than that, ever use outside this file should just look at
+ * the contents of the table entries.
+ */
+enum {
+ REGSET_GPR,
+};
+
+static const struct user_regset arm_regsets[] = {
+ [REGSET_GPR] = {
+ .core_note_type = NT_PRSTATUS, .n = ELF_NGREG,
+ .size = sizeof(long), .align = sizeof(long),
+ .get = gpr_get, .set = gpr_set
+ },
+};
+
+static const struct user_regset_view user_arm_view = {
+ .name = "arm", .e_machine = ELF_ARCH, .ei_osabi = ELF_OSABI,
+ .regsets = arm_regsets, .n = ARRAY_SIZE(arm_regsets)
+};
+
+const struct user_regset_view *task_user_regset_view(struct task_struct *task)
+{
+ return &user_arm_view;
+}
+
long arch_ptrace(struct task_struct *child, long request, long addr, long data)
{
int ret;
@@ -742,12 +780,14 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
goto common;

case PTRACE_GETREGS:
- ret = ptrace_getregs(child, (void __user *)data);
- break;
+ return copy_regset_to_user(child, &user_arm_view, REGSET_GPR,
+ 0, sizeof(struct pt_regs),
+ (void __user *) data);

case PTRACE_SETREGS:
- ret = ptrace_setregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view, REGSET_GPR,
+ 0, sizeof(struct pt_regs),
+ (const void __user *) data);

case PTRACE_GETFPREGS:
ret = ptrace_getfpregs(child, (void __user *)data);

2009-04-25 00:10:54

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 03/17] arm: tracehook_report_syscall

This makes arm use the standard tracehook_report_syscall_entry()
and tracehook_report_syscall_exit() hooks in <linux/tracehook.h>.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 22 +++++++++-------------
1 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 93d7822..effada1 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -14,6 +14,7 @@
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/ptrace.h>
+#include <linux/tracehook.h>
#include <linux/user.h>
#include <linux/security.h>
#include <linux/init.h>
@@ -811,8 +812,6 @@ asmlinkage int syscall_trace(int why, struct pt_regs *regs, int scno)

if (!test_thread_flag(TIF_SYSCALL_TRACE))
return scno;
- if (!(current->ptrace & PT_PTRACED))
- return scno;

/*
* Save IP. IP is used to denote syscall entry/exit:
@@ -823,19 +822,16 @@ asmlinkage int syscall_trace(int why, struct pt_regs *regs, int scno)

current_thread_info()->syscall = scno;

- /* the 0x80 provides a way for the tracing parent to distinguish
- between a syscall stop and SIGTRAP delivery */
- ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
- ? 0x80 : 0));
/*
- * this isn't the same as continuing with a signal, but it will do
- * for normal use. strace only continues with a signal if the
- * stopping signal is not SIGTRAP. -brl
+ * Report the system call for tracing. Entry tracing can
+ * decide to abort the call. We handle that by setting an
+ * invalid syscall number (-1) to force an ENOSYS error.
*/
- if (current->exit_code) {
- send_sig(current->exit_code, current, 1);
- current->exit_code = 0;
- }
+ if (why)
+ tracehook_report_syscall_exit(regs, 0);
+ else if (tracehook_report_syscall_entry(regs))
+ current_thread_info()->syscall = -1;
+
regs->ARM_ip = ip;

return current_thread_info()->syscall;

2009-04-25 00:11:33

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 05/17] arm: TIF_NOTIFY_RESUME

This makes arm support TIF_NOTIFY_RESUME, so that set_notify_resume()
requests a call to tracehook_notify_resume() on the way back to user mode.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/include/asm/thread_info.h | 4 ++++
arch/arm/kernel/entry-common.S | 2 +-
arch/arm/kernel/signal.c | 5 +++++
3 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 4f88482..def86a0 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -129,12 +129,14 @@ extern void vfp_sync_state(struct thread_info *thread);
* thread information flags:
* TIF_SYSCALL_TRACE - syscall trace active
* TIF_SIGPENDING - signal pending
+ * TIF_NOTIFY_RESUME - tracing notification pending
* TIF_NEED_RESCHED - rescheduling necessary
* TIF_USEDFPU - FPU was used by this task this quantum (SMP)
* TIF_POLLING_NRFLAG - true if poll_idle() is polling TIF_NEED_RESCHED
*/
#define TIF_SIGPENDING 0
#define TIF_NEED_RESCHED 1
+#define TIF_NOTIFY_RESUME 2
#define TIF_SYSCALL_TRACE 8
#define TIF_POLLING_NRFLAG 16
#define TIF_USING_IWMMXT 17
@@ -142,6 +144,7 @@ extern void vfp_sync_state(struct thread_info *thread);
#define TIF_FREEZE 19

#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
+#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG)
@@ -152,6 +155,7 @@ extern void vfp_sync_state(struct thread_info *thread);
* Change these and you break ASM code in entry-common.S
*/
#define _TIF_WORK_MASK 0x000000ff
+#define _TIF_DO_NOTIFY_RESUME (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME)

#endif /* __KERNEL__ */
#endif /* __ASM_ARM_THREAD_INFO_H */
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index b55cb03..2523cac 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -51,7 +51,7 @@ fast_work_pending:
work_pending:
tst r1, #_TIF_NEED_RESCHED
bne work_resched
- tst r1, #_TIF_SIGPENDING
+ tst r1, #_TIF_DO_NOTIFY_RESUME
beq no_work_pending
mov r0, sp @ 'regs'
mov r2, why @ 'syscall'
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 51e2187..6663ee0 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -705,4 +705,9 @@ do_notify_resume(struct pt_regs *regs, unsigned int thread_flags, int syscall)
{
if (thread_flags & _TIF_SIGPENDING)
do_signal(&current->blocked, regs, syscall);
+
+ if (thread_flags & _TIF_NOTIFY_RESUME) {
+ clear_thread_flag(TIF_NOTIFY_RESUME);
+ tracehook_notify_resume(regs);
+ }
}

2009-04-25 00:11:48

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 08/17] arm: CORE_DUMP_USE_REGSET

Flip on CORE_DUMP_USE_REGSET now that arm has user_regset support
for the general and FPU register sets that appear in core dumps.

This moves arm in line with the preferred new code to gather
registers for ELF core dumps. There is no material change to
what appears in core dumps.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/include/asm/elf.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index d7da19b..97a1cd2 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -94,6 +94,7 @@ extern int arm_elf_read_implies_exec(const struct elf32_hdr *, int);
#define elf_read_implies_exec(ex,stk) arm_elf_read_implies_exec(&(ex), stk)

#define USE_ELF_CORE_DUMP
+#define CORE_DUMP_USE_REGSET
#define ELF_EXEC_PAGESIZE 4096

/* This is the location that an ET_DYN program is loaded if exec'ed. Typical

2009-04-25 00:12:19

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 09/17] arm: user_regset: VFP regs

This converts PTRACE_GETVFPREGS/PTRACE_SETVFPREGS into user_regset form.

There are two small changes to the PTRACE_GETVFPREGS behavior.
The trailing word of struct user_vfp (alignment padding) was not
touched at all by PTRACE_GETVFPREGS before, so the user's struct
probably contained uninitialized garbage. Now that word will be
filled with zero. Likewise, on configurations with only 16 VFP
registers (<v3), PTRACE_GETVFPREGS would not touch those words in
the user's struct all, leaving uninitialized garbage. Now those
words will also be filled with zero.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 104 ++++++++++++++++++++++++++++++++++++----------
1 files changed, 82 insertions(+), 22 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 9e7aa04..7f3c121 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -685,47 +685,95 @@ static int ptrace_setcrunchregs(struct task_struct *tsk, void __user *ufp)
/*
* Get the child VFP state.
*/
-static int ptrace_getvfpregs(struct task_struct *tsk, void __user *data)
+static int vfp_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);
union vfp_state *vfp = &thread->vfpstate;
- struct user_vfp __user *ufp = data;
+ u32 *fpscr = &vfp->hard.fpscr;
+ int ret;

vfp_sync_state(thread);

/* copy the floating point registers */
- if (copy_to_user(&ufp->fpregs, &vfp->hard.fpregs,
- sizeof(vfp->hard.fpregs)))
- return -EFAULT;
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &vfp->hard.fpregs,
+ 0, sizeof(vfp->hard.fpregs));
+ if (!ret && sizeof(vfp->hard.fpregs) < offsetof(struct user_vfp, fpscr))
+ /*
+ * When we don't support all the VFP registers in the
+ * regset format, fill the rest with zero.
+ */
+ ret = user_regset_copyout_zero(
+ &pos, &count, &kbuf, &ubuf,
+ sizeof(vfp->hard.fpregs),
+ offsetof(struct user_vfp, fpscr));

/* copy the status and control register */
- if (put_user(vfp->hard.fpscr, &ufp->fpscr))
- return -EFAULT;
+ if (!ret)
+ ret = user_regset_copyout(
+ &pos, &count, &kbuf, &ubuf, fpscr,
+ offsetof(struct user_vfp, fpscr),
+ offsetof(struct user_vfp, fpscr) + sizeof(*fpscr));

- return 0;
+ /*
+ * Fill the alignment padding at the end of struct user_vfp.
+ */
+ if (!ret)
+ ret = user_regset_copyout_zero(
+ &pos, &count, &kbuf, &ubuf,
+ offsetof(struct user_vfp, fpscr),
+ offsetof(struct user_vfp, fpscr) + sizeof(*fpscr));
+
+ return ret;
}

/*
* Set the child VFP state.
*/
-static int ptrace_setvfpregs(struct task_struct *tsk, void __user *data)
+static int vfp_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);
union vfp_state *vfp = &thread->vfpstate;
- struct user_vfp __user *ufp = data;
+ u32 *fpscr = &vfp->hard.fpscr;
+ int ret;

vfp_sync_state(thread);

/* copy the floating point registers */
- if (copy_from_user(&vfp->hard.fpregs, &ufp->fpregs,
- sizeof(vfp->hard.fpregs)))
- return -EFAULT;
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &vfp->hard.fpregs,
+ 0, sizeof(vfp->hard.fpregs));
+ if (!ret && sizeof(vfp->hard.fpregs) < offsetof(struct user_vfp, fpscr))
+ /*
+ * When we don't support all the VFP registers in the
+ * regset format, ignore the rest.
+ */
+ ret = user_regset_copyin_ignore(
+ &pos, &count, &kbuf, &ubuf,
+ sizeof(vfp->hard.fpregs),
+ offsetof(struct user_vfp, fpscr));

/* copy the status and control register */
- if (get_user(vfp->hard.fpscr, &ufp->fpscr))
- return -EFAULT;
+ if (!ret)
+ ret = user_regset_copyin(
+ &pos, &count, &kbuf, &ubuf, fpscr,
+ offsetof(struct user_vfp, fpscr),
+ offsetof(struct user_vfp, fpscr) + sizeof(*fpscr));

- return 0;
+ /*
+ * Ignore the alignment padding at the end of struct user_vfp.
+ */
+ if (!ret)
+ ret = user_regset_copyin_ignore(
+ &pos, &count, &kbuf, &ubuf,
+ offsetof(struct user_vfp, fpscr) + sizeof(*fpscr),
+ sizeof(struct user_vfp));
+
+ return ret;
}
#endif

@@ -738,6 +786,9 @@ static int ptrace_setvfpregs(struct task_struct *tsk, void __user *data)
enum {
REGSET_GPR,
REGSET_FP,
+#ifdef CONFIG_VFP
+ REGSET_VFP,
+#endif
};

static const struct user_regset arm_regsets[] = {
@@ -752,6 +803,13 @@ static const struct user_regset arm_regsets[] = {
.size = sizeof(long), .align = sizeof(long),
.active = user_fp_active, .get = user_fp_get, .set = user_fp_set
},
+#ifdef CONFIG_VFP
+ [REGSET_VFP] = {
+ .n = sizeof(struct user_vfp) / sizeof(long),
+ .size = sizeof(long), .align = sizeof(long),
+ .get = vfp_get, .set = vfp_set
+ },
+#endif
};

static const struct user_regset_view user_arm_view = {
@@ -855,12 +913,14 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)

#ifdef CONFIG_VFP
case PTRACE_GETVFPREGS:
- ret = ptrace_getvfpregs(child, (void __user *)data);
- break;
+ return copy_regset_to_user(child, &user_arm_view, REGSET_VFP,
+ 0, sizeof(struct pt_regs),
+ (void __user *) data);

case PTRACE_SETVFPREGS:
- ret = ptrace_setvfpregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view, REGSET_VFP,
+ 0, sizeof(struct pt_regs),
+ (const void __user *) data);
#endif

default:

2009-04-25 00:12:36

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 10/17] arm: user_regset: VFP in core dumps

This enables dumping the VFP state in ELF core dumps.
It gets its own note.

I reused the NT_PRXFPREG type code already used for extended-FP
state on x86, because it seems to fit logically. It would be just
as easy to define a new code called NT_ARM_VFP in linux/elf.h if
that is preferable.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 7f3c121..5fb5ac1 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -805,6 +805,7 @@ static const struct user_regset arm_regsets[] = {
},
#ifdef CONFIG_VFP
[REGSET_VFP] = {
+ .core_note_type = NT_PRXFPREG,
.n = sizeof(struct user_vfp) / sizeof(long),
.size = sizeof(long), .align = sizeof(long),
.get = vfp_get, .set = vfp_set

2009-04-25 00:12:50

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 07/17] arm: user_regset: FPU regs

This converts PTRACE_GETFPREGS/PTRACE_SETFPREGS into user_regset form.
There should be no change in the ptrace behavior.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 48 +++++++++++++++++++++++++++++++++++----------
1 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index fbb18e4..9e7aa04 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -589,21 +589,38 @@ static int gpr_set(struct task_struct *target,
/*
* Get the child FPU state.
*/
-static int ptrace_getfpregs(struct task_struct *tsk, void __user *ufp)
+static int user_fp_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
{
- return copy_to_user(ufp, &task_thread_info(tsk)->fpstate,
- sizeof(struct user_fp)) ? -EFAULT : 0;
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &task_thread_info(target)->fpstate,
+ 0, sizeof(struct user_fp));
+}
+
+/*
+ * Decide if the FPU state is interesting enough to look at or dump.
+ */
+static int user_fp_active(struct task_struct *target,
+ const struct user_regset *regset)
+{
+ struct thread_info *thread = task_thread_info(target);
+ return (thread->used_cp[1] || thread->used_cp[2]) ? regset->n : 0;
}

/*
* Set the child FPU state.
*/
-static int ptrace_setfpregs(struct task_struct *tsk, void __user *ufp)
+static int user_fp_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);
thread->used_cp[1] = thread->used_cp[2] = 1;
- return copy_from_user(&thread->fpstate, ufp,
- sizeof(struct user_fp)) ? -EFAULT : 0;
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &thread->fpstate, 0, sizeof(struct user_fp));
}

#ifdef CONFIG_IWMMXT
@@ -720,6 +737,7 @@ static int ptrace_setvfpregs(struct task_struct *tsk, void __user *data)
*/
enum {
REGSET_GPR,
+ REGSET_FP,
};

static const struct user_regset arm_regsets[] = {
@@ -728,6 +746,12 @@ static const struct user_regset arm_regsets[] = {
.size = sizeof(long), .align = sizeof(long),
.get = gpr_get, .set = gpr_set
},
+ [REGSET_FP] = {
+ .core_note_type = NT_PRFPREG,
+ .n = sizeof(struct user_fp) / sizeof(long),
+ .size = sizeof(long), .align = sizeof(long),
+ .active = user_fp_active, .get = user_fp_get, .set = user_fp_set
+ },
};

static const struct user_regset_view user_arm_view = {
@@ -790,12 +814,14 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
(const void __user *) data);

case PTRACE_GETFPREGS:
- ret = ptrace_getfpregs(child, (void __user *)data);
- break;
+ return copy_regset_to_user(child, &user_arm_view, REGSET_FP,
+ 0, sizeof(struct user_fp),
+ (void __user *) data);

case PTRACE_SETFPREGS:
- ret = ptrace_setfpregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view, REGSET_FP,
+ 0, sizeof(struct user_fp),
+ (const void __user *) data);

#ifdef CONFIG_IWMMXT
case PTRACE_GETWMMXREGS:

2009-04-25 00:13:54

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 13/17] arm: user_regset: Crunch regs

This converts the ptrace Crunch support into user_regset form.
There should be no change in the ptrace behavior on CONFIG_CRUNCH.

I am not set up to test any hardware with CONFIG_CRUNCH or to try
any software that uses Crunch so as to verify it. I've tested that
it compiles. This regset's code is as simple as any can be.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 44 ++++++++++++++++++++++++++++++++------------
1 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index d59e074..78b457c 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -671,25 +671,33 @@ static int iwmmxt_set(struct task_struct *target,
/*
* Get the child Crunch state.
*/
-static int ptrace_getcrunchregs(struct task_struct *tsk, void __user *ufp)
+static int crunch_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);

crunch_task_disable(thread); /* force it to ram */
- return copy_to_user(ufp, &thread->crunchstate, CRUNCH_SIZE)
- ? -EFAULT : 0;
+
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &thread->crunchstate, 0, CRUNCH_SIZE);
}

/*
* Set the child Crunch state.
*/
-static int ptrace_setcrunchregs(struct task_struct *tsk, void __user *ufp)
+static int crunch_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);

crunch_task_release(thread); /* force a reload */
- return copy_from_user(&thread->crunchstate, ufp, CRUNCH_SIZE)
- ? -EFAULT : 0;
+
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &thread->crunchstate, 0, CRUNCH_SIZE);
}
#endif

@@ -804,6 +812,9 @@ enum {
#ifdef CONFIG_IWMMXT
REGSET_IWMMXT,
#endif
+#ifdef CONFIG_CRUNCH
+ REGSET_CRUNCH,
+#endif
};

static const struct user_regset arm_regsets[] = {
@@ -834,6 +845,13 @@ static const struct user_regset arm_regsets[] = {
.active = iwmmxt_active, .get = iwmmxt_get, .set = iwmmxt_set
},
#endif
+#ifdef CONFIG_CRUNCH
+ [REGSET_CRUNCH] = {
+ .n = sizeof(struct crunch_state) / sizeof(int),
+ .size = sizeof(int), .align = sizeof(int),
+ .get = crunch_get, .set = crunch_set
+ },
+#endif
};

static const struct user_regset_view user_arm_view = {
@@ -931,12 +949,14 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)

#ifdef CONFIG_CRUNCH
case PTRACE_GETCRUNCHREGS:
- ret = ptrace_getcrunchregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view,
+ REGSET_CRUNCH, 0, CRUNCH_SIZE,
+ (const void __user *) data);

case PTRACE_SETCRUNCHREGS:
- ret = ptrace_setcrunchregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view,
+ REGSET_CRUNCH, 0, CRUNCH_SIZE,
+ (const void __user *) data);
#endif

#ifdef CONFIG_VFP

2009-04-25 00:14:21

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 14/17] arm: user_regset: Crunch in core dumps

This enables dumping the Crunch state in ELF core dumps.
It gets its own note of the new type NT_ARM_CRUNCH.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 1 +
include/linux/elf.h | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 78b457c..6ce3deb 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -847,6 +847,7 @@ static const struct user_regset arm_regsets[] = {
#endif
#ifdef CONFIG_CRUNCH
[REGSET_CRUNCH] = {
+ .core_note_type = NT_ARM_CRUNCH,
.n = sizeof(struct crunch_state) / sizeof(int),
.size = sizeof(int), .align = sizeof(int),
.get = crunch_get, .set = crunch_set
diff --git a/include/linux/elf.h b/include/linux/elf.h
index aa8bae6..30a7b92 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -362,6 +362,7 @@ typedef struct elf64_shdr {
#define NT_386_TLS 0x200 /* i386 TLS slots (struct user_desc) */
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
#define NT_ARM_WMMX 0x300 /* ARM iWMMXt registers */
+#define NT_ARM_CRUNCH 0x301 /* ARM Crunch registers */


/* Note header in a PT_NOTE section */

2009-04-25 00:14:40

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 11/17] arm: user_regset: iWMMXt regs

This converts the ptrace iWMMXt support into user_regset form.
There should be no change in the ptrace behavior on CONFIG_IWMMXT.

I am not set up to test any hardware with CONFIG_IWMMXT or to try
any software that uses iWMMXt so as to verify it. I've tested that
it compiles. This regset's code is extremely simple.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 50 ++++++++++++++++++++++++++++++++++-----------
1 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 5fb5ac1..821ce64 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -628,29 +628,41 @@ static int user_fp_set(struct task_struct *target,
/*
* Get the child iWMMXt state.
*/
-static int ptrace_getwmmxregs(struct task_struct *tsk, void __user *ufp)
+static int iwmmxt_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);

if (!test_ti_thread_flag(thread, TIF_USING_IWMMXT))
return -ENODATA;
iwmmxt_task_disable(thread); /* force it to ram */
- return copy_to_user(ufp, &thread->fpstate.iwmmxt, IWMMXT_SIZE)
- ? -EFAULT : 0;
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &thread->fpstate.iwmmxt, 0, IWMMXT_SIZE);
+}
+
+static int iwmmxt_active(struct task_struct *target,
+ const struct user_regset *regset)
+{
+ return test_tsk_thread_flag(target, TIF_USING_IWMMXT) ? regset->n : 0;
}

/*
* Set the child iWMMXt state.
*/
-static int ptrace_setwmmxregs(struct task_struct *tsk, void __user *ufp)
+static int iwmmxt_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
{
- struct thread_info *thread = task_thread_info(tsk);
+ struct thread_info *thread = task_thread_info(target);

if (!test_ti_thread_flag(thread, TIF_USING_IWMMXT))
return -EACCES;
iwmmxt_task_release(thread); /* force a reload */
- return copy_from_user(&thread->fpstate.iwmmxt, ufp, IWMMXT_SIZE)
- ? -EFAULT : 0;
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &thread->fpstate.iwmmxt, 0, IWMMXT_SIZE);
}

#endif
@@ -789,6 +801,9 @@ enum {
#ifdef CONFIG_VFP
REGSET_VFP,
#endif
+#ifdef CONFIG_IWMMXT
+ REGSET_IWMMXT,
+#endif
};

static const struct user_regset arm_regsets[] = {
@@ -811,6 +826,13 @@ static const struct user_regset arm_regsets[] = {
.get = vfp_get, .set = vfp_set
},
#endif
+#ifdef CONFIG_IWMMXT
+ [REGSET_IWMMXT] = {
+ .n = sizeof(struct iwmmxt_struct) / sizeof(int),
+ .size = sizeof(int), .align = sizeof(int),
+ .active = iwmmxt_active, .get = iwmmxt_get, .set = iwmmxt_set
+ },
+#endif
};

static const struct user_regset_view user_arm_view = {
@@ -884,12 +906,16 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)

#ifdef CONFIG_IWMMXT
case PTRACE_GETWMMXREGS:
- ret = ptrace_getwmmxregs(child, (void __user *)data);
- break;
+ return copy_regset_to_user(child, &user_arm_view,
+ REGSET_IWMMXT,
+ 0, IWMMXT_SIZE,
+ (void __user *) data);

case PTRACE_SETWMMXREGS:
- ret = ptrace_setwmmxregs(child, (void __user *)data);
- break;
+ return copy_regset_from_user(child, &user_arm_view,
+ REGSET_IWMMXT,
+ 0, IWMMXT_SIZE,
+ (const void __user *) data);
#endif

case PTRACE_GET_THREAD_AREA:

2009-04-25 00:14:59

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 12/17] arm: user_regset: iWMMXt in core dumps

This enables dumping the iWMMXt state in ELF core dumps if it was used.
It gets its own note of the new type NT_ARM_WMMX.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 1 +
include/linux/elf.h | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 821ce64..d59e074 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -828,6 +828,7 @@ static const struct user_regset arm_regsets[] = {
#endif
#ifdef CONFIG_IWMMXT
[REGSET_IWMMXT] = {
+ .core_note_type = NT_ARM_WMMX,
.n = sizeof(struct iwmmxt_struct) / sizeof(int),
.size = sizeof(int), .align = sizeof(int),
.active = iwmmxt_active, .get = iwmmxt_get, .set = iwmmxt_set
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 45a937b..aa8bae6 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -361,6 +361,7 @@ typedef struct elf64_shdr {
#define NT_PPC_VSX 0x102 /* PowerPC VSX registers */
#define NT_386_TLS 0x200 /* i386 TLS slots (struct user_desc) */
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
+#define NT_ARM_WMMX 0x300 /* ARM iWMMXt registers */


/* Note header in a PT_NOTE section */

2009-04-25 00:15:46

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 17/17] arm: HAVE_ARCH_TRACEHOOK

All the prerequisites are in place, so wire up HAVE_ARCH_TRACEHOOK for arm.
The immediate effect of this is to make /proc/pid/syscall work.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/Kconfig | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e02b893..5b322d3 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -14,6 +14,7 @@ config ARM
select SYS_SUPPORTS_APM_EMULATION
select HAVE_OPROFILE
select HAVE_ARCH_KGDB
+ select HAVE_ARCH_TRACEHOOK
select HAVE_KPROBES if (!XIP_KERNEL)
select HAVE_KRETPROBES if (HAVE_KPROBES)
select HAVE_FUNCTION_TRACER if (!XIP_KERNEL)

2009-04-25 00:16:24

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 15/17] arm: user_regset: thread pointer in core dumps

This wires up a proper user_regset for the $tp special register.
It will appear in ELF core dumps in its own tiny note of the new
type NT_ARM_TP. The note will be omitted if $tp was 0.

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/kernel/ptrace.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/elf.h | 1 +
2 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 6ce3deb..1a07257 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -623,6 +623,37 @@ static int user_fp_set(struct task_struct *target,
&thread->fpstate, 0, sizeof(struct user_fp));
}

+/*
+ * Fetch the thread pointer as a regset of its own.
+ * It's considered "active" (i.e. worth dumping) if it's nonzero.
+ */
+
+static int tp_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &task_thread_info(target)->tp_value,
+ 0, sizeof(unsigned long));
+}
+
+static int tp_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &task_thread_info(target)->tp_value,
+ 0, sizeof(unsigned long));
+}
+
+static int tp_active(struct task_struct *target,
+ const struct user_regset *regset)
+{
+ return task_thread_info(target)->tp_value == 0 ? 0 : 1;
+}
+
#ifdef CONFIG_IWMMXT

/*
@@ -806,6 +837,7 @@ static int vfp_set(struct task_struct *target,
enum {
REGSET_GPR,
REGSET_FP,
+ REGSET_TP,
#ifdef CONFIG_VFP
REGSET_VFP,
#endif
@@ -829,6 +861,11 @@ static const struct user_regset arm_regsets[] = {
.size = sizeof(long), .align = sizeof(long),
.active = user_fp_active, .get = user_fp_get, .set = user_fp_set
},
+ [REGSET_TP] = {
+ .core_note_type = NT_ARM_TP,
+ .size = sizeof(long), .align = sizeof(long), .n = 1,
+ .active = tp_active, .get = tp_get, .set = tp_set
+ },
#ifdef CONFIG_VFP
[REGSET_VFP] = {
.core_note_type = NT_PRXFPREG,
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 30a7b92..32662e3 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -363,6 +363,7 @@ typedef struct elf64_shdr {
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
#define NT_ARM_WMMX 0x300 /* ARM iWMMXt registers */
#define NT_ARM_CRUNCH 0x301 /* ARM Crunch registers */
+#define NT_ARM_TP 0x302 /* ARM thread pointer */


/* Note header in a PT_NOTE section */

2009-04-25 00:16:59

by Roland McGrath

[permalink] [raw]
Subject: [PATCH 16/17] arm: asm/syscall.h (unfinished)

This provides the user_stack_pointer() macro in asm/ptrace.h,
which some new arch-independent code wants to be able to use.

It also adds the asm/syscall.h header to let arch-independent
code interpret the registers as system call arguments or return.

The syscall_get_nr() function here is not really right. I don't
know enough about ARM to finish it correctly. It needs to figure
out if the blocked user task is really in the kernel for a system
call and return -1 if not. I also did not try to handle all the
different ABI variants, which I don't really understand.

Until this is fixed, /proc/pid/syscall can show bogus results
for a process that is blocked inside a fault or something else
that's not a system call. (It's probably all wrong for !AEABI too.)

Signed-off-by: Roland McGrath <[email protected]>
---
arch/arm/include/asm/ptrace.h | 4 +-
arch/arm/include/asm/syscall.h | 65 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+), 2 deletions(-)
create mode 100644 arch/arm/include/asm/syscall.h

diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 236a06b..34075dc 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -140,7 +140,8 @@ static inline int valid_user_regs(struct pt_regs *regs)
return 0;
}

-#define instruction_pointer(regs) (regs)->ARM_pc
+#define instruction_pointer(regs) ((regs)->ARM_pc)
+#define user_stack_pointer(regs) ((regs)->ARM_sp)

#ifdef CONFIG_SMP
extern unsigned long profile_pc(struct pt_regs *regs);
@@ -156,4 +157,3 @@ extern unsigned long profile_pc(struct pt_regs *regs);
#endif /* __ASSEMBLY__ */

#endif
-
diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
new file mode 100644
index 0000000..56aefe6
--- /dev/null
+++ b/arch/arm/include/asm/syscall.h
@@ -0,0 +1,65 @@
+/*
+ * Access to user system call parameters and results
+ *
+ * See asm-generic/syscall.h for descriptions of what we must do here.
+ */
+
+#ifndef _ASM_SYSCALL_H
+#define _ASM_SYSCALL_H 1
+
+#include <linux/sched.h>
+#include <linux/err.h>
+
+static inline long syscall_get_nr(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ /* XXX how to figure out if blocked in a syscall or not?? */
+
+ return regs->ARM_r7; /* XXX apparently */
+ return task_thread_info(task)->syscall; /* XXX if changed via ptrace */
+}
+
+static inline void syscall_rollback(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ regs->ARM_r0 = regs->ARM_ORIG_r0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ return IS_ERR_VALUE(regs->ARM_r0) ? regs->ARM_r0 : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ return regs->ARM_r0;
+}
+
+static inline void syscall_set_return_value(struct task_struct *task,
+ struct pt_regs *regs,
+ int error, long val)
+{
+ regs->ARM_r0 = (long) error ?: val;
+}
+
+static inline void syscall_get_arguments(struct task_struct *task,
+ struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ unsigned long *args)
+{
+ BUG_ON(i + n > 6);
+ memcpy(args, &regs->uregs[i], n * sizeof(args[0]));
+}
+
+static inline void syscall_set_arguments(struct task_struct *task,
+ struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ const unsigned long *args)
+{
+ BUG_ON(i + n > 6);
+ memcpy(&regs->uregs[i], args, n * sizeof(args[0]));
+}
+
+#endif /* _ASM_SYSCALL_H */

2009-04-27 22:49:19

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH 12/17] arm: user_regset: iWMMXt in core dumps

On Fri, Apr 24, 2009 at 05:12:59PM -0700, Roland McGrath wrote:
> This enables dumping the iWMMXt state in ELF core dumps if it was used.
> It gets its own note of the new type NT_ARM_WMMX.
>
> Signed-off-by: Roland McGrath <[email protected]>

Are these note definitions centrally defined anywhere? I looked at the
include/elf/common.h in gdb, but it seemed fairly similar to linux/elf.h.

Both sh and mips would benefit from a note type for DSP registers. Would
something like an NT_PRDSPREG be accetable for this? Also, the note
definitions seem to have hopped from 4 to 6, is there a definition for 5,
or can that be used for a DSP register note type?

2009-04-28 02:55:22

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 12/17] arm: user_regset: iWMMXt in core dumps

> Are these note definitions centrally defined anywhere? I looked at the
> include/elf/common.h in gdb, but it seemed fairly similar to linux/elf.h.

linux/elf.h constitutes the central place. (And basically, de facto
everyone now seems to check with me for new ones.) New user_regset types
all get the "LINUX" note name, so there is no real authority beyond the
kernel maintainers to harmonize with.

> Both sh and mips would benefit from a note type for DSP registers. Would
> something like an NT_PRDSPREG be accetable for this? Also, the note
> definitions seem to have hopped from 4 to 6, is there a definition for 5,
> or can that be used for a DSP register note type?

There is no particular benefit to packing the space. It's just a 32-bit
identifier, with existing values assigned that are both large and small.
Please do not use any more small values. They are used on other systems
that some tools already deal with, and it just keeps things simpler not to
add any new n_type values that are reused with different names on different
systems. (I'll NAK any new use of values <= 20, which already have names
in glibc's elf.h.)

For a new purpose, it is my inclination to use a new NT_<machine>_<type>
name and pick a new value range, e.g. NT_SH_* at 0x300+n. The
interpretation of these type codes is machine-specific, but it just seems
simpler to assign a new block of numbers to each machine and not have any
overlaps.

If you really mean a new particular note format that would actually match
across different machines, then a new generically-named type for that is
appropriate. For that, I'd start with another new value range like 0x10000+n
(leaving the 0x?00+n blocks to be assigned for particular arch's).


Thanks,
Roland

2009-06-06 14:43:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/17] tracehook & user_regset for ARM

Russell, did you manage to take a look at this?

On Fri, Apr 24, 2009 at 05:06:34PM -0700, Roland McGrath wrote:
> The various things listed under CONFIG_HAVE_ARCH_TRACEHOOK (arch/Kconfig)
> are what each arch wants to do nowadays to fit with the generic code for
> user debugging, core dumping, etc. The "big machine" arch's have done this
> since 2.6.28 or earlier.
>
> Christoph suggested helping ARM with the work would be a good way to
> encourage all the "little machine" arch maintainers to catch up soon.
> (Of the "little machine" and oddball arch's, so far only sh has it.)
>
> The following patches bring ARM up to speed with HAVE_ARCH_TRACEHOOK (just
> about). There are so many patches because I sliced them into many small
> changes. Each patch is pretty short (some of them very tiny). The overall
> diffstat from the whole series is attached in the "pull request" below.
>
> This series is relative to ~2.6.30-rc3 (0c8454f). I expect it rebases
> easily to whatever tree you might want to queue it on.
>
> The immediate user-visible effects of the series are to enable the
> /proc/pid/syscall feature, and to add VFP, WMMX, Crunch, and $tp register
> data to core dumps.
>
> AFAIK only the asm/syscall.h patch still needs work. The preliminary
> version is only buggy in the way that /proc/pid/syscall will give bogus
> answers for a task not really in a syscall, or for the non-EABI entry
> styles. It's not unsafe or anything. It needs some attention from folks
> who really know ARM to fill in the truly proper version of syscall_get_nr().
>
> I only know how to run and test one ABI flavor, and only in qemu. I used
> versatile_defconfig and ran it in qemu-system-arm -M versatilepb using NFS
> root with the userland binaries from Fedora ARM.
>
> I don't know how to simulate hardware that has iWMMXt or Crunch, nor if my
> ARM userland handles those kernel configurations. So I've only
> (cross-)compile-tested the iWMMXt and Crunch code. (It is however the
> simplest of the user_regset code and pretty easy to eyeball-review.)
>
> My testing is quite minimal. Booted, nothing went wrong, simple strace
> uses still look sane, "cat /proc/self/syscall" looks right, core dump
> contents look right. (Not knowing the arch at all, I don't actually know
> how to put anything in the FPA or VFP registers so as to notice they are
> right rather than just the right number of zeros.) The userland I have
> does not have gdb (and qemu would take a week to build and run the gdb
> testsuite if it did), so I didn't try to test any ptrace use beyond what
> strace does.
>
>
> Thanks,
> Roland
>
> ---
>
> The following changes since commit 0c8454f56623505a99463405fd7d5664adfbb094:
> Rafael J. Wysocki (1):
> PM/Hibernate: Fix waiting for image device to appear on resume
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland.git arm/tracehook
>
> Roland McGrath (17):
> arm: arch_ptrace clean-up
> arm: arch_ptrace indentation
> arm: tracehook_report_syscall
> arm: tracehook_signal_handler
> arm: TIF_NOTIFY_RESUME
> arm: user_regset: general regs
> arm: user_regset: FPU regs
> arm: CORE_DUMP_USE_REGSET
> arm: user_regset: VFP regs
> arm: user_regset: VFP in core dumps
> arm: user_regset: iWMMXt regs
> arm: user_regset: iWMMXt in core dumps
> arm: user_regset: Crunch regs
> arm: user_regset: Crunch in core dumps
> arm: user_regset: thread pointer in core dumps
> arm: asm/syscall.h (unfinished)
> arm: HAVE_ARCH_TRACEHOOK
>
> arch/arm/Kconfig | 1 +
> arch/arm/include/asm/elf.h | 1 +
> arch/arm/include/asm/ptrace.h | 4 +-
> arch/arm/include/asm/syscall.h | 65 +++++
> arch/arm/include/asm/thread_info.h | 4 +
> arch/arm/kernel/entry-common.S | 2 +-
> arch/arm/kernel/ptrace.c | 526 ++++++++++++++++++++++++------------
> arch/arm/kernel/signal.c | 7 +
> include/linux/elf.h | 3 +
> 9 files changed, 433 insertions(+), 180 deletions(-)
> create mode 100644 arch/arm/include/asm/syscall.h
---end quoted text---

2009-06-19 09:14:42

by Russell King

[permalink] [raw]
Subject: Re: [PATCH 01/17] arm: arch_ptrace clean-up

On Fri, Apr 24, 2009 at 05:07:32PM -0700, Roland McGrath wrote:
> This cleans up arch_ptrace() on arm to use the generic ptrace_request()
> for everything it can. This gets rid of the non-arch ptrace internals
> magic from the arch code.

It also changes the semantics a bit, and I don't buy the "it doesn't
matter" comments, especially in the PT_SINGLESTEP case. Having
PT_SINGLESTEP set the flag but later fail is not nice behaviour at all,
it means that, despite the ptrace call failing, the next time that the
task encounters a signal, it will enter single stepping mode.

Therefore, I believe this patch is wrong.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2009-06-19 09:31:50

by Russell King

[permalink] [raw]
Subject: Re: [PATCH 16/17] arm: asm/syscall.h (unfinished)

On Fri, Apr 24, 2009 at 05:15:03PM -0700, Roland McGrath wrote:
> The syscall_get_nr() function here is not really right. I don't
> know enough about ARM to finish it correctly. It needs to figure
> out if the blocked user task is really in the kernel for a system
> call and return -1 if not.

That bit is rather hard - we maintain no global state as to whether a
task is in a syscall or not. We also do not maintain a global view
of which syscall number is being executed. The kernel just hasn't
required either of these things before.

> I also did not try to handle all the different ABI variants, which I
> don't really understand.

There are two places that the syscall number comes from: EABI and thumb
both use R7. OABI puts the value in the instruction itself.

In short, I don't know what to do about this either. I don't think
there's a quick and simple answer.

Your implementation of syscall_get_arguments() looks wrong - for OABI
it makes sense because there is no padding in the allocation of registers.
However, for EABI, there is padding, so 64-bit values always come in
using an even+odd register number. Short of maintaining some sort of
table describing the argument placements for every kernel syscall (eww)
I'm not sure how this could be fixed.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2009-06-24 06:56:21

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 01/17] arm: arch_ptrace clean-up

> It also changes the semantics a bit, and I don't buy the "it doesn't
> matter" comments, especially in the PT_SINGLESTEP case. Having
> PT_SINGLESTEP set the flag but later fail is not nice behaviour at all,
> it means that, despite the ptrace call failing, the next time that the
> task encounters a signal, it will enter single stepping mode.

The reason my comment says "before it matters" is precisely because I don't
think this scenario is possible. (And hence that it does not change the
user-visible semantics at all.) If I'm wrong about that, then all that's
required is to drop the comments about not error-checking and reintroduce:

ret = -EIO;
if (!valid_signal(data))
break;

at the top of both cases that end in "goto common;". That is the only
error case ptrace_request() can diagnose for those requests.

Here's why I think that my comment:

* It's harmless in the error case, since we will have
* come back through here or PTRACE_SINGLESTEP again
* before it matters.

is correct and thus the check above is superfluous.

If we are in arch_ptrace() at all, it means we have already been through
ptrace_check_attach() successfully, so the child is in TASK_TRACED. (Or
else it's simultaneously being woken up by SIGKILL and dying, which is
already required to be harmless to things arch_ptrace() might do.) Aside
from SIGKILL, nothing else can wake the child from TASK_TRACED but ptrace.
If it's SIGKILL, then no harm done no matter what the state of the
single-step machinery--it never even tries to get back to user mode anyway.

If it's ptrace, then that's what "back through here" means in the comment:
it will be in one of these same cases in arch_ptrace() that calls either
single_step_disable() or single_step_enable(). Again, no harm done,
because the state was reset as desired before resuming in user mode.

I don't know what exactly "task encounters a signal" meant in your
description of a failure mode. If it means "a signal is sent" (generated),
then this does not wake the task from TASK_TRACED, where we know it is
sitting in this scenario, unless it was SIGKILL. If it means "a signal is
dequeued and processed" (delivered), then this only happens after the task
has been woken up from TASK_TRACED. That happens by SIGKILL, in which case
user mode is never seen again (zero steps already < single step), or by
ptrace, in which case it will or won't enter single stepping mode, exactly
as directed by that (second, successful) ptrace call.

Did I misunderstand which scenario you had in mind,
or am I mistaken about how that scenario is ruled out?


Thanks,
Roland

2009-06-24 08:57:14

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 16/17] arm: asm/syscall.h (unfinished)

> That bit is rather hard - we maintain no global state as to whether a
> task is in a syscall or not. We also do not maintain a global view
> of which syscall number is being executed. The kernel just hasn't
> required either of these things before.

I understand. This has also been true for at least one other machine
(though on others this has turned out to be easy to get reliably even
though it was never explicitly planned for there either). But lots of
people seem to think this is a really useful debugging feature worth
putting a bit of effort into implementing. (It enables a debugger to
answer, "What is that process blocked in?"--without prior tracing setup.)

The other machine I have in mind is s390, see commit 59da2139. That arch's
maintainer took on some new assembly hacking to make syscall_get_nr() work
in all cases, because he thought it was worth the work and whatever changes
to the kernel entry paths to support the feature. I think that might have
entailed adding a store to some kernel entry paths, though I gather there
was a serendipitous way to replace an existing store so perhaps the hot
paths were not lengthened.

> > I also did not try to handle all the different ABI variants, which I
> > don't really understand.
>
> There are two places that the syscall number comes from: EABI and thumb
> both use R7. OABI puts the value in the instruction itself.
>
> In short, I don't know what to do about this either. I don't think
> there's a quick and simple answer.

I understand. I would look at it in a few stages.

First, there is asm/syscall.h for use with syscall tracing. These
are for new things coming in like ftrace-based tracers that use
TIF_SYSCALL_TRACE or its equivalent, or for userland debuggers that
use /proc/pid/syscall in concert with PTRACE_SYSCALL (instead of
using PTRACE_GETREGS or whatnot). For these uses, syscall_get_nr()
only has to work right on a task known to be inside syscall_trace().
So current_thread_info()->syscall suffices.

That could be a reasonable first step to commit something simple.
Advise ARM users that the new /proc/pid/syscall et al are reliable
only for explicit syscall tracing features (ptrace/ftrace/utrace).
It gives bad info in other cases, but that's not the end of the
world. Then you can consider further refinements piecemeal.

Next, there is the "What syscall is he blocked in?" case. This is
the most-requested reason to want /proc/pid/syscall. For this
purpose, 99.44% of the time in practice when a person asks this
question, the task will in fact be inside a system call (as opposed
to blocked in a page fault or something else like that). (People
doing ad hoc debugging can probably tell from wchan if a block is in
a page fault, and actual experiences involving tasks blocked for
minutes in a page fault are not the norm.)
So, what about something vaguely like:

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 92248eb..0000000 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -267,6 +267,7 @@ ENTRY(vector_swi)
eor scno, scno, #__NR_SYSCALL_BASE @ check OS number
#endif

+ str scno, [tsk, #TI_SYSCALL]! @ store for syscall_get_nr()
stmdb sp!, {r4, r5} @ push fifth and sixth args
tst ip, #_TIF_SYSCALL_TRACE @ are we tracing syscalls?
bne __sys_trace
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 1a07257..0000000 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -1032,8 +1032,6 @@ asmlinkage int syscall_trace(int why, st
ip = regs->ARM_ip;
regs->ARM_ip = why;

- current_thread_info()->syscall = scno;
-
/*
* Report the system call for tracing. Entry tracing can
* decide to abort the call. We handle that by setting an

(Forgive me, I don't actually know a lick of ARM assembly.) So that
adds a store to current_thread_info()->syscall in the hot path.
Maybe that is acceptable? Or maybe there is a more clever way to
swizzle things to record the info with net zero overhead. (On some
machines there seems to be some hardware or lowest-level kernel
state already somewhere that indicates "how we got into kernel mode"
in a way that helps.)

Finally, there is the full reliable definition that if the task is
in an explicit block after entering the kernel from user mode for
some reason other than a system call, syscall_get_nr() returns -1.
It's nice to get there, but in practice many users will be quite
happy just to get the first two stages and know the caveat that
/proc/pid/syscall et al are unreliable when you can't be sure that
where the task is blocked is indeed in a system call.

Assuming something akin to the above for noticing on syscall entry,
then there are two obvious ways to keep that state maintained for
the rest of the time (i.e. reset for non-syscall kernel entries).
One is to reset the record when leaving the kernel after a syscall,
so it's not set again until the next syscall entry. The other is to
clear that record on every other kind of kernel entry. The latter
is effectively what other machines I'm aware of all do, but I have
even less clue how to do that on ARM than I've displayed above on
the syscall entry side. For the latter, perhaps it can be this:

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 92248eb..4844ba2 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -30,6 +30,9 @@ ret_fast_syscall:
tst r1, #_TIF_WORK_MASK
bne fast_work_pending

+ mov why, #-1
+ str why, [tsk, #TI_SYSCALL]! @ reset for syscall_get_nr()
+
/* perform architecture specific actions before user return */
arch_ret_to_user r1, lr

@@ -65,6 +68,8 @@ work_resched:
*/
ENTRY(ret_to_user)
ret_slow_syscall:
+ mov why, #-1
+ str why, [tsk, #TI_SYSCALL]! @ reset for syscall_get_nr()
disable_irq @ disable interrupts
ldr r1, [tsk, #TI_FLAGS]
tst r1, #_TIF_WORK_MASK

But again I don't really presume to have any clue about the ARM
assembly code. I hope I've illustrated some general directions that
might be possible for actual ARM experts to consider and flesh out
whether they are worthwhile.

Regardless, IMHO it's worthwhile to start with the first stage that
is trivial but with the most caveats about its limited utility, and
have an asm/syscall.h rather than none while mulling over the
eventual perfection of syscall_get_nr().

> Your implementation of syscall_get_arguments() looks wrong - for OABI
> it makes sense because there is no padding in the allocation of registers.
> However, for EABI, there is padding, so 64-bit values always come in
> using an even+odd register number. Short of maintaining some sort of
> table describing the argument placements for every kernel syscall (eww)
> I'm not sure how this could be fixed.

I seem to recall this confusion has come up before, but I can't find
where I explained it. Perhaps the asm-generic/syscall.h kerneldoc
comments should change to clarify the point. In fact, I thought I had
changed those descriptions before it went in, but it's not so.

syscall_get_arguments() and syscall_set_arguments() get at the *argument
registers* in their usual order, not at the arguments in any semantic
sense. Unless I'm mistaken, all ARM syscalls taking 64-bit arguments
receive them in these same registers, and in never more than six
registers total. (This is true of all other machines too.) So that's
all these functions have to do: get those six registers.

The consumer of the data (in userland or whatever tracing thing) indeed
needs to know the arch-specific interpretation of these register values
and that differs by the particular syscall number. That is already true
for knowing what the syscall numbers mean, how many of the registers are
actually used for arguments in a given syscall, which ones are pointers
to what kinds of data or have what other intepretation as bitmasks or
numbers, and that is true on every arch. Which pairs of registers form
a single semantic argument and when is just another of these details.

There is other work being done on in-kernel syscall tracers that
reconstruct the C-level arguments from syscall_get_arguments() info.
Those do use tables and such (jiggered via the SYSCALL_DEFINE macros and
black magic, AIUI). But that is all above the syscall_get_arguments()
layer. You don't need to worry about this to get asm/syscall.h right.


Thanks,
Roland