2011-06-14 09:20:26

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#5

Hello,

This is the fifth take of PTRACE_SEIZE/INTERRUPT and group stop
notification patchset.

Changes from the last take[1] are,

* Rebased on top of Oleg's ptrace branch[2] which now contains prep
patches from the last take[1], so prep patches are dropped from this
series.

* si_pt_flags and PTRACE_SI_STOPPED are dropped in favor of encoding
the current group stop state in exit_code of STOP traps. If group
stop is in effect for the tracee, signr part of exit_code is the
stopping signal; otherwise, it's SIGTRAP.

Note that SIGCONT isn't used for NOTIFY traps - SIGTRAP is used
instead, so signr is either SIG(stop) or SIGTRAP. NOTIFY traps
aren't differen't from any other STOP traps and all we need is the
current state of group stop. There's no need to distinguish NOTIFY
traps.

* LISTENING test moved from wait_task_stopped() to
task_stopped_code().

This patchset contains the following five patches.

0001-job-control-introduce-JOBCTL_TRAP_STOP-and-use-it-fo.patch
0002-ptrace-implement-PTRACE_SEIZE.patch
0003-ptrace-implement-PTRACE_INTERRUPT.patch
0004-ptrace-implement-TRAP_NOTIFY-and-use-it-for-group-st.patch
0005-ptrace-implement-PTRACE_LISTEN.patch

and available in the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize

The HEAD is 20b0913886 (ptrace: implement PTRACE_LISTEN). If you see
older branch, please retry after a while (korg is still syncing).

The patchset is on top of Oleg's ptrace branch[2] - dd1d677269
(signal: remove three noop tracehooks).

diffstat follows.

include/linux/ptrace.h | 9 ++
include/linux/sched.h | 10 ++-
kernel/exit.c | 3
kernel/ptrace.c | 116 +++++++++++++++++++++++++++++++----
kernel/signal.c | 162 ++++++++++++++++++++++++++++++++++++++-----------
5 files changed, 249 insertions(+), 51 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1147384
[2] git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc.git ptrace


2011-06-14 09:20:31

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/5] job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap

do_signal_stop() implemented both normal group stop and trap for group
stop while ptraced. This approach has been enough but scheduled
changes require trap mechanism which can be used in more generic
manner and using group stop trap for generic trap site simplifies both
userland visible interface and implementation.

This patch adds a new jobctl flag - JOBCTL_TRAP_STOP. When set, it
triggers a trap site, which behaves like group stop trap, in
get_signal_to_deliver() after checking for pending signals. While
ptraced, do_signal_stop() doesn't stop itself. It initiates group
stop if requested and schedules JOBCTL_TRAP_STOP and returns. The
caller - get_signal_to_deliver() - is responsible for checking whether
TRAP_STOP is pending afterwards and handling it.

ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of
JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap
bits and TRAPPING so that TRAP_STOP and future trap bits don't linger
after detach.

While at it, add proper function comment to do_signal_stop() and make
it return bool.

-v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING
instead of JOBCTL_PENDING_MASK. This avoids accidentally
clearing JOBCTL_STOP_CONSUME. Spotted by Oleg.

-v3: do_signal_stop() updated to return %false without dropping
siglock while ptraced and TRAP_STOP check moved inside for(;;)
loop after group stop participation. This avoids unnecessary
relocking and also will help avoiding unnecessary traps by
consuming group stop before handling pending traps.

-v4: Jobctl trap handling moved into a separate function -
do_jobctl_trap().

Signed-off-by: Tejun Heo <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
include/linux/sched.h | 6 +++-
kernel/ptrace.c | 12 +++++--
kernel/signal.c | 90 +++++++++++++++++++++++++++++++++----------------
3 files changed, 75 insertions(+), 33 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5157bd9..8bd84b8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1810,17 +1810,21 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
#define JOBCTL_STOP_DEQUEUED_BIT 16 /* stop signal dequeued */
#define JOBCTL_STOP_PENDING_BIT 17 /* task should stop for group stop */
#define JOBCTL_STOP_CONSUME_BIT 18 /* consume group stop count */
+#define JOBCTL_TRAP_STOP_BIT 19 /* trap for STOP */
#define JOBCTL_TRAPPING_BIT 21 /* switching to TRACED */

#define JOBCTL_STOP_DEQUEUED (1 << JOBCTL_STOP_DEQUEUED_BIT)
#define JOBCTL_STOP_PENDING (1 << JOBCTL_STOP_PENDING_BIT)
#define JOBCTL_STOP_CONSUME (1 << JOBCTL_STOP_CONSUME_BIT)
+#define JOBCTL_TRAP_STOP (1 << JOBCTL_TRAP_STOP_BIT)
#define JOBCTL_TRAPPING (1 << JOBCTL_TRAPPING_BIT)

-#define JOBCTL_PENDING_MASK JOBCTL_STOP_PENDING
+#define JOBCTL_TRAP_MASK JOBCTL_TRAP_STOP
+#define JOBCTL_PENDING_MASK (JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)

extern bool task_set_jobctl_pending(struct task_struct *task,
unsigned int mask);
+extern void task_clear_jobctl_trapping(struct task_struct *task);
extern void task_clear_jobctl_pending(struct task_struct *task,
unsigned int mask);

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 7f05f3a..45a8a4c 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -83,6 +83,13 @@ void __ptrace_unlink(struct task_struct *child)
spin_lock(&child->sighand->siglock);

/*
+ * Clear all pending traps and TRAPPING. TRAPPING should be
+ * cleared regardless of JOBCTL_STOP_PENDING. Do it explicitly.
+ */
+ task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
+ task_clear_jobctl_trapping(child);
+
+ /*
* Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
* @child isn't dead.
*/
@@ -246,7 +253,7 @@ static int ptrace_attach(struct task_struct *task)
spin_lock(&task->sighand->siglock);

/*
- * If the task is already STOPPED, set JOBCTL_STOP_PENDING and
+ * If the task is already STOPPED, set JOBCTL_TRAP_STOP and
* TRAPPING, and kick it so that it transits to TRACED. TRAPPING
* will be cleared if the child completes the transition or any
* event which clears the group stop states happens. We'll wait
@@ -263,8 +270,7 @@ static int ptrace_attach(struct task_struct *task)
* in and out of STOPPED are protected by siglock.
*/
if (task_is_stopped(task) &&
- task_set_jobctl_pending(task,
- JOBCTL_STOP_PENDING | JOBCTL_TRAPPING))
+ task_set_jobctl_pending(task, JOBCTL_TRAP_STOP | JOBCTL_TRAPPING))
signal_wake_up(task, 1);

spin_unlock(&task->sighand->siglock);
diff --git a/kernel/signal.c b/kernel/signal.c
index c99b8b5..b5f55ca 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -266,7 +266,7 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned int mask)
* CONTEXT:
* Must be called with @task->sighand->siglock held.
*/
-static void task_clear_jobctl_trapping(struct task_struct *task)
+void task_clear_jobctl_trapping(struct task_struct *task)
{
if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
task->jobctl &= ~JOBCTL_TRAPPING;
@@ -1790,13 +1790,16 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
/*
* If @why is CLD_STOPPED, we're trapping to participate in a group
* stop. Do the bookkeeping. Note that if SIGCONT was delievered
- * while siglock was released for the arch hook, PENDING could be
- * clear now. We act as if SIGCONT is received after TASK_TRACED
- * is entered - ignore it.
+ * across siglock relocks since INTERRUPT was scheduled, PENDING
+ * could be clear now. We act as if SIGCONT is received after
+ * TASK_TRACED is entered - ignore it.
*/
if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
gstop_done = task_participate_group_stop(current);

+ /* any trap clears pending STOP trap */
+ task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
+
/* entering a trap, clear TRAPPING */
task_clear_jobctl_trapping(current);

@@ -1888,13 +1891,30 @@ void ptrace_notify(int exit_code)
spin_unlock_irq(&current->sighand->siglock);
}

-/*
- * This performs the stopping for SIGSTOP and other stop signals.
- * We have to stop all threads in the thread group.
- * Returns non-zero if we've actually stopped and released the siglock.
- * Returns zero if we didn't stop and still hold the siglock.
+/**
+ * do_signal_stop - handle group stop for SIGSTOP and other stop signals
+ * @signr: signr causing group stop if initiating
+ *
+ * If %JOBCTL_STOP_PENDING is not set yet, initiate group stop with @signr
+ * and participate in it. If already set, participate in the existing
+ * group stop. If participated in a group stop (and thus slept), %true is
+ * returned with siglock released.
+ *
+ * If ptraced, this function doesn't handle stop itself. Instead,
+ * %JOBCTL_TRAP_STOP is scheduled and %false is returned with siglock
+ * untouched. The caller must ensure that INTERRUPT trap handling takes
+ * places afterwards.
+ *
+ * CONTEXT:
+ * Must be called with @current->sighand->siglock held, which is released
+ * on %true return.
+ *
+ * RETURNS:
+ * %false if group stop is already cancelled or ptrace trap is scheduled.
+ * %true if participated in group stop.
*/
-static int do_signal_stop(int signr)
+static bool do_signal_stop(int signr)
+ __releases(&current->sighand->siglock)
{
struct signal_struct *sig = current->signal;

@@ -1907,7 +1927,7 @@ static int do_signal_stop(int signr)

if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) ||
unlikely(signal_group_exit(sig)))
- return 0;
+ return false;
/*
* There is no group stop already in progress. We must
* initiate one now.
@@ -1951,7 +1971,7 @@ static int do_signal_stop(int signr)
}
}
}
-retry:
+
if (likely(!task_ptrace(current))) {
int notify = 0;

@@ -1983,27 +2003,33 @@ retry:

/* Now we don't run again until woken by SIGCONT or SIGKILL */
schedule();
-
- spin_lock_irq(&current->sighand->siglock);
+ return true;
} else {
- ptrace_stop(current->jobctl & JOBCTL_STOP_SIGMASK,
- CLD_STOPPED, 0, NULL);
- current->exit_code = 0;
- }
-
- /*
- * JOBCTL_STOP_PENDING could be set if another group stop has
- * started since being woken up or ptrace wants us to transit
- * between TASK_STOPPED and TRACED. Retry group stop.
- */
- if (current->jobctl & JOBCTL_STOP_PENDING) {
- WARN_ON_ONCE(!(current->jobctl & JOBCTL_STOP_SIGMASK));
- goto retry;
+ /*
+ * While ptraced, group stop is handled by STOP trap.
+ * Schedule it and let the caller deal with it.
+ */
+ task_set_jobctl_pending(current, JOBCTL_TRAP_STOP);
+ return false;
}
+}

- spin_unlock_irq(&current->sighand->siglock);
+/**
+ * do_jobctl_trap - take care of ptrace jobctl traps
+ *
+ * It is currently used only to trap for group stop while ptraced.
+ *
+ * CONTEXT:
+ * Must be called with @current->sighand->siglock held, which may be
+ * released and re-acquired before returning with intervening sleep.
+ */
+static void do_jobctl_trap(void)
+{
+ int signr = current->jobctl & JOBCTL_STOP_SIGMASK;

- return 1;
+ WARN_ON_ONCE(!signr);
+ ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+ current->exit_code = 0;
}

static int ptrace_signal(int signr, siginfo_t *info,
@@ -2110,6 +2136,12 @@ relock:
do_signal_stop(0))
goto relock;

+ if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
+ do_jobctl_trap();
+ spin_unlock_irq(&sighand->siglock);
+ goto relock;
+ }
+
signr = dequeue_signal(current, &current->blocked, info);

if (!signr)
--
1.7.5.2

2011-06-14 09:21:07

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
effects on tracee signal and job control states. This patch
implements a new ptrace request PTRACE_SEIZE which attaches a tracee
without trapping it or affecting its signal and job control states.

The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
flags in @data. Currently, the only defined flag is
PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
The changes will be implemented gradually and the DEVEL flag is to
prevent programs which expect full SEIZE behavior from using it before
all the behavior modifications are complete while allowing unit
testing. The flag will be removed once SEIZE behaviors are completely
implemented.

* PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap. After
attaching tracee continues to run unless a trap condition occurs.

* PTRACE_SEIZE doesn't affect signal or group stop state.

* If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of
the stopping signals if group stop is in effect or SIGTRAP
otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO
instead of NULL.

Seizing sets PT_SEIZED in ->ptrace of the tracee. This flag will be
used to determine whether new SEIZE behaviors should be enabled.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts100ms = { .tv_nsec = 100000000 };
static const struct timespec ts1s = { .tv_sec = 1 };
static const struct timespec ts3s = { .tv_sec = 3 };

int main(int argc, char **argv)
{
pid_t tracee;

tracee = fork();
if (tracee == 0) {
nanosleep(&ts100ms, NULL);
while (1) {
printf("tracee: alive\n");
nanosleep(&ts1s, NULL);
}
}

if (argc > 1)
kill(tracee, SIGSTOP);

nanosleep(&ts100ms, NULL);

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
if (argc > 1) {
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, NULL);
}
nanosleep(&ts3s, NULL);
printf("tracer: exiting\n");
return 0;
}

When the above program is called w/o argument, tracee is seized while
running and remains running. When tracer exits, tracee continues to
run and print out messages.

# ./test-seize-simple
tracee: alive
tracee: alive
tracee: alive
tracer: exiting
tracee: alive
tracee: alive

When called with an argument, tracee is seized from stopped state and
continued, and returns to stopped state when tracer exits.

# ./test-seize
tracee: alive
tracee: alive
tracee: alive
tracer: exiting
# ps -el|grep test-seize
1 T 0 4720 1 0 80 0 - 941 signal ttyS0 00:00:00 test-seize

-v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
suggested.

-v3: PTRACE_EVENT_STOP traps now report group stop state by signr. If
group stop is in effect the stop signal number is returned as
part of exit_code; otherwise, SIGTRAP. This was suggested by
Denys and Oleg.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jan Kratochvil <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
include/linux/ptrace.h | 7 +++++++
kernel/ptrace.c | 35 +++++++++++++++++++++++++++++------
kernel/signal.c | 39 ++++++++++++++++++++++++++++++---------
3 files changed, 66 insertions(+), 15 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index e93ef1a..67ad3f1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -47,6 +47,11 @@
#define PTRACE_GETREGSET 0x4204
#define PTRACE_SETREGSET 0x4205

+#define PTRACE_SEIZE 0x4206
+
+/* flags in @data for PTRACE_SEIZE */
+#define PTRACE_SEIZE_DEVEL 0x80000000 /* temp flag for development */
+
/* options set using PTRACE_SETOPTIONS */
#define PTRACE_O_TRACESYSGOOD 0x00000001
#define PTRACE_O_TRACEFORK 0x00000002
@@ -65,6 +70,7 @@
#define PTRACE_EVENT_EXEC 4
#define PTRACE_EVENT_VFORK_DONE 5
#define PTRACE_EVENT_EXIT 6
+#define PTRACE_EVENT_STOP 7

#include <asm/ptrace.h>

@@ -77,6 +83,7 @@
* flags. When the a task is stopped the ptracer owns task->ptrace.
*/

+#define PT_SEIZED 0x00010000 /* SEIZE used, enable new behavior */
#define PT_PTRACED 0x00000001
#define PT_DTRACE 0x00000002 /* delayed trace (used on m68k, i386) */
#define PT_TRACESYSGOOD 0x00000004
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 45a8a4c..dcf9f97 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -209,10 +209,28 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
return !err;
}

-static int ptrace_attach(struct task_struct *task)
+static int ptrace_attach(struct task_struct *task, long request,
+ unsigned long flags)
{
+ bool seize = (request == PTRACE_SEIZE);
int retval;

+ /*
+ * SEIZE will enable new ptrace behaviors which will be implemented
+ * gradually. SEIZE_DEVEL is used to prevent applications
+ * expecting full SEIZE behaviors trapping on kernel commits which
+ * are still in the process of implementing them.
+ *
+ * Only test programs for new ptrace behaviors being implemented
+ * should set SEIZE_DEVEL. If unset, SEIZE will fail with -EIO.
+ *
+ * Once SEIZE behaviors are completely implemented, this flag and
+ * the following test will be removed.
+ */
+ retval = -EIO;
+ if (seize && !(flags & PTRACE_SEIZE_DEVEL))
+ goto out;
+
audit_ptrace(task);

retval = -EPERM;
@@ -244,11 +262,16 @@ static int ptrace_attach(struct task_struct *task)
goto unlock_tasklist;

task->ptrace = PT_PTRACED;
+ if (seize)
+ task->ptrace |= PT_SEIZED;
if (task_ns_capable(task, CAP_SYS_PTRACE))
task->ptrace |= PT_PTRACE_CAP;

__ptrace_link(task, current);
- send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
+
+ /* SEIZE doesn't trap tracee on attach */
+ if (!seize)
+ send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);

spin_lock(&task->sighand->siglock);

@@ -785,8 +808,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
goto out;
}

- if (request == PTRACE_ATTACH) {
- ret = ptrace_attach(child);
+ if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+ ret = ptrace_attach(child, request, data);
/*
* Some architectures need to do book-keeping after
* a ptrace attach.
@@ -927,8 +950,8 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
goto out;
}

- if (request == PTRACE_ATTACH) {
- ret = ptrace_attach(child);
+ if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+ ret = ptrace_attach(child, request, data);
/*
* Some architectures need to do book-keeping after
* a ptrace attach.
diff --git a/kernel/signal.c b/kernel/signal.c
index b5f55ca..589292f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1873,21 +1873,26 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
recalc_sigpending_tsk(current);
}

-void ptrace_notify(int exit_code)
+static void ptrace_do_notify(int signr, int exit_code, int why)
{
siginfo_t info;

- BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
-
memset(&info, 0, sizeof info);
- info.si_signo = SIGTRAP;
+ info.si_signo = signr;
info.si_code = exit_code;
info.si_pid = task_pid_vnr(current);
info.si_uid = current_uid();

/* Let the debugger run. */
+ ptrace_stop(exit_code, why, 1, &info);
+}
+
+void ptrace_notify(int exit_code)
+{
+ BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
+
spin_lock_irq(&current->sighand->siglock);
- ptrace_stop(exit_code, CLD_TRAPPED, 1, &info);
+ ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
spin_unlock_irq(&current->sighand->siglock);
}

@@ -2017,7 +2022,13 @@ static bool do_signal_stop(int signr)
/**
* do_jobctl_trap - take care of ptrace jobctl traps
*
- * It is currently used only to trap for group stop while ptraced.
+ * When PT_SEIZED, it's used for both group stop and explicit
+ * SEIZE/INTERRUPT traps. Both generate PTRACE_EVENT_STOP trap with
+ * accompanying siginfo. If stopped, lower eight bits of exit_code contain
+ * the stop signal; otherwise, %SIGTRAP.
+ *
+ * When !PT_SEIZED, it's used only for group stop trap with stop signal
+ * number as exit_code and no siginfo.
*
* CONTEXT:
* Must be called with @current->sighand->siglock held, which may be
@@ -2025,11 +2036,21 @@ static bool do_signal_stop(int signr)
*/
static void do_jobctl_trap(void)
{
+ struct signal_struct *signal = current->signal;
int signr = current->jobctl & JOBCTL_STOP_SIGMASK;

- WARN_ON_ONCE(!signr);
- ptrace_stop(signr, CLD_STOPPED, 0, NULL);
- current->exit_code = 0;
+ if (current->ptrace & PT_SEIZED) {
+ if (!signal->group_stop_count &&
+ !(signal->flags & SIGNAL_STOP_STOPPED))
+ signr = SIGTRAP;
+ WARN_ON_ONCE(!signr);
+ ptrace_do_notify(signr, signr | (PTRACE_EVENT_STOP << 8),
+ CLD_STOPPED);
+ } else {
+ WARN_ON_ONCE(!signr);
+ ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+ current->exit_code = 0;
+ }
}

static int ptrace_signal(int signr, siginfo_t *info,
--
1.7.5.2

2011-06-14 09:21:05

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 3/5] ptrace: implement PTRACE_INTERRUPT

Currently, there's no way to trap a running ptracee short of sending a
signal which has various side effects. This patch implements
PTRACE_INTERRUPT which traps ptracee without any signal or job control
related side effect.

The implementation is almost trivial. It uses the group stop trap -
SIGTRAP | PTRACE_EVENT_STOP << 8. A new trap flag
JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and
cleared when any trap happens. As INTERRUPT should be useable
regardless of the current state of tracee, task_is_traced() test in
ptrace_check_attach() is skipped for INTERRUPT.

PTRACE_INTERRUPT is available iff tracee is attached with
PTRACE_SEIZE.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207

#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts100ms = { .tv_nsec = 100000000 };
static const struct timespec ts1s = { .tv_sec = 1 };
static const struct timespec ts3s = { .tv_sec = 3 };

int main(int argc, char **argv)
{
pid_t tracee;

tracee = fork();
if (tracee == 0) {
nanosleep(&ts100ms, NULL);
while (1) {
printf("tracee: alive pid=%d\n", getpid());
nanosleep(&ts1s, NULL);
}
}

if (argc > 1)
kill(tracee, SIGSTOP);

nanosleep(&ts100ms, NULL);

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
if (argc > 1) {
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, NULL);
}
nanosleep(&ts3s, NULL);

printf("tracer: INTERRUPT and DETACH\n");
ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_DETACH, tracee, NULL, NULL);
nanosleep(&ts3s, NULL);

printf("tracer: exiting\n");
kill(tracee, SIGKILL);
return 0;
}

When called without argument, tracee is seized from running state,
interrupted and then detached back to running state.

# ./test-interrupt
tracee: alive pid=4546
tracee: alive pid=4546
tracee: alive pid=4546
tracer: INTERRUPT and DETACH
tracee: alive pid=4546
tracee: alive pid=4546
tracee: alive pid=4546
tracer: exiting

When called with argument, tracee is seized from stopped state,
continued, interrupted and then detached back to stopped state.

# ./test-interrupt 1
tracee: alive pid=4548
tracee: alive pid=4548
tracee: alive pid=4548
tracer: INTERRUPT and DETACH
tracer: exiting

Before PTRACE_INTERRUPT, once the tracee was running, there was no way
to trap tracee and do PTRACE_DETACH without causing side effect.

-v2: Updated to use task_set_jobctl_pending() so that it doesn't end
up scheduling TRAP_STOP if child is dying which may make the
child unkillable. Spotted by Oleg.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
include/linux/ptrace.h | 1 +
kernel/ptrace.c | 29 +++++++++++++++++++++++++++--
2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 67ad3f1..ad754d1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -48,6 +48,7 @@
#define PTRACE_SETREGSET 0x4205

#define PTRACE_SEIZE 0x4206
+#define PTRACE_INTERRUPT 0x4207

/* flags in @data for PTRACE_SEIZE */
#define PTRACE_SEIZE_DEVEL 0x80000000 /* temp flag for development */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index dcf9f97..6852c0f 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -658,10 +658,12 @@ static int ptrace_regset(struct task_struct *task, int req, unsigned int type,
int ptrace_request(struct task_struct *child, long request,
unsigned long addr, unsigned long data)
{
+ bool seized = child->ptrace & PT_SEIZED;
int ret = -EIO;
siginfo_t siginfo;
void __user *datavp = (void __user *) data;
unsigned long __user *datalp = datavp;
+ unsigned long flags;

switch (request) {
case PTRACE_PEEKTEXT:
@@ -694,6 +696,27 @@ int ptrace_request(struct task_struct *child, long request,
ret = ptrace_setsiginfo(child, &siginfo);
break;

+ case PTRACE_INTERRUPT:
+ /*
+ * Stop tracee without any side-effect on signal or job
+ * control. At least one trap is guaranteed to happen
+ * after this request. If @child is already trapped, the
+ * current trap is not disturbed and another trap will
+ * happen after the current trap is ended with PTRACE_CONT.
+ *
+ * The actual trap might not be PTRACE_EVENT_STOP trap but
+ * the pending condition is cleared regardless.
+ */
+ if (unlikely(!seized || !lock_task_sighand(child, &flags)))
+ break;
+
+ if (likely(task_set_jobctl_pending(child, JOBCTL_TRAP_STOP)))
+ signal_wake_up(child, 0);
+
+ unlock_task_sighand(child, &flags);
+ ret = 0;
+ break;
+
case PTRACE_DETACH: /* detach a process that was attached. */
ret = ptrace_detach(child, data);
break;
@@ -819,7 +842,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
goto out_put_task_struct;
}

- ret = ptrace_check_attach(child, request == PTRACE_KILL);
+ ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+ request == PTRACE_INTERRUPT);
if (ret < 0)
goto out_put_task_struct;

@@ -961,7 +985,8 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
goto out_put_task_struct;
}

- ret = ptrace_check_attach(child, request == PTRACE_KILL);
+ ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+ request == PTRACE_INTERRUPT);
if (!ret)
ret = compat_arch_ptrace(child, request, addr, data);

--
1.7.5.2

2011-06-14 09:20:36

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 4/5] ptrace: implement TRAP_NOTIFY and use it for group stop events

Currently there's no way for ptracer to find out whether group stop
finished other than polling with INTERRUPT - GETSIGINFO - CONT
sequence. This patch implements group stop notification for ptracer
using STOP traps.

When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY
is set, which schedules a STOP trap which is sticky - it isn't cleared
by other traps and at least one STOP trap will happen eventually.
STOP trap is synchronization point for event notification and the
tracer can determine the current group stop state by looking at the
signal number portion of exit code (si_status from waitid(2) or
si_code from PTRACE_GETSIGINFO).

Notifications are generated both on start and end of group stops but,
because group stop participation always happens before STOP trap, this
doesn't cause an extra trap while tracee is participating in group
stop. The symmetry will be useful later.

Note that this notification works iff tracee is not trapped.
Currently there is no way to be notified of group stop state changes
while tracee is trapped. This will be addressed by a later patch.

An example program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207

#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts1s = { .tv_sec = 1 };

int main(int argc, char **argv)
{
pid_t tracee, tracer;
int i;

tracee = fork();
if (!tracee)
while (1)
pause();

tracer = fork();
if (!tracer) {
siginfo_t si;

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
repeat:
waitid(P_PID, tracee, NULL, WSTOPPED);

ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
if (!si.si_code) {
printf("tracer: SIG %d\n", si.si_signo);
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(unsigned long)si.si_signo);
goto repeat;
}
printf("tracer: stopped=%d signo=%d\n",
si.si_signo != SIGTRAP, si.si_signo);
ptrace(PTRACE_CONT, tracee, NULL, NULL);
goto repeat;
}

for (i = 0; i < 3; i++) {
nanosleep(&ts1s, NULL);
printf("mother: SIGSTOP\n");
kill(tracee, SIGSTOP);
nanosleep(&ts1s, NULL);
printf("mother: SIGCONT\n");
kill(tracee, SIGCONT);
}
nanosleep(&ts1s, NULL);

kill(tracer, SIGKILL);
kill(tracee, SIGKILL);
return 0;
}

In the above program, tracer keeps tracee running and gets
notification of each group stop state changes.

# ./test-notify
tracer: stopped=0 signo=5
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18

Signed-off-by: Tejun Heo <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
include/linux/sched.h | 4 +++-
kernel/signal.c | 38 +++++++++++++++++++++++++++++++++++---
2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8bd84b8..1854def 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1811,15 +1811,17 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
#define JOBCTL_STOP_PENDING_BIT 17 /* task should stop for group stop */
#define JOBCTL_STOP_CONSUME_BIT 18 /* consume group stop count */
#define JOBCTL_TRAP_STOP_BIT 19 /* trap for STOP */
+#define JOBCTL_TRAP_NOTIFY_BIT 20 /* trap for NOTIFY */
#define JOBCTL_TRAPPING_BIT 21 /* switching to TRACED */

#define JOBCTL_STOP_DEQUEUED (1 << JOBCTL_STOP_DEQUEUED_BIT)
#define JOBCTL_STOP_PENDING (1 << JOBCTL_STOP_PENDING_BIT)
#define JOBCTL_STOP_CONSUME (1 << JOBCTL_STOP_CONSUME_BIT)
#define JOBCTL_TRAP_STOP (1 << JOBCTL_TRAP_STOP_BIT)
+#define JOBCTL_TRAP_NOTIFY (1 << JOBCTL_TRAP_NOTIFY_BIT)
#define JOBCTL_TRAPPING (1 << JOBCTL_TRAPPING_BIT)

-#define JOBCTL_TRAP_MASK JOBCTL_TRAP_STOP
+#define JOBCTL_TRAP_MASK (JOBCTL_TRAP_STOP | JOBCTL_TRAP_NOTIFY)
#define JOBCTL_PENDING_MASK (JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)

extern bool task_set_jobctl_pending(struct task_struct *task,
diff --git a/kernel/signal.c b/kernel/signal.c
index 589292f..06177e2 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -817,6 +817,30 @@ static int check_kill_permission(int sig, struct siginfo *info,
return security_task_kill(t, info, sig, 0);
}

+/**
+ * ptrace_trap_notify - schedule trap to notify ptracer
+ * @t: tracee wanting to notify tracer
+ *
+ * This function schedules sticky ptrace trap which is cleared on the next
+ * TRAP_STOP to notify ptracer of an event. @t must have been seized by
+ * ptracer.
+ *
+ * If @t is running, STOP trap will be taken. If already trapped, STOP
+ * trap will be eventually taken without returning to userland after the
+ * existing traps are finished by PTRACE_CONT.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ */
+static void ptrace_trap_notify(struct task_struct *t)
+{
+ WARN_ON_ONCE(!(t->ptrace & PT_SEIZED));
+ assert_spin_locked(&t->sighand->siglock);
+
+ task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
+ signal_wake_up(t, 0);
+}
+
/*
* Handle magic process-wide effects of stop/continue signals. Unlike
* the signal actions, these happen immediately at signal-generation
@@ -855,7 +879,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
do {
task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
- wake_up_state(t, __TASK_STOPPED);
+ if (likely(!(t->ptrace & PT_SEIZED)))
+ wake_up_state(t, __TASK_STOPPED);
+ else
+ ptrace_trap_notify(t);
} while_each_thread(p, t);

/*
@@ -1797,8 +1824,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
gstop_done = task_participate_group_stop(current);

- /* any trap clears pending STOP trap */
+ /* any trap clears pending STOP trap, STOP trap clears NOTIFY */
task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
+ if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
+ task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);

/* entering a trap, clear TRAPPING */
task_clear_jobctl_trapping(current);
@@ -1972,7 +2001,10 @@ static bool do_signal_stop(int signr)
if (!task_is_stopped(t) &&
task_set_jobctl_pending(t, signr | gstop)) {
sig->group_stop_count++;
- signal_wake_up(t, 0);
+ if (likely(!(t->ptrace & PT_SEIZED)))
+ signal_wake_up(t, 0);
+ else
+ ptrace_trap_notify(t);
}
}
}
--
1.7.5.2

2011-06-14 09:20:47

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 5/5] ptrace: implement PTRACE_LISTEN

The previous patch implemented async notification for ptrace but it
only worked while trace is running. This patch introduces
PTRACE_LISTEN which is suggested by Oleg Nestrov.

It's allowed iff tracee is in STOP trap and puts tracee into
quasi-running state - tracee never really runs but wait(2) and
ptrace(2) consider it to be running. While ptracer is listening,
tracee is allowed to re-enter STOP to notify an async event.
Listening state is cleared on the first notification. Ptracer can
also clear it by issuing INTERRUPT - tracee will re-trap into STOP
with listening state cleared.

This allows ptracer to monitor group stop state without running tracee
- use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
wait(2) to wait for the next group stop event. When it happens,
PTRACE_GETSIGINFO provides information to determine the current state.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207
#define PTRACE_LISTEN 0x4208

#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts1s = { .tv_sec = 1 };

int main(int argc, char **argv)
{
pid_t tracee, tracer;
int i;

tracee = fork();
if (!tracee)
while (1)
pause();

tracer = fork();
if (!tracer) {
siginfo_t si;

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
repeat:
waitid(P_PID, tracee, NULL, WSTOPPED);

ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
if (!si.si_code) {
printf("tracer: SIG %d\n", si.si_signo);
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(unsigned long)si.si_signo);
goto repeat;
}
printf("tracer: stopped=%d signo=%d\n",
si.si_signo != SIGTRAP, si.si_signo);
if (si.si_signo != SIGTRAP)
ptrace(PTRACE_LISTEN, tracee, NULL, NULL);
else
ptrace(PTRACE_CONT, tracee, NULL, NULL);
goto repeat;
}

for (i = 0; i < 3; i++) {
nanosleep(&ts1s, NULL);
printf("mother: SIGSTOP\n");
kill(tracee, SIGSTOP);
nanosleep(&ts1s, NULL);
printf("mother: SIGCONT\n");
kill(tracee, SIGCONT);
}
nanosleep(&ts1s, NULL);

kill(tracer, SIGKILL);
kill(tracee, SIGKILL);
return 0;
}

This is identical to the program to test TRAP_NOTIFY except that
tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped.
This allows ptracer to monitor when group stop ends without running
tracee.

# ./test-listen
tracer: stopped=0 signo=5
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18

-v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into
task_stopped_code() as suggested by Oleg.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
include/linux/ptrace.h | 1 +
include/linux/sched.h | 2 ++
kernel/exit.c | 3 ++-
kernel/ptrace.c | 42 +++++++++++++++++++++++++++++++++++++++---
kernel/signal.c | 13 +++++++++----
5 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index ad754d1..4f224f1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -49,6 +49,7 @@

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207
+#define PTRACE_LISTEN 0x4208

/* flags in @data for PTRACE_SEIZE */
#define PTRACE_SEIZE_DEVEL 0x80000000 /* temp flag for development */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1854def..87f7ca7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1813,6 +1813,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
#define JOBCTL_TRAP_STOP_BIT 19 /* trap for STOP */
#define JOBCTL_TRAP_NOTIFY_BIT 20 /* trap for NOTIFY */
#define JOBCTL_TRAPPING_BIT 21 /* switching to TRACED */
+#define JOBCTL_LISTENING_BIT 22 /* ptracer is listening for events */

#define JOBCTL_STOP_DEQUEUED (1 << JOBCTL_STOP_DEQUEUED_BIT)
#define JOBCTL_STOP_PENDING (1 << JOBCTL_STOP_PENDING_BIT)
@@ -1820,6 +1821,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
#define JOBCTL_TRAP_STOP (1 << JOBCTL_TRAP_STOP_BIT)
#define JOBCTL_TRAP_NOTIFY (1 << JOBCTL_TRAP_NOTIFY_BIT)
#define JOBCTL_TRAPPING (1 << JOBCTL_TRAPPING_BIT)
+#define JOBCTL_LISTENING (1 << JOBCTL_LISTENING_BIT)

#define JOBCTL_TRAP_MASK (JOBCTL_TRAP_STOP | JOBCTL_TRAP_NOTIFY)
#define JOBCTL_PENDING_MASK (JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)
diff --git a/kernel/exit.c b/kernel/exit.c
index 20a4064..289f59d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1368,7 +1368,8 @@ static int wait_task_zombie(struct wait_opts *wo, struct task_struct *p)
static int *task_stopped_code(struct task_struct *p, bool ptrace)
{
if (ptrace) {
- if (task_is_stopped_or_traced(p))
+ if (task_is_stopped_or_traced(p) &&
+ !(p->jobctl & JOBCTL_LISTENING))
return &p->exit_code;
} else {
if (p->signal->flags & SIGNAL_STOP_STOPPED)
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 6852c0f..e18966c 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -146,7 +146,8 @@ int ptrace_check_attach(struct task_struct *child, bool ignore_state)
*/
spin_lock_irq(&child->sighand->siglock);
WARN_ON_ONCE(task_is_stopped(child));
- if (task_is_traced(child) || ignore_state)
+ if (ignore_state || (task_is_traced(child) &&
+ !(child->jobctl & JOBCTL_LISTENING)))
ret = 0;
spin_unlock_irq(&child->sighand->siglock);
}
@@ -660,7 +661,7 @@ int ptrace_request(struct task_struct *child, long request,
{
bool seized = child->ptrace & PT_SEIZED;
int ret = -EIO;
- siginfo_t siginfo;
+ siginfo_t siginfo, *si;
void __user *datavp = (void __user *) data;
unsigned long __user *datalp = datavp;
unsigned long flags;
@@ -710,8 +711,43 @@ int ptrace_request(struct task_struct *child, long request,
if (unlikely(!seized || !lock_task_sighand(child, &flags)))
break;

+ /*
+ * INTERRUPT doesn't disturb existing trap sans one
+ * exception. If ptracer issued LISTEN for the current
+ * STOP, this INTERRUPT should clear LISTEN and re-trap
+ * tracee into STOP.
+ */
if (likely(task_set_jobctl_pending(child, JOBCTL_TRAP_STOP)))
- signal_wake_up(child, 0);
+ signal_wake_up(child, child->jobctl & JOBCTL_LISTENING);
+
+ unlock_task_sighand(child, &flags);
+ ret = 0;
+ break;
+
+ case PTRACE_LISTEN:
+ /*
+ * Listen for events. Tracee must be in STOP. It's not
+ * resumed per-se but is not considered to be in TRACED by
+ * wait(2) or ptrace(2). If an async event (e.g. group
+ * stop state change) happens, tracee will enter STOP trap
+ * again. Alternatively, ptracer can issue INTERRUPT to
+ * finish listening and re-trap tracee into STOP.
+ */
+ if (unlikely(!seized || !lock_task_sighand(child, &flags)))
+ break;
+
+ si = child->last_siginfo;
+ if (unlikely(!si || si->si_code >> 8 != PTRACE_EVENT_STOP))
+ break;
+
+ child->jobctl |= JOBCTL_LISTENING;
+
+ /*
+ * If NOTIFY is set, it means event happened between start
+ * of this trap and now. Trigger re-trap immediately.
+ */
+ if (child->jobctl & JOBCTL_TRAP_NOTIFY)
+ signal_wake_up(child, true);

unlock_task_sighand(child, &flags);
ret = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index 06177e2..97e575a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -825,9 +825,11 @@ static int check_kill_permission(int sig, struct siginfo *info,
* TRAP_STOP to notify ptracer of an event. @t must have been seized by
* ptracer.
*
- * If @t is running, STOP trap will be taken. If already trapped, STOP
- * trap will be eventually taken without returning to userland after the
- * existing traps are finished by PTRACE_CONT.
+ * If @t is running, STOP trap will be taken. If trapped for STOP and
+ * ptracer is listening for events, tracee is woken up so that it can
+ * re-trap for the new event. If trapped otherwise, STOP trap will be
+ * eventually taken without returning to userland after the existing traps
+ * are finished by PTRACE_CONT.
*
* CONTEXT:
* Must be called with @task->sighand->siglock held.
@@ -838,7 +840,7 @@ static void ptrace_trap_notify(struct task_struct *t)
assert_spin_locked(&t->sighand->siglock);

task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
- signal_wake_up(t, 0);
+ signal_wake_up(t, t->jobctl & JOBCTL_LISTENING);
}

/*
@@ -1894,6 +1896,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
spin_lock_irq(&current->sighand->siglock);
current->last_siginfo = NULL;

+ /* LISTENING can be set only during STOP traps, clear it */
+ current->jobctl &= ~JOBCTL_LISTENING;
+
/*
* Queued signals ignored us while we were stopped for tracing.
* So check for any that we should take before resuming user mode.
--
1.7.5.2

2011-06-16 19:46:55

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#5

On 06/14, Tejun Heo wrote:
>
> This patchset contains the following five patches.
>
> 0001-job-control-introduce-JOBCTL_TRAP_STOP-and-use-it-fo.patch
> 0002-ptrace-implement-PTRACE_SEIZE.patch
> 0003-ptrace-implement-PTRACE_INTERRUPT.patch
> 0004-ptrace-implement-TRAP_NOTIFY-and-use-it-for-group-st.patch
> 0005-ptrace-implement-PTRACE_LISTEN.patch

Applied, thanks.

Oleg.

2011-06-16 19:54:07

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 4/5] ptrace: implement TRAP_NOTIFY and use it for group stop events

On 06/14, Tejun Heo wrote:
>
> When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY
> is set,

I already applied this series. But somehow I have the fuzzy feeling we
can simplify JOBCTL_TRAP_NOTIFY/JOBCTL_TRAP_STOP logic later. No, I can't
explain what I mean ;)

One question,

> @@ -1797,8 +1824,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
> if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
> gstop_done = task_participate_group_stop(current);
>
> - /* any trap clears pending STOP trap */
> + /* any trap clears pending STOP trap, STOP trap clears NOTIFY */
> task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
> + if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
> + task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);

OK, but can't we check why == CLD_STOPPED instead of PTRACE_EVENT_STOP?

In fact, can't we move all code above under 'if (why == CLD_STOPPED)' ?
JOBCTL_TRAP_STOP can't be set otherwise, no? I am almost sure I missed
something though.

Oleg.

2011-06-17 15:12:43

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/5] ptrace: implement TRAP_NOTIFY and use it for group stop events

Hello, Oleg.

On Thu, Jun 16, 2011 at 09:51:38PM +0200, Oleg Nesterov wrote:
> I already applied this series. But somehow I have the fuzzy feeling we
> can simplify JOBCTL_TRAP_NOTIFY/JOBCTL_TRAP_STOP logic later. No, I can't
> explain what I mean ;)

Heh, yeah, please go ahead.

> One question,
>
> > @@ -1797,8 +1824,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
> > if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
> > gstop_done = task_participate_group_stop(current);
> >
> > - /* any trap clears pending STOP trap */
> > + /* any trap clears pending STOP trap, STOP trap clears NOTIFY */
> > task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
> > + if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
> > + task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
>
> OK, but can't we check why == CLD_STOPPED instead of PTRACE_EVENT_STOP?

Yeap, sure. The reason why I used PTRACE_EVENT_STOP was because
PTRACE_LISTEN needs the same test and it doesn't have access to @why.
Maybe it's better to create ptrace_is_trap_stop(si)?

> In fact, can't we move all code above under 'if (why == CLD_STOPPED)' ?
> JOBCTL_TRAP_STOP can't be set otherwise, no? I am almost sure I missed
> something though.

JOBCTL_TRAP_STOP should also be cleared on CLD_TRAPPED traps. ie. If
the ptracer does PTRACE_INTERRUPT and then wait(2) reports
PTRACE_EVENT_FORK, there won't be another PTRACE_EVENT_STOP.

Thanks.

--
tejun

2011-06-17 18:33:49

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 4/5] ptrace: implement TRAP_NOTIFY and use it for group stop events

On 06/17, Tejun Heo wrote:
>
> Hello, Oleg.
>
> On Thu, Jun 16, 2011 at 09:51:38PM +0200, Oleg Nesterov wrote:
> > I already applied this series. But somehow I have the fuzzy feeling we
> > can simplify JOBCTL_TRAP_NOTIFY/JOBCTL_TRAP_STOP logic later. No, I can't
> > explain what I mean ;)
>
> Heh, yeah, please go ahead.
>
> > One question,
> >
> > > @@ -1797,8 +1824,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
> > > if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
> > > gstop_done = task_participate_group_stop(current);
> > >
> > > - /* any trap clears pending STOP trap */
> > > + /* any trap clears pending STOP trap, STOP trap clears NOTIFY */
> > > task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
> > > + if (info && info->si_code >> 8 == PTRACE_EVENT_STOP)
> > > + task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
> >
> > OK, but can't we check why == CLD_STOPPED instead of PTRACE_EVENT_STOP?
>
> Yeap, sure. The reason why I used PTRACE_EVENT_STOP was because
> PTRACE_LISTEN needs the same test and it doesn't have access to @why.
> Maybe it's better to create ptrace_is_trap_stop(si)?

Sure, PTRACE_EVENT_STOP have to look at info (although we could add
another JOBCTL_ bit). But since ptrace_stop() checks CLD_STOPPED anyway
we could do

if (why == CLD_STOPPED) {
if (current->jobctl & JOBCTL_STOP_PENDING)
gstop_done = task_participate_group_stop(current);
task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
}

as a microcleanup. OK, please forget, this is minor.

> > In fact, can't we move all code above under 'if (why == CLD_STOPPED)' ?
> > JOBCTL_TRAP_STOP can't be set otherwise, no? I am almost sure I missed
> > something though.
>
> JOBCTL_TRAP_STOP should also be cleared on CLD_TRAPPED traps.

Yes, this is clear

> ie. If
> the ptracer does PTRACE_INTERRUPT

Ah, indeed, I forgot about PTRACE_INTERRUPT. Thanks.

Oleg.

2011-06-18 07:55:46

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

On Tuesday 14 June 2011 11:20, Tejun Heo wrote:
#define PTRACE_EVENT_FORK 1
#define PTRACE_EVENT_VFORK 2
#define PTRACE_EVENT_CLONE 3
> #define PTRACE_EVENT_EXEC 4
> #define PTRACE_EVENT_VFORK_DONE 5
> #define PTRACE_EVENT_EXIT 6
> +#define PTRACE_EVENT_STOP 7

Er... these constants were corresponding exactly to
bit positions in ptrace options which enable them:

#define PTRACE_O_TRACESYSGOOD 0x00000001
#define PTRACE_O_TRACEFORK 0x00000002
#define PTRACE_O_TRACEVFORK 0x00000004
#define PTRACE_O_TRACECLONE 0x00000008
#define PTRACE_O_TRACEEXEC 0x00000010
#define PTRACE_O_TRACEVFORKDONE 0x00000020
#define PTRACE_O_TRACEEXIT 0x00000040

For example, PTRACE_O_TRACEEXEC is 4th bit, PTRACE_EVENT_EXEC is 4.

If we'd define PTRACE_EVENT_STOP as 7, any future added
PTRACE_O_foo bit with value 0x00000080 will be unable
to follow this convention.

I propose to define PTRACE_EVENT_STOP as 64 instead, leaving 64 low
PTRACE_EVENT_foo constants for possible future PTRACE_O_foo bits.

[32 should be enough too, but I feel paranoid today :)]

--
vda

2011-06-18 07:59:45

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

On Saturday 18 June 2011 09:55, Denys Vlasenko wrote:
> On Tuesday 14 June 2011 11:20, Tejun Heo wrote:
> #define PTRACE_EVENT_FORK 1
> #define PTRACE_EVENT_VFORK 2
> #define PTRACE_EVENT_CLONE 3
> > #define PTRACE_EVENT_EXEC 4
> > #define PTRACE_EVENT_VFORK_DONE 5
> > #define PTRACE_EVENT_EXIT 6
> > +#define PTRACE_EVENT_STOP 7
>
> Er... these constants were corresponding exactly to
> bit positions in ptrace options which enable them:
>
> #define PTRACE_O_TRACESYSGOOD 0x00000001
> #define PTRACE_O_TRACEFORK 0x00000002
> #define PTRACE_O_TRACEVFORK 0x00000004
> #define PTRACE_O_TRACECLONE 0x00000008
> #define PTRACE_O_TRACEEXEC 0x00000010
> #define PTRACE_O_TRACEVFORKDONE 0x00000020
> #define PTRACE_O_TRACEEXIT 0x00000040
>
> For example, PTRACE_O_TRACEEXEC is 4th bit, PTRACE_EVENT_EXEC is 4.
>
> If we'd define PTRACE_EVENT_STOP as 7, any future added
> PTRACE_O_foo bit with value 0x00000080 will be unable
> to follow this convention.
>
> I propose to define PTRACE_EVENT_STOP as 64 instead, leaving 64 low
> PTRACE_EVENT_foo constants for possible future PTRACE_O_foo bits.
>
> [32 should be enough too, but I feel paranoid today :)]

...unless we plan to introduce PTRACE_O_TRACESTOP (with value 0x00000080)
which enables PTRACE_INTERRUPT and stop notifications independently
of PTRACE_SEIZE. Which would be very useful for e.g. strace.

Then, PTRACE_EVENT_STOP indeed should be 7.

--
vda

2011-06-18 08:30:40

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

Hello,

On Sat, Jun 18, 2011 at 09:55:37AM +0200, Denys Vlasenko wrote:
> On Tuesday 14 June 2011 11:20, Tejun Heo wrote:
> #define PTRACE_EVENT_FORK 1
> #define PTRACE_EVENT_VFORK 2
> #define PTRACE_EVENT_CLONE 3
> > #define PTRACE_EVENT_EXEC 4
> > #define PTRACE_EVENT_VFORK_DONE 5
> > #define PTRACE_EVENT_EXIT 6
> > +#define PTRACE_EVENT_STOP 7
>
> Er... these constants were corresponding exactly to
> bit positions in ptrace options which enable them:
>
> #define PTRACE_O_TRACESYSGOOD 0x00000001
> #define PTRACE_O_TRACEFORK 0x00000002
> #define PTRACE_O_TRACEVFORK 0x00000004
> #define PTRACE_O_TRACECLONE 0x00000008
> #define PTRACE_O_TRACEEXEC 0x00000010
> #define PTRACE_O_TRACEVFORKDONE 0x00000020
> #define PTRACE_O_TRACEEXIT 0x00000040
>
> For example, PTRACE_O_TRACEEXEC is 4th bit, PTRACE_EVENT_EXEC is 4.
>
> If we'd define PTRACE_EVENT_STOP as 7, any future added
> PTRACE_O_foo bit with value 0x00000080 will be unable
> to follow this convention.

I'm not sure how this will actually play out but currently planning on
adding more implicitly enabled events on SEIZE.

> I propose to define PTRACE_EVENT_STOP as 64 instead, leaving 64 low
> PTRACE_EVENT_foo constants for possible future PTRACE_O_foo bits.
>
> [32 should be enough too, but I feel paranoid today :)]

But this might not be a bad idea. Given that we also support 32bit
archs, going over 32 wouldn't help much tho.

Thanks.

--
tejun

2011-06-18 08:36:07

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

Hello,

On Sat, Jun 18, 2011 at 09:59:38AM +0200, Denys Vlasenko wrote:
> ...unless we plan to introduce PTRACE_O_TRACESTOP (with value 0x00000080)
> which enables PTRACE_INTERRUPT and stop notifications independently
> of PTRACE_SEIZE. Which would be very useful for e.g. strace.

I know you're a big fan of those option flags but I don't really see
the added value in making these behaviors optional rather than keeping
things backward compatible - ie. introducing new event needed to be
gated somehow so the O flags but SEIZE itself serves as a big gate
anyway so I don't see much point in introducing multiple selectable
behaviors. It's not like PTRACE_O_TRACESTOP is gonna make anything
drastically easier or reduce significant amount of overhead.

Thanks.

--
tejun

2011-06-18 08:57:08

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

On Saturday 18 June 2011 10:35, Tejun Heo wrote:
> Hello,
>
> On Sat, Jun 18, 2011 at 09:59:38AM +0200, Denys Vlasenko wrote:
> > ...unless we plan to introduce PTRACE_O_TRACESTOP (with value 0x00000080)
> > which enables PTRACE_INTERRUPT and stop notifications independently
> > of PTRACE_SEIZE. Which would be very useful for e.g. strace.
>
> I know you're a big fan of those option flags but I don't really see
> the added value in making these behaviors optional rather than keeping
> things backward compatible - ie. introducing new event needed to be
> gated somehow so the O flags but SEIZE itself serves as a big gate
> anyway so I don't see much point in introducing multiple selectable
> behaviors. It's not like PTRACE_O_TRACESTOP is gonna make anything
> drastically easier or reduce significant amount of overhead.

I explained this already. strace code is a bit complex, and adding
more complexity so that it uses PTRACE_SEIZE if available, but PTRACE_ATTACH
if it is not, will add some PITA.

Considering that strace does not want PTRACE_SEIZE per se, it only wants
to have a way to properly see and handle group stops, having an option
to enable *only that functonality* without having to use PTRACE_SEIZE
will be useful for strace.

--
vda

2011-06-18 08:58:53

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

On Saturday 18 June 2011 10:30, Tejun Heo wrote:
> > I propose to define PTRACE_EVENT_STOP as 64 instead, leaving 64 low
> > PTRACE_EVENT_foo constants for possible future PTRACE_O_foo bits.
> >
> > [32 should be enough too, but I feel paranoid today :)]
>
> But this might not be a bad idea. Given that we also support 32bit
> archs, going over 32 wouldn't help much tho.

In 2030, 32 bit will be sort of like 16 bit today :)
and we'll suddenly find it not impossible to use bit positions >= 32.

(and I will be so old it's not funny)

--
vda

2011-06-18 09:04:37

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 2/5] ptrace: implement PTRACE_SEIZE

Hello,

On Sat, Jun 18, 2011 at 10:57:02AM +0200, Denys Vlasenko wrote:
> I explained this already. strace code is a bit complex, and adding
> more complexity so that it uses PTRACE_SEIZE if available, but PTRACE_ATTACH
> if it is not, will add some PITA.
>
> Considering that strace does not want PTRACE_SEIZE per se, it only wants
> to have a way to properly see and handle group stops, having an option
> to enable *only that functonality* without having to use PTRACE_SEIZE
> will be useful for strace.

I understand that it would make strace's life somewhat easier but
don't agree the difference is significant enough to justify
introducing more options. We're talking about small number of well
defined behaviors. Yes, it wouldn't be as simple as adding several
liners during initialization but that doesn't warrant extra kernel
features and differing behaviors which, I think, in the long run, make
things much more complicated (not complex) than necessary.

Thanks.

--
tejun