2007-12-04 18:03:29

by Markus Metzger

[permalink] [raw]
Subject: [patch 0/2] x86, ptrace: support for branch trace store(BTS)

Support for Intel's last branch recording to ptrace. This gives debuggers
access to this hardware feature and allows them to show an execution trace
of the debugged application.

Last branch recording (see section 18.5 in the Intel 64 and IA-32
Architectures Software Developer's Manual) allows taking an execution
trace of the running application without instrumentation. When a branch
is executed, the hardware logs the source and destination address in a
cyclic buffer given to it by the OS.

This can be a great debugging aid. It shows you how exactly you got
where you currently are without requiring you to do lots of single
stepping and rerunning.

This patch manages the various buffers, configures the trace
hardware, disentangles the trace, and provides a user interface via
ptrace. On the high-level design:
- there is one optional trace buffer per thread_struct
- upon a context switch, the trace hardware is reconfigured to either
disable tracing or to use the appropriate buffer for the new task.
- tracing induces ~20% overhead as branch records are sent out on
the bus.
- the hardware collects trace per processor. To disentangle the
traces for different tasks, we use separate buffers and reconfigure
the trace hardware.
- the low-level data layout is configured at cpu initialization time
- different processors use different branch record formats
- the implementation is done in two layers
- the lower layer implements the DS/BTS access
- the higher layer implements a ptrace interface

Per-CPU tracing can be implemented on top of the lower layer.
A per-cpu array of DS pointers needs to be ds_allocate()'d and the
MSR_IA32_DS_AREA and MSR_IA32_DEBUGCTLMSR MSR's need to be properly
configured. Care needs to be taken to not interfere with the ptrace
use of the above MSR's.


patch 1/2 contains the kernel changes
patch 2/2 contains changes to the ptrace man pages


regards,
markus.