2008-11-26 08:44:26

by Stephane Eranian

[permalink] [raw]
Subject: [patch 06/24] perfmon: generic x86 definitions (x86)

This patch adds definitions for the perfmon interrupt vector
and thread info flags. It is common to i386 and x86_64 code.

Signed-off-by: Stephane Eranian <[email protected]>
--

Index: o3/arch/x86/include/asm/irq_vectors.h
===================================================================
--- o3.orig/arch/x86/include/asm/irq_vectors.h 2008-11-03 10:55:26.000000000 +0100
+++ o3/arch/x86/include/asm/irq_vectors.h 2008-11-03 10:56:12.000000000 +0100
@@ -87,6 +87,11 @@
#define LOCAL_TIMER_VECTOR 0xef

/*
+ * Perfmon PMU interrupt vector
+ */
+#define LOCAL_PERFMON_VECTOR 0xee
+
+/*
* First APIC vector available to drivers: (vectors 0x30-0xee) we
* start at 0x31(0x41) to spread out vectors evenly between priority
* levels. (0x80 is the syscall vector)
Index: o3/arch/x86/include/asm/thread_info.h
===================================================================
--- o3.orig/arch/x86/include/asm/thread_info.h 2008-11-03 10:55:14.000000000 +0100
+++ o3/arch/x86/include/asm/thread_info.h 2008-11-03 10:58:10.000000000 +0100
@@ -79,6 +79,7 @@
#define TIF_SYSCALL_EMU 6 /* syscall emulation active */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SECCOMP 8 /* secure computing */
+#define TIF_PERFMON_WORK 9 /* work for pfm_handle_work() */
#define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
@@ -92,6 +93,7 @@
#define TIF_DEBUGCTLMSR 25 /* uses thread_struct.debugctlmsr */
#define TIF_DS_AREA_MSR 26 /* uses thread_struct.ds_area_msr */
#define TIF_BTS_TRACE_TS 27 /* record scheduling event timestamps */
+#define TIF_PERFMON_CTXSW 28 /* perfmon needs ctxsw calls */

#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
@@ -114,6 +116,8 @@
#define _TIF_DEBUGCTLMSR (1 << TIF_DEBUGCTLMSR)
#define _TIF_DS_AREA_MSR (1 << TIF_DS_AREA_MSR)
#define _TIF_BTS_TRACE_TS (1 << TIF_BTS_TRACE_TS)
+#define _TIF_PERFMON_WORK (1<<TIF_PERFMON_WORK)
+#define _TIF_PERFMON_CTXSW (1<<TIF_PERFMON_CTXSW)

/* work to do in syscall_trace_enter() */
#define _TIF_WORK_SYSCALL_ENTRY \
@@ -135,12 +139,12 @@

/* Only used for 64 bit */
#define _TIF_DO_NOTIFY_MASK \
- (_TIF_SIGPENDING|_TIF_MCE_NOTIFY|_TIF_NOTIFY_RESUME)
+ (_TIF_SIGPENDING|_TIF_MCE_NOTIFY|_TIF_PERFMON_WORK)

/* flags to check in __switch_to() */
#define _TIF_WORK_CTXSW \
(_TIF_IO_BITMAP|_TIF_DEBUGCTLMSR|_TIF_DS_AREA_MSR|_TIF_BTS_TRACE_TS| \
- _TIF_NOTSC)
+ _TIF_NOTSC|_TIF_PERFMON_CTXSW)

#define _TIF_WORK_CTXSW_PREV _TIF_WORK_CTXSW
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW|_TIF_DEBUG)

--


2008-11-26 11:10:17

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

> /*
> + * Perfmon PMU interrupt vector
> + */
> +#define LOCAL_PERFMON_VECTOR 0xee

There's a new dynamic vector allocator that can be used instead.

-Andi

2008-11-26 13:43:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

On Wed, 26 Nov 2008, [email protected] wrote:

> This patch adds definitions for the perfmon interrupt vector
> and thread info flags. It is common to i386 and x86_64 code.

> +#define TIF_PERFMON_WORK 9 /* work for pfm_handle_work() */

I can see the requirement for an apic vector, but why do you need a
TIF flag ?

Thanks,

tglx

2008-11-26 14:19:22

by Stephane Eranian

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

Thomas,

On Wed, Nov 26, 2008 at 2:41 PM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 26 Nov 2008, [email protected] wrote:
>
>> This patch adds definitions for the perfmon interrupt vector
>> and thread info flags. It is common to i386 and x86_64 code.
>
>> +#define TIF_PERFMON_WORK 9 /* work for pfm_handle_work() */
>
> I can see the requirement for an apic vector, but why do you need a
> TIF flag ?
>
Ok, this is a good question, so let me explain.

The goal of the TIF flag is to force the thread to go do some extra work on
kernel exit. There are two situations where this is necessary, there is one
in the current patchset, the other is related to sampling (not yet provided).

With per-thread monitoring, a tool is monitoring another thread, possibly in
another process. The monitored process and the tool may not be parent
of each other.

What happens if the tool dies BEFORE it can cleanly close the
monitoring session?

There are 2 scenarios:
1- the monitored process also had the perfmon file descriptor open,
e.g., inherited
on fork/exec. In that case the monitored thread will keep on
running to completion
with an attached perfmon context.

2- the monitoring had the last reference to the file descriptor. In
that case, we have a
perfmon context attached to a thread but no mean to get to it
from userland. This is
the case where we declare the context as ZOMBIE.

I think Andi confused it with the meaning of ZOMBIE for the
process. In this situation,
we want to cleanup the context and make sure monitoring is stopped.

That has to be done by the monitored thread. The issue is that
the thread may notice
the context is ZOMBIE during context switch in. At this level, we
run with interrupts
disabled, and it is not possible to free certain resources. So
instead, we set the TIF
flag, and let the thread clean things up at a much higher level
in the kernel execution
somewhere where we know we can safely call certain kernel APIs, e.g, kfree.

Another possible solution (which is not implemented):
- just let the context attached and run the thread to completion.
If another tool wants to
attach to the same thread, it will detect there is already a
context attached, and that it is
marked ZOMBIE, so it will clean it up. This is a lazy cleanup approach.

2008-11-26 15:45:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

Stephane,

On Wed, 26 Nov 2008, stephane eranian wrote:
> The goal of the TIF flag is to force the thread to go do some extra work on
> kernel exit. There are two situations where this is necessary, there is one
> in the current patchset, the other is related to sampling (not yet provided).
>
> With per-thread monitoring, a tool is monitoring another thread, possibly in
> another process. The monitored process and the tool may not be parent
> of each other.
>
> What happens if the tool dies BEFORE it can cleanly close the
> monitoring session?
>
> There are 2 scenarios:
> 1- the monitored process also had the perfmon file descriptor open,
> e.g., inherited
> on fork/exec. In that case the monitored thread will keep on
> running to completion
> with an attached perfmon context.

So no TIF work for this case, right ?

> 2- the monitoring had the last reference to the file descriptor. In
> that case, we have a
> perfmon context attached to a thread but no mean to get to it
> from userland. This is
> the case where we declare the context as ZOMBIE.
>
> I think Andi confused it with the meaning of ZOMBIE for the
> process. In this situation,
> we want to cleanup the context and make sure monitoring is stopped.
>
> That has to be done by the monitored thread. The issue is that
> the thread may notice
> the context is ZOMBIE during context switch in. At this level, we
> run with interrupts
> disabled, and it is not possible to free certain resources. So
> instead, we set the TIF
> flag, and let the thread clean things up at a much higher level
> in the kernel execution
> somewhere where we know we can safely call certain kernel APIs, e.g, kfree.

There is no harm, when the context is kept around, right ?

> Another possible solution (which is not implemented):
> - just let the context attached and run the thread to completion.
> If another tool wants to
> attach to the same thread, it will detect there is already a
> context attached, and that it is
> marked ZOMBIE, so it will clean it up. This is a lazy cleanup approach.

Looks like ctx is a couple of hundred bytes, so just keep it around
until thread exit time or until the other tool does the cleanup
possibly by recycling the context.

Thanks,

tglx

2008-11-26 15:50:33

by Stephane Eranian

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

Thomas,

On Wed, Nov 26, 2008 at 4:44 PM, Thomas Gleixner <[email protected]> wrote:
> Stephane,
>
> On Wed, 26 Nov 2008, stephane eranian wrote:
>> The goal of the TIF flag is to force the thread to go do some extra work on
>> kernel exit. There are two situations where this is necessary, there is one
>> in the current patchset, the other is related to sampling (not yet provided).
>>
>> With per-thread monitoring, a tool is monitoring another thread, possibly in
>> another process. The monitored process and the tool may not be parent
>> of each other.
>>
>> What happens if the tool dies BEFORE it can cleanly close the
>> monitoring session?
>>
>> There are 2 scenarios:
>> 1- the monitored process also had the perfmon file descriptor open,
>> e.g., inherited
>> on fork/exec. In that case the monitored thread will keep on
>> running to completion
>> with an attached perfmon context.
>
> So no TIF work for this case, right ?
>
Correct.

>> 2- the monitoring had the last reference to the file descriptor. In
>> that case, we have a
>> perfmon context attached to a thread but no mean to get to it
>> from userland. This is
>> the case where we declare the context as ZOMBIE.
>>
>> I think Andi confused it with the meaning of ZOMBIE for the
>> process. In this situation,
>> we want to cleanup the context and make sure monitoring is stopped.
>>
>> That has to be done by the monitored thread. The issue is that
>> the thread may notice
>> the context is ZOMBIE during context switch in. At this level, we
>> run with interrupts
>> disabled, and it is not possible to free certain resources. So
>> instead, we set the TIF
>> flag, and let the thread clean things up at a much higher level
>> in the kernel execution
>> somewhere where we know we can safely call certain kernel APIs, e.g, kfree.
>
> There is no harm, when the context is kept around, right ?
>

Well, there are possibly PMU interrupts. If the monitored thread is
active on the CPU
by the time the tool dies, then it will keep on running with
monitoring on, until it is
context switched out or dies.

With the approach currently implemented, the TIF bit will be set and
as soon as the thread
leaves the kernel for any reason, it will execute the cleanup
function which will stop
monitoring and free the context.

>> Another possible solution (which is not implemented):
>> - just let the context attached and run the thread to completion.
>> If another tool wants to
>> attach to the same thread, it will detect there is already a
>> context attached, and that it is
>> marked ZOMBIE, so it will clean it up. This is a lazy cleanup approach.
>
> Looks like ctx is a couple of hundred bytes, so just keep it around
> until thread exit time or until the other tool does the cleanup
> possibly by recycling the context.
>
That's true except for the caveat described above.

2008-11-26 16:02:33

by Stephane Eranian

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

Thomas,

On Wed, Nov 26, 2008 at 4:50 PM, stephane eranian
<[email protected]> wrote:
>>> 2- the monitoring had the last reference to the file descriptor. In
>>> that case, we have a
>>> perfmon context attached to a thread but no mean to get to it
>>> from userland. This is
>>> the case where we declare the context as ZOMBIE.
>>>
>>> I think Andi confused it with the meaning of ZOMBIE for the
>>> process. In this situation,
>>> we want to cleanup the context and make sure monitoring is stopped.
>>>
>>> That has to be done by the monitored thread. The issue is that
>>> the thread may notice
>>> the context is ZOMBIE during context switch in. At this level, we
>>> run with interrupts
>>> disabled, and it is not possible to free certain resources. So
>>> instead, we set the TIF
>>> flag, and let the thread clean things up at a much higher level
>>> in the kernel execution
>>> somewhere where we know we can safely call certain kernel APIs, e.g, kfree.
>>
>> There is no harm, when the context is kept around, right ?
>>
>
> Well, there are possibly PMU interrupts. If the monitored thread is
> active on the CPU
> by the time the tool dies, then it will keep on running with
> monitoring on, until it is
> context switched out or dies.
>
> With the approach currently implemented, the TIF bit will be set and
> as soon as the thread
> leaves the kernel for any reason, it will execute the cleanup
> function which will stop
> monitoring and free the context.
>
To follow-up on that, worst case scenario is you get one more PMU interrupt.
The interrupt handler will notice the ZOMBIE state and will not reactivate
monitoring. The context will remain, but there will be no further cost to the
context switch because nothing will be saved or restored anymore.

2008-11-26 16:17:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 06/24] perfmon: generic x86 definitions (x86)

Stephane,

On Wed, 26 Nov 2008, stephane eranian wrote:
> > There is no harm, when the context is kept around, right ?
> >
>
> Well, there are possibly PMU interrupts. If the monitored thread is
> active on the CPU
> by the time the tool dies, then it will keep on running with
> monitoring on, until it is
> context switched out or dies.

If the interrupt detects that the context is dead, then it can disable
the counters and be done with it. And when the thread is switched in
again it just does not enable the counters when the context is dead.

> With the approach currently implemented, the TIF bit will be set and
> as soon as the thread
> leaves the kernel for any reason, it will execute the cleanup
> function which will stop
> monitoring and free the context.

Well, this does not guarantee that no PMU interrupts happen before it
can process the TIF bit.

> >> Another possible solution (which is not implemented):
> >> - just let the context attached and run the thread to completion.
> >> If another tool wants to
> >> attach to the same thread, it will detect there is already a
> >> context attached, and that it is
> >> marked ZOMBIE, so it will clean it up. This is a lazy cleanup approach.
> >
> > Looks like ctx is a couple of hundred bytes, so just keep it around
> > until thread exit time or until the other tool does the cleanup
> > possibly by recycling the context.
> >
> That's true except for the caveat described above.

Which is fine.

Thanks,

tglx