Now that everything is converted to the new IDTENTRY mechansim, move the
irq tracing and the invocation of enter_to_user_mode() to C code, i.e. into
the idtentry_enter() inline which is invoked through the IDTENTRY magic.
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/entry/entry_32.S | 1 -
arch/x86/entry/entry_64.S | 9 ---------
arch/x86/include/asm/idtentry.h | 19 ++++++++++++++++++-
3 files changed, 18 insertions(+), 11 deletions(-)
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1452,7 +1452,6 @@ SYM_CODE_START_LOCAL_NOALIGN(common_exce
movl PT_ORIG_EAX(%esp), %edx # get the error code
movl $-1, PT_ORIG_EAX(%esp) # no syscall to restart
- TRACE_IRQS_OFF
movl %esp, %eax # pt_regs pointer
CALL_NOSPEC %edi
jmp ret_from_exception
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -511,15 +511,6 @@ SYM_CODE_END(spurious_entries_start)
GET_CR2_INTO(%r12);
.endif
- TRACE_IRQS_OFF
-
-#ifdef CONFIG_CONTEXT_TRACKING
- testb $3, CS(%rsp)
- jz .Lfrom_kernel_no_ctxt_tracking_\@
- CALL_enter_from_user_mode
-.Lfrom_kernel_no_ctxt_tracking_\@:
-#endif
-
movq %rsp, %rdi /* pt_regs pointer into 1st argument*/
.if \has_error_code == 1
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -7,14 +7,31 @@
#ifndef __ASSEMBLY__
+#ifdef CONFIG_CONTEXT_TRACKING
+static __always_inline void enter_from_user_context(void)
+{
+ CT_WARN_ON(ct_state() != CONTEXT_USER);
+ user_exit_irqoff();
+}
+#else
+static __always_inline void enter_from_user_context(void) { }
+#endif
+
/**
* idtentry_enter - Handle state tracking on idtentry
* @regs: Pointer to pt_regs of interrupted context
*
- * Place holder for now.
+ * Invokes:
+ * - The hardirq tracer to keep the state consistent as low level ASM
+ * entry disabled interrupts.
+ *
+ * - Context tracking if the exception hit user mode
*/
static __always_inline void idtentry_enter(struct pt_regs *regs)
{
+ trace_hardirqs_off();
+ if (user_mode(regs))
+ enter_from_user_context();
}
/**
On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -7,14 +7,31 @@
>
> #ifndef __ASSEMBLY__
>
> +#ifdef CONFIG_CONTEXT_TRACKING
> +static __always_inline void enter_from_user_context(void)
> +{
> + CT_WARN_ON(ct_state() != CONTEXT_USER);
> + user_exit_irqoff();
> +}
> +#else
> +static __always_inline void enter_from_user_context(void) { }
> +#endif
> +
> /**
> * idtentry_enter - Handle state tracking on idtentry
> * @regs: Pointer to pt_regs of interrupted context
> *
> - * Place holder for now.
> + * Invokes:
> + * - The hardirq tracer to keep the state consistent as low level ASM
> + * entry disabled interrupts.
> + *
> + * - Context tracking if the exception hit user mode
> */
> static __always_inline void idtentry_enter(struct pt_regs *regs)
> {
> + trace_hardirqs_off();
> + if (user_mode(regs))
> + enter_from_user_context();
> }
So:
asm_exc_int3
exc_int3
idtentry_enter()
enter_from_user_context
if (context_tracking_enabled())
poke_int3_handler();
Is, AFAICT, completely buggered.
You can't have a static_branch before the poke_int3_handler that deals
with text_poke.
On Wed, Feb 26, 2020 at 09:05:38AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
>
> > --- a/arch/x86/include/asm/idtentry.h
> > +++ b/arch/x86/include/asm/idtentry.h
> > @@ -7,14 +7,31 @@
> >
> > #ifndef __ASSEMBLY__
> >
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > +static __always_inline void enter_from_user_context(void)
> > +{
> > + CT_WARN_ON(ct_state() != CONTEXT_USER);
> > + user_exit_irqoff();
> > +}
> > +#else
> > +static __always_inline void enter_from_user_context(void) { }
> > +#endif
> > +
> > /**
> > * idtentry_enter - Handle state tracking on idtentry
> > * @regs: Pointer to pt_regs of interrupted context
> > *
> > - * Place holder for now.
> > + * Invokes:
> > + * - The hardirq tracer to keep the state consistent as low level ASM
> > + * entry disabled interrupts.
> > + *
> > + * - Context tracking if the exception hit user mode
> > */
> > static __always_inline void idtentry_enter(struct pt_regs *regs)
> > {
> > + trace_hardirqs_off();
> > + if (user_mode(regs))
> > + enter_from_user_context();
> > }
>
> So:
>
> asm_exc_int3
> exc_int3
> idtentry_enter()
> enter_from_user_context
> if (context_tracking_enabled())
>
> poke_int3_handler();
>
> Is, AFAICT, completely buggered.
>
> You can't have a static_branch before the poke_int3_handler that deals
> with text_poke.
Forgot to say that this isn't new with this patch, just noticed it when
chasing after tracepoints.
After my patch series we can actually fix this by moving the
poke_int3_handler() before idtentry_enter().
On Wed, Feb 26, 2020 at 1:20 AM Peter Zijlstra <[email protected]> wrote:
>
> On Wed, Feb 26, 2020 at 09:05:38AM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
> >
> > > --- a/arch/x86/include/asm/idtentry.h
> > > +++ b/arch/x86/include/asm/idtentry.h
> > > @@ -7,14 +7,31 @@
> > >
> > > #ifndef __ASSEMBLY__
> > >
> > > +#ifdef CONFIG_CONTEXT_TRACKING
> > > +static __always_inline void enter_from_user_context(void)
> > > +{
> > > + CT_WARN_ON(ct_state() != CONTEXT_USER);
> > > + user_exit_irqoff();
> > > +}
> > > +#else
> > > +static __always_inline void enter_from_user_context(void) { }
> > > +#endif
> > > +
> > > /**
> > > * idtentry_enter - Handle state tracking on idtentry
> > > * @regs: Pointer to pt_regs of interrupted context
> > > *
> > > - * Place holder for now.
> > > + * Invokes:
> > > + * - The hardirq tracer to keep the state consistent as low level ASM
> > > + * entry disabled interrupts.
> > > + *
> > > + * - Context tracking if the exception hit user mode
> > > */
> > > static __always_inline void idtentry_enter(struct pt_regs *regs)
> > > {
> > > + trace_hardirqs_off();
> > > + if (user_mode(regs))
> > > + enter_from_user_context();
> > > }
> >
> > So:
> >
> > asm_exc_int3
> > exc_int3
> > idtentry_enter()
> > enter_from_user_context
> > if (context_tracking_enabled())
> >
> > poke_int3_handler();
> >
> > Is, AFAICT, completely buggered.
> >
> > You can't have a static_branch before the poke_int3_handler that deals
> > with text_poke.
>
> Forgot to say that this isn't new with this patch, just noticed it when
> chasing after tracepoints.
>
> After my patch series we can actually fix this by moving the
> poke_int3_handler() before idtentry_enter().
In some sense, this is a weakness of the magic macro approach. Some
of the entries just want to have code that runs before all the entry
fixups. This is an example of it. So are the cr2 reads. It can all
be made to work, but it's a bit gross.
On Wed, Feb 26, 2020 at 07:11:39AM -0800, Andy Lutomirski wrote:
> In some sense, this is a weakness of the magic macro approach. Some
> of the entries just want to have code that runs before all the entry
> fixups. This is an example of it. So are the cr2 reads. It can all
> be made to work, but it's a bit gross.
Right. In my current pile (new patche since last posting) I also have
one that makes #DB save-clear/restore DR7.
I got it early enough that only a watchpoint on the task stack can still
screw us over, since I also included your patch that excludes
cpu_entry_area.
Pushing it earlier still would require calling into C from the entry
stack, which I know is on your todo list, but we're not quite there yet.
On Wed, Feb 26, 2020 at 09:05:38AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
>
> > --- a/arch/x86/include/asm/idtentry.h
> > +++ b/arch/x86/include/asm/idtentry.h
> > @@ -7,14 +7,31 @@
> >
> > #ifndef __ASSEMBLY__
> >
> > +#ifdef CONFIG_CONTEXT_TRACKING
> > +static __always_inline void enter_from_user_context(void)
> > +{
> > + CT_WARN_ON(ct_state() != CONTEXT_USER);
> > + user_exit_irqoff();
> > +}
> > +#else
> > +static __always_inline void enter_from_user_context(void) { }
> > +#endif
> > +
> > /**
> > * idtentry_enter - Handle state tracking on idtentry
> > * @regs: Pointer to pt_regs of interrupted context
> > *
> > - * Place holder for now.
> > + * Invokes:
> > + * - The hardirq tracer to keep the state consistent as low level ASM
> > + * entry disabled interrupts.
> > + *
> > + * - Context tracking if the exception hit user mode
> > */
> > static __always_inline void idtentry_enter(struct pt_regs *regs)
> > {
> > + trace_hardirqs_off();
> > + if (user_mode(regs))
> > + enter_from_user_context();
> > }
>
> So:
>
> asm_exc_int3
> exc_int3
> idtentry_enter()
> enter_from_user_context
> if (context_tracking_enabled())
>
> poke_int3_handler();
>
> Is, AFAICT, completely buggered.
>
> You can't have a static_branch before the poke_int3_handler that deals
> with text_poke.
#BP is treated like an NMI in your patchset IIRC?
In that case and since that implies we can't schedule, we can remove
the call to context tracking there once we have nmi_enter()/nmi_exit()
called unconditionally.
> On Feb 26, 2020, at 9:05 AM, Frederic Weisbecker <[email protected]> wrote:
>
> On Wed, Feb 26, 2020 at 09:05:38AM +0100, Peter Zijlstra wrote:
>>> On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
>>>
>>> --- a/arch/x86/include/asm/idtentry.h
>>> +++ b/arch/x86/include/asm/idtentry.h
>>> @@ -7,14 +7,31 @@
>>>
>>> #ifndef __ASSEMBLY__
>>>
>>> +#ifdef CONFIG_CONTEXT_TRACKING
>>> +static __always_inline void enter_from_user_context(void)
>>> +{
>>> + CT_WARN_ON(ct_state() != CONTEXT_USER);
>>> + user_exit_irqoff();
>>> +}
>>> +#else
>>> +static __always_inline void enter_from_user_context(void) { }
>>> +#endif
>>> +
>>> /**
>>> * idtentry_enter - Handle state tracking on idtentry
>>> * @regs: Pointer to pt_regs of interrupted context
>>> *
>>> - * Place holder for now.
>>> + * Invokes:
>>> + * - The hardirq tracer to keep the state consistent as low level ASM
>>> + * entry disabled interrupts.
>>> + *
>>> + * - Context tracking if the exception hit user mode
>>> */
>>> static __always_inline void idtentry_enter(struct pt_regs *regs)
>>> {
>>> + trace_hardirqs_off();
>>> + if (user_mode(regs))
>>> + enter_from_user_context();
>>> }
>>
>> So:
>>
>> asm_exc_int3
>> exc_int3
>> idtentry_enter()
>> enter_from_user_context
>> if (context_tracking_enabled())
>>
>> poke_int3_handler();
>>
>> Is, AFAICT, completely buggered.
>>
>> You can't have a static_branch before the poke_int3_handler that deals
>> with text_poke.
>
> #BP is treated like an NMI in your patchset IIRC?
> In that case and since that implies we can't schedule, we can remove
> the call to context tracking there once we have nmi_enter()/nmi_exit()
> called unconditionally.
int3 from user mode can send signals. This has better be able to schedule by the time it hits prepare_exit_to_usermode.
On Wed, Feb 26, 2020 at 09:09:34AM -0800, Andy Lutomirski wrote:
>
>
> > On Feb 26, 2020, at 9:05 AM, Frederic Weisbecker <[email protected]> wrote:
> >
> > On Wed, Feb 26, 2020 at 09:05:38AM +0100, Peter Zijlstra wrote:
> >>> On Tue, Feb 25, 2020 at 11:33:34PM +0100, Thomas Gleixner wrote:
> >>>
> >>> --- a/arch/x86/include/asm/idtentry.h
> >>> +++ b/arch/x86/include/asm/idtentry.h
> >>> @@ -7,14 +7,31 @@
> >>>
> >>> #ifndef __ASSEMBLY__
> >>>
> >>> +#ifdef CONFIG_CONTEXT_TRACKING
> >>> +static __always_inline void enter_from_user_context(void)
> >>> +{
> >>> + CT_WARN_ON(ct_state() != CONTEXT_USER);
> >>> + user_exit_irqoff();
> >>> +}
> >>> +#else
> >>> +static __always_inline void enter_from_user_context(void) { }
> >>> +#endif
> >>> +
> >>> /**
> >>> * idtentry_enter - Handle state tracking on idtentry
> >>> * @regs: Pointer to pt_regs of interrupted context
> >>> *
> >>> - * Place holder for now.
> >>> + * Invokes:
> >>> + * - The hardirq tracer to keep the state consistent as low level ASM
> >>> + * entry disabled interrupts.
> >>> + *
> >>> + * - Context tracking if the exception hit user mode
> >>> */
> >>> static __always_inline void idtentry_enter(struct pt_regs *regs)
> >>> {
> >>> + trace_hardirqs_off();
> >>> + if (user_mode(regs))
> >>> + enter_from_user_context();
> >>> }
> >>
> >> So:
> >>
> >> asm_exc_int3
> >> exc_int3
> >> idtentry_enter()
> >> enter_from_user_context
> >> if (context_tracking_enabled())
> >>
> >> poke_int3_handler();
> >>
> >> Is, AFAICT, completely buggered.
> >>
> >> You can't have a static_branch before the poke_int3_handler that deals
> >> with text_poke.
> >
> > #BP is treated like an NMI in your patchset IIRC?
> > In that case and since that implies we can't schedule, we can remove
> > the call to context tracking there once we have nmi_enter()/nmi_exit()
> > called unconditionally.
>
> int3 from user mode can send signals. This has better be able to schedule by the time it hits prepare_exit_to_usermode.
Oh right...
> On Feb 26, 2020, at 8:28 AM, Peter Zijlstra <[email protected]> wrote:
>
> On Wed, Feb 26, 2020 at 07:11:39AM -0800, Andy Lutomirski wrote:
>
>> In some sense, this is a weakness of the magic macro approach. Some
>> of the entries just want to have code that runs before all the entry
>> fixups. This is an example of it. So are the cr2 reads. It can all
>> be made to work, but it's a bit gross.
>
> Right. In my current pile (new patche since last posting) I also have
> one that makes #DB save-clear/restore DR7.
>
> I got it early enough that only a watchpoint on the task stack can still
> screw us over, since I also included your patch that excludes
> cpu_entry_area.
Hmm. It would be nice to prevent watchpoints on the task stack, but that would need some trickery. It could be done.
>
> Pushing it earlier still would require calling into C from the entry
> stack, which I know is on your todo list, but we're not quite there yet.
Indeed.
This is my main objection to the DEFINE_IDTENTRY stuff. It’s *great* for the easy cases, but it’s not so great for the nasty cases. Maybe we should open code PF, MC, DB, etc. (And kill the kvm special case for PF. I have a working patch for that and I can send it.)
Anyway, this isn’t actually what I was concerned about. I meant DR6, not DR7. Specifically, if we get traced too early in do_debug / exc_debug, we can recursively debug and clobber DR6. The result will be incorrect behavior in the outer do_debug. We can fix this the same way we handle CR2. I just haven’t done it on the existing entry code because it’s too messy.
Andy Lutomirski <[email protected]> writes:
>> On Feb 26, 2020, at 8:28 AM, Peter Zijlstra <[email protected]> wrote:
>>
>> On Wed, Feb 26, 2020 at 07:11:39AM -0800, Andy Lutomirski wrote:
>>
>>> In some sense, this is a weakness of the magic macro approach. Some
>>> of the entries just want to have code that runs before all the entry
>>> fixups. This is an example of it. So are the cr2 reads. It can all
>>> be made to work, but it's a bit gross.
>>
>> Right. In my current pile (new patche since last posting) I also have
>> one that makes #DB save-clear/restore DR7.
>>
>> I got it early enough that only a watchpoint on the task stack can still
>> screw us over, since I also included your patch that excludes
>> cpu_entry_area.
>
> Hmm. It would be nice to prevent watchpoints on the task stack, but that would need some trickery. It could be done.
>
>>
>> Pushing it earlier still would require calling into C from the entry
>> stack, which I know is on your todo list, but we're not quite there yet.
>
> Indeed.
>
> This is my main objection to the DEFINE_IDTENTRY stuff. It’s *great*
> for the easy cases, but it’s not so great for the nasty cases. Maybe
> we should open code PF, MC, DB, etc. (And kill the kvm special case
> for PF. I have a working patch for that and I can send it.)
I'm fine with that. I like the obvious easy ones as it spares to
duplicate all the same crap all over the place.
Making the nasty ones have:
#define DEFINE_IDTENTRY_$NASTY(exc_name) \
__visible void notrace exc_name(args....)
would solve this and you can stick whatever you want into it and have
the NOPROBE added manually. Hmm?
Thanks,
tglx