2016-04-14 18:11:31

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 0/4] drop test_thread_flag checks

As it was suggested by Andy, I'm working on TIF_IA32 flag
removal. In this first four places, it's quite trivial to
use user_64bit_mode instead of test_thread_flag().
This should fix possible problems for native applications
that change their code selector to __USER32_CS, but do
not have TIF_IA32 flag.

I still quite don't know what to do with uprobes using
ia32_compat check and will make the next patches about
ptrace & signals usage of TIF_IA32.

Dmitry Safonov (4):
x86/events: down with test_thread_flag(TIF_IA32)
x86/intel: down with test_thread_flag(TIF_IA32)
x86/intel lbr: down with test_thread_flag(TIF_IA32)
x86/oprofile: down with test_thread_flag(TIF_IA32)

arch/x86/events/core.c | 2 +-
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/intel/lbr.c | 17 ++++++++++-------
arch/x86/events/perf_event.h | 2 +-
arch/x86/oprofile/backtrace.c | 2 +-
6 files changed, 15 insertions(+), 12 deletions(-)

--
2.8.0


2016-04-14 18:11:37

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 2/4] x86/intel: down with test_thread_flag(TIF_IA32)

For IP + one insturction fixup there is need to know
in which state (compat/native) is application. Now
it's done with TIF_IA32 test, which is buggy, as
the process may change it's CS register to __USER32_CS
descriptor (and vice-versa) so instruction interpreter
will fail to correctly fixup IP.
Changing to user_64bit_mode to check for interrupt
register set is better, however it may race with task,
that changes it's code selector frequiently.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/events/intel/ds.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8584b90d8e0b..e903a8d3b4b0 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -962,7 +962,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
old_to = to;

#ifdef CONFIG_X86_64
- is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
+ is_64bit = kernel_ip(to) || user_64bit_mode(regs);
#endif
insn_init(&insn, kaddr, size, is_64bit);
insn_get_length(&insn);
--
2.8.0

2016-04-14 18:11:43

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 4/4] x86/oprofile: down with test_thread_flag(TIF_IA32)

As we have here full register set - just use user_64bit_mode
on it.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/oprofile/backtrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index cb31a4440e58..405dadaee74a 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -69,7 +69,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int depth)
struct stack_frame_ia32 *head;

/* User process is IA32 */
- if (!current || !test_thread_flag(TIF_IA32))
+ if (!current || user_64bit_mode(regs))
return 0;

head = (struct stack_frame_ia32 *) regs->bp;
--
2.8.0

2016-04-14 18:12:01

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 3/4] x86/intel lbr: down with test_thread_flag(TIF_IA32)

Use user_mode64_bit to check process state. For that pass
interrupt register set from irq handler.
This should fix opcode decoder misinterpreting ABI for
tasks that change their code selector.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/lbr.c | 17 ++++++++++-------
arch/x86/events/perf_event.h | 2 +-
3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 68fa55b4d42e..df13d1d6dbf6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -1860,7 +1860,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)

loops = 0;
again:
- intel_pmu_lbr_read();
+ intel_pmu_lbr_read(regs);
intel_pmu_ack_status(status);
if (++loops > 100) {
static bool warned = false;
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 6c3b7c1780c9..f1a1dbc77dea 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -136,7 +136,9 @@ enum {
X86_BR_IRQ |\
X86_BR_INT)

-static void intel_pmu_lbr_filter(struct cpu_hw_events *cpuc);
+
+static void
+intel_pmu_lbr_filter(struct pt_regs *regs, struct cpu_hw_events *cpuc);

/*
* We only support LBR implementations that have FREEZE_LBRS_ON_PMI
@@ -500,7 +502,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
cpuc->lbr_stack.nr = out;
}

-void intel_pmu_lbr_read(void)
+void intel_pmu_lbr_read(struct pt_regs *regs)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);

@@ -512,7 +514,7 @@ void intel_pmu_lbr_read(void)
else
intel_pmu_lbr_read_64(cpuc);

- intel_pmu_lbr_filter(cpuc);
+ intel_pmu_lbr_filter(regs, cpuc);
}

/*
@@ -658,7 +660,8 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
* decoded (e.g., text page not present), then X86_BR_NONE is
* returned.
*/
-static int branch_type(unsigned long from, unsigned long to, int abort)
+static int branch_type(unsigned long from, unsigned long to, int abort,
+ struct pt_regs *regs)
{
struct insn insn;
void *addr;
@@ -724,7 +727,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
* on 64-bit systems running 32-bit apps
*/
#ifdef CONFIG_X86_64
- is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
+ is64 = kernel_ip((unsigned long)addr) || user_64bit_mode(regs);
#endif
insn_init(&insn, addr, bytes_read, is64);
insn_get_opcode(&insn);
@@ -830,7 +833,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
* in PERF_SAMPLE_BRANCH_STACK sample may vary.
*/
static void
-intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
+intel_pmu_lbr_filter(struct pt_regs *regs, struct cpu_hw_events *cpuc)
{
u64 from, to;
int br_sel = cpuc->br_sel;
@@ -846,7 +849,7 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
from = cpuc->lbr_entries[i].from;
to = cpuc->lbr_entries[i].to;

- type = branch_type(from, to, cpuc->lbr_entries[i].abort);
+ type = branch_type(from, to, cpuc->lbr_entries[i].abort, regs);
if (type != X86_BR_NONE && (br_sel & X86_BR_ANYTX)) {
if (cpuc->lbr_entries[i].in_tx)
type |= X86_BR_IN_TX;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ad4dc7ffffb5..da1ec8240097 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -899,7 +899,7 @@ void intel_pmu_lbr_enable_all(bool pmi);

void intel_pmu_lbr_disable_all(void);

-void intel_pmu_lbr_read(void);
+void intel_pmu_lbr_read(struct pt_regs *regs);

void intel_pmu_lbr_init_core(void);

--
2.8.0

2016-04-14 18:11:34

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 1/4] x86/events: down with test_thread_flag(TIF_IA32)

We can use user_64bit_mode(regs) here instead of thread flag
because we have full register frame.

Signed-off-by: Dmitry Safonov <[email protected]>
---
arch/x86/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 041e442a3e28..91d101a9a6e9 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2269,7 +2269,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
struct stack_frame_ia32 frame;
const void __user *fp;

- if (!test_thread_flag(TIF_IA32))
+ if (user_64bit_mode(regs))
return 0;

cs_base = get_segment_base(regs->cs);
--
2.8.0

2016-04-14 18:32:30

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/events: down with test_thread_flag(TIF_IA32)

On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:
> We can use user_64bit_mode(regs) here instead of thread flag
> because we have full register frame.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/x86/events/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 041e442a3e28..91d101a9a6e9 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2269,7 +2269,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
> struct stack_frame_ia32 frame;
> const void __user *fp;
>
> - if (!test_thread_flag(TIF_IA32))
> + if (user_64bit_mode(regs))
> return 0;

Peter, I got lost in the code that calls this. Are regs coming from
the overflow interrupt's regs, current_pt_regs(), or
perf_get_regs_user?

If it's the perf_get_regs_user, then this should be okay, but passing
in the ABI field directly would be even nicer. If they're coming from
the overflow interrupt's regs or current_pt_regs(), could we change
that?

It might also be nice to make sure that we call perf_get_regs_user
exactly once per overflow interrupt -- i.e. we could push it into the
main code rather than the regs sampling code.

>
> cs_base = get_segment_base(regs->cs);
> --
> 2.8.0
>



--
Andy Lutomirski
AMA Capital Management, LLC

2016-04-14 19:29:35

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/intel lbr: down with test_thread_flag(TIF_IA32)

On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:
> Use user_mode64_bit to check process state. For that pass
> interrupt register set from irq handler.
> This should fix opcode decoder misinterpreting ABI for
> tasks that change their code selector.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/x86/events/intel/core.c | 2 +-
> arch/x86/events/intel/lbr.c | 17 ++++++++++-------
> arch/x86/events/perf_event.h | 2 +-
> 3 files changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 68fa55b4d42e..df13d1d6dbf6 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -1860,7 +1860,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>
> loops = 0;
> again:
> - intel_pmu_lbr_read();
> + intel_pmu_lbr_read(regs);
> intel_pmu_ack_status(status);
> if (++loops > 100) {
> static bool warned = false;
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 6c3b7c1780c9..f1a1dbc77dea 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -136,7 +136,9 @@ enum {
> X86_BR_IRQ |\
> X86_BR_INT)
>
> -static void intel_pmu_lbr_filter(struct cpu_hw_events *cpuc);
> +
> +static void
> +intel_pmu_lbr_filter(struct pt_regs *regs, struct cpu_hw_events *cpuc);
>
> /*
> * We only support LBR implementations that have FREEZE_LBRS_ON_PMI
> @@ -500,7 +502,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
> cpuc->lbr_stack.nr = out;
> }
>
> -void intel_pmu_lbr_read(void)
> +void intel_pmu_lbr_read(struct pt_regs *regs)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>
> @@ -512,7 +514,7 @@ void intel_pmu_lbr_read(void)
> else
> intel_pmu_lbr_read_64(cpuc);
>
> - intel_pmu_lbr_filter(cpuc);
> + intel_pmu_lbr_filter(regs, cpuc);
> }
>
> /*
> @@ -658,7 +660,8 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
> * decoded (e.g., text page not present), then X86_BR_NONE is
> * returned.
> */
> -static int branch_type(unsigned long from, unsigned long to, int abort)
> +static int branch_type(unsigned long from, unsigned long to, int abort,
> + struct pt_regs *regs)
> {
> struct insn insn;
> void *addr;
> @@ -724,7 +727,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
> * on 64-bit systems running 32-bit apps
> */
> #ifdef CONFIG_X86_64
> - is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
> + is64 = kernel_ip((unsigned long)addr) || user_64bit_mode(regs);

Peterz, looking at this some more, would it make sense to pass
user_regs and interrupt_regs (or whatever we'd call it) all the way
through to here?

--Andy

2016-04-14 19:30:16

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/oprofile: down with test_thread_flag(TIF_IA32)

On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:
> As we have here full register set - just use user_64bit_mode
> on it.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> arch/x86/oprofile/backtrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
> index cb31a4440e58..405dadaee74a 100644
> --- a/arch/x86/oprofile/backtrace.c
> +++ b/arch/x86/oprofile/backtrace.c
> @@ -69,7 +69,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int depth)
> struct stack_frame_ia32 *head;
>
> /* User process is IA32 */
> - if (!current || !test_thread_flag(TIF_IA32))
> + if (!current || user_64bit_mode(regs))

This is presumably okay, but I know nothing about oprofile.

> return 0;
>
> head = (struct stack_frame_ia32 *) regs->bp;
> --
> 2.8.0
>



--
Andy Lutomirski
AMA Capital Management, LLC

2016-04-20 11:16:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/events: down with test_thread_flag(TIF_IA32)

On Thu, Apr 14, 2016 at 11:32:07AM -0700, Andy Lutomirski wrote:
> On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:
> > We can use user_64bit_mode(regs) here instead of thread flag
> > because we have full register frame.
> >
> > Signed-off-by: Dmitry Safonov <[email protected]>
> > ---
> > arch/x86/events/core.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > index 041e442a3e28..91d101a9a6e9 100644
> > --- a/arch/x86/events/core.c
> > +++ b/arch/x86/events/core.c
> > @@ -2269,7 +2269,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
> > struct stack_frame_ia32 frame;
> > const void __user *fp;
> >
> > - if (!test_thread_flag(TIF_IA32))
> > + if (user_64bit_mode(regs))
> > return 0;

Urgh, so why didn't I get Cc'ed to these patches to begin with?

> Peter, I got lost in the code that calls this. Are regs coming from
> the overflow interrupt's regs, current_pt_regs(), or
> perf_get_regs_user?

So get_perf_callchain() will get regs from:

- interrupt/NMI regs
- perf_arch_fetch_caller_regs()

And when user && !user_mode(), we'll use:

- task_pt_regs() (which arguably should maybe be perf_get_regs_user())

to call perf_callchain_user(), which then, ands up calling
perf_callchain_user32() which is expected to NO-OP for 64bit userspace.

> If it's the perf_get_regs_user, then this should be okay, but passing
> in the ABI field directly would be even nicer. If they're coming from
> the overflow interrupt's regs or current_pt_regs(), could we change
> that?
>
> It might also be nice to make sure that we call perf_get_regs_user
> exactly once per overflow interrupt -- i.e. we could push it into the
> main code rather than the regs sampling code.

The risk there is that we might not need the user regs at all to handle
the overflow thingy, so doing it unconditionally would be unwanted.

2016-04-20 11:21:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/intel lbr: down with test_thread_flag(TIF_IA32)

On Thu, Apr 14, 2016 at 12:29:12PM -0700, Andy Lutomirski wrote:
> On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:

> > @@ -724,7 +727,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
> > * on 64-bit systems running 32-bit apps
> > */
> > #ifdef CONFIG_X86_64
> > - is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
> > + is64 = kernel_ip((unsigned long)addr) || user_64bit_mode(regs);
>
> Peterz, looking at this some more, would it make sense to pass
> user_regs and interrupt_regs (or whatever we'd call it) all the way
> through to here?

Urgh; again, wtf wasn't I Cc'ed to these patches?

And not sure; if we never need the user regs, calling
perf_get_user_regs() to set all that up seems like a massive waste of
cycles.

2016-04-20 13:58:51

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/intel lbr: down with test_thread_flag(TIF_IA32)

On 04/20/2016 02:21 PM, Peter Zijlstra wrote:
> On Thu, Apr 14, 2016 at 12:29:12PM -0700, Andy Lutomirski wrote:
>> On Thu, Apr 14, 2016 at 11:10 AM, Dmitry Safonov <[email protected]> wrote:
>>> @@ -724,7 +727,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
>>> * on 64-bit systems running 32-bit apps
>>> */
>>> #ifdef CONFIG_X86_64
>>> - is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
>>> + is64 = kernel_ip((unsigned long)addr) || user_64bit_mode(regs);
>> Peterz, looking at this some more, would it make sense to pass
>> user_regs and interrupt_regs (or whatever we'd call it) all the way
>> through to here?
> Urgh; again, wtf wasn't I Cc'ed to these patches?

Sorry for that - that was my unintentional miss on git-send-email.

> And not sure; if we never need the user regs, calling
> perf_get_user_regs() to set all that up seems like a massive waste of
> cycles.