2021-02-07 10:42:16

by Borislav Petkov

[permalink] [raw]
Subject: [GIT PULL] x86/urgent for v5.11-rc7

Hi Linus,

I hope this is the last batch of x86/urgent updates for this round.

Pls pull,
thx.

---

The following changes since commit 6ee1d745b7c9fd573fba142a2efdad76a9f1cb04:

Linux 5.11-rc5 (2021-01-24 16:47:14 -0800)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/x86_urgent_for_v5.11_rc7

for you to fetch changes up to 816ef8d7a2c4182e19bc06ab65751cb9e3951e94:

x86/efi: Remove EFI PGD build time checks (2021-02-06 13:54:14 +0100)

----------------------------------------------------------------
- Remove superfluous EFI PGD range checks which lead to those assertions failing
with certain kernel configs and LLVM.

- Disable setting breakpoints on facilities involved in #DB exception handling
to avoid infinite loops.

- Add extra serialization to non-serializing MSRs (IA32_TSC_DEADLINE and
x2 APIC MSRs) to adhere to SDM's recommendation and avoid any theoretical
issues.

- Re-add the EPB MSR reading on turbostat so that it works on older
kernels which don't have the corresponding EPB sysfs file.

- Add Alder Lake to the list of CPUs which support split lock.

- Fix %dr6 register handling in order to be able to set watchpoints with gdb
again.

- Disable CET instrumentation in the kernel so that gcc doesn't add
ENDBR64 to kernel code and thus confuse tracing.

----------------------------------------------------------------
Borislav Petkov (2):
tools/power/turbostat: Fallback to an MSR read for EPB
x86/efi: Remove EFI PGD build time checks

Dave Hansen (1):
x86/apic: Add extra serialization for non-serializing MSRs

Fenghua Yu (1):
x86/split_lock: Enable the split lock feature on another Alder Lake CPU

Josh Poimboeuf (1):
x86/build: Disable CET instrumentation in the kernel

Lai Jiangshan (2):
x86/debug: Prevent data breakpoints on __per_cpu_offset
x86/debug: Prevent data breakpoints on cpu_dr7

Peter Zijlstra (1):
x86/debug: Fix DR6 handling

Makefile | 6 ----
arch/x86/Makefile | 3 ++
arch/x86/include/asm/apic.h | 10 ------
arch/x86/include/asm/barrier.h | 18 +++++++++++
arch/x86/kernel/apic/apic.c | 4 +++
arch/x86/kernel/apic/x2apic_cluster.c | 6 ++--
arch/x86/kernel/apic/x2apic_phys.c | 9 ++++--
arch/x86/kernel/cpu/intel.c | 1 +
arch/x86/kernel/hw_breakpoint.c | 61 +++++++++++++++++++++++------------
arch/x86/platform/efi/efi_64.c | 19 -----------
tools/power/x86/turbostat/turbostat.c | 10 +++++-
11 files changed, 85 insertions(+), 62 deletions(-)

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg


2021-02-07 17:52:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
>
> - Disable CET instrumentation in the kernel so that gcc doesn't add
> ENDBR64 to kernel code and thus confuse tracing.

So this is clearly the right thing to do for now, but I wonder if
people have a plan for actually enabling CET and endbr at cpl0 at some
point?

Linus

2021-02-07 18:01:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
> >
> > - Disable CET instrumentation in the kernel so that gcc doesn't add
> > ENDBR64 to kernel code and thus confuse tracing.
>
> So this is clearly the right thing to do for now, but I wonder if
> people have a plan for actually enabling CET and endbr at cpl0 at some
> point?

It probably is an item on some Intel manager's to-enable list. So far,
the CET enablement concentrates only on userspace but dhansen might know
more about future plans. CCed.

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

2021-02-07 18:19:15

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <[email protected]> wrote:
>
> It probably is an item on some Intel manager's to-enable list. So far,
> the CET enablement concentrates only on userspace but dhansen might know
> more about future plans. CCed.

I think the new Ryzen 5000 series also supports CET, but I don't have
any machines to check.

Hopefully somebody ends up with hardware that supports it and a urge
to try to make it work in kernel land too.

I do suspect involved people should start thinking about how they want
to deal with functions starting with

endbr64
call __fentry__

instead of the call being at the very top of the function.

I _assume_ it's mostly tracing, bpf and objtool that are going to
notice, and it's going to be largely invisible to anybody else.

So hopefully the involved people can at least just try to see how
their code looks when they turn off retpoline and add

-fcf-protection=full

to the compiler command line (assuming they have a gcc that can do
it), even if they can't actually test the end result on hardware.

Linus

2021-02-07 18:21:28

by Dave Hansen

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On 2/7/21 9:58 AM, Borislav Petkov wrote:
> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
>>> ENDBR64 to kernel code and thus confuse tracing.
>> So this is clearly the right thing to do for now, but I wonder if
>> people have a plan for actually enabling CET and endbr at cpl0 at some
>> point?
> It probably is an item on some Intel manager's to-enable list. So far,
> the CET enablement concentrates only on userspace but dhansen might know
> more about future plans. CCed.

It's definitely on our radar to look at after CET userspace.

The only question for me is whether it will be worth doing with the
exiting kernel entry/exit architecture.

2021-02-07 18:32:11

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

The pull request you sent on Sun, 7 Feb 2021 11:40:22 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/x86_urgent_for_v5.11_rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e24f9c5f6e3127a0679d5ba5575a181b80f219c9

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2021-02-07 18:33:25

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7


> On Feb 7, 2021, at 10:19 AM, Dave Hansen <[email protected]> wrote:
>
> On 2/7/21 9:58 AM, Borislav Petkov wrote:
>>> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
>>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
>>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
>>>> ENDBR64 to kernel code and thus confuse tracing.
>>> So this is clearly the right thing to do for now, but I wonder if
>>> people have a plan for actually enabling CET and endbr at cpl0 at some
>>> point?
>> It probably is an item on some Intel manager's to-enable list. So far,
>> the CET enablement concentrates only on userspace but dhansen might know
>> more about future plans. CCed.
>
> It's definitely on our radar to look at after CET userspace.
>
> The only question for me is whether it will be worth doing with the
> exiting kernel entry/exit architecture.

I assume you mean: is anyone sufficiently inspired to try to handle NMI correctly? I have a whole pile of nacks saved up for incorrect implementations, although I will try to wrap them in polite explanations of precisely what is wrong :)

(I’ve contemplated doing this myself, and it doesn’t sound fun at all.)

2021-02-07 18:36:10

by Dave Hansen

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On 2/7/21 10:15 AM, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <[email protected]> wrote:
>> It probably is an item on some Intel manager's to-enable list. So far,
>> the CET enablement concentrates only on userspace but dhansen might know
>> more about future plans. CCed.
> I think the new Ryzen 5000 series also supports CET, but I don't have
> any machines to check.

Intel wraps up Shadow Stacks and Indirect Branch Tracking (IBT) under
the CET umbrella, although they can be implemented totally independently.

I actually forget about the IBT half most of the time because the kernel
code to implement userspace support is a much lighter lift than shadow
stacks.

My understanding is that AMD has documented support for Shadow Stacks:

https://www.amd.com/system/files/TechDocs/24592.pdf

But has not yet released any documentation about IBT. IBT seems to be
Intel-only, at least in the short term. There may be more, but the
"Tiger Lake" CPUs are the only ones I know of off the top of my head
that are in the wild:

> https://ark.intel.com/content/www/us/en/ark/products/208661/intel-core-i7-1160g7-processor-12m-cache-up-to-4-40-ghz-with-ipu.html

2021-02-07 18:44:02

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 7, 2021 at 10:32 AM Dave Hansen <[email protected]> wrote:
>
> My understanding is that AMD has documented support for Shadow Stacks:
>
> https://www.amd.com/system/files/TechDocs/24592.pdf
>
> But has not yet released any documentation about IBT. IBT seems to be
> Intel-only, at least in the short term. There may be more, but the
> "Tiger Lake" CPUs are the only ones I know of off the top of my head
> that are in the wild:

Ahh, ok. I clearly didn't look at the details, just the overview of
"supports CET".

Linus

2021-02-07 20:47:21

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 7, 2021 at 10:21 AM Dave Hansen <[email protected]> wrote:
>
> On 2/7/21 9:58 AM, Borislav Petkov wrote:
> > On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> >> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
> >>> - Disable CET instrumentation in the kernel so that gcc doesn't add
> >>> ENDBR64 to kernel code and thus confuse tracing.
> >> So this is clearly the right thing to do for now, but I wonder if
> >> people have a plan for actually enabling CET and endbr at cpl0 at some
> >> point?
> > It probably is an item on some Intel manager's to-enable list. So far,
> > the CET enablement concentrates only on userspace but dhansen might know
> > more about future plans. CCed.
>
> It's definitely on our radar to look at after CET userspace.

What is the desired timeline to enable CET in the kernel ?
I think for bpf and tracing it will be mostly straightforward to deal
with extra endbr64 insn in front of the fentry nop.
Just trying to figure when this work needs to be done.

2021-02-07 22:40:02

by Dave Hansen

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On 2/7/21 12:44 PM, Alexei Starovoitov wrote:
>>> It probably is an item on some Intel manager's to-enable list. So far,
>>> the CET enablement concentrates only on userspace but dhansen might know
>>> more about future plans. CCed.
>> It's definitely on our radar to look at after CET userspace.
> What is the desired timeline to enable CET in the kernel ?
> I think for bpf and tracing it will be mostly straightforward to deal
> with extra endbr64 insn in front of the fentry nop.
> Just trying to figure when this work needs to be done.

Yu-cheng? Any idea when you're going to start hacking on the in-kernel
IBT bits?

2021-02-07 22:50:22

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 07, 2021 at 10:15:49AM -0800, Linus Torvalds wrote:
> On Sun, Feb 7, 2021 at 9:58 AM Borislav Petkov <[email protected]> wrote:
> >
> > It probably is an item on some Intel manager's to-enable list. So far,
> > the CET enablement concentrates only on userspace but dhansen might know
> > more about future plans. CCed.
>
> I think the new Ryzen 5000 series also supports CET, but I don't have
> any machines to check.
>
> Hopefully somebody ends up with hardware that supports it and a urge
> to try to make it work in kernel land too.
>
> I do suspect involved people should start thinking about how they want
> to deal with functions starting with
>
> endbr64
> call __fentry__
>
> instead of the call being at the very top of the function.

FWIW, objtool's already fine with it (otherwise we would have discovered
the need to disable fcf-protection much sooner).

--
Josh

2021-02-08 10:47:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, Feb 07, 2021 at 10:31:32AM -0800, Andy Lutomirski wrote:
>
> > On Feb 7, 2021, at 10:19 AM, Dave Hansen <[email protected]> wrote:
> >
> > On 2/7/21 9:58 AM, Borislav Petkov wrote:
> >>> On Sun, Feb 07, 2021 at 09:49:18AM -0800, Linus Torvalds wrote:
> >>> On Sun, Feb 7, 2021 at 2:40 AM Borislav Petkov <[email protected]> wrote:
> >>>> - Disable CET instrumentation in the kernel so that gcc doesn't add
> >>>> ENDBR64 to kernel code and thus confuse tracing.
> >>> So this is clearly the right thing to do for now, but I wonder if
> >>> people have a plan for actually enabling CET and endbr at cpl0 at some
> >>> point?
> >> It probably is an item on some Intel manager's to-enable list. So far,
> >> the CET enablement concentrates only on userspace but dhansen might know
> >> more about future plans. CCed.
> >
> > It's definitely on our radar to look at after CET userspace.
> >
> > The only question for me is whether it will be worth doing with the
> > exiting kernel entry/exit architecture.
>
> I assume you mean: is anyone sufficiently inspired to try to handle
> NMI correctly? I have a whole pile of nacks saved up for incorrect
> implementations, although I will try to wrap them in polite
> explanations of precisely what is wrong :)

Yeah, the IST stack recursion possibilities are 'fun'. But IIRC CET-SS
has far more problems than just NMI. It will also run into all the ROP
tricks we pull for return tracing, CALL emulation and other lovely
things.


2021-02-08 15:07:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Sun, 7 Feb 2021 16:45:40 -0600
Josh Poimboeuf <[email protected]> wrote:

> > I do suspect involved people should start thinking about how they want
> > to deal with functions starting with
> >
> > endbr64
> > call __fentry__
> >
> > instead of the call being at the very top of the function.
>
> FWIW, objtool's already fine with it (otherwise we would have discovered
> the need to disable fcf-protection much sooner).

And this doesn't really affect tracing (note, another user that might be
affected is live kernel patching). The way this change was noticed, was
that there was a report of someone that was be able to connect a bpf
program to a function for one machine but not for another machine. The
other machine had this CET thingy.

The difference is, when you attach a probe to the start of a function,
kprobes will check if the probe address (start of function) is located at a
ftrace location (nop / __fentry__) and if it is, it would use the ftrace
infrastructure instead of attaching an int3 breakpoint. Because of the
enbr64 being at the start of the function, the check returned false (it was
not a ftrace location) and it attached an int3 breakpoint instead.

This uncovered another "bug". Peter Zijlstra made int3 handlers look like
NMIs (in_nmi() would return true in an int3 handler). The BPF programs would
not run in NMI context. But nobody noticed, because people usually attach
BPF programs to the start of a function using kprobes, and since kprobes
would use ftrace handlers (that don't set in_nmi() to true), everything
worked. But when the "endbr64" was added at the start of the program,
kprobes fell back to int3, and suddenly the BPF programs stopped working.

-- Steve

2021-02-08 18:01:47

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Mon, Feb 08, 2021 at 10:02:06AM -0500, Steven Rostedt wrote:
> On Sun, 7 Feb 2021 16:45:40 -0600
> Josh Poimboeuf <[email protected]> wrote:
>
> > > I do suspect involved people should start thinking about how they want
> > > to deal with functions starting with
> > >
> > > endbr64
> > > call __fentry__
> > >
> > > instead of the call being at the very top of the function.
> >
> > FWIW, objtool's already fine with it (otherwise we would have discovered
> > the need to disable fcf-protection much sooner).
>
> And this doesn't really affect tracing (note, another user that might be
> affected is live kernel patching).

Good point, livepatch is indeed affected. Is there a better way to get
the "call __fentry__" address for a given function?


/*
* Convert a function address into the appropriate ftrace location.
*
* Usually this is just the address of the function, but on some architectures
* it's more complicated so allow them to provide a custom behaviour.
*/
#ifndef klp_get_ftrace_location
static unsigned long klp_get_ftrace_location(unsigned long faddr)
{
return faddr;
}
#endif

--
Josh

2021-02-08 18:24:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Mon, Feb 08, 2021 at 09:33:00AM -0600, Josh Poimboeuf wrote:
> On Mon, Feb 08, 2021 at 10:02:06AM -0500, Steven Rostedt wrote:
> > On Sun, 7 Feb 2021 16:45:40 -0600
> > Josh Poimboeuf <[email protected]> wrote:
> >
> > > > I do suspect involved people should start thinking about how they want
> > > > to deal with functions starting with
> > > >
> > > > endbr64
> > > > call __fentry__
> > > >
> > > > instead of the call being at the very top of the function.
> > >
> > > FWIW, objtool's already fine with it (otherwise we would have discovered
> > > the need to disable fcf-protection much sooner).
> >
> > And this doesn't really affect tracing (note, another user that might be
> > affected is live kernel patching).
>
> Good point, livepatch is indeed affected. Is there a better way to get
> the "call __fentry__" address for a given function?
>
>
> /*
> * Convert a function address into the appropriate ftrace location.
> *
> * Usually this is just the address of the function, but on some architectures
> * it's more complicated so allow them to provide a custom behaviour.
> */
> #ifndef klp_get_ftrace_location
> static unsigned long klp_get_ftrace_location(unsigned long faddr)
> {
> return faddr;
> }
> #endif

I suppose the trivial fix is to see if it points to endbr64 and if so,
increment the addr by the length of that.

2021-02-08 18:33:08

by Yu-cheng Yu

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On 2/7/2021 2:35 PM, Dave Hansen wrote:
> On 2/7/21 12:44 PM, Alexei Starovoitov wrote:
>>>> It probably is an item on some Intel manager's to-enable list. So far,
>>>> the CET enablement concentrates only on userspace but dhansen might know
>>>> more about future plans. CCed.
>>> It's definitely on our radar to look at after CET userspace.
>> What is the desired timeline to enable CET in the kernel ?
>> I think for bpf and tracing it will be mostly straightforward to deal
>> with extra endbr64 insn in front of the fentry nop.
>> Just trying to figure when this work needs to be done.
>
> Yu-cheng? Any idea when you're going to start hacking on the in-kernel
> IBT bits?
>

I have some kernel-mode enabling patches that I will soon send
internally for comments. My estimate is probably before the summer, I
can send those to the mailing list.

--
Yu-cheng

2021-02-08 18:34:06

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Mon, 8 Feb 2021 16:47:05 +0100
Peter Zijlstra <[email protected]> wrote:

> > /*
> > * Convert a function address into the appropriate ftrace location.
> > *
> > * Usually this is just the address of the function, but on some architectures
> > * it's more complicated so allow them to provide a custom behaviour.
> > */
> > #ifndef klp_get_ftrace_location
> > static unsigned long klp_get_ftrace_location(unsigned long faddr)
> > {
> > return faddr;
> > }
> > #endif
>
> I suppose the trivial fix is to see if it points to endbr64 and if so,
> increment the addr by the length of that.

I thought of that too. But one thing that may be possible, is to use
kallsym. I believe you can get the range of a function (start and end of
the function) from kallsyms. Then ask ftrace for the addr in that range
(there should only be one).

-- Steve

2021-02-09 08:35:24

by Miroslav Benes

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Mon, 8 Feb 2021, Steven Rostedt wrote:

> On Mon, 8 Feb 2021 16:47:05 +0100
> Peter Zijlstra <[email protected]> wrote:
>
> > > /*
> > > * Convert a function address into the appropriate ftrace location.
> > > *
> > > * Usually this is just the address of the function, but on some architectures
> > > * it's more complicated so allow them to provide a custom behaviour.
> > > */
> > > #ifndef klp_get_ftrace_location
> > > static unsigned long klp_get_ftrace_location(unsigned long faddr)
> > > {
> > > return faddr;
> > > }
> > > #endif

powerpc has this

static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
{
/*
* Live patch works only with -mprofile-kernel on PPC. In this case,
* the ftrace location is always within the first 16 bytes.
*/
return ftrace_location_range(faddr, faddr + 16);
}

> > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > increment the addr by the length of that.
>
> I thought of that too. But one thing that may be possible, is to use
> kallsym. I believe you can get the range of a function (start and end of
> the function) from kallsyms. Then ask ftrace for the addr in that range
> (there should only be one).

And we can do this if a hard-coded value live above is not welcome. If I
remember correctly, we used to have exactly this in the old versions of
kGraft. We walked through all ftrace records, called
kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
matched faddr (in this case), we returned the ip.

Miroslav

2021-02-09 14:54:59

by Steven Rostedt

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
Miroslav Benes <[email protected]> wrote:

> powerpc has this
>
> static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
> {
> /*
> * Live patch works only with -mprofile-kernel on PPC. In this case,
> * the ftrace location is always within the first 16 bytes.
> */
> return ftrace_location_range(faddr, faddr + 16);
> }
>
> > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > increment the addr by the length of that.
> >
> > I thought of that too. But one thing that may be possible, is to use
> > kallsym. I believe you can get the range of a function (start and end of
> > the function) from kallsyms. Then ask ftrace for the addr in that range
> > (there should only be one).
>
> And we can do this if a hard-coded value live above is not welcome. If I
> remember correctly, we used to have exactly this in the old versions of
> kGraft. We walked through all ftrace records, called
> kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
> matched faddr (in this case), we returned the ip.

Either way is fine. Question is, should we just wait till CET is
implemented for the kernel before making any of these changes? Just knowing
that we have a solution to handle it may be good enough for now.

-- Steve

2021-02-09 15:18:47

by Miroslav Benes

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Tue, 9 Feb 2021, Steven Rostedt wrote:

> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
> Miroslav Benes <[email protected]> wrote:
>
> > powerpc has this
> >
> > static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
> > {
> > /*
> > * Live patch works only with -mprofile-kernel on PPC. In this case,
> > * the ftrace location is always within the first 16 bytes.
> > */
> > return ftrace_location_range(faddr, faddr + 16);
> > }
> >
> > > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > > increment the addr by the length of that.
> > >
> > > I thought of that too. But one thing that may be possible, is to use
> > > kallsym. I believe you can get the range of a function (start and end of
> > > the function) from kallsyms. Then ask ftrace for the addr in that range
> > > (there should only be one).
> >
> > And we can do this if a hard-coded value live above is not welcome. If I
> > remember correctly, we used to have exactly this in the old versions of
> > kGraft. We walked through all ftrace records, called
> > kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
> > matched faddr (in this case), we returned the ip.
>
> Either way is fine. Question is, should we just wait till CET is
> implemented for the kernel before making any of these changes? Just knowing
> that we have a solution to handle it may be good enough for now.

I'd prefer it to be a part of CET enablement patch set.

Miroslav

2021-02-09 16:51:16

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Tue, Feb 9, 2021 at 6:49 AM Steven Rostedt <[email protected]> wrote:
>
> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
> Miroslav Benes <[email protected]> wrote:
>
> > powerpc has this
> >
> > static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
> > {
> > /*
> > * Live patch works only with -mprofile-kernel on PPC. In this case,
> > * the ftrace location is always within the first 16 bytes.
> > */
> > return ftrace_location_range(faddr, faddr + 16);
> > }
> >
> > > > I suppose the trivial fix is to see if it points to endbr64 and if so,
> > > > increment the addr by the length of that.
> > >
> > > I thought of that too. But one thing that may be possible, is to use
> > > kallsym. I believe you can get the range of a function (start and end of
> > > the function) from kallsyms. Then ask ftrace for the addr in that range
> > > (there should only be one).
> >
> > And we can do this if a hard-coded value live above is not welcome. If I
> > remember correctly, we used to have exactly this in the old versions of
> > kGraft. We walked through all ftrace records, called
> > kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
> > matched faddr (in this case), we returned the ip.
>
> Either way is fine. Question is, should we just wait till CET is
> implemented for the kernel before making any of these changes? Just knowing
> that we have a solution to handle it may be good enough for now.

I think the issue is more fundamental than what appears on the surface.
According to endbr64 documentation it's not just any instruction.
The cpu will wait for it and if it's replaced with int3 or not seen at
the branch target the cpu will throw an exception.
If I understood the doc correctly it means that endbr64 can never be
replaced with a breakpoint. If that's the case text_poke_bp and kprobe
need to do extra safety checks.

2021-02-09 20:46:19

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7



> On Feb 9, 2021, at 10:09 AM, Linus Torvalds <[email protected]> wrote:
>
> On Tue, Feb 9, 2021 at 8:55 AM Andy Lutomirski <[email protected]> wrote:
>>
>> Or we hack up #CP to handle this case. I don’t quite know how I feel about this.
>
> I think that's the sane model - if we've replaced the instruction with
> 'int3', and we end up getting #CP due to that, just do the #BP
> handling.
>
> Anything else would just be insanely complicated, I feel.

The other model is “don’t do that then.”

I suppose a nice property of patching ENDBR to INT3 is that, not only is it atomic, but ENDBR is sort of a NOP, so we don’t need to replace the ENDBR with anything.

>
> Linus

2021-02-09 20:58:02

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Tue, Feb 9, 2021 at 10:26 AM Andy Lutomirski <[email protected]> wrote:
> >
> > Anything else would just be insanely complicated, I feel.
>
> The other model is “don’t do that then.”

Hmm. I guess all the code that does int3 patching could just be taught
to always go to the next instruction instead.

I don't think advancing the rewriting is an option for the asm
alternative() logic or the static call infrastructure, but those
should never be about endbr anyway, so presumably that's not an issue.

So if it ends up being _only_ about kprobes, then the "don't do that
then" might work fine.

Linus

2021-02-10 07:39:40

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7



> On Feb 9, 2021, at 8:45 AM, Alexei Starovoitov <[email protected]> wrote:
>
> On Tue, Feb 9, 2021 at 6:49 AM Steven Rostedt <[email protected]> wrote:
>>
>>> On Tue, 9 Feb 2021 09:32:34 +0100 (CET)
>>> Miroslav Benes <[email protected]> wrote:
>>>
>>> powerpc has this
>>>
>>> static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
>>> {
>>> /*
>>> * Live patch works only with -mprofile-kernel on PPC. In this case,
>>> * the ftrace location is always within the first 16 bytes.
>>> */
>>> return ftrace_location_range(faddr, faddr + 16);
>>> }
>>>
>>>>> I suppose the trivial fix is to see if it points to endbr64 and if so,
>>>>> increment the addr by the length of that.
>>>>
>>>> I thought of that too. But one thing that may be possible, is to use
>>>> kallsym. I believe you can get the range of a function (start and end of
>>>> the function) from kallsyms. Then ask ftrace for the addr in that range
>>>> (there should only be one).
>>>
>>> And we can do this if a hard-coded value live above is not welcome. If I
>>> remember correctly, we used to have exactly this in the old versions of
>>> kGraft. We walked through all ftrace records, called
>>> kallsyms_lookup_size_offset() on every record's ip and if the offset+ip
>>> matched faddr (in this case), we returned the ip.
>>
>> Either way is fine. Question is, should we just wait till CET is
>> implemented for the kernel before making any of these changes? Just knowing
>> that we have a solution to handle it may be good enough for now.
>
> I think the issue is more fundamental than what appears on the surface.
> According to endbr64 documentation it's not just any instruction.
> The cpu will wait for it and if it's replaced with int3 or not seen at
> the branch target the cpu will throw an exception.
> If I understood the doc correctly it means that endbr64 can never be
> replaced with a breakpoint. If that's the case text_poke_bp and kprobe
> need to do extra safety checks.

Ugh.

Or we hack up #CP to handle this case. I don’t quite know how I feel about this.

2021-02-10 07:53:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/urgent for v5.11-rc7

On Tue, Feb 9, 2021 at 8:55 AM Andy Lutomirski <[email protected]> wrote:
>
> Or we hack up #CP to handle this case. I don’t quite know how I feel about this.

I think that's the sane model - if we've replaced the instruction with
'int3', and we end up getting #CP due to that, just do the #BP
handling.

Anything else would just be insanely complicated, I feel.

Linus