The ZERO_CALL_USED_REGS feature may zero out callee-saved registers. This needs
to be properly modeled by things like code alternatives. Otherwise, it's
possible that a callee-saved register would be expected to remain unchanged
past an ASM statement when in reality it isn't.
Bill Wendling (2):
x86/paravirt: clean up typos and grammaros
x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
arch/x86/include/asm/paravirt_types.h | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
--
2.37.2.789.g6183377224-goog
The ZERO_CALL_USED_REGS feature may zero out caller-saved registers
before returning.
In spurious_kernel_fault(), the "pte_offset_kernel()" call results in
this assembly code:
.Ltmp151:
#APP
# ALT: oldnstr
.Ltmp152:
.Ltmp153:
.Ltmp154:
.section .discard.retpoline_safe,"",@progbits
.quad .Ltmp154
.text
callq *pv_ops+536(%rip)
.Ltmp155:
.section .parainstructions,"a",@progbits
.p2align 3, 0x0
.quad .Ltmp153
.byte 67
.byte .Ltmp155-.Ltmp153
.short 1
.text
.Ltmp156:
# ALT: padding
.zero (-(((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152))>0))*((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152)),144
.Ltmp159:
.section .altinstructions,"a",@progbits
.Ltmp160:
.long .Ltmp152-.Ltmp160
.Ltmp161:
.long .Ltmp158-.Ltmp161
.short 33040
.byte .Ltmp159-.Ltmp152
.byte .Ltmp157-.Ltmp158
.text
.section .altinstr_replacement,"ax",@progbits
# ALT: replacement 1
.Ltmp158:
movq %rdi, %rax
.Ltmp157:
.text
#NO_APP
.Ltmp162:
testb $-128, %dil
The "testb" here is using %dil, but the %rdi register was cleared before
returning from "callq *pv_ops+536(%rip)". Adding the proper constraints
results in the use of a different register:
movq %r11, %rdi
# Similar to above.
testb $-128, %r11b
Link: https://github.com/KSPP/linux/issues/192
Signed-off-by: Bill Wendling <[email protected]>
Reported-and-tested-by: Nathan Chancellor <[email protected]>
---
arch/x86/include/asm/paravirt_types.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index f04157456a49..b1ab5d94881b 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
"=c" (__ecx)
#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
-/* void functions are still allowed [re]ax for scratch */
+/*
+ * void functions are still allowed [re]ax for scratch.
+ *
+ * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
+ * registers. Make sure we model this with the appropriate clobbers.
+ */
+#ifdef CONFIG_ZERO_CALL_USED_REGS
+#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
+#else
#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
+#endif
#define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
#define EXTRA_CLOBBERS , "r8", "r9", "r10", "r11"
--
2.37.2.789.g6183377224-goog
Drive-by clean up of the comment.
[ Impact: cleanup]
Signed-off-by: Bill Wendling <[email protected]>
---
arch/x86/include/asm/paravirt_types.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 89df6c6617f5..f04157456a49 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -328,7 +328,7 @@ int paravirt_disable_iospace(void);
* Unfortunately, this is a relatively slow operation for modern CPUs,
* because it cannot necessarily determine what the destination
* address is. In this case, the address is a runtime constant, so at
- * the very least we can patch the call to e a simple direct call, or
+ * the very least we can patch the call to a simple direct call, or,
* ideally, patch an inline implementation into the callsite. (Direct
* calls are essentially free, because the call and return addresses
* are completely predictable.)
@@ -339,10 +339,10 @@ int paravirt_disable_iospace(void);
* on the stack. All caller-save registers (eax,edx,ecx) are expected
* to be modified (either clobbered or used for return values).
* X86_64, on the other hand, already specifies a register-based calling
- * conventions, returning at %rax, with parameters going on %rdi, %rsi,
+ * conventions, returning at %rax, with parameters going in %rdi, %rsi,
* %rdx, and %rcx. Note that for this reason, x86_64 does not need any
* special handling for dealing with 4 arguments, unlike i386.
- * However, x86_64 also have to clobber all caller saved registers, which
+ * However, x86_64 also has to clobber all caller saved registers, which
* unfortunately, are quite a bit (r8 - r11)
*
* The call instruction itself is marked by placing its start address
@@ -360,22 +360,22 @@ int paravirt_disable_iospace(void);
* There are 5 sets of PVOP_* macros for dealing with 0-4 arguments.
* It could be extended to more arguments, but there would be little
* to be gained from that. For each number of arguments, there are
- * the two VCALL and CALL variants for void and non-void functions.
+ * two VCALL and CALL variants for void and non-void functions.
*
* When there is a return value, the invoker of the macro must specify
* the return type. The macro then uses sizeof() on that type to
- * determine whether its a 32 or 64 bit value, and places the return
+ * determine whether it's a 32 or 64 bit value and places the return
* in the right register(s) (just %eax for 32-bit, and %edx:%eax for
- * 64-bit). For x86_64 machines, it just returns at %rax regardless of
+ * 64-bit). For x86_64 machines, it just returns in %rax regardless of
* the return value size.
*
- * 64-bit arguments are passed as a pair of adjacent 32-bit arguments
+ * 64-bit arguments are passed as a pair of adjacent 32-bit arguments;
* i386 also passes 64-bit arguments as a pair of adjacent 32-bit arguments
* in low,high order
*
* Small structures are passed and returned in registers. The macro
* calling convention can't directly deal with this, so the wrapper
- * functions must do this.
+ * functions must do it.
*
* These PVOP_* macros are only defined within this header. This
* means that all uses must be wrapped in inline functions. This also
--
2.37.2.789.g6183377224-goog
On Fri, Sep 02, 2022 at 09:37:49PM +0000, Bill Wendling wrote:
> Drive-by clean up of the comment.
>
> [ Impact: cleanup]
We used to do that type of "impact" tagging years ago but we stopped.
Where did you dig this out from?
:)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> [...]
> callq *pv_ops+536(%rip)
Do you know which pv_ops function is this? I can't figure out where
pte_offset_kernel() gets converted into a pv_ops call....
> [...]
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
> "=c" (__ecx)
> #define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
>
> -/* void functions are still allowed [re]ax for scratch */
> +/*
> + * void functions are still allowed [re]ax for scratch.
> + *
> + * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
> + * registers. Make sure we model this with the appropriate clobbers.
> + */
> +#ifdef CONFIG_ZERO_CALL_USED_REGS
> +#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
> +#else
> #define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
> +#endif
> #define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
I don't think this should depend on CONFIG_ZERO_CALL_USED_REGS; it should
always be present.
I've only been looking at this just now, so many I'm missing
something. The callee clobbers are for functions with return values,
yes?
For example, 32-bit has to manually deal with doing a 64-bit value return,
and even got it wrong originally, fixing it in commit 0eb592dbba40
("x86/paravirt: return full 64-bit result"), with:
-#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
+#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), "=d" (__edx)
But the naming is confusing, since these aren't actually clobbers,
they're input constraints marked as clobbers (the "=" modifier).
Regardless, the note in the comments ...
...
* However, x86_64 also have to clobber all caller saved registers, which
* unfortunately, are quite a bit (r8 - r11)
...
... would indicate that ALL the function argument registers need to be
marked as clobbers (i.e. the compiler can't figure this out on its own).
I was going to say it seems like they're missing from EXTRA_CLOBBERS,
but it's not used with any of the macros using PVOP_VCALLEE_CLOBBERS,
and then I saw the weird alternatives patching that encodes the clobbers
a second time (CLBR_ANY vs CLBR_RET_REG) via:
#define _paravirt_alt(insn_string, type, clobber) \
"771:\n\t" insn_string "\n" "772:\n" \
".pushsection .parainstructions,\"a\"\n" \
_ASM_ALIGN "\n" \
_ASM_PTR " 771b\n" \
" .byte " type "\n" \
" .byte 772b-771b\n" \
" .short " clobber "\n" \
".popsection\n"
And after reading the alternatives patching code which parses this via
the following struct:
/* These all sit in the .parainstructions section to tell us what to patch. */
struct paravirt_patch_site {
u8 *instr; /* original instructions */
u8 type; /* type of this instruction */
u8 len; /* length of original instruction */
};
... I see it _doesn't use the clobbers_ at all! *head explode* I found
that removal in commit 27876f3882fd ("x86/paravirt: Remove clobbers from
struct paravirt_patch_site")
So, I guess the CLBR_* can all be entirely removed. But back to my other
train of thought...
It seems like all the input registers need to be explicitly listed in
the PVOP_VCALLEE_CLOBBERS list (as you have), but likely should be done
unconditionally and for 32-bit as well.
-Kees
(Also, please CC [email protected].)
--
Kees Cook
On Fri, Sep 2, 2022 at 9:28 PM Borislav Petkov <[email protected]> wrote:
>
> On Fri, Sep 02, 2022 at 09:37:49PM +0000, Bill Wendling wrote:
> > Drive-by clean up of the comment.
> >
> > [ Impact: cleanup]
>
> We used to do that type of "impact" tagging years ago but we stopped.
> Where did you dig this out from?
>
> :)
>
It was in a comment from that file. :-)
-bw
On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
>
> On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > [...]
> > callq *pv_ops+536(%rip)
>
> Do you know which pv_ops function is this? I can't figure out where
> pte_offset_kernel() gets converted into a pv_ops call....
>
This one is _paravirt_ident_64, I believe. I think that the original
issue Nathan was seeing was with another seemingly innocuous function.
> > [...]
> > --- a/arch/x86/include/asm/paravirt_types.h
> > +++ b/arch/x86/include/asm/paravirt_types.h
> > @@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
> > "=c" (__ecx)
> > #define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
> >
> > -/* void functions are still allowed [re]ax for scratch */
> > +/*
> > + * void functions are still allowed [re]ax for scratch.
> > + *
> > + * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
> > + * registers. Make sure we model this with the appropriate clobbers.
> > + */
> > +#ifdef CONFIG_ZERO_CALL_USED_REGS
> > +#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
> > +#else
> > #define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
> > +#endif
> > #define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
>
> I don't think this should depend on CONFIG_ZERO_CALL_USED_REGS; it should
> always be present.
>
> I've only been looking at this just now, so many I'm missing
> something. The callee clobbers are for functions with return values,
> yes?
>
Kinda. It seems that the usage here is to let the compiler know that a
register may be modified by the callee, not just that it's an "actual"
return value. So it's suitable for void functions.
> For example, 32-bit has to manually deal with doing a 64-bit value return,
> and even got it wrong originally, fixing it in commit 0eb592dbba40
> ("x86/paravirt: return full 64-bit result"), with:
>
> -#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
> +#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), "=d" (__edx)
>
> But the naming is confusing, since these aren't actually clobbers,
> they're input constraints marked as clobbers (the "=" modifier).
>
Right.
> Regardless, the note in the comments ...
>
> ...
> * However, x86_64 also have to clobber all caller saved registers, which
> * unfortunately, are quite a bit (r8 - r11)
> ...
>
> ... would indicate that ALL the function argument registers need to be
> marked as clobbers (i.e. the compiler can't figure this out on its own).
>
Good point. And there are some forms of these macros that specify
those as clobbers.
> I was going to say it seems like they're missing from EXTRA_CLOBBERS,
> but it's not used with any of the macros using PVOP_VCALLEE_CLOBBERS,
> and then I saw the weird alternatives patching that encodes the clobbers
> a second time (CLBR_ANY vs CLBR_RET_REG) via:
>
> #define _paravirt_alt(insn_string, type, clobber) \
> "771:\n\t" insn_string "\n" "772:\n" \
> ".pushsection .parainstructions,\"a\"\n" \
> _ASM_ALIGN "\n" \
> _ASM_PTR " 771b\n" \
> " .byte " type "\n" \
> " .byte 772b-771b\n" \
> " .short " clobber "\n" \
> ".popsection\n"
>
> And after reading the alternatives patching code which parses this via
> the following struct:
>
> /* These all sit in the .parainstructions section to tell us what to patch. */
> struct paravirt_patch_site {
> u8 *instr; /* original instructions */
> u8 type; /* type of this instruction */
> u8 len; /* length of original instruction */
> };
>
> ... I see it _doesn't use the clobbers_ at all! *head explode* I found
> that removal in commit 27876f3882fd ("x86/paravirt: Remove clobbers from
> struct paravirt_patch_site")
>
> So, I guess the CLBR_* can all be entirely removed. But back to my other
> train of thought...
>
[switches stations]
> It seems like all the input registers need to be explicitly listed in
> the PVOP_VCALLEE_CLOBBERS list (as you have), but likely should be done
> unconditionally and for 32-bit as well.
>
Possibly, though it may cause significant code degradation when the
compiler needs to store a value that's live over the ASM statement,
but the register it's in isn't actually modified. I saw that in the
example I gave in the description. In the case where a "movq" is used,
there's a useless move of "rdi" into "r11".
> (Also, please CC [email protected].)
>
Doh! Someday I'll learn email.
-bw
On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
>
> On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> >
> > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > [...]
> > > callq *pv_ops+536(%rip)
> >
> > Do you know which pv_ops function is this? I can't figure out where
> > pte_offset_kernel() gets converted into a pv_ops call....
> >
> This one is _paravirt_ident_64, I believe. I think that the original
> issue Nathan was seeing was with another seemingly innocuous function.
_paravirt_ident_64 is marked noinstr, which makes me suspect that it
really needs to not be touched at all by the compiler for
these...special features.
Maybe the definition of noinstr in include/linux/compiler_types.h
should be adding __attribute__((zero_call_used_regs(skip)))?
Untested:
```
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 4f2a819fd60a..a51ab77e2da8 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -226,10 +226,17 @@ struct ftrace_likely_data {
#define __no_sanitize_or_inline __always_inline
#endif
+#ifdef CONFIG_ZERO_CALL_USED_REGS
+#define __no_zero_call_used_regs __attribute__((__zero_call_used_reg__(skip)))
+#else
+#define __no_zero_call_used_regs
+#endif
+
/* Section for code which can't be instrumented at all */
#define noinstr
\
noinline notrace __attribute((__section__(".noinstr.text"))) \
- __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
+ __no_kcsan __no_sanitize_address __no_profile \
+ __no_sanitize_coverage __no_zero_call_used_regs
#endif /* __KERNEL__ */
```
Or use __has_attribute in include/linux/compiler_attributes.h.
--
Thanks,
~Nick Desaulniers
On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
> >
> > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> > >
> > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > [...]
> > > > callq *pv_ops+536(%rip)
> > >
> > > Do you know which pv_ops function is this? I can't figure out where
> > > pte_offset_kernel() gets converted into a pv_ops call....
> > >
> > This one is _paravirt_ident_64, I believe. I think that the original
> > issue Nathan was seeing was with another seemingly innocuous function.
>
> _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> really needs to not be touched at all by the compiler for
> these...special features.
My source tree sayeth:
u64 notrace _paravirt_ident_64(u64 x)
And that function is only ever called at boot, after alternatives runs
it's patched with:
mov %_ASM_ARG1, %_ASM_AX
Anyway, if you want to take it away from the compiler, something like
so should do.
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 7ca2d46c08cc..8922e2887779 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
}
#ifdef CONFIG_PARAVIRT_XXL
-/* identity function, which can be inlined */
-u64 notrace _paravirt_ident_64(u64 x)
-{
- return x;
-}
+extern u64 _paravirt_ident_64(u64 x);
+asm (".pushsection .entry.text, \"ax\"\n"
+ ".global _paravirt_ident_64\n"
+ "_paravirt_ident_64:\n\t"
+ ASM_ENDBR
+ "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
+ ASM_RET
+ ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
+ ".type _paravirt_ident_64, @function\n\t"
+ ".popsection");
#endif
DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
> > >
> > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > [...]
> > > > > callq *pv_ops+536(%rip)
> > > >
> > > > Do you know which pv_ops function is this? I can't figure out where
> > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > >
> > > This one is _paravirt_ident_64, I believe. I think that the original
> > > issue Nathan was seeing was with another seemingly innocuous function.
> >
> > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > really needs to not be touched at all by the compiler for
> > these...special features.
>
> My source tree sayeth:
>
> u64 notrace _paravirt_ident_64(u64 x)
I don't see noinstr either. But it seems a reasonable thing to do?
Bill, does fixing up the noinstr macro and adding it here fix the
problem?
--
Kees Cook
On Thu, Sep 8, 2022 at 12:10 AM Kees Cook <[email protected]> wrote:
>
> On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
> > > >
> > > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> > > > >
> > > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > > [...]
> > > > > > callq *pv_ops+536(%rip)
> > > > >
> > > > > Do you know which pv_ops function is this? I can't figure out where
> > > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > > >
> > > > This one is _paravirt_ident_64, I believe. I think that the original
> > > > issue Nathan was seeing was with another seemingly innocuous function.
> > >
> > > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > > really needs to not be touched at all by the compiler for
> > > these...special features.
> >
> > My source tree sayeth:
> >
> > u64 notrace _paravirt_ident_64(u64 x)
>
> I don't see noinstr either. But it seems a reasonable thing to do?
>
> Bill, does fixing up the noinstr macro and adding it here fix the
> problem?
>
[sorry for late response]
Let me give it a shot. I'll also test out Peter's suggestion, which
might be a better option in the long run. I suspect that we'll need to
devise similar patches for other places, but it shouldn't be hard to
find them all.
-bw
-bw
On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
> > >
> > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > [...]
> > > > > callq *pv_ops+536(%rip)
> > > >
> > > > Do you know which pv_ops function is this? I can't figure out where
> > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > >
> > > This one is _paravirt_ident_64, I believe. I think that the original
> > > issue Nathan was seeing was with another seemingly innocuous function.
> >
> > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > really needs to not be touched at all by the compiler for
> > these...special features.
>
> My source tree sayeth:
>
> u64 notrace _paravirt_ident_64(u64 x)
>
> And that function is only ever called at boot, after alternatives runs
> it's patched with:
>
> mov %_ASM_ARG1, %_ASM_AX
>
> Anyway, if you want to take it away from the compiler, something like
> so should do.
This appears to work fine for me in QEMU, as I can still boot with
CONFIG_ZERO_CALL_USED_REGS and spawn a nested guest without any issues.
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 7ca2d46c08cc..8922e2887779 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
> }
>
> #ifdef CONFIG_PARAVIRT_XXL
> -/* identity function, which can be inlined */
> -u64 notrace _paravirt_ident_64(u64 x)
> -{
> - return x;
> -}
> +extern u64 _paravirt_ident_64(u64 x);
> +asm (".pushsection .entry.text, \"ax\"\n"
> + ".global _paravirt_ident_64\n"
> + "_paravirt_ident_64:\n\t"
> + ASM_ENDBR
> + "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
> + ASM_RET
> + ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
> + ".type _paravirt_ident_64, @function\n\t"
> + ".popsection");
> #endif
>
> DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
>
On Wed, Sep 14, 2022 at 3:41 PM Nathan Chancellor <[email protected]> wrote:
> On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <[email protected]> wrote:
> > > >
> > > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <[email protected]> wrote:
> > > > >
> > > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > > [...]
> > > > > > callq *pv_ops+536(%rip)
> > > > >
> > > > > Do you know which pv_ops function is this? I can't figure out where
> > > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > > >
> > > > This one is _paravirt_ident_64, I believe. I think that the original
> > > > issue Nathan was seeing was with another seemingly innocuous function.
> > >
> > > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > > really needs to not be touched at all by the compiler for
> > > these...special features.
> >
> > My source tree sayeth:
> >
> > u64 notrace _paravirt_ident_64(u64 x)
> >
> > And that function is only ever called at boot, after alternatives runs
> > it's patched with:
> >
> > mov %_ASM_ARG1, %_ASM_AX
> >
> > Anyway, if you want to take it away from the compiler, something like
> > so should do.
>
> This appears to work fine for me in QEMU, as I can still boot with
> CONFIG_ZERO_CALL_USED_REGS and spawn a nested guest without any issues.
>
Thanks, Nathan. I much prefer to use this patch then and file a
separate issue to investigate the clobbers issue for later.
-bw
> > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> > index 7ca2d46c08cc..8922e2887779 100644
> > --- a/arch/x86/kernel/paravirt.c
> > +++ b/arch/x86/kernel/paravirt.c
> > @@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
> > }
> >
> > #ifdef CONFIG_PARAVIRT_XXL
> > -/* identity function, which can be inlined */
> > -u64 notrace _paravirt_ident_64(u64 x)
> > -{
> > - return x;
> > -}
> > +extern u64 _paravirt_ident_64(u64 x);
> > +asm (".pushsection .entry.text, \"ax\"\n"
> > + ".global _paravirt_ident_64\n"
> > + "_paravirt_ident_64:\n\t"
> > + ASM_ENDBR
> > + "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
> > + ASM_RET
> > + ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
> > + ".type _paravirt_ident_64, @function\n\t"
> > + ".popsection");
> > #endif
> >
> > DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
> >