2023-11-02 11:29:26

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH 0/3] Fix and unify call thunks assembly snippets

Currently INCREMENT_CALL_DEPTH and thunk debug macros explicitly
define %gs: segment register prefix for their percpu variables.
This is not compatible with !CONFIG_SMP, which requires non-prefixed
percpu variables.

Contrary to alternatives, relocations are currently not supported in
call thunk templates. Support for relocations will be needed when
PER_CPU_VAR macro switches to %rip-relative addressing.

Due to unsupported relocations, two variants of INCREMENT_CALL_DEPTH
macro are needed, ASM_ prefixed that allows relocations and
non-prefixed version that allows only absolute addresses.

The following patch series fixes above issues by

a) Moving call thunk template to its own callthunks-tmpl.S assembly file
where PER_CPU_VAR macro from percpu.h can be used to conditionally
use %gs: segment register prefix, depending on CONFIG_SMP.

b) Implementing minimal support for relocations when copying call thunk
template from its storage location to handle %rip-relative addresses.

c) Fixing call thunks debug macros to use PER_CPU_VAR macro from
percpu.h to conditionally use %gs: segment register prefix, depending
on CONFIG_SMP.

d) Unifying ASM_ prefixed assembly macros with their non-prefixed
variants. With support of %rip-relative relocations in place, call
thunk templates allow %rip-relative addressing, so unified assembly
snippet can be used everywhere.

The patch is independent of main percpu series in -tip tree.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Peter Zijlstra <[email protected]>

Uros Bizjak (3):
x86/callthunks: Move call thunk template to .S file
x86/callthunks: Handle %rip-relative relocations in call thunk
template
x86/callthunks: Fix and unify call thunks assembly snippets

arch/x86/include/asm/nospec-branch.h | 23 +++------
arch/x86/kernel/Makefile | 2 +-
arch/x86/kernel/callthunks-tmpl.S | 11 +++++
arch/x86/kernel/callthunks.c | 73 +++++++++++++++++++++-------
4 files changed, 75 insertions(+), 34 deletions(-)
create mode 100644 arch/x86/kernel/callthunks-tmpl.S

--
2.41.0


2023-11-02 11:29:59

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets

Currently thunk debug macros explicitly define %gs: segment register
prefix for their percpu variables. This is not compatible with
!CONFIG_SMP, which requires non-prefixed percpu variables.

Fix call thunks debug macros to use PER_CPU_VAR macro from percpu.h
to conditionally use %gs: segment register prefix, depending on
CONFIG_SMP.

Finally, unify ASM_ prefixed assembly macros with their non-prefixed
variants. With support of %rip-relative relocations in place, call
thunk templates allow %rip-relative addressing, so unified assembly
snippet can be used everywhere.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f93e9b96927a..6f677be6bdb9 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -59,13 +59,13 @@

#ifdef CONFIG_CALL_THUNKS_DEBUG
# define CALL_THUNKS_DEBUG_INC_CALLS \
- incq %gs:__x86_call_count;
+ incq PER_CPU_VAR(__x86_call_count);
# define CALL_THUNKS_DEBUG_INC_RETS \
- incq %gs:__x86_ret_count;
+ incq PER_CPU_VAR(__x86_ret_count);
# define CALL_THUNKS_DEBUG_INC_STUFFS \
- incq %gs:__x86_stuffs_count;
+ incq PER_CPU_VAR(__x86_stuffs_count);
# define CALL_THUNKS_DEBUG_INC_CTXSW \
- incq %gs:__x86_ctxsw_count;
+ incq PER_CPU_VAR(__x86_ctxsw_count);
#else
# define CALL_THUNKS_DEBUG_INC_CALLS
# define CALL_THUNKS_DEBUG_INC_RETS
@@ -80,9 +80,6 @@
#define CREDIT_CALL_DEPTH \
movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth);

-#define ASM_CREDIT_CALL_DEPTH \
- movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth);
-
#define RESET_CALL_DEPTH \
xor %eax, %eax; \
bts $63, %rax; \
@@ -95,20 +92,14 @@
CALL_THUNKS_DEBUG_INC_CALLS

#define INCREMENT_CALL_DEPTH \
- sarq $5, %gs:pcpu_hot + X86_call_depth; \
- CALL_THUNKS_DEBUG_INC_CALLS
-
-#define ASM_INCREMENT_CALL_DEPTH \
sarq $5, PER_CPU_VAR(pcpu_hot + X86_call_depth); \
CALL_THUNKS_DEBUG_INC_CALLS

#else
#define CREDIT_CALL_DEPTH
-#define ASM_CREDIT_CALL_DEPTH
#define RESET_CALL_DEPTH
-#define INCREMENT_CALL_DEPTH
-#define ASM_INCREMENT_CALL_DEPTH
#define RESET_CALL_DEPTH_FROM_CALL
+#define INCREMENT_CALL_DEPTH
#endif

/*
@@ -158,7 +149,7 @@
jnz 771b; \
/* barrier for jnz misprediction */ \
lfence; \
- ASM_CREDIT_CALL_DEPTH \
+ CREDIT_CALL_DEPTH \
CALL_THUNKS_DEBUG_INC_CTXSW
#else
/*
@@ -311,7 +302,7 @@
.macro CALL_DEPTH_ACCOUNT
#ifdef CONFIG_CALL_DEPTH_TRACKING
ALTERNATIVE "", \
- __stringify(ASM_INCREMENT_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+ __stringify(INCREMENT_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
#endif
.endm

--
2.41.0

2023-11-05 08:23:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets

Hi Uros,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on tip/master tip/auto-latest linus/master v6.6 next-20231103]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Uros-Bizjak/x86-callthunks-Move-call-thunk-template-to-S-file/20231102-193542
base: tip/x86/core
patch link: https://lore.kernel.org/r/20231102112850.3448745-4-ubizjak%40gmail.com
patch subject: [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets
config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20231105/[email protected]/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231105/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

/tmp/ccwZh9MG.s: Assembler messages:
>> /tmp/ccwZh9MG.s:27: Error: junk `(pcpu_hot+16)' after expression
>> /tmp/ccwZh9MG.s:27: Error: junk `(__x86_call_count)' after expression

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki