2023-03-10 20:33:56

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 0/5] Improve static call NULL handling

Static calling a NULL pointer is a NOP, unless you're one of those poor
souls running on an arch (or backported x86 monstrosity) with
CONFIG_HAVE_STATIC_CALL=n, then it's a panic.

The "fix" for this undefined behavior is to tell the user to just use
static_call_cond() instead, if they want consistent NOP behavior. But
forgetting to do that is likely to cause subtle bugs. It actually
already did (during RHEL development).

There are two ways to make it consistent:

a) Make static_call(NULL) a NOP for all configs; or

b) Make static_call(NULL) a panic for all configs.

Do (a) because it's consistent with the existing HAVE_STATIC_CALL
behavior. Also it seems simpler to implement and use, and based on
looking at the existing use cases, it's common to want the "do nothing
and return 0" behavior by default.

Then take it a step further and get rid of the distinction between
STATIC_CALL_NULL and STATIC_CALL_RET0.

The end result is less confusing semantics and simpler code all around.


EPILOGUE
--------

If any users wanted panic-on-NULL by default instead of NOP-on-NULL,
that could be added on top of this. They could just initialize the
static call with a __static_call_bug() helper.

void __static_call_bug(void)
{
BUG();
}
..
DEFINE_STATIC_CALL(foo, (func_type)__static_call_bug);

We could take that even further:

DEFINE_STATIC_CALL_NOP(foo, func_type);
DEFINE_STATIC_CALL_BUG(bar, func_type);
...
#define STATIC_CALL_NOP (func_type)__static_call_nop
#define STATIC_CALL_BUG (func_type)__static_call_bug
...
static_call_update(foo, STATIC_CALL_NOP); // do nothing and return 0
static_call_update(foo, STATIC_CALL_BUG); // panic
static_call_update(foo, NULL); // ???

The default behavior for NULL could be a key-specific policy, stored as
a flag in the static_call_key struct.

The key-specific policy would be easier to deal with than the
call-site-specific policy we have today with static_call_cond().



Josh Poimboeuf (5):
static_call: Make NULL static calls consistent
static_call: Make NULL static calls return 0
static_call: Remove static_call_cond() and its usages
static_call: Remove DEFINE_STATIC_CALL_RET0() and its uses
x86/kvm: Simplify static call handling

arch/powerpc/include/asm/static_call.h | 1 -
arch/powerpc/kernel/irq.c | 2 +-
arch/x86/events/amd/core.c | 2 +-
arch/x86/events/core.c | 26 ++---
arch/x86/include/asm/kvm-x86-ops.h | 86 +++++++-------
arch/x86/include/asm/kvm-x86-pmu-ops.h | 17 +--
arch/x86/include/asm/kvm_host.h | 6 +-
arch/x86/include/asm/static_call.h | 8 --
arch/x86/kvm/irq.c | 2 +-
arch/x86/kvm/lapic.c | 22 ++--
arch/x86/kvm/pmu.c | 11 +-
arch/x86/kvm/x86.c | 36 +++---
include/linux/static_call.h | 131 +++++-----------------
kernel/events/core.c | 8 +-
kernel/sched/core.c | 10 +-
security/keys/trusted-keys/trusted_core.c | 2 +-
16 files changed, 126 insertions(+), 244 deletions(-)

--
2.39.2



2023-03-10 20:34:00

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 2/5] static_call: Make NULL static calls return 0

Now that all NULL static calls are just NOPs, the only existing use of
static_call_cond() is to cast static_call() to void. That prevents
calling code from checking the return value of a NULL static call.

The following results in a compiler error:

static_call_update(bar, NULL);
...
foo = static_call_cond(bar)();

But the compiler error only happens if somebody remembers to use
static_call_cond(). If they instead use static_call(), 'foo' is
undefined. So the "protection" of static_call_cond() is a half-hearted,
misleading protection.

If NULL static calls are going to be NOPs, the next logical step is to
have their return values be 0. In other words, combine NULL and RET0
static calls into a single concept.

While it doesn't necessarily make sense to return 0 for void-return
functions, it's still pretty much harmless. The return value register
is already callee-clobbered, and an extra "xor %eax, %eax" shouldn't
move the needle on performance.

This "do nothing return 0" default should work for the vast majority of
cases. Otherwise it can be easily overridden with a user-specified
function which panics or returns 0xdeadbeef or does whatever you want.

This simplifies the static call code and also tends to help simplify
users' code as well.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/powerpc/include/asm/static_call.h | 1 -
arch/x86/include/asm/static_call.h | 8 --
include/linux/static_call.h | 100 +++++++++----------------
3 files changed, 34 insertions(+), 75 deletions(-)

diff --git a/arch/powerpc/include/asm/static_call.h b/arch/powerpc/include/asm/static_call.h
index de1018cc522b..0b17fc551157 100644
--- a/arch/powerpc/include/asm/static_call.h
+++ b/arch/powerpc/include/asm/static_call.h
@@ -23,7 +23,6 @@
#define PPC_SCT_DATA 28 /* Offset of label 2 */

#define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func) __PPC_SCT(name, "b " #func)
-#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) __PPC_SCT(name, "blr")
#define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) __PPC_SCT(name, "b .+20")

#endif /* _ASM_POWERPC_STATIC_CALL_H */
diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
index 343b722ccaf2..118d3d2b837f 100644
--- a/arch/x86/include/asm/static_call.h
+++ b/arch/x86/include/asm/static_call.h
@@ -46,14 +46,6 @@
#define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func) \
__ARCH_DEFINE_STATIC_CALL_TRAMP(name, ".byte 0xe9; .long " #func " - (. + 4)")

-#ifdef CONFIG_RETHUNK
-#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \
- __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "jmp __x86_return_thunk")
-#else
-#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \
- __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret; int3; nop; nop; nop")
-#endif
-
#define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) \
ARCH_DEFINE_STATIC_CALL_TRAMP(name, __static_call_return0)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 8b12216da0da..58783d112df5 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -65,47 +65,31 @@
*
* Notes on NULL function pointers:
*
- * A static_call() to a NULL function pointer is a NOP.
+ * A static_call() to a NULL function pointer is equivalent to a call to a
+ * "do nothing return 0" function.
*
* A NULL static call can be the result of:
*
* DECLARE_STATIC_CALL_NULL(my_static_call, void (*)(int));
*
- * or using static_call_update() with a NULL function. In both cases the
- * HAVE_STATIC_CALL implementation will patch the trampoline with a RET
- * instruction, instead of an immediate tail-call JMP. HAVE_STATIC_CALL_INLINE
- * architectures can patch the trampoline call to a NOP.
+ * or using static_call_update() with a NULL function.
*
- * In all cases, any argument evaluation is unconditional. Unlike a regular
- * conditional function pointer call:
+ * The "return 0" feature is strictly UB per the C standard (since it casts a
+ * function pointer to a different signature) and relies on the architecture
+ * ABI to make things work. In particular it relies on the return value
+ * register being callee-clobbered for all function calls.
*
- * if (my_func_ptr)
- * my_func_ptr(arg1)
- *
- * where the argument evaludation also depends on the pointer value.
- *
- * To query which function is currently set to be called, use:
- *
- * func = static_call_query(name);
+ * In particular The x86_64 implementation of HAVE_STATIC_CALL_INLINE
+ * replaces the 5 byte CALL instruction at the callsite with a 5 byte clear
+ * of the RAX register, completely eliding any function call overhead.
*
+ * Any argument evaluation is unconditional. Unlike a regular conditional
+ * function pointer call:
*
- * DEFINE_STATIC_CALL_RET0 / __static_call_return0:
- *
- * Just like how DEFINE_STATIC_CALL_NULL() optimizes the
- * conditional void function call, DEFINE_STATIC_CALL_RET0 /
- * __static_call_return0 optimize the do nothing return 0 function.
- *
- * This feature is strictly UB per the C standard (since it casts a function
- * pointer to a different signature) and relies on the architecture ABI to
- * make things work. In particular it relies on Caller Stack-cleanup and the
- * whole return register being clobbered for short return values. All normal
- * CDECL style ABIs conform.
- *
- * In particular the x86_64 implementation replaces the 5 byte CALL
- * instruction at the callsite with a 5 byte clear of the RAX register,
- * completely eliding any function call overhead.
+ * if (my_func_ptr)
+ * my_func_ptr(arg1)
*
- * Notably argument setup is unconditional.
+ * where the argument evaluation also depends on the pointer value.
*
*
* EXPORT_STATIC_CALL() vs EXPORT_STATIC_CALL_TRAMP():
@@ -134,14 +118,21 @@ extern void arch_static_call_transform(void *site, void *tramp, void *func, bool
#define STATIC_CALL_TRAMP_ADDR(name) NULL
#endif

+extern long __static_call_return0(void);
+
#define static_call_update(name, func) \
({ \
typeof(&STATIC_CALL_TRAMP(name)) __F = (func); \
+ void *__f = (void *)__F ? : (void *)__static_call_return0; \
__static_call_update(&STATIC_CALL_KEY(name), \
- STATIC_CALL_TRAMP_ADDR(name), __F); \
+ STATIC_CALL_TRAMP_ADDR(name), __f); \
})

-#define static_call_query(name) (READ_ONCE(STATIC_CALL_KEY(name).func))
+#define static_call_query(name) \
+({ \
+ void *__func = (READ_ONCE(STATIC_CALL_KEY(name).func)); \
+ __func == __static_call_return0 ? NULL : __func; \
+})

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE

@@ -165,8 +156,6 @@ extern void __static_call_update(struct static_call_key *key, void *tramp, void
extern int static_call_mod_init(struct module *mod);
extern int static_call_text_reserved(void *start, void *end);

-extern long __static_call_return0(void);
-
#define DEFINE_STATIC_CALL(name, _func) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
@@ -176,14 +165,6 @@ extern long __static_call_return0(void);
ARCH_DEFINE_STATIC_CALL_TRAMP(name, _func)

#define DEFINE_STATIC_CALL_NULL(name, _func) \
- DECLARE_STATIC_CALL(name, _func); \
- struct static_call_key STATIC_CALL_KEY(name) = { \
- .func = NULL, \
- .type = 1, \
- }; \
- ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
-
-#define DEFINE_STATIC_CALL_RET0(name, _func) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
.func = __static_call_return0, \
@@ -191,7 +172,9 @@ extern long __static_call_return0(void);
}; \
ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)

-#define static_call_cond(name) (void)__static_call(name)
+#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL
+
+#define static_call_cond(name) __static_call(name)

#define EXPORT_STATIC_CALL(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
@@ -222,18 +205,13 @@ static inline int static_call_init(void) { return 0; }
#define DEFINE_STATIC_CALL_NULL(name, _func) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
- .func = NULL, \
+ .func = __static_call_return0, \
}; \
ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)

-#define DEFINE_STATIC_CALL_RET0(name, _func) \
- DECLARE_STATIC_CALL(name, _func); \
- struct static_call_key STATIC_CALL_KEY(name) = { \
- .func = __static_call_return0, \
- }; \
- ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)
+#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL

-#define static_call_cond(name) (void)__static_call(name)
+#define static_call_cond(name) __static_call(name)

static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
@@ -249,8 +227,6 @@ static inline int static_call_text_reserved(void *start, void *end)
return 0;
}

-extern long __static_call_return0(void);
-
#define EXPORT_STATIC_CALL(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
@@ -268,13 +244,6 @@ extern long __static_call_return0(void);

static inline int static_call_init(void) { return 0; }

-static inline void __static_call_nop(void) { }
-
-static inline long __static_call_return0(void)
-{
- return 0;
-}
-
#define __DEFINE_STATIC_CALL(name, _func, _func_init) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
@@ -285,17 +254,16 @@ static inline long __static_call_return0(void)
__DEFINE_STATIC_CALL(name, _func, _func)

#define DEFINE_STATIC_CALL_NULL(name, _func) \
- __DEFINE_STATIC_CALL(name, _func, __static_call_nop)
-
-#define DEFINE_STATIC_CALL_RET0(name, _func) \
__DEFINE_STATIC_CALL(name, _func, __static_call_return0)

-#define static_call_cond(name) (void)static_call(name)
+#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL
+
+#define static_call_cond(name) static_call(name)

static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
- WRITE_ONCE(key->func, func ? : (void *)__static_call_nop);
+ WRITE_ONCE(key->func, func);
}

static inline int static_call_text_reserved(void *start, void *end)
--
2.39.2


2023-03-10 20:34:04

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 3/5] static_call: Remove static_call_cond() and its usages

static_call_cond() is now identical to static_call(). All static calls
to NULL function pointers are now NOPs for all configs. Replace all
static_call_cond() usages and remove it.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/events/core.c | 24 +++++++++++------------
arch/x86/include/asm/kvm-x86-ops.h | 3 +--
arch/x86/include/asm/kvm-x86-pmu-ops.h | 3 +--
arch/x86/include/asm/kvm_host.h | 4 ++--
arch/x86/kvm/irq.c | 2 +-
arch/x86/kvm/lapic.c | 22 ++++++++++-----------
arch/x86/kvm/pmu.c | 4 ++--
arch/x86/kvm/x86.c | 24 +++++++++++------------
include/linux/static_call.h | 7 -------
security/keys/trusted-keys/trusted_core.c | 2 +-
10 files changed, 43 insertions(+), 52 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d096b04bf80e..c94537501091 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -995,7 +995,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
if (cpuc->txn_flags & PERF_PMU_TXN_ADD)
n0 -= cpuc->n_txn;

- static_call_cond(x86_pmu_start_scheduling)(cpuc);
+ static_call(x86_pmu_start_scheduling)(cpuc);

for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
c = cpuc->event_constraint[i];
@@ -1094,7 +1094,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
*/
if (!unsched && assign) {
for (i = 0; i < n; i++)
- static_call_cond(x86_pmu_commit_scheduling)(cpuc, i, assign[i]);
+ static_call(x86_pmu_commit_scheduling)(cpuc, i, assign[i]);
} else {
for (i = n0; i < n; i++) {
e = cpuc->event_list[i];
@@ -1102,13 +1102,13 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
/*
* release events that failed scheduling
*/
- static_call_cond(x86_pmu_put_event_constraints)(cpuc, e);
+ static_call(x86_pmu_put_event_constraints)(cpuc, e);

cpuc->event_constraint[i] = NULL;
}
}

- static_call_cond(x86_pmu_stop_scheduling)(cpuc);
+ static_call(x86_pmu_stop_scheduling)(cpuc);

return unsched ? -EINVAL : 0;
}
@@ -1221,7 +1221,7 @@ static inline void x86_assign_hw_event(struct perf_event *event,
hwc->last_cpu = smp_processor_id();
hwc->last_tag = ++cpuc->tags[i];

- static_call_cond(x86_pmu_assign)(event, idx);
+ static_call(x86_pmu_assign)(event, idx);

switch (hwc->idx) {
case INTEL_PMC_IDX_FIXED_BTS:
@@ -1399,7 +1399,7 @@ int x86_perf_event_set_period(struct perf_event *event)
if (left > x86_pmu.max_period)
left = x86_pmu.max_period;

- static_call_cond(x86_pmu_limit_period)(event, &left);
+ static_call(x86_pmu_limit_period)(event, &left);

this_cpu_write(pmc_prev_left[idx], left);

@@ -1487,7 +1487,7 @@ static int x86_pmu_add(struct perf_event *event, int flags)
* This is before x86_pmu_enable() will call x86_pmu_start(),
* so we enable LBRs before an event needs them etc..
*/
- static_call_cond(x86_pmu_add)(event);
+ static_call(x86_pmu_add)(event);

ret = 0;
out:
@@ -1640,7 +1640,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
if (i >= cpuc->n_events - cpuc->n_added)
--cpuc->n_added;

- static_call_cond(x86_pmu_put_event_constraints)(cpuc, event);
+ static_call(x86_pmu_put_event_constraints)(cpuc, event);

/* Delete the array entry. */
while (++i < cpuc->n_events) {
@@ -1660,7 +1660,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
* This is after x86_pmu_stop(); so we disable LBRs after any
* event can need them etc..
*/
- static_call_cond(x86_pmu_del)(event);
+ static_call(x86_pmu_del)(event);
}

int x86_pmu_handle_irq(struct pt_regs *regs)
@@ -2627,13 +2627,13 @@ static const struct attribute_group *x86_pmu_attr_groups[] = {

static void x86_pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
{
- static_call_cond(x86_pmu_sched_task)(pmu_ctx, sched_in);
+ static_call(x86_pmu_sched_task)(pmu_ctx, sched_in);
}

static void x86_pmu_swap_task_ctx(struct perf_event_pmu_context *prev_epc,
struct perf_event_pmu_context *next_epc)
{
- static_call_cond(x86_pmu_swap_task_ctx)(prev_epc, next_epc);
+ static_call(x86_pmu_swap_task_ctx)(prev_epc, next_epc);
}

void perf_check_microcode(void)
@@ -2672,7 +2672,7 @@ static bool x86_pmu_filter(struct pmu *pmu, int cpu)
{
bool ret = false;

- static_call_cond(x86_pmu_filter)(pmu, cpu, &ret);
+ static_call(x86_pmu_filter)(pmu, cpu, &ret);

return ret;
}
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 8dc345cc6318..2f0bfd910637 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -9,8 +9,7 @@ BUILD_BUG_ON(1)
* "static_call_update()" calls.
*
* KVM_X86_OP_OPTIONAL() can be used for those functions that can have
- * a NULL definition, for example if "static_call_cond()" will be used
- * at the call sites. KVM_X86_OP_OPTIONAL_RET0() can be used likewise
+ * a NULL definition. KVM_X86_OP_OPTIONAL_RET0() can be used likewise
* to make a definition optional, but in this case the default will
* be __static_call_return0.
*/
diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index c17e3e96fc1d..6815319c4ff3 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -9,8 +9,7 @@ BUILD_BUG_ON(1)
* "static_call_update()" calls.
*
* KVM_X86_PMU_OP_OPTIONAL() can be used for those functions that can have
- * a NULL definition, for example if "static_call_cond()" will be used
- * at the call sites.
+ * a NULL definition.
*/
KVM_X86_PMU_OP(hw_event_available)
KVM_X86_PMU_OP(pmc_is_enabled)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 808c292ad3f4..1dfba499d3e5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2172,12 +2172,12 @@ static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq)

static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
{
- static_call_cond(kvm_x86_vcpu_blocking)(vcpu);
+ static_call(kvm_x86_vcpu_blocking)(vcpu);
}

static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
{
- static_call_cond(kvm_x86_vcpu_unblocking)(vcpu);
+ static_call(kvm_x86_vcpu_unblocking)(vcpu);
}

static inline int kvm_cpu_get_apicid(int mps_cpu)
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index b2c397dd2bc6..4f9e090c9d42 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -155,7 +155,7 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
{
__kvm_migrate_apic_timer(vcpu);
__kvm_migrate_pit_timer(vcpu);
- static_call_cond(kvm_x86_migrate_timers)(vcpu);
+ static_call(kvm_x86_migrate_timers)(vcpu);
}

bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e542cf285b51..d5f7e829d975 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -681,8 +681,8 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
if (unlikely(apic->apicv_active)) {
/* need to update RVI */
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
- static_call_cond(kvm_x86_hwapic_irr_update)(apic->vcpu,
- apic_find_highest_irr(apic));
+ static_call(kvm_x86_hwapic_irr_update)(apic->vcpu,
+ apic_find_highest_irr(apic));
} else {
apic->irr_pending = false;
kvm_lapic_clear_vector(vec, apic->regs + APIC_IRR);
@@ -708,7 +708,7 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
* just set SVI.
*/
if (unlikely(apic->apicv_active))
- static_call_cond(kvm_x86_hwapic_isr_update)(vec);
+ static_call(kvm_x86_hwapic_isr_update)(vec);
else {
++apic->isr_count;
BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
@@ -753,7 +753,7 @@ static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
* and must be left alone.
*/
if (unlikely(apic->apicv_active))
- static_call_cond(kvm_x86_hwapic_isr_update)(apic_find_highest_isr(apic));
+ static_call(kvm_x86_hwapic_isr_update)(apic_find_highest_isr(apic));
else {
--apic->isr_count;
BUG_ON(apic->isr_count < 0);
@@ -2519,7 +2519,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)

if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
- static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
+ static_call(kvm_x86_set_virtual_apic_mode)(vcpu);
}

apic->base_address = apic->vcpu->arch.apic_base &
@@ -2682,9 +2682,9 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
vcpu->arch.pv_eoi.msr_val = 0;
apic_update_ppr(apic);
if (apic->apicv_active) {
- static_call_cond(kvm_x86_apicv_post_state_restore)(vcpu);
- static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, -1);
- static_call_cond(kvm_x86_hwapic_isr_update)(-1);
+ static_call(kvm_x86_apicv_post_state_restore)(vcpu);
+ static_call(kvm_x86_hwapic_irr_update)(vcpu, -1);
+ static_call(kvm_x86_hwapic_isr_update)(-1);
}

vcpu->arch.apic_arb_prio = 0;
@@ -2961,9 +2961,9 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
kvm_lapic_set_reg(apic, APIC_TMCCT, 0);
kvm_apic_update_apicv(vcpu);
if (apic->apicv_active) {
- static_call_cond(kvm_x86_apicv_post_state_restore)(vcpu);
- static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, apic_find_highest_irr(apic));
- static_call_cond(kvm_x86_hwapic_isr_update)(apic_find_highest_isr(apic));
+ static_call(kvm_x86_apicv_post_state_restore)(vcpu);
+ static_call(kvm_x86_hwapic_irr_update)(vcpu, apic_find_highest_irr(apic));
+ static_call(kvm_x86_hwapic_isr_update)(apic_find_highest_isr(apic));
}
kvm_make_request(KVM_REQ_EVENT, vcpu);
if (ioapic_in_kernel(vcpu->kvm))
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 612e6c70ce2e..6accb46295a3 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -552,7 +552,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
{
if (lapic_in_kernel(vcpu)) {
- static_call_cond(kvm_x86_pmu_deliver_pmi)(vcpu);
+ static_call(kvm_x86_pmu_deliver_pmi)(vcpu);
kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
}
}
@@ -632,7 +632,7 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
pmc_stop_counter(pmc);
}

- static_call_cond(kvm_x86_pmu_cleanup)(vcpu);
+ static_call(kvm_x86_pmu_cleanup)(vcpu);

bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7713420abab0..fcf845fc5770 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4845,7 +4845,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s)
{
- static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call(kvm_x86_sync_pir_to_irr)(vcpu);

return kvm_apic_get_state(vcpu, s);
}
@@ -8948,7 +8948,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
kvm_rip_write(vcpu, ctxt->eip);
if (r && (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
r = kvm_vcpu_do_singlestep(vcpu);
- static_call_cond(kvm_x86_update_emulated_instruction)(vcpu);
+ static_call(kvm_x86_update_emulated_instruction)(vcpu);
__kvm_set_rflags(vcpu, ctxt->eflags);
}

@@ -10307,7 +10307,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
if (irqchip_split(vcpu->kvm))
kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
else {
- static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call(kvm_x86_sync_pir_to_irr)(vcpu);
if (ioapic_in_kernel(vcpu->kvm))
kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
}
@@ -10329,11 +10329,11 @@ static void vcpu_load_eoi_exitmap(struct kvm_vcpu *vcpu)
bitmap_or((ulong *)eoi_exit_bitmap,
vcpu->arch.ioapic_handled_vectors,
to_hv_synic(vcpu)->vec_bitmap, 256);
- static_call_cond(kvm_x86_load_eoi_exitmap)(vcpu, eoi_exit_bitmap);
+ static_call(kvm_x86_load_eoi_exitmap)(vcpu, eoi_exit_bitmap);
return;
}

- static_call_cond(kvm_x86_load_eoi_exitmap)(
+ static_call(kvm_x86_load_eoi_exitmap)(
vcpu, (u64 *)vcpu->arch.ioapic_handled_vectors);
}

@@ -10353,7 +10353,7 @@ void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,

void kvm_arch_guest_memory_reclaimed(struct kvm *kvm)
{
- static_call_cond(kvm_x86_guest_memory_reclaimed)(kvm);
+ static_call(kvm_x86_guest_memory_reclaimed)(kvm);
}

static void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
@@ -10361,7 +10361,7 @@ static void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
if (!lapic_in_kernel(vcpu))
return;

- static_call_cond(kvm_x86_set_apic_access_page_addr)(vcpu);
+ static_call(kvm_x86_set_apic_access_page_addr)(vcpu);
}

void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu)
@@ -10603,7 +10603,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* i.e. they can post interrupts even if APICv is temporarily disabled.
*/
if (kvm_lapic_enabled(vcpu))
- static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call(kvm_x86_sync_pir_to_irr)(vcpu);

if (kvm_vcpu_exit_request(vcpu)) {
vcpu->mode = OUTSIDE_GUEST_MODE;
@@ -10654,7 +10654,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
break;

if (kvm_lapic_enabled(vcpu))
- static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call(kvm_x86_sync_pir_to_irr)(vcpu);

if (unlikely(kvm_vcpu_exit_request(vcpu))) {
exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
@@ -11392,7 +11392,7 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
*mmu_reset_needed |= kvm_read_cr3(vcpu) != sregs->cr3;
vcpu->arch.cr3 = sregs->cr3;
kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
- static_call_cond(kvm_x86_post_set_cr3)(vcpu, sregs->cr3);
+ static_call(kvm_x86_post_set_cr3)(vcpu, sregs->cr3);

kvm_set_cr8(vcpu, sregs->cr8);

@@ -12361,7 +12361,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
mutex_unlock(&kvm->slots_lock);
}
kvm_unload_vcpu_mmus(kvm);
- static_call_cond(kvm_x86_vm_destroy)(kvm);
+ static_call(kvm_x86_vm_destroy)(kvm);
kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
kvm_pic_destroy(kvm);
kvm_ioapic_destroy(kvm);
@@ -13049,7 +13049,7 @@ bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu)
void kvm_arch_start_assignment(struct kvm *kvm)
{
if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1)
- static_call_cond(kvm_x86_pi_start_assignment)(kvm);
+ static_call(kvm_x86_pi_start_assignment)(kvm);
}
EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 58783d112df5..3b3e9e9a1582 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -22,7 +22,6 @@
* __static_call_return0;
*
* static_call(name)(args...);
- * static_call_cond(name)(args...);
* static_call_update(name, func);
* static_call_query(name);
*
@@ -174,8 +173,6 @@ extern int static_call_text_reserved(void *start, void *end);

#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL

-#define static_call_cond(name) __static_call(name)
-
#define EXPORT_STATIC_CALL(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
@@ -211,8 +208,6 @@ static inline int static_call_init(void) { return 0; }

#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL

-#define static_call_cond(name) __static_call(name)
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
@@ -258,8 +253,6 @@ static inline int static_call_init(void) { return 0; }

#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL

-#define static_call_cond(name) static_call(name)
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
diff --git a/security/keys/trusted-keys/trusted_core.c b/security/keys/trusted-keys/trusted_core.c
index c6fc50d67214..b7920482ebcb 100644
--- a/security/keys/trusted-keys/trusted_core.c
+++ b/security/keys/trusted-keys/trusted_core.c
@@ -388,7 +388,7 @@ static int __init init_trusted(void)

static void __exit cleanup_trusted(void)
{
- static_call_cond(trusted_key_exit)();
+ static_call(trusted_key_exit)();
}

late_initcall(init_trusted);
--
2.39.2


2023-03-10 20:34:08

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

NULL static calls have inconsistent behavior. With HAVE_STATIC_CALL=y
they're a NOP, but with HAVE_STATIC_CALL=n they're a panic.

That's guaranteed to cause subtle bugs. Make the behavior consistent by
making NULL static calls a NOP with HAVE_STATIC_CALL=n.

This is probably easier than doing the reverse (making NULL static calls
panic with HAVE_STATIC_CALL=y).

Signed-off-by: Josh Poimboeuf <[email protected]>
---
include/linux/static_call.h | 53 +++++++------------------------------
1 file changed, 9 insertions(+), 44 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 141e6b176a1b..8b12216da0da 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -65,20 +65,12 @@
*
* Notes on NULL function pointers:
*
- * Static_call()s support NULL functions, with many of the caveats that
- * regular function pointers have.
+ * A static_call() to a NULL function pointer is a NOP.
*
- * Clearly calling a NULL function pointer is 'BAD', so too for
- * static_call()s (although when HAVE_STATIC_CALL it might not be immediately
- * fatal). A NULL static_call can be the result of:
+ * A NULL static call can be the result of:
*
* DECLARE_STATIC_CALL_NULL(my_static_call, void (*)(int));
*
- * which is equivalent to declaring a NULL function pointer with just a
- * typename:
- *
- * void (*my_func_ptr)(int arg1) = NULL;
- *
* or using static_call_update() with a NULL function. In both cases the
* HAVE_STATIC_CALL implementation will patch the trampoline with a RET
* instruction, instead of an immediate tail-call JMP. HAVE_STATIC_CALL_INLINE
@@ -92,13 +84,6 @@
*
* where the argument evaludation also depends on the pointer value.
*
- * When calling a static_call that can be NULL, use:
- *
- * static_call_cond(name)(arg1);
- *
- * which will include the required value tests to avoid NULL-pointer
- * dereferences.
- *
* To query which function is currently set to be called, use:
*
* func = static_call_query(name);
@@ -106,7 +91,7 @@
*
* DEFINE_STATIC_CALL_RET0 / __static_call_return0:
*
- * Just like how DEFINE_STATIC_CALL_NULL() / static_call_cond() optimize the
+ * Just like how DEFINE_STATIC_CALL_NULL() optimizes the
* conditional void function call, DEFINE_STATIC_CALL_RET0 /
* __static_call_return0 optimize the do nothing return 0 function.
*
@@ -279,10 +264,12 @@ extern long __static_call_return0(void);
#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

-#else /* Generic implementation */
+#else /* !CONFIG_HAVE_STATIC_CALL */

static inline int static_call_init(void) { return 0; }

+static inline void __static_call_nop(void) { }
+
static inline long __static_call_return0(void)
{
return 0;
@@ -298,39 +285,17 @@ static inline long __static_call_return0(void)
__DEFINE_STATIC_CALL(name, _func, _func)

#define DEFINE_STATIC_CALL_NULL(name, _func) \
- __DEFINE_STATIC_CALL(name, _func, NULL)
+ __DEFINE_STATIC_CALL(name, _func, __static_call_nop)

#define DEFINE_STATIC_CALL_RET0(name, _func) \
__DEFINE_STATIC_CALL(name, _func, __static_call_return0)

-static inline void __static_call_nop(void) { }
-
-/*
- * This horrific hack takes care of two things:
- *
- * - it ensures the compiler will only load the function pointer ONCE,
- * which avoids a reload race.
- *
- * - it ensures the argument evaluation is unconditional, similar
- * to the HAVE_STATIC_CALL variant.
- *
- * Sadly current GCC/Clang (10 for both) do not optimize this properly
- * and will emit an indirect call for the NULL case :-(
- */
-#define __static_call_cond(name) \
-({ \
- void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
- if (!func) \
- func = &__static_call_nop; \
- (typeof(STATIC_CALL_TRAMP(name))*)func; \
-})
-
-#define static_call_cond(name) (void)__static_call_cond(name)
+#define static_call_cond(name) (void)static_call(name)

static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
- WRITE_ONCE(key->func, func);
+ WRITE_ONCE(key->func, func ? : (void *)__static_call_nop);
}

static inline int static_call_text_reserved(void *start, void *end)
--
2.39.2


2023-03-10 20:34:13

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 5/5] x86/kvm: Simplify static call handling

Static calls with a NULL function pointer are now "do nothing return 0"
functions by default. Simplify the KVM static call handling
accordingly.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/x86/include/asm/kvm-x86-ops.h | 85 ++++++++++++--------------
arch/x86/include/asm/kvm-x86-pmu-ops.h | 16 ++---
arch/x86/include/asm/kvm_host.h | 2 -
arch/x86/kvm/pmu.c | 7 +--
arch/x86/kvm/x86.c | 12 +---
5 files changed, 48 insertions(+), 74 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 2f0bfd910637..ea18a8abf4a1 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -1,17 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#if !defined(KVM_X86_OP) || !defined(KVM_X86_OP_OPTIONAL)
+#if !defined(KVM_X86_OP)
BUILD_BUG_ON(1)
#endif

/*
- * KVM_X86_OP() and KVM_X86_OP_OPTIONAL() are used to help generate
- * both DECLARE/DEFINE_STATIC_CALL() invocations and
- * "static_call_update()" calls.
- *
- * KVM_X86_OP_OPTIONAL() can be used for those functions that can have
- * a NULL definition. KVM_X86_OP_OPTIONAL_RET0() can be used likewise
- * to make a definition optional, but in this case the default will
- * be __static_call_return0.
+ * KVM_X86_OP() is used to help generate both DECLARE/DEFINE_STATIC_CALL()
+ * invocations and "static_call_update()" calls. Note that NULL static calls
+ * default to "do-nothing return 0" functions.
*/
KVM_X86_OP(check_processor_compatibility)
KVM_X86_OP(hardware_enable)
@@ -20,8 +15,8 @@ KVM_X86_OP(hardware_unsetup)
KVM_X86_OP(has_emulated_msr)
KVM_X86_OP(vcpu_after_set_cpuid)
KVM_X86_OP(vm_init)
-KVM_X86_OP_OPTIONAL(vm_destroy)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate)
+KVM_X86_OP(vm_destroy)
+KVM_X86_OP(vcpu_precreate)
KVM_X86_OP(vcpu_create)
KVM_X86_OP(vcpu_free)
KVM_X86_OP(vcpu_reset)
@@ -37,7 +32,7 @@ KVM_X86_OP(get_cpl)
KVM_X86_OP(set_segment)
KVM_X86_OP(get_cs_db_l_bits)
KVM_X86_OP(set_cr0)
-KVM_X86_OP_OPTIONAL(post_set_cr3)
+KVM_X86_OP(post_set_cr3)
KVM_X86_OP(is_valid_cr4)
KVM_X86_OP(set_cr4)
KVM_X86_OP(set_efer)
@@ -53,15 +48,15 @@ KVM_X86_OP(set_rflags)
KVM_X86_OP(get_if_flag)
KVM_X86_OP(flush_tlb_all)
KVM_X86_OP(flush_tlb_current)
-KVM_X86_OP_OPTIONAL(tlb_remote_flush)
-KVM_X86_OP_OPTIONAL(tlb_remote_flush_with_range)
+KVM_X86_OP(tlb_remote_flush)
+KVM_X86_OP(tlb_remote_flush_with_range)
KVM_X86_OP(flush_tlb_gva)
KVM_X86_OP(flush_tlb_guest)
KVM_X86_OP(vcpu_pre_run)
KVM_X86_OP(vcpu_run)
KVM_X86_OP(handle_exit)
KVM_X86_OP(skip_emulated_instruction)
-KVM_X86_OP_OPTIONAL(update_emulated_instruction)
+KVM_X86_OP(update_emulated_instruction)
KVM_X86_OP(set_interrupt_shadow)
KVM_X86_OP(get_interrupt_shadow)
KVM_X86_OP(patch_hypercall)
@@ -75,19 +70,19 @@ KVM_X86_OP(get_nmi_mask)
KVM_X86_OP(set_nmi_mask)
KVM_X86_OP(enable_nmi_window)
KVM_X86_OP(enable_irq_window)
-KVM_X86_OP_OPTIONAL(update_cr8_intercept)
+KVM_X86_OP(update_cr8_intercept)
KVM_X86_OP(refresh_apicv_exec_ctrl)
-KVM_X86_OP_OPTIONAL(hwapic_irr_update)
-KVM_X86_OP_OPTIONAL(hwapic_isr_update)
-KVM_X86_OP_OPTIONAL_RET0(guest_apic_has_interrupt)
-KVM_X86_OP_OPTIONAL(load_eoi_exitmap)
-KVM_X86_OP_OPTIONAL(set_virtual_apic_mode)
-KVM_X86_OP_OPTIONAL(set_apic_access_page_addr)
+KVM_X86_OP(hwapic_irr_update)
+KVM_X86_OP(hwapic_isr_update)
+KVM_X86_OP(guest_apic_has_interrupt)
+KVM_X86_OP(load_eoi_exitmap)
+KVM_X86_OP(set_virtual_apic_mode)
+KVM_X86_OP(set_apic_access_page_addr)
KVM_X86_OP(deliver_interrupt)
-KVM_X86_OP_OPTIONAL(sync_pir_to_irr)
-KVM_X86_OP_OPTIONAL_RET0(set_tss_addr)
-KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
-KVM_X86_OP_OPTIONAL_RET0(get_mt_mask)
+KVM_X86_OP(sync_pir_to_irr)
+KVM_X86_OP(set_tss_addr)
+KVM_X86_OP(set_identity_map_addr)
+KVM_X86_OP(get_mt_mask)
KVM_X86_OP(load_mmu_pgd)
KVM_X86_OP(has_wbinvd_exit)
KVM_X86_OP(get_l2_tsc_offset)
@@ -99,15 +94,15 @@ KVM_X86_OP(check_intercept)
KVM_X86_OP(handle_exit_irqoff)
KVM_X86_OP(request_immediate_exit)
KVM_X86_OP(sched_in)
-KVM_X86_OP_OPTIONAL(update_cpu_dirty_logging)
-KVM_X86_OP_OPTIONAL(vcpu_blocking)
-KVM_X86_OP_OPTIONAL(vcpu_unblocking)
-KVM_X86_OP_OPTIONAL(pi_update_irte)
-KVM_X86_OP_OPTIONAL(pi_start_assignment)
-KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
-KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
-KVM_X86_OP_OPTIONAL(set_hv_timer)
-KVM_X86_OP_OPTIONAL(cancel_hv_timer)
+KVM_X86_OP(update_cpu_dirty_logging)
+KVM_X86_OP(vcpu_blocking)
+KVM_X86_OP(vcpu_unblocking)
+KVM_X86_OP(pi_update_irte)
+KVM_X86_OP(pi_start_assignment)
+KVM_X86_OP(apicv_post_state_restore)
+KVM_X86_OP(dy_apicv_has_pending_interrupt)
+KVM_X86_OP(set_hv_timer)
+KVM_X86_OP(cancel_hv_timer)
KVM_X86_OP(setup_mce)
#ifdef CONFIG_KVM_SMM
KVM_X86_OP(smi_allowed)
@@ -115,22 +110,20 @@ KVM_X86_OP(enter_smm)
KVM_X86_OP(leave_smm)
KVM_X86_OP(enable_smi_window)
#endif
-KVM_X86_OP_OPTIONAL(mem_enc_ioctl)
-KVM_X86_OP_OPTIONAL(mem_enc_register_region)
-KVM_X86_OP_OPTIONAL(mem_enc_unregister_region)
-KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
-KVM_X86_OP_OPTIONAL(vm_move_enc_context_from)
-KVM_X86_OP_OPTIONAL(guest_memory_reclaimed)
+KVM_X86_OP(mem_enc_ioctl)
+KVM_X86_OP(mem_enc_register_region)
+KVM_X86_OP(mem_enc_unregister_region)
+KVM_X86_OP(vm_copy_enc_context_from)
+KVM_X86_OP(vm_move_enc_context_from)
+KVM_X86_OP(guest_memory_reclaimed)
KVM_X86_OP(get_msr_feature)
KVM_X86_OP(can_emulate_instruction)
KVM_X86_OP(apic_init_signal_blocked)
-KVM_X86_OP_OPTIONAL(enable_l2_tlb_flush)
-KVM_X86_OP_OPTIONAL(migrate_timers)
+KVM_X86_OP(enable_l2_tlb_flush)
+KVM_X86_OP(migrate_timers)
KVM_X86_OP(msr_filter_changed)
KVM_X86_OP(complete_emulated_msr)
KVM_X86_OP(vcpu_deliver_sipi_vector)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP(vcpu_get_apicv_inhibit_reasons);

#undef KVM_X86_OP
-#undef KVM_X86_OP_OPTIONAL
-#undef KVM_X86_OP_OPTIONAL_RET0
diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index 6815319c4ff3..b11885d1bcd4 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -1,15 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_OPTIONAL)
+#if !defined(KVM_X86_PMU_OP)
BUILD_BUG_ON(1)
#endif

/*
- * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_OPTIONAL() are used to help generate
- * both DECLARE/DEFINE_STATIC_CALL() invocations and
- * "static_call_update()" calls.
- *
- * KVM_X86_PMU_OP_OPTIONAL() can be used for those functions that can have
- * a NULL definition.
+ * KVM_X86_PMU_OP() is used to help generate both DECLARE/DEFINE_STATIC_CALL()
+ * invocations and "static_call_update()" calls. Note that NULL static calls
+ * default to "do-nothing return 0" functions.
*/
KVM_X86_PMU_OP(hw_event_available)
KVM_X86_PMU_OP(pmc_is_enabled)
@@ -23,8 +20,7 @@ KVM_X86_PMU_OP(set_msr)
KVM_X86_PMU_OP(refresh)
KVM_X86_PMU_OP(init)
KVM_X86_PMU_OP(reset)
-KVM_X86_PMU_OP_OPTIONAL(deliver_pmi)
-KVM_X86_PMU_OP_OPTIONAL(cleanup)
+KVM_X86_PMU_OP(deliver_pmi)
+KVM_X86_PMU_OP(cleanup)

#undef KVM_X86_PMU_OP
-#undef KVM_X86_PMU_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1dfba499d3e5..612531e1c478 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1789,8 +1789,6 @@ extern struct kvm_x86_ops kvm_x86_ops;

#define KVM_X86_OP(func) \
DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
-#define KVM_X86_OP_OPTIONAL KVM_X86_OP
-#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
#include <asm/kvm-x86-ops.h>

int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops);
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 6accb46295a3..5f7f860c5f17 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -77,20 +77,15 @@ static struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
#define KVM_X86_PMU_OP(func) \
DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
*(((struct kvm_pmu_ops *)0)->func));
-#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
#include <asm/kvm-x86-pmu-ops.h>

void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
{
memcpy(&kvm_pmu_ops, pmu_ops, sizeof(kvm_pmu_ops));

-#define __KVM_X86_PMU_OP(func) \
- static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
#define KVM_X86_PMU_OP(func) \
- WARN_ON(!kvm_pmu_ops.func); __KVM_X86_PMU_OP(func)
-#define KVM_X86_PMU_OP_OPTIONAL __KVM_X86_PMU_OP
+ static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
#include <asm/kvm-x86-pmu-ops.h>
-#undef __KVM_X86_PMU_OP
}

static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fcf845fc5770..a9521e5b2435 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -135,9 +135,8 @@ struct kvm_x86_ops kvm_x86_ops __read_mostly;
#define KVM_X86_OP(func) \
DEFINE_STATIC_CALL_NULL(kvm_x86_##func, \
*(((struct kvm_x86_ops *)0)->func));
-#define KVM_X86_OP_OPTIONAL KVM_X86_OP
-#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
#include <asm/kvm-x86-ops.h>
+
EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
EXPORT_STATIC_CALL_GPL(kvm_x86_cache_reg);

@@ -9316,16 +9315,9 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
{
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));

-#define __KVM_X86_OP(func) \
- static_call_update(kvm_x86_##func, kvm_x86_ops.func);
#define KVM_X86_OP(func) \
- WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
-#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
-#define KVM_X86_OP_OPTIONAL_RET0(func) \
- static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
- (void *)__static_call_return0);
+ static_call_update(kvm_x86_##func, kvm_x86_ops.func);
#include <asm/kvm-x86-ops.h>
-#undef __KVM_X86_OP

kvm_pmu_ops_update(ops->pmu_ops);
}
--
2.39.2


2023-03-10 20:34:23

by Josh Poimboeuf

[permalink] [raw]
Subject: [RFC][PATCH 4/5] static_call: Remove DEFINE_STATIC_CALL_RET0() and its uses

DEFINE_STATIC_CALL_RET0() is now identical to DEFINE_STATIC_CALL_NULL().
All static calls to NULL function pointers are actually calls to "do
nothing return 0" functions.

Replace all DEFINE_STATIC_CALL_RET0() usages and remove it.

Signed-off-by: Josh Poimboeuf <[email protected]>
---
arch/powerpc/kernel/irq.c | 2 +-
arch/x86/events/amd/core.c | 2 +-
arch/x86/events/core.c | 2 +-
include/linux/static_call.h | 7 -------
kernel/events/core.c | 8 ++++----
kernel/sched/core.c | 10 +++++-----
6 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index c9535f2760b5..320e1a41abd6 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -220,7 +220,7 @@ static __always_inline void call_do_softirq(const void *sp)
}
#endif

-DEFINE_STATIC_CALL_RET0(ppc_get_irq, *ppc_md.get_irq);
+DEFINE_STATIC_CALL_NULL(ppc_get_irq, *ppc_md.get_irq);

static void __do_irq(struct pt_regs *regs, unsigned long oldsp)
{
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 8c45b198b62f..3c545595bfeb 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -330,7 +330,7 @@ static inline bool amd_is_pair_event_code(struct hw_perf_event *hwc)
}
}

-DEFINE_STATIC_CALL_RET0(amd_pmu_branch_hw_config, *x86_pmu.hw_config);
+DEFINE_STATIC_CALL_NULL(amd_pmu_branch_hw_config, *x86_pmu.hw_config);

static int amd_core_hw_config(struct perf_event *event)
{
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c94537501091..3f662c16aa08 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -96,7 +96,7 @@ DEFINE_STATIC_CALL_NULL(x86_pmu_filter, *x86_pmu.filter);
* This one is magic, it will get called even when PMU init fails (because
* there is no PMU), in which case it should simply return NULL.
*/
-DEFINE_STATIC_CALL_RET0(x86_pmu_guest_get_msrs, *x86_pmu.guest_get_msrs);
+DEFINE_STATIC_CALL_NULL(x86_pmu_guest_get_msrs, *x86_pmu.guest_get_msrs);

u64 __read_mostly hw_cache_event_ids
[PERF_COUNT_HW_CACHE_MAX]
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 3b3e9e9a1582..6b8d7b687c13 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -17,7 +17,6 @@
* DECLARE_STATIC_CALL(name, func);
* DEFINE_STATIC_CALL(name, func);
* DEFINE_STATIC_CALL_NULL(name, typename);
- * DEFINE_STATIC_CALL_RET0(name, typename);
*
* __static_call_return0;
*
@@ -171,8 +170,6 @@ extern int static_call_text_reserved(void *start, void *end);
}; \
ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)

-#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL
-
#define EXPORT_STATIC_CALL(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
@@ -206,8 +203,6 @@ static inline int static_call_init(void) { return 0; }
}; \
ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)

-#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
@@ -251,8 +246,6 @@ static inline int static_call_init(void) { return 0; }
#define DEFINE_STATIC_CALL_NULL(name, _func) \
__DEFINE_STATIC_CALL(name, _func, __static_call_return0)

-#define DEFINE_STATIC_CALL_RET0 DEFINE_STATIC_CALL_NULL
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f79fd8b87f75..42017f696e2e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6757,9 +6757,9 @@ static void perf_pending_task(struct callback_head *head)
#ifdef CONFIG_GUEST_PERF_EVENTS
struct perf_guest_info_callbacks __rcu *perf_guest_cbs;

-DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state);
-DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
-DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);
+DEFINE_STATIC_CALL_NULL(__perf_guest_state, *perf_guest_cbs->state);
+DEFINE_STATIC_CALL_NULL(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
+DEFINE_STATIC_CALL_NULL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);

void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
{
@@ -13766,4 +13766,4 @@ struct cgroup_subsys perf_event_cgrp_subsys = {
};
#endif /* CONFIG_CGROUP_PERF */

-DEFINE_STATIC_CALL_RET0(perf_snapshot_branch_stack, perf_snapshot_branch_stack_t);
+DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, perf_snapshot_branch_stack_t);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index af017e038b48..cf50562277b7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8492,12 +8492,12 @@ EXPORT_SYMBOL(__cond_resched);
#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
#define cond_resched_dynamic_enabled __cond_resched
#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
-DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
+DEFINE_STATIC_CALL_NULL(cond_resched, __cond_resched);
EXPORT_STATIC_CALL_TRAMP(cond_resched);

#define might_resched_dynamic_enabled __cond_resched
#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
-DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
+DEFINE_STATIC_CALL_NULL(might_resched, __cond_resched);
EXPORT_STATIC_CALL_TRAMP(might_resched);
#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
@@ -8598,7 +8598,7 @@ EXPORT_SYMBOL(__cond_resched_rwlock_write);
*
* NONE:
* cond_resched <- __cond_resched
- * might_resched <- RET0
+ * might_resched <- NULL
* preempt_schedule <- NOP
* preempt_schedule_notrace <- NOP
* irqentry_exit_cond_resched <- NOP
@@ -8611,8 +8611,8 @@ EXPORT_SYMBOL(__cond_resched_rwlock_write);
* irqentry_exit_cond_resched <- NOP
*
* FULL:
- * cond_resched <- RET0
- * might_resched <- RET0
+ * cond_resched <- NULL
+ * might_resched <- NULL
* preempt_schedule <- preempt_schedule
* preempt_schedule_notrace <- preempt_schedule_notrace
* irqentry_exit_cond_resched <- irqentry_exit_cond_resched
--
2.39.2


2023-03-10 21:01:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Fri, Mar 10, 2023 at 12:31:13PM -0800, Josh Poimboeuf wrote:

> -/*
> - * This horrific hack takes care of two things:
> - *
> - * - it ensures the compiler will only load the function pointer ONCE,
> - * which avoids a reload race.
> - *
> - * - it ensures the argument evaluation is unconditional, similar
> - * to the HAVE_STATIC_CALL variant.
> - *
> - * Sadly current GCC/Clang (10 for both) do not optimize this properly
> - * and will emit an indirect call for the NULL case :-(
> - */
> -#define __static_call_cond(name) \
> -({ \
> - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> - if (!func) \
> - func = &__static_call_nop; \
> - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> -})

So a sufficiently clever compiler can optimize the above to avoid the
actual indirect call (and resulting CFI violation, see below), because
__static_call_nop() is inline and hence visible as an empty stub
function. Currently none of the compilers are that clever :/

> -#define static_call_cond(name) (void)__static_call_cond(name)
> +#define static_call_cond(name) (void)static_call(name)
>
> static inline
> void __static_call_update(struct static_call_key *key, void *tramp, void *func)
> {
> - WRITE_ONCE(key->func, func);
> + WRITE_ONCE(key->func, func ? : (void *)__static_call_nop);
> }

This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
CLANG_CFI, which means the above will end up being a runtime indirect
call to a non-matching signature function.

Now, I suppose we don't actually have this happen in current code by the
simple expedient of not actually having any static_call_cond() usage
outside of arch code.

(/me git-grep's some and *arrrggh* trusted-keys)

I really don't think we can do this though, must not promote CFI
violations.

2023-03-10 21:07:40

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RFC][PATCH 5/5] x86/kvm: Simplify static call handling

"KVM: x86:" please, "x86/kvm" is for guest-side changes.

On Fri, Mar 10, 2023, Josh Poimboeuf wrote:
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 1dfba499d3e5..612531e1c478 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1789,8 +1789,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
>
> #define KVM_X86_OP(func) \
> DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
> -#define KVM_X86_OP_OPTIONAL KVM_X86_OP
> -#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
> #include <asm/kvm-x86-ops.h>
>
> int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops);
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 6accb46295a3..5f7f860c5f17 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -77,20 +77,15 @@ static struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> #define KVM_X86_PMU_OP(func) \
> DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
> *(((struct kvm_pmu_ops *)0)->func));
> -#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
> #include <asm/kvm-x86-pmu-ops.h>
>
> void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
> {
> memcpy(&kvm_pmu_ops, pmu_ops, sizeof(kvm_pmu_ops));
>
> -#define __KVM_X86_PMU_OP(func) \
> - static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
> #define KVM_X86_PMU_OP(func) \
> - WARN_ON(!kvm_pmu_ops.func); __KVM_X86_PMU_OP(func)

I would much prefer to keep KVM mostly as-is, specifically so that we don't lose
this WARN_ON() that guards against a vendor module neglecting to implement a
mandatory callback. This effectively gives KVM "full" protection against consuming
an unexpectedly-NULL function pointer.

It's not strictly necessary, but I'm inclinded to keep KVM_X86_OP_OPTIONAL_RET0
around for documentation purposes. And if I could figure out the compiler magic
to make it work, a WARN/BUILD_BUG on the sizeof() the return type of a RET0
function being larger than sizeof(int). KVM encountered the fun that occurs on
a u64 return on 32-bit kernels due to "xor eax, eax" leaving garbage in edx.

2023-03-10 21:10:02

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/5] Improve static call NULL handling

On Fri, 10 Mar 2023 12:31:12 -0800
Josh Poimboeuf <[email protected]> wrote:

> static_call_update(foo, STATIC_CALL_NOP); // do nothing and return 0
> static_call_update(foo, STATIC_CALL_BUG); // panic
> static_call_update(foo, NULL); // ???
>
> The default behavior for NULL could be a key-specific policy, stored as
> a flag in the static_call_key struct.

Could we just get rid of the ambiguity and make

static_call_update(foo, NULL);

trigger a WARN_ON() instead, and always do nop?

The issue I have with allowing NULL, is that it's not easy to know from the
call site what it does.

-- Steve

2023-03-10 21:14:05

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC][PATCH 5/5] x86/kvm: Simplify static call handling

On Fri, 10 Mar 2023 13:07:27 -0800
Sean Christopherson <[email protected]> wrote:

> "KVM: x86:" please, "x86/kvm" is for guest-side changes.
>
> On Fri, Mar 10, 2023, Josh Poimboeuf wrote:
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 1dfba499d3e5..612531e1c478 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1789,8 +1789,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
> >
> > #define KVM_X86_OP(func) \
> > DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
> > -#define KVM_X86_OP_OPTIONAL KVM_X86_OP
> > -#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
> > #include <asm/kvm-x86-ops.h>
> >
> > int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops);
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index 6accb46295a3..5f7f860c5f17 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -77,20 +77,15 @@ static struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> > #define KVM_X86_PMU_OP(func) \
> > DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
> > *(((struct kvm_pmu_ops *)0)->func));
> > -#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
> > #include <asm/kvm-x86-pmu-ops.h>
> >
> > void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
> > {
> > memcpy(&kvm_pmu_ops, pmu_ops, sizeof(kvm_pmu_ops));
> >
> > -#define __KVM_X86_PMU_OP(func) \
> > - static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
> > #define KVM_X86_PMU_OP(func) \
> > - WARN_ON(!kvm_pmu_ops.func); __KVM_X86_PMU_OP(func)
>
> I would much prefer to keep KVM mostly as-is, specifically so that we don't lose
> this WARN_ON() that guards against a vendor module neglecting to implement a
> mandatory callback. This effectively gives KVM "full" protection against consuming
> an unexpectedly-NULL function pointer.

As in my reply to patch 0/5, I suggested that static_call_update(NULL)
would trigger a WARN_ON() always. Then this could be cleaned up and still
get that warning.

-- Steve

2023-03-10 21:30:41

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RFC][PATCH 5/5] x86/kvm: Simplify static call handling

On Fri, Mar 10, 2023, Steven Rostedt wrote:
> On Fri, 10 Mar 2023 13:07:27 -0800
> Sean Christopherson <[email protected]> wrote:
>
> > "KVM: x86:" please, "x86/kvm" is for guest-side changes.
> >
> > On Fri, Mar 10, 2023, Josh Poimboeuf wrote:
> > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > index 1dfba499d3e5..612531e1c478 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -1789,8 +1789,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
> > >
> > > #define KVM_X86_OP(func) \
> > > DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
> > > -#define KVM_X86_OP_OPTIONAL KVM_X86_OP
> > > -#define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
> > > #include <asm/kvm-x86-ops.h>
> > >
> > > int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops);
> > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > > index 6accb46295a3..5f7f860c5f17 100644
> > > --- a/arch/x86/kvm/pmu.c
> > > +++ b/arch/x86/kvm/pmu.c
> > > @@ -77,20 +77,15 @@ static struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> > > #define KVM_X86_PMU_OP(func) \
> > > DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
> > > *(((struct kvm_pmu_ops *)0)->func));
> > > -#define KVM_X86_PMU_OP_OPTIONAL KVM_X86_PMU_OP
> > > #include <asm/kvm-x86-pmu-ops.h>
> > >
> > > void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
> > > {
> > > memcpy(&kvm_pmu_ops, pmu_ops, sizeof(kvm_pmu_ops));
> > >
> > > -#define __KVM_X86_PMU_OP(func) \
> > > - static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func);
> > > #define KVM_X86_PMU_OP(func) \
> > > - WARN_ON(!kvm_pmu_ops.func); __KVM_X86_PMU_OP(func)
> >
> > I would much prefer to keep KVM mostly as-is, specifically so that we don't lose
> > this WARN_ON() that guards against a vendor module neglecting to implement a
> > mandatory callback. This effectively gives KVM "full" protection against consuming
> > an unexpectedly-NULL function pointer.
>
> As in my reply to patch 0/5, I suggested that static_call_update(NULL)
> would trigger a WARN_ON() always. Then this could be cleaned up and still
> get that warning.

I don't think that provides the functionality KVM wants/needs. KVM only disallows
NULL updates for select mandatory hooks. For optional hooks, KVM needs to support
NULL updates in some capacity to handle the scenario where a vendor module is
reloaded with different settings, e.g. loading kvm_intel with enable_apicv=0 after
running with enable_apicv=1.

WARN_ON() a static_call_update(..., NULL) should be ok, but I believe KVM would
still need/want macro shenanigans, e.g.

#define __KVM_X86_OP(func) \
static_call_update(kvm_x86_##func,
kvm_x86_ops.func ? kvm_x86_ops.func : STATIC_CALL_NOP);
#define KVM_X86_OP(func) \
WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
#define KVM_X86_OP_OPTIONAL_RET0(func) __KVM_X86_OP

2023-03-10 22:24:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC][PATCH 5/5] x86/kvm: Simplify static call handling

On Fri, Mar 10, 2023 at 09:29:36PM +0000, Sean Christopherson wrote:
> > > I would much prefer to keep KVM mostly as-is, specifically so that we don't lose
> > > this WARN_ON() that guards against a vendor module neglecting to implement a
> > > mandatory callback. This effectively gives KVM "full" protection against consuming
> > > an unexpectedly-NULL function pointer.

Ok, sure.

> > As in my reply to patch 0/5, I suggested that static_call_update(NULL)
> > would trigger a WARN_ON() always. Then this could be cleaned up and still
> > get that warning.
>
> I don't think that provides the functionality KVM wants/needs. KVM only disallows
> NULL updates for select mandatory hooks. For optional hooks, KVM needs to support
> NULL updates in some capacity to handle the scenario where a vendor module is
> reloaded with different settings, e.g. loading kvm_intel with enable_apicv=0 after
> running with enable_apicv=1.
>
> WARN_ON() a static_call_update(..., NULL) should be ok, but I believe KVM would
> still need/want macro shenanigans, e.g.
>
> #define __KVM_X86_OP(func) \
> static_call_update(kvm_x86_##func,
> kvm_x86_ops.func ? kvm_x86_ops.func : STATIC_CALL_NOP);
> #define KVM_X86_OP(func) \
> WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
> #define KVM_X86_OP_OPTIONAL __KVM_X86_OP
> #define KVM_X86_OP_OPTIONAL_RET0(func) __KVM_X86_OP

Yeah, something like that might be ok, if we just refuse NULL as an
option.

If only Peter hadn't ruined my Friday with the CFI talk.

--
Josh

2023-03-11 01:20:14

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Fri, Mar 10, 2023 at 09:59:26PM +0100, Peter Zijlstra wrote:
> > -#define __static_call_cond(name) \
> > -({ \
> > - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> > - if (!func) \
> > - func = &__static_call_nop; \
> > - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> > -})
>
> So a sufficiently clever compiler can optimize the above to avoid the
> actual indirect call (and resulting CFI violation, see below), because
> __static_call_nop() is inline and hence visible as an empty stub
> function. Currently none of the compilers are that clever :/

I won't hold my breath waiting for theoretical optimizations.

> This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
> CLANG_CFI, which means the above will end up being a runtime indirect
> call to a non-matching signature function.
>
> Now, I suppose we don't actually have this happen in current code by the
> simple expedient of not actually having any static_call_cond() usage
> outside of arch code.
>
> (/me git-grep's some and *arrrggh* trusted-keys)
>
> I really don't think we can do this though, must not promote CFI
> violations.

Ouch, so static_call_cond() and __static_call_return0() are broken today
on CFI_CLANG + arm64.

Some ideas:

1) Implement HAVE_STATIC_CALL for arm64. IIRC, this wasn't worth the
effort due to restricted branch ranges and CFI fun.

2) Create yet another "tier" of static call implementations, for
arches which can have the unfortunate combo of CFI_CLANG +
!HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?

The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
asm to create a CFI-compliant NOP/BUG/whatever version of the
function (insert lots of hand-waving). Is the kcfi hash available
to inline asm at build time?

3) Use a jump label to bypass the static call instead of calling
__static_call_nop(). NOTE: I couldn't figure out how to do this
without angering the compiler, unless we want to change
static_call() back to the old-school interface:

static_call(foo, args...)

Is it Friday yet?

--
Josh

2023-03-12 15:17:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> On Fri, Mar 10, 2023 at 09:59:26PM +0100, Peter Zijlstra wrote:
> > > -#define __static_call_cond(name) \
> > > -({ \
> > > - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> > > - if (!func) \
> > > - func = &__static_call_nop; \
> > > - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> > > -})
> >
> > So a sufficiently clever compiler can optimize the above to avoid the
> > actual indirect call (and resulting CFI violation, see below), because
> > __static_call_nop() is inline and hence visible as an empty stub
> > function. Currently none of the compilers are that clever :/
>
> I won't hold my breath waiting for theoretical optimizations.

Well, I'm thinking the clang folks might like this option to unbreak the
arm64 build. At least here they have a fighting chance of actually doing
the right thing.

Let me Cc some actual compiler folks.

> > This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
> > CLANG_CFI, which means the above will end up being a runtime indirect
> > call to a non-matching signature function.
> >
> > Now, I suppose we don't actually have this happen in current code by the
> > simple expedient of not actually having any static_call_cond() usage
> > outside of arch code.
> >
> > (/me git-grep's some and *arrrggh* trusted-keys)
> >
> > I really don't think we can do this though, must not promote CFI
> > violations.
>
> Ouch, so static_call_cond() and __static_call_return0() are broken today
> on CFI_CLANG + arm64.

Yes. Now __static_call_return0() should really only happen when
HAVE_STATIC_CALL per the definition only being available in that case.

And static_call_cond() as implemented today *might* just be fixable by
the compiler.

> Some ideas:
>
> 1) Implement HAVE_STATIC_CALL for arm64. IIRC, this wasn't worth the
> effort due to restricted branch ranges and CFI fun.

The powerpc32 thing did it, iirc a similar approach could work for arm.
But this would basically mandate HAVE_STATIC_CALL for CFI_CLANG.

>
> 2) Create yet another "tier" of static call implementations, for
> arches which can have the unfortunate combo of CFI_CLANG +
> !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
>
> The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> asm to create a CFI-compliant NOP/BUG/whatever version of the
> function (insert lots of hand-waving). Is the kcfi hash available
> to inline asm at build time?

Yes, clang creates magic symbol for everything it sees a declaration
for. This symbols can be referenced from asm, linking will make it all
work.

And yes, C sucks, you can't actually create a function definition from a
type :/ Otherwise this could be trivially fixable.

> 3) Use a jump label to bypass the static call instead of calling
> __static_call_nop(). NOTE: I couldn't figure out how to do this
> without angering the compiler, unless we want to change
> static_call() back to the old-school interface:
>
> static_call(foo, args...)
>
> Is it Friday yet?

Always right :-)

And yes, the whole premise of all this is that we let the compiler
generate the actuall CALL and then have objtool scan the output and
report the locations of them. There is no way to intercept this at the
compiler level.

2023-03-13 15:09:03

by Sean Christopherson

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Sun, Mar 12, 2023, Peter Zijlstra wrote:
> On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> > On Fri, Mar 10, 2023 at 09:59:26PM +0100, Peter Zijlstra wrote:
> > > > -#define __static_call_cond(name) \
> > > > -({ \
> > > > - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> > > > - if (!func) \
> > > > - func = &__static_call_nop; \
> > > > - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> > > > -})
> > >
> > > So a sufficiently clever compiler can optimize the above to avoid the
> > > actual indirect call (and resulting CFI violation, see below), because
> > > __static_call_nop() is inline and hence visible as an empty stub
> > > function. Currently none of the compilers are that clever :/
> >
> > I won't hold my breath waiting for theoretical optimizations.
>
> Well, I'm thinking the clang folks might like this option to unbreak the
> arm64 build. At least here they have a fighting chance of actually doing
> the right thing.
>
> Let me Cc some actual compiler folks.

+Will and Kees too for the arm64+CFI mess.

https://lore.kernel.org/all/[email protected]

> > > This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
> > > CLANG_CFI, which means the above will end up being a runtime indirect
> > > call to a non-matching signature function.
> > >
> > > Now, I suppose we don't actually have this happen in current code by the
> > > simple expedient of not actually having any static_call_cond() usage
> > > outside of arch code.
> > >
> > > (/me git-grep's some and *arrrggh* trusted-keys)
> > >
> > > I really don't think we can do this though, must not promote CFI
> > > violations.
> >
> > Ouch, so static_call_cond() and __static_call_return0() are broken today
> > on CFI_CLANG + arm64.
>
> Yes. Now __static_call_return0() should really only happen when
> HAVE_STATIC_CALL per the definition only being available in that case.
>
> And static_call_cond() as implemented today *might* just be fixable by
> the compiler.
>
> > Some ideas:
> >
> > 1) Implement HAVE_STATIC_CALL for arm64. IIRC, this wasn't worth the
> > effort due to restricted branch ranges and CFI fun.
>
> The powerpc32 thing did it, iirc a similar approach could work for arm.
> But this would basically mandate HAVE_STATIC_CALL for CFI_CLANG.
>
> >
> > 2) Create yet another "tier" of static call implementations, for
> > arches which can have the unfortunate combo of CFI_CLANG +
> > !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
> >
> > The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> > asm to create a CFI-compliant NOP/BUG/whatever version of the
> > function (insert lots of hand-waving). Is the kcfi hash available
> > to inline asm at build time?
>
> Yes, clang creates magic symbol for everything it sees a declaration
> for. This symbols can be referenced from asm, linking will make it all
> work.
>
> And yes, C sucks, you can't actually create a function definition from a
> type :/ Otherwise this could be trivially fixable.
>
> > 3) Use a jump label to bypass the static call instead of calling
> > __static_call_nop(). NOTE: I couldn't figure out how to do this
> > without angering the compiler, unless we want to change
> > static_call() back to the old-school interface:
> >
> > static_call(foo, args...)
> >
> > Is it Friday yet?
>
> Always right :-)
>
> And yes, the whole premise of all this is that we let the compiler
> generate the actuall CALL and then have objtool scan the output and
> report the locations of them. There is no way to intercept this at the
> compiler level.

2023-03-13 17:49:57

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Sun, Mar 12, 2023 at 8:17 AM Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> > 2) Create yet another "tier" of static call implementations, for
> > arches which can have the unfortunate combo of CFI_CLANG +
> > !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
> >
> > The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> > asm to create a CFI-compliant NOP/BUG/whatever version of the
> > function (insert lots of hand-waving). Is the kcfi hash available
> > to inline asm at build time?
>
> Yes, clang creates magic symbol for everything it sees a declaration
> for. This symbols can be referenced from asm, linking will make it all
> work.
>
> And yes, C sucks, you can't actually create a function definition from a
> type :/ Otherwise this could be trivially fixable.

Wouldn't creating a separate inline assembly nop function that
references the CFI hash of another function with the correct type
potentially solve this issue like Josh suggested?

Sami

2023-03-14 01:58:57

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Mon, Mar 13, 2023 at 10:48:58AM -0700, Sami Tolvanen wrote:
> On Sun, Mar 12, 2023 at 8:17 AM Peter Zijlstra <[email protected]> wrote:
> >
> > On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> > > 2) Create yet another "tier" of static call implementations, for
> > > arches which can have the unfortunate combo of CFI_CLANG +
> > > !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
> > >
> > > The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> > > asm to create a CFI-compliant NOP/BUG/whatever version of the
> > > function (insert lots of hand-waving). Is the kcfi hash available
> > > to inline asm at build time?
> >
> > Yes, clang creates magic symbol for everything it sees a declaration
> > for. This symbols can be referenced from asm, linking will make it all
> > work.
> >
> > And yes, C sucks, you can't actually create a function definition from a
> > type :/ Otherwise this could be trivially fixable.
>
> Wouldn't creating a separate inline assembly nop function that
> references the CFI hash of another function with the correct type
> potentially solve this issue like Josh suggested?

Right, I was thinking something like this, where the nop function gets
generated by DEFINE_STATIC_CALL().

Completely untested of course...

#define STATIC_CALL_NOP_PREFIX __SCN__
#define STATIC_CALL_NOP(name) __PASTE(STATIC_CALL_NOP_PREFIX, name)
#define STATIC_CALL_NOP_STR(name) __stringify(STATIC_CALL_NOP(name))

#define ARCH_DEFINE_STATIC_CALL_NOP(name, func) \
asm(".align 4 \n" \
".word __kcfi_typeid_" STATIC_CALL_NOP_STR(name) " \n" \
".globl " STATIC_CALL_NOP_STR(name) " \n" \
STATIC_CALL_NOP_STR(name) ": \n" \
"bti c \n" \
"mov x0, xzr \n" \
"ret \n" \
".type " STATIC_CALL_NOP_STR(name) ", @function \n" \
".size " STATIC_CALL_NOP_STR(name) ", . - " STATIC_CALL_NOP_STR(name) " \n")

#define DECLARE_STATIC_CALL(name, func) \
extern struct static_call_key STATIC_CALL_KEY(name); \
extern typeof(func) STATIC_CALL_TRAMP(name) \
extern typeof(func) STATIC_CALL_NOP(name)

#define DEFINE_STATIC_CALL(name, _func, _func_init) \
DECLARE_STATIC_CALL(name, _func); \
ARCH_DEFINE_STATIC_CALL_NOP(name); \
struct static_call_key STATIC_CALL_KEY(name) = { \
.func = _func_init, \
}
--
Josh

2023-03-14 10:07:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent

On Mon, Mar 13, 2023 at 06:58:36PM -0700, Josh Poimboeuf wrote:
> On Mon, Mar 13, 2023 at 10:48:58AM -0700, Sami Tolvanen wrote:
> > On Sun, Mar 12, 2023 at 8:17 AM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> > > > 2) Create yet another "tier" of static call implementations, for
> > > > arches which can have the unfortunate combo of CFI_CLANG +
> > > > !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
> > > >
> > > > The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> > > > asm to create a CFI-compliant NOP/BUG/whatever version of the
> > > > function (insert lots of hand-waving). Is the kcfi hash available
> > > > to inline asm at build time?
> > >
> > > Yes, clang creates magic symbol for everything it sees a declaration
> > > for. This symbols can be referenced from asm, linking will make it all
> > > work.
> > >
> > > And yes, C sucks, you can't actually create a function definition from a
> > > type :/ Otherwise this could be trivially fixable.
> >
> > Wouldn't creating a separate inline assembly nop function that
> > references the CFI hash of another function with the correct type
> > potentially solve this issue like Josh suggested?

Yes it would, and the below looks about right. It's just a shame the C
language itself cannot sanely express that. Also, having a ton of silly
little nop functions is daft, but alas.

> Right, I was thinking something like this, where the nop function gets
> generated by DEFINE_STATIC_CALL().
>
> Completely untested of course...
>
> #define STATIC_CALL_NOP_PREFIX __SCN__
> #define STATIC_CALL_NOP(name) __PASTE(STATIC_CALL_NOP_PREFIX, name)
> #define STATIC_CALL_NOP_STR(name) __stringify(STATIC_CALL_NOP(name))
>
> #define ARCH_DEFINE_STATIC_CALL_NOP(name, func) \
> asm(".align 4 \n" \

IIRC arm64 just changed (or is about to) their alignment muck. I think
you can write this like:

".balign " __stringify(CONFIG_FUNCTION_ALIGNMENT) " \n" \

or somesuch...

> ".word __kcfi_typeid_" STATIC_CALL_NOP_STR(name) " \n" \
> ".globl " STATIC_CALL_NOP_STR(name) " \n" \
> STATIC_CALL_NOP_STR(name) ": \n" \
> "bti c \n" \
> "mov x0, xzr \n" \
> "ret \n" \
> ".type " STATIC_CALL_NOP_STR(name) ", @function \n" \
> ".size " STATIC_CALL_NOP_STR(name) ", . - " STATIC_CALL_NOP_STR(name) " \n")
>
> #define DECLARE_STATIC_CALL(name, func) \
> extern struct static_call_key STATIC_CALL_KEY(name); \
> extern typeof(func) STATIC_CALL_TRAMP(name) \
> extern typeof(func) STATIC_CALL_NOP(name)
>
> #define DEFINE_STATIC_CALL(name, _func, _func_init) \
> DECLARE_STATIC_CALL(name, _func); \
> ARCH_DEFINE_STATIC_CALL_NOP(name); \
> struct static_call_key STATIC_CALL_KEY(name) = { \
> .func = _func_init, \
> }
> --
> Josh