2021-11-10 00:07:31

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 0/7] static call updates

This series addresses a number of asymmetries and inconsistencies in
both the API and the implementation of static calls. The aim is to
ensure that the documented API is fully implemented and works correctly,
regardless of which variety of static calls the arch decides to
implement.

Another goal of this series is to ensure that the caller of the API does
not need to choose a particular kind of macro depending on whether the
static call exports its key and/or whether the static call in question
may implement the NULL flavor.

Cc: Peter Zijlstra <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Kees Cook <[email protected]>

Ard Biesheuvel (7):
static_call: get rid of static_call_cond()
static_call: deal with unexported keys without cluttering up the API
static_call: use helper to access non-exported key
static_call: fix broken static_call_query() for non-exported keys
static_call: use non-function types to refer to the trampolines
static_call: rename EXPORT_ macros to be more self-explanatory
static_call: add generic support for non-exported keys

arch/x86/events/core.c | 20 ++--
arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/include/asm/preempt.h | 4 +-
arch/x86/include/asm/static_call.h | 6 -
arch/x86/kvm/irq.c | 2 +-
arch/x86/kvm/x86.c | 10 +-
include/asm-generic/vmlinux.lds.h | 5 +-
include/linux/kernel.h | 2 +-
include/linux/sched.h | 2 +-
include/linux/static_call.h | 117 +++++++++-----------
include/linux/static_call_types.h | 54 +++++----
kernel/extable.c | 1 +
kernel/sched/core.c | 8 +-
kernel/static_call.c | 29 +----
tools/include/linux/static_call_types.h | 54 +++++----
tools/objtool/check.c | 43 ++++---
16 files changed, 174 insertions(+), 187 deletions(-)

--
2.30.2


2021-11-10 00:07:35

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 1/7] static_call: get rid of static_call_cond()

The main reason for the existence of static_call_cond() seems to be that
in theory, when using the generic implementation of static calls, it may
be possible for a compiler to elide the indirect call entirely if the
target is NULL, while still guaranteeing that all side effects of
argument evaluation occur as expected.

This is rather optimistic: as documented by an existing code comment,
both GCC and Clang (version 10) get this wrong, and even if they ever
get it right, this is far too subtle to rely on for a code path that is
expected to be used only by the 'remaining' architectures once all the
best supported ones implement either the out-of-line or inline optimized
variety of static calls.

Given that having static_call_cond() clutters up the API, and puts the
burden on the caller to go and check what kind of static call they are
dealing with, let's just get rid of the distinction.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/events/core.c | 20 ++++----
arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/kvm/irq.c | 2 +-
arch/x86/kvm/x86.c | 4 +-
include/linux/static_call.h | 49 ++++----------------
5 files changed, 23 insertions(+), 56 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 38b2c779146f..ed1e5fcdb49d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -991,7 +991,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
if (cpuc->txn_flags & PERF_PMU_TXN_ADD)
n0 -= cpuc->n_txn;

- static_call_cond(x86_pmu_start_scheduling)(cpuc);
+ static_call(x86_pmu_start_scheduling)(cpuc);

for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
c = cpuc->event_constraint[i];
@@ -1090,7 +1090,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
*/
if (!unsched && assign) {
for (i = 0; i < n; i++)
- static_call_cond(x86_pmu_commit_scheduling)(cpuc, i, assign[i]);
+ static_call(x86_pmu_commit_scheduling)(cpuc, i, assign[i]);
} else {
for (i = n0; i < n; i++) {
e = cpuc->event_list[i];
@@ -1098,13 +1098,13 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
/*
* release events that failed scheduling
*/
- static_call_cond(x86_pmu_put_event_constraints)(cpuc, e);
+ static_call(x86_pmu_put_event_constraints)(cpuc, e);

cpuc->event_constraint[i] = NULL;
}
}

- static_call_cond(x86_pmu_stop_scheduling)(cpuc);
+ static_call(x86_pmu_stop_scheduling)(cpuc);

return unsched ? -EINVAL : 0;
}
@@ -1217,7 +1217,7 @@ static inline void x86_assign_hw_event(struct perf_event *event,
hwc->last_cpu = smp_processor_id();
hwc->last_tag = ++cpuc->tags[i];

- static_call_cond(x86_pmu_assign)(event, idx);
+ static_call(x86_pmu_assign)(event, idx);

switch (hwc->idx) {
case INTEL_PMC_IDX_FIXED_BTS:
@@ -1494,7 +1494,7 @@ static int x86_pmu_add(struct perf_event *event, int flags)
* This is before x86_pmu_enable() will call x86_pmu_start(),
* so we enable LBRs before an event needs them etc..
*/
- static_call_cond(x86_pmu_add)(event);
+ static_call(x86_pmu_add)(event);

ret = 0;
out:
@@ -1647,7 +1647,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
if (i >= cpuc->n_events - cpuc->n_added)
--cpuc->n_added;

- static_call_cond(x86_pmu_put_event_constraints)(cpuc, event);
+ static_call(x86_pmu_put_event_constraints)(cpuc, event);

/* Delete the array entry. */
while (++i < cpuc->n_events) {
@@ -1667,7 +1667,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
* This is after x86_pmu_stop(); so we disable LBRs after any
* event can need them etc..
*/
- static_call_cond(x86_pmu_del)(event);
+ static_call(x86_pmu_del)(event);
}

int x86_pmu_handle_irq(struct pt_regs *regs)
@@ -2638,13 +2638,13 @@ static const struct attribute_group *x86_pmu_attr_groups[] = {

static void x86_pmu_sched_task(struct perf_event_context *ctx, bool sched_in)
{
- static_call_cond(x86_pmu_sched_task)(ctx, sched_in);
+ static_call(x86_pmu_sched_task)(ctx, sched_in);
}

static void x86_pmu_swap_task_ctx(struct perf_event_context *prev,
struct perf_event_context *next)
{
- static_call_cond(x86_pmu_swap_task_ctx)(prev, next);
+ static_call(x86_pmu_swap_task_ctx)(prev, next);
}

void perf_check_microcode(void)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2acf37cc1991..295f6285b895 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1914,12 +1914,12 @@ static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq)

static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
{
- static_call_cond(kvm_x86_vcpu_blocking)(vcpu);
+ static_call(kvm_x86_vcpu_blocking)(vcpu);
}

static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
{
- static_call_cond(kvm_x86_vcpu_unblocking)(vcpu);
+ static_call(kvm_x86_vcpu_unblocking)(vcpu);
}

static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 172b05343cfd..79e9354f076a 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -150,7 +150,7 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
{
__kvm_migrate_apic_timer(vcpu);
__kvm_migrate_pit_timer(vcpu);
- static_call_cond(kvm_x86_migrate_timers)(vcpu);
+ static_call(kvm_x86_migrate_timers)(vcpu);
}

bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c1c4e2b05a63..2202db26c4d7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11418,7 +11418,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
__x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, 0, 0);
mutex_unlock(&kvm->slots_lock);
}
- static_call_cond(kvm_x86_vm_destroy)(kvm);
+ static_call(kvm_x86_vm_destroy)(kvm);
kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
kvm_pic_destroy(kvm);
kvm_ioapic_destroy(kvm);
@@ -12038,7 +12038,7 @@ bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu)
void kvm_arch_start_assignment(struct kvm *kvm)
{
if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1)
- static_call_cond(kvm_x86_start_assignment)(kvm);
+ static_call(kvm_x86_start_assignment)(kvm);
}
EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 3e56a9751c06..19c98cab8643 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -92,23 +92,16 @@
*
* where the argument evaludation also depends on the pointer value.
*
- * When calling a static_call that can be NULL, use:
- *
- * static_call_cond(name)(arg1);
- *
- * which will include the required value tests to avoid NULL-pointer
- * dereferences.
- *
* To query which function is currently set to be called, use:
*
* func = static_call_query(name);
*
*
- * DEFINE_STATIC_CALL_RET0 / __static_call_return0:
+ * DEFINE_STATIC_CALL_RET0:
*
- * Just like how DEFINE_STATIC_CALL_NULL() / static_call_cond() optimize the
- * conditional void function call, DEFINE_STATIC_CALL_RET0 /
- * __static_call_return0 optimize the do nothing return 0 function.
+ * Just like how DEFINE_STATIC_CALL_NULL() optimizes the conditional void
+ * function call, DEFINE_STATIC_CALL_RET0 optimizes the do nothing return 0
+ * function.
*
* This feature is strictly UB per the C standard (since it casts a function
* pointer to a different signature) and relies on the architecture ABI to
@@ -196,8 +189,6 @@ extern long __static_call_return0(void);
}; \
ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)

-#define static_call_cond(name) (void)__static_call(name)
-
#define EXPORT_STATIC_CALL(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
@@ -232,8 +223,6 @@ static inline int static_call_init(void) { return 0; }
ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)


-#define static_call_cond(name) (void)__static_call(name)
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
@@ -275,6 +264,8 @@ static inline long __static_call_return0(void)
return 0;
}

+static inline void __static_call_nop(void) { }
+
#define __DEFINE_STATIC_CALL(name, _func, _func_init) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
@@ -284,37 +275,13 @@ static inline long __static_call_return0(void)
#define DEFINE_STATIC_CALL_NULL(name, _func) \
DECLARE_STATIC_CALL(name, _func); \
struct static_call_key STATIC_CALL_KEY(name) = { \
- .func = NULL, \
+ .func = (void *)__static_call_nop, \
}

-static inline void __static_call_nop(void) { }
-
-/*
- * This horrific hack takes care of two things:
- *
- * - it ensures the compiler will only load the function pointer ONCE,
- * which avoids a reload race.
- *
- * - it ensures the argument evaluation is unconditional, similar
- * to the HAVE_STATIC_CALL variant.
- *
- * Sadly current GCC/Clang (10 for both) do not optimize this properly
- * and will emit an indirect call for the NULL case :-(
- */
-#define __static_call_cond(name) \
-({ \
- void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
- if (!func) \
- func = &__static_call_nop; \
- (typeof(STATIC_CALL_TRAMP(name))*)func; \
-})
-
-#define static_call_cond(name) (void)__static_call_cond(name)
-
static inline
void __static_call_update(struct static_call_key *key, void *tramp, void *func)
{
- WRITE_ONCE(key->func, func);
+ WRITE_ONCE(key->func, func ?: (void *)&__static_call_nop);
}

static inline int static_call_text_reserved(void *start, void *end)
--
2.30.2

2021-11-10 00:07:37

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 2/7] static_call: deal with unexported keys without cluttering up the API

In principle, invoking a static call does not require access to the
static call key that is used to manipulate the function that the static
call targets. This means that exporting only the static call but not its
key to modules should prevent such modules from manipulating the target
inadvertently.

However, for inline static calls, we do need access to the key at module
load time, because the code patching routines rely on it. For this
reason, there is special handling in objtool that recovers the key
pointer from the address of the trampoline (which does get exported
unconditionally), using a lookup table in the core kernel.

Unfortunately, this clutters up the user-visible API: a static call is
normally invoked using the static_call() macro, but in cases where a)
the call originates from a module and b) the associated key was not
exported, it is left up to the user to pick the static_call_mod() macro
instead. Since that macro relies on the lookup table in the core kernel,
it can only be used on static calls that are exported from the core
kernel.

It would be much better if the call sites can simply use static_call()
and not have to figure out whether or not the static call exports its
key and/or whether it as exported from the core kernel.

So let's address this, by using a weak reference for the key. This means
we can force the symbol reference in the module to be emitted
unconditionally, allowing the caller to use static_call() without
knowing the difference. This does imply that the logic in objtool that
tests whether a symbol exists in the object no longer works, as the
symbol is guaranteed to exist. So instead, add another field to struct
static_call_site to cover the trampoline reference, and fix up the
module notifier that consumes this data accordingly.

This gets rid of the following macros:

__STATIC_CALL_MOD_ADDRESSABLE()
static_call_mod()
__static_call()
__raw_static_call()

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/include/asm/preempt.h | 4 +--
include/linux/kernel.h | 2 +-
include/linux/sched.h | 2 +-
include/linux/static_call_types.h | 26 +++++-----------
kernel/static_call.c | 8 ++---
tools/include/linux/static_call_types.h | 26 +++++-----------
tools/objtool/check.c | 31 ++++++++++----------
7 files changed, 38 insertions(+), 61 deletions(-)

diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
index fe5efbcba824..341745a03726 100644
--- a/arch/x86/include/asm/preempt.h
+++ b/arch/x86/include/asm/preempt.h
@@ -121,7 +121,7 @@ DECLARE_STATIC_CALL(preempt_schedule, __preempt_schedule_func);

#define __preempt_schedule() \
do { \
- __STATIC_CALL_MOD_ADDRESSABLE(preempt_schedule); \
+ __STATIC_CALL_ADDRESSABLE(preempt_schedule); \
asm volatile ("call " STATIC_CALL_TRAMP_STR(preempt_schedule) : ASM_CALL_CONSTRAINT); \
} while (0)

@@ -129,7 +129,7 @@ DECLARE_STATIC_CALL(preempt_schedule_notrace, __preempt_schedule_notrace_func);

#define __preempt_schedule_notrace() \
do { \
- __STATIC_CALL_MOD_ADDRESSABLE(preempt_schedule_notrace); \
+ __STATIC_CALL_ADDRESSABLE(preempt_schedule_notrace); \
asm volatile ("call " STATIC_CALL_TRAMP_STR(preempt_schedule_notrace) : ASM_CALL_CONSTRAINT); \
} while (0)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e5359b09de1d..1e1159878e40 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -101,7 +101,7 @@ DECLARE_STATIC_CALL(might_resched, __cond_resched);

static __always_inline void might_resched(void)
{
- static_call_mod(might_resched)();
+ static_call(might_resched)();
}

#else
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 78c351e35fec..3cdf55c1074c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2014,7 +2014,7 @@ DECLARE_STATIC_CALL(cond_resched, __cond_resched);

static __always_inline int _cond_resched(void)
{
- return static_call_mod(cond_resched)();
+ return static_call(cond_resched)();
}

#else
diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index 5a00b8b2cf9f..0bb36294cce7 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -32,15 +32,20 @@
struct static_call_site {
s32 addr;
s32 key;
+ s32 tramp;
};

#define DECLARE_STATIC_CALL(name, func) \
- extern struct static_call_key STATIC_CALL_KEY(name); \
+ extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern typeof(func) STATIC_CALL_TRAMP(name);

#ifdef CONFIG_HAVE_STATIC_CALL

-#define __raw_static_call(name) (&STATIC_CALL_TRAMP(name))
+#define static_call(name) \
+({ \
+ __STATIC_CALL_ADDRESSABLE(name); \
+ (&STATIC_CALL_TRAMP(name)); \
+})

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE

@@ -52,12 +57,6 @@ struct static_call_site {
#define __STATIC_CALL_ADDRESSABLE(name) \
__ADDRESSABLE(STATIC_CALL_KEY(name))

-#define __static_call(name) \
-({ \
- __STATIC_CALL_ADDRESSABLE(name); \
- __raw_static_call(name); \
-})
-
struct static_call_key {
void *func;
union {
@@ -71,7 +70,6 @@ struct static_call_key {
#else /* !CONFIG_HAVE_STATIC_CALL_INLINE */

#define __STATIC_CALL_ADDRESSABLE(name)
-#define __static_call(name) __raw_static_call(name)

struct static_call_key {
void *func;
@@ -79,16 +77,6 @@ struct static_call_key {

#endif /* CONFIG_HAVE_STATIC_CALL_INLINE */

-#ifdef MODULE
-#define __STATIC_CALL_MOD_ADDRESSABLE(name)
-#define static_call_mod(name) __raw_static_call(name)
-#else
-#define __STATIC_CALL_MOD_ADDRESSABLE(name) __STATIC_CALL_ADDRESSABLE(name)
-#define static_call_mod(name) __static_call(name)
-#endif
-
-#define static_call(name) __static_call(name)
-
#else

struct static_call_key {
diff --git a/kernel/static_call.c b/kernel/static_call.c
index 43ba0b1e0edb..360cc3cd0fbf 100644
--- a/kernel/static_call.c
+++ b/kernel/static_call.c
@@ -366,18 +366,18 @@ static int static_call_add_module(struct module *mod)
* means modules are allowed to call static_call_update() on
* it.
*
- * Otherwise, the key isn't exported, and 'addr' points to the
+ * Otherwise, the key isn't exported, and 'tramp' points to the
* trampoline so we need to lookup the key.
*
* We go through this dance to prevent crazy modules from
* abusing sensitive static calls.
*/
- if (!kernel_text_address(addr))
+ if (addr)
continue;

- key = tramp_key_lookup(addr);
+ key = tramp_key_lookup((unsigned long)offset_to_ptr(&site->tramp));
if (!key) {
- pr_warn("Failed to fixup __raw_static_call() usage at: %ps\n",
+ pr_warn("Failed to fixup static_call() usage at: %ps\n",
static_call_addr(site));
return -EINVAL;
}
diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h
index 5a00b8b2cf9f..0bb36294cce7 100644
--- a/tools/include/linux/static_call_types.h
+++ b/tools/include/linux/static_call_types.h
@@ -32,15 +32,20 @@
struct static_call_site {
s32 addr;
s32 key;
+ s32 tramp;
};

#define DECLARE_STATIC_CALL(name, func) \
- extern struct static_call_key STATIC_CALL_KEY(name); \
+ extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern typeof(func) STATIC_CALL_TRAMP(name);

#ifdef CONFIG_HAVE_STATIC_CALL

-#define __raw_static_call(name) (&STATIC_CALL_TRAMP(name))
+#define static_call(name) \
+({ \
+ __STATIC_CALL_ADDRESSABLE(name); \
+ (&STATIC_CALL_TRAMP(name)); \
+})

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE

@@ -52,12 +57,6 @@ struct static_call_site {
#define __STATIC_CALL_ADDRESSABLE(name) \
__ADDRESSABLE(STATIC_CALL_KEY(name))

-#define __static_call(name) \
-({ \
- __STATIC_CALL_ADDRESSABLE(name); \
- __raw_static_call(name); \
-})
-
struct static_call_key {
void *func;
union {
@@ -71,7 +70,6 @@ struct static_call_key {
#else /* !CONFIG_HAVE_STATIC_CALL_INLINE */

#define __STATIC_CALL_ADDRESSABLE(name)
-#define __static_call(name) __raw_static_call(name)

struct static_call_key {
void *func;
@@ -79,16 +77,6 @@ struct static_call_key {

#endif /* CONFIG_HAVE_STATIC_CALL_INLINE */

-#ifdef MODULE
-#define __STATIC_CALL_MOD_ADDRESSABLE(name)
-#define static_call_mod(name) __raw_static_call(name)
-#else
-#define __STATIC_CALL_MOD_ADDRESSABLE(name) __STATIC_CALL_ADDRESSABLE(name)
-#define static_call_mod(name) __static_call(name)
-#endif
-
-#define static_call(name) __static_call(name)
-
#else

struct static_call_key {
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index add39902166d..ec6ddaaf6cf6 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -652,21 +652,8 @@ static int create_static_call_sections(struct objtool_file *file)

key_sym = find_symbol_by_name(file->elf, tmp);
if (!key_sym) {
- if (!module) {
- WARN("static_call: can't find static_call_key symbol: %s", tmp);
- return -1;
- }
-
- /*
- * For modules(), the key might not be exported, which
- * means the module can make static calls but isn't
- * allowed to change them.
- *
- * In that case we temporarily set the key to be the
- * trampoline address. This is fixed up in
- * static_call_add_module().
- */
- key_sym = insn->call_dest;
+ WARN("static_call: can't find static_call_key symbol: %s", tmp);
+ return -1;
}
free(key_name);

@@ -677,6 +664,20 @@ static int create_static_call_sections(struct objtool_file *file)
is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
return -1;

+ /*
+ * For modules(), the key might not be exported, which means
+ * the module can make static calls but isn't allowed to change
+ * them.
+ *
+ * For this case, we pass the trampoline in .static_call_sites
+ * as well. This is used to lookup the key in
+ * static_call_add_module().
+ */
+ if (elf_add_reloc(file->elf, sec,
+ idx * sizeof(struct static_call_site) + 8,
+ R_X86_64_PC32, insn->call_dest, 0))
+ return -1;
+
idx++;
}

--
2.30.2

2021-11-10 00:07:39

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 3/7] static_call: use helper to access non-exported key

We support static calls from modules without the keys being exported,
but since the inline patching code needs access to the key regardless,
we have special handling in the module notifier that looks up the key
based on the trampoline address in a list that is kept especially for
this purpose.

This list is part of the core kernel, and only contains static calls
exported from the core kernel. This means that exporting static calls
without the associated key is only possible from the core kernel, and
any attempts to do so from another module currently result in module
load failures.

So let's replace this list with a special per-static call helper that
returns the static call key address, provided that the call originates
from the core kernel. This prevents access to the key for modular
callers, while removing the need to keep a list of key/trampoline pairs.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
Alternatively, we might fix the module loader to look for
.static_call_tramp_key sections in all modules loaded prior, but I'm not
sure whether that would be cleaner.

arch/x86/include/asm/static_call.h | 6 -----
include/asm-generic/vmlinux.lds.h | 5 +----
include/linux/static_call.h | 18 ++++++++-------
include/linux/static_call_types.h | 9 +++++++-
kernel/extable.c | 1 +
kernel/static_call.c | 23 +++-----------------
tools/include/linux/static_call_types.h | 9 +++++++-
tools/objtool/check.c | 20 ++++++++++++-----
8 files changed, 45 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
index cbb67b6030f9..beb2bbaae8b1 100644
--- a/arch/x86/include/asm/static_call.h
+++ b/arch/x86/include/asm/static_call.h
@@ -38,10 +38,4 @@
__ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret; nop; nop; nop; nop")


-#define ARCH_ADD_TRAMP_KEY(name) \
- asm(".pushsection .static_call_tramp_key, \"a\" \n" \
- ".long " STATIC_CALL_TRAMP_STR(name) " - . \n" \
- ".long " STATIC_CALL_KEY_STR(name) " - . \n" \
- ".popsection \n")
-
#endif /* _ASM_STATIC_CALL_H */
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 42f3866bca69..5fd802ccb62f 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -408,10 +408,7 @@
. = ALIGN(8); \
__start_static_call_sites = .; \
KEEP(*(.static_call_sites)) \
- __stop_static_call_sites = .; \
- __start_static_call_tramp_key = .; \
- KEEP(*(.static_call_tramp_key)) \
- __stop_static_call_tramp_key = .;
+ __stop_static_call_sites = .;

/*
* Allow architectures to handle ro_after_init data on their
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 19c98cab8643..3bba0bcba844 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -161,12 +161,6 @@ struct static_call_mod {
struct static_call_site *sites;
};

-/* For finding the key associated with a trampoline */
-struct static_call_tramp_key {
- s32 tramp;
- s32 key;
-};
-
extern void __static_call_update(struct static_call_key *key, void *tramp, void *func);
extern int static_call_mod_init(struct module *mod);
extern int static_call_text_reserved(void *start, void *end);
@@ -196,13 +190,21 @@ extern long __static_call_return0(void);
EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

+#define EXPORT_STATIC_CALL_GETKEY_HELPER(name) \
+ struct static_call_key *STATIC_CALL_GETKEY(name)(void) { \
+ BUG_ON(!core_kernel_text( \
+ (unsigned long)__builtin_return_address(0))); \
+ return &STATIC_CALL_KEY(name); \
+ } \
+ EXPORT_SYMBOL_GPL(STATIC_CALL_GETKEY(name))
+
/* Leave the key unexported, so modules can't change static call targets: */
#define EXPORT_STATIC_CALL_TRAMP(name) \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)); \
- ARCH_ADD_TRAMP_KEY(name)
+ EXPORT_STATIC_CALL_GETKEY_HELPER(name)
#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)); \
- ARCH_ADD_TRAMP_KEY(name)
+ EXPORT_STATIC_CALL_GETKEY_HELPER(name)

#elif defined(CONFIG_HAVE_STATIC_CALL)

diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index 0bb36294cce7..a31782909e43 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -18,6 +18,11 @@
#define STATIC_CALL_TRAMP(name) __PASTE(STATIC_CALL_TRAMP_PREFIX, name)
#define STATIC_CALL_TRAMP_STR(name) __stringify(STATIC_CALL_TRAMP(name))

+#define STATIC_CALL_GETKEY_PREFIX __SCG__
+#define STATIC_CALL_GETKEY_PREFIX_STR __stringify(STATIC_CALL_GETKEY_PREFIX)
+#define STATIC_CALL_GETKEY_PREFIX_LEN (sizeof(STATIC_CALL_GETKEY_PREFIX_STR) - 1)
+#define STATIC_CALL_GETKEY(name) __PASTE(STATIC_CALL_GETKEY_PREFIX, name)
+
/*
* Flags in the low bits of static_call_site::key.
*/
@@ -32,11 +37,12 @@
struct static_call_site {
s32 addr;
s32 key;
- s32 tramp;
+ s32 helper;
};

#define DECLARE_STATIC_CALL(name, func) \
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
+ extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
extern typeof(func) STATIC_CALL_TRAMP(name);

#ifdef CONFIG_HAVE_STATIC_CALL
@@ -55,6 +61,7 @@ struct static_call_site {
* .static_call_sites section.
*/
#define __STATIC_CALL_ADDRESSABLE(name) \
+ __ADDRESSABLE(STATIC_CALL_GETKEY(name)) \
__ADDRESSABLE(STATIC_CALL_KEY(name))

struct static_call_key {
diff --git a/kernel/extable.c b/kernel/extable.c
index 290661f68e6b..e52579e46180 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -81,6 +81,7 @@ int notrace core_kernel_text(unsigned long addr)
return 1;
return 0;
}
+EXPORT_SYMBOL_GPL(core_kernel_text);

/**
* core_kernel_data - tell if addr points to kernel data
diff --git a/kernel/static_call.c b/kernel/static_call.c
index 360cc3cd0fbf..29ae772dc86e 100644
--- a/kernel/static_call.c
+++ b/kernel/static_call.c
@@ -12,8 +12,6 @@

extern struct static_call_site __start_static_call_sites[],
__stop_static_call_sites[];
-extern struct static_call_tramp_key __start_static_call_tramp_key[],
- __stop_static_call_tramp_key[];

static bool static_call_initialized;

@@ -333,23 +331,6 @@ static int __static_call_mod_text_reserved(void *start, void *end)
return ret;
}

-static unsigned long tramp_key_lookup(unsigned long addr)
-{
- struct static_call_tramp_key *start = __start_static_call_tramp_key;
- struct static_call_tramp_key *stop = __stop_static_call_tramp_key;
- struct static_call_tramp_key *tramp_key;
-
- for (tramp_key = start; tramp_key != stop; tramp_key++) {
- unsigned long tramp;
-
- tramp = (long)tramp_key->tramp + (long)&tramp_key->tramp;
- if (tramp == addr)
- return (long)tramp_key->key + (long)&tramp_key->key;
- }
-
- return 0;
-}
-
static int static_call_add_module(struct module *mod)
{
struct static_call_site *start = mod->static_call_sites;
@@ -359,6 +340,7 @@ static int static_call_add_module(struct module *mod)
for (site = start; site != stop; site++) {
unsigned long s_key = __static_call_key(site);
unsigned long addr = s_key & ~STATIC_CALL_SITE_FLAGS;
+ unsigned long (*key_helper)(void);
unsigned long key;

/*
@@ -375,7 +357,8 @@ static int static_call_add_module(struct module *mod)
if (addr)
continue;

- key = tramp_key_lookup((unsigned long)offset_to_ptr(&site->tramp));
+ key_helper = offset_to_ptr(&site->helper);
+ key = key_helper();
if (!key) {
pr_warn("Failed to fixup static_call() usage at: %ps\n",
static_call_addr(site));
diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h
index 0bb36294cce7..a31782909e43 100644
--- a/tools/include/linux/static_call_types.h
+++ b/tools/include/linux/static_call_types.h
@@ -18,6 +18,11 @@
#define STATIC_CALL_TRAMP(name) __PASTE(STATIC_CALL_TRAMP_PREFIX, name)
#define STATIC_CALL_TRAMP_STR(name) __stringify(STATIC_CALL_TRAMP(name))

+#define STATIC_CALL_GETKEY_PREFIX __SCG__
+#define STATIC_CALL_GETKEY_PREFIX_STR __stringify(STATIC_CALL_GETKEY_PREFIX)
+#define STATIC_CALL_GETKEY_PREFIX_LEN (sizeof(STATIC_CALL_GETKEY_PREFIX_STR) - 1)
+#define STATIC_CALL_GETKEY(name) __PASTE(STATIC_CALL_GETKEY_PREFIX, name)
+
/*
* Flags in the low bits of static_call_site::key.
*/
@@ -32,11 +37,12 @@
struct static_call_site {
s32 addr;
s32 key;
- s32 tramp;
+ s32 helper;
};

#define DECLARE_STATIC_CALL(name, func) \
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
+ extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
extern typeof(func) STATIC_CALL_TRAMP(name);

#ifdef CONFIG_HAVE_STATIC_CALL
@@ -55,6 +61,7 @@ struct static_call_site {
* .static_call_sites section.
*/
#define __STATIC_CALL_ADDRESSABLE(name) \
+ __ADDRESSABLE(STATIC_CALL_GETKEY(name)) \
__ADDRESSABLE(STATIC_CALL_KEY(name))

struct static_call_key {
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index ec6ddaaf6cf6..3698ace884a5 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -600,7 +600,7 @@ static int create_static_call_sections(struct objtool_file *file)
struct section *sec;
struct static_call_site *site;
struct instruction *insn;
- struct symbol *key_sym;
+ struct symbol *key_sym, *helper_sym;
char *key_name, *tmp;
int idx;

@@ -655,7 +655,6 @@ static int create_static_call_sections(struct objtool_file *file)
WARN("static_call: can't find static_call_key symbol: %s", tmp);
return -1;
}
- free(key_name);

/* populate reloc for 'key' */
if (elf_add_reloc(file->elf, sec,
@@ -664,18 +663,27 @@ static int create_static_call_sections(struct objtool_file *file)
is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
return -1;

+ memcpy(tmp, STATIC_CALL_GETKEY_PREFIX_STR, STATIC_CALL_GETKEY_PREFIX_LEN);
+
+ helper_sym = find_symbol_by_name(file->elf, tmp);
+ if (!helper_sym) {
+ WARN("static_call: can't find static_call_key symbol: %s", tmp);
+ return -1;
+ }
+ free(key_name);
+
/*
* For modules(), the key might not be exported, which means
* the module can make static calls but isn't allowed to change
* them.
*
- * For this case, we pass the trampoline in .static_call_sites
- * as well. This is used to lookup the key in
- * static_call_add_module().
+ * For this case, we pass the special key helper routine in
+ * .static_call_sites as well. This is used to lookup the key
+ * in static_call_add_module().
*/
if (elf_add_reloc(file->elf, sec,
idx * sizeof(struct static_call_site) + 8,
- R_X86_64_PC32, insn->call_dest, 0))
+ R_X86_64_PC32, helper_sym, 0))
return -1;

idx++;
--
2.30.2

2021-11-10 00:07:41

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 5/7] static_call: use non-function types to refer to the trampolines

In order to prevent CFI enabled code from grabbing a jump table entry
that jumps to the trampoline, rather than the trampoline itself, use an
incomplete non-function type for the trampoline, and cast it to the
right type only when invoking it.

Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/linux/static_call.h | 2 +-
include/linux/static_call_types.h | 6 +++---
tools/include/linux/static_call_types.h | 6 +++---
3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 391f737496eb..6b261fe96ba3 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -144,7 +144,7 @@ extern void arch_static_call_transform(void *site, void *tramp, void *func, bool

#define static_call_update(name, func) \
({ \
- typeof(&STATIC_CALL_TRAMP(name)) __F = (func); \
+ typeof(STATIC_CALL_QUERY(name)()) __F = (func); \
__static_call_update(&STATIC_CALL_KEY(name), \
STATIC_CALL_TRAMP_ADDR(name), __F); \
})
diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index 2fce9aa8a995..eae11c5b015d 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -47,7 +47,7 @@ struct static_call_site {
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
extern __weak typeof(func) *STATIC_CALL_QUERY(name)(void); \
- extern typeof(func) STATIC_CALL_TRAMP(name)
+ extern struct static_call_tramp STATIC_CALL_TRAMP(name)

#define __static_call_query(name) \
((typeof(STATIC_CALL_QUERY(name)()))READ_ONCE(STATIC_CALL_KEY(name).func))
@@ -66,7 +66,7 @@ struct static_call_site {
#define static_call(name) \
({ \
__STATIC_CALL_ADDRESSABLE(name); \
- (&STATIC_CALL_TRAMP(name)); \
+ ((typeof(STATIC_CALL_QUERY(name)()))&STATIC_CALL_TRAMP(name)); \
})

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE
@@ -107,7 +107,7 @@ struct static_call_key {
};

#define static_call(name) \
- ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func))
+ ((typeof(STATIC_CALL_QUERY(name)()))(STATIC_CALL_KEY(name).func))

#endif /* CONFIG_HAVE_STATIC_CALL */

diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h
index 2fce9aa8a995..eae11c5b015d 100644
--- a/tools/include/linux/static_call_types.h
+++ b/tools/include/linux/static_call_types.h
@@ -47,7 +47,7 @@ struct static_call_site {
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
extern __weak typeof(func) *STATIC_CALL_QUERY(name)(void); \
- extern typeof(func) STATIC_CALL_TRAMP(name)
+ extern struct static_call_tramp STATIC_CALL_TRAMP(name)

#define __static_call_query(name) \
((typeof(STATIC_CALL_QUERY(name)()))READ_ONCE(STATIC_CALL_KEY(name).func))
@@ -66,7 +66,7 @@ struct static_call_site {
#define static_call(name) \
({ \
__STATIC_CALL_ADDRESSABLE(name); \
- (&STATIC_CALL_TRAMP(name)); \
+ ((typeof(STATIC_CALL_QUERY(name)()))&STATIC_CALL_TRAMP(name)); \
})

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE
@@ -107,7 +107,7 @@ struct static_call_key {
};

#define static_call(name) \
- ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func))
+ ((typeof(STATIC_CALL_QUERY(name)()))(STATIC_CALL_KEY(name).func))

#endif /* CONFIG_HAVE_STATIC_CALL */

--
2.30.2

2021-11-10 00:07:49

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 7/7] static_call: add generic support for non-exported keys

The static_call feature where only the static call target is exposed to
modules while access to the key is withheld is only implemented for the
inline and out-of-line varieties, but not for the generic indirect call
based variety.

Let's use static_call_query() to obtain the call target address if the
key is not exported. Note that this means we can do the converse as
well, i.e., implement static_call_query() in terms of the key struct if
the key is exported, so we only need to export one of the two.

Note that this adds an additional load and potentially a direct call to
the instantiations of static_call residing in module code.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/linux/static_call.h | 4 ++--
include/linux/static_call_types.h | 3 +--
tools/include/linux/static_call_types.h | 3 +--
3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 21929147879a..d88fd613f06d 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -302,8 +302,8 @@ static inline int static_call_text_reserved(void *start, void *end)
return 0;
}

-#define EXPORT_STATIC_CALL(name) EXPORT_SYMBOL(STATIC_CALL_KEY(name))
-#define EXPORT_STATIC_CALL_GPL(name) EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))
+#define EXPORT_STATIC_CALL(name) EXPORT_STATIC_CALL_QUERY(name,)
+#define EXPORT_STATIC_CALL_GPL(name) EXPORT_STATIC_CALL_QUERY(name, _GPL)

#define EXPORT_STATIC_CALL_FOR_UPDATE(name) EXPORT_SYMBOL(STATIC_CALL_KEY(name))
#define EXPORT_STATIC_CALL_FOR_UPDATE_GPL(name) EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))
diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index eae11c5b015d..920828ad1d69 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -106,8 +106,7 @@ struct static_call_key {
void *func;
};

-#define static_call(name) \
- ((typeof(STATIC_CALL_QUERY(name)()))(STATIC_CALL_KEY(name).func))
+#define static_call(name) static_call_query(name)

#endif /* CONFIG_HAVE_STATIC_CALL */

diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h
index eae11c5b015d..920828ad1d69 100644
--- a/tools/include/linux/static_call_types.h
+++ b/tools/include/linux/static_call_types.h
@@ -106,8 +106,7 @@ struct static_call_key {
void *func;
};

-#define static_call(name) \
- ((typeof(STATIC_CALL_QUERY(name)()))(STATIC_CALL_KEY(name).func))
+#define static_call(name) static_call_query(name)

#endif /* CONFIG_HAVE_STATIC_CALL */

--
2.30.2

2021-11-10 00:07:50

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 6/7] static_call: rename EXPORT_ macros to be more self-explanatory

There are currently two ways to export static calls to modules:
- export only the current static call target, so that modules can invoke
it but not make it point to another target;
- export the static call key so that modules may call
static_call_update() and patch all invocations tied to that key.

Depending on the flavor of static call implementation, different things
need to be exported: for the generic case, we need to export the key
regardless of which export was chosen, but for other cases, we only
export the trampoline but not the key.

As the latter is an implementation detail, EXPORT_STATIC_CALL_TRAMP() is
not the best choice for the name of the export macro that only exports
the target but not the key.

So let's rename these macros to

EXPORT_STATIC_CALL -> EXPORT_STATIC_CALL_FOR_UPDATE
EXPORT_STATIC_CALL_TRAMP -> EXPORT_STATIC_CALL

(and the _GPL suffixed ones in a similar fashion). That way, the default
does not expose the key, and users that need to expose control over the
static call key to modules can do so explicitly by choosing the
_FOR_UPDATE version instead.

While at it, add the missing ones for the generic static call
implementation.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/kvm/x86.c | 6 ++--
include/linux/static_call.h | 35 ++++++++++++--------
kernel/sched/core.c | 8 ++---
3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2202db26c4d7..c3749911e966 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -130,9 +130,9 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
*(((struct kvm_x86_ops *)0)->func));
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
-EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
-EXPORT_STATIC_CALL_GPL(kvm_x86_cache_reg);
-EXPORT_STATIC_CALL_GPL(kvm_x86_tlb_flush_current);
+EXPORT_STATIC_CALL_FOR_UPDATE_GPL(kvm_x86_get_cs_db_l_bits);
+EXPORT_STATIC_CALL_FOR_UPDATE_GPL(kvm_x86_cache_reg);
+EXPORT_STATIC_CALL_FOR_UPDATE_GPL(kvm_x86_tlb_flush_current);

static bool __read_mostly ignore_msrs = 0;
module_param(ignore_msrs, bool, S_IRUGO | S_IWUSR);
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 6b261fe96ba3..21929147879a 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -26,7 +26,7 @@
* static_call_update(name, func);
* static_call_query(name);
*
- * EXPORT_STATIC_CALL{,_TRAMP}{,_GPL}()
+ * EXPORT_STATIC_CALL{,_FOR_UPDATE}{,_GPL}()
*
* Usage example:
*
@@ -90,7 +90,7 @@
* if (my_func_ptr)
* my_func_ptr(arg1)
*
- * where the argument evaludation also depends on the pointer value.
+ * where the argument evaluation also depends on the pointer value.
*
* To query which function is currently set to be called, use:
*
@@ -116,11 +116,15 @@
* Notably argument setup is unconditional.
*
*
- * EXPORT_STATIC_CALL() vs EXPORT_STATIC_CALL_TRAMP():
+ * EXPORT_STATIC_CALL() vs EXPORT_STATIC_CALL_FOR_UPDATE():
*
- * The difference is that the _TRAMP variant tries to only export the
- * trampoline with the result that a module can use static_call{,_cond}() but
- * not static_call_update().
+ * EXPORT_STATIC_CALL() exports the minimal set of symbols that are needed
+ * for a module to be able invoke the static call, with the result that it
+ * can use static_call{,_cond}() but not static_call_update(). If the intent
+ * is to permit modules to manipulate the targets of static calls associated
+ * with a certain static call key occurring anywhere in the core kernel or
+ * other modules than the one importing the static call, use
+ * EXPORT_STATIC_CALL_FOR_UPDATE() instead.
*
*/

@@ -186,10 +190,10 @@ extern long __static_call_return0(void);
}; \
ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)

-#define EXPORT_STATIC_CALL(name) \
+#define EXPORT_STATIC_CALL_FOR_UPDATE(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
-#define EXPORT_STATIC_CALL_GPL(name) \
+#define EXPORT_STATIC_CALL_FOR_UPDATE_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

@@ -202,11 +206,11 @@ extern long __static_call_return0(void);
EXPORT_SYMBOL_GPL(STATIC_CALL_GETKEY(name))

/* Leave the key unexported, so modules can't change static call targets: */
-#define EXPORT_STATIC_CALL_TRAMP(name) \
+#define EXPORT_STATIC_CALL(name) \
EXPORT_STATIC_CALL_QUERY(name,); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)); \
EXPORT_STATIC_CALL_GETKEY_HELPER(name)
-#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
+#define EXPORT_STATIC_CALL_GPL(name) \
EXPORT_STATIC_CALL_QUERY(name, _GPL); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)); \
EXPORT_STATIC_CALL_GETKEY_HELPER(name)
@@ -249,18 +253,18 @@ static inline long __static_call_return0(void)
return 0;
}

-#define EXPORT_STATIC_CALL(name) \
+#define EXPORT_STATIC_CALL_FOR_UPDATE(name) \
EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
-#define EXPORT_STATIC_CALL_GPL(name) \
+#define EXPORT_STATIC_CALL_FOR_UPDATE_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

/* Leave the key unexported, so modules can't change static call targets: */
-#define EXPORT_STATIC_CALL_TRAMP(name) \
+#define EXPORT_STATIC_CALL(name) \
EXPORT_STATIC_CALL_QUERY(name,); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
-#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
+#define EXPORT_STATIC_CALL_GPL(name) \
EXPORT_STATIC_CALL_QUERY(name, _GPL); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

@@ -301,6 +305,9 @@ static inline int static_call_text_reserved(void *start, void *end)
#define EXPORT_STATIC_CALL(name) EXPORT_SYMBOL(STATIC_CALL_KEY(name))
#define EXPORT_STATIC_CALL_GPL(name) EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))

+#define EXPORT_STATIC_CALL_FOR_UPDATE(name) EXPORT_SYMBOL(STATIC_CALL_KEY(name))
+#define EXPORT_STATIC_CALL_FOR_UPDATE_GPL(name) EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))
+
#endif /* CONFIG_HAVE_STATIC_CALL */

#define DEFINE_STATIC_CALL(name, _func) \
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 523fd602ea90..0e963d77176f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6445,7 +6445,7 @@ EXPORT_SYMBOL(preempt_schedule);

#ifdef CONFIG_PREEMPT_DYNAMIC
DEFINE_STATIC_CALL(preempt_schedule, __preempt_schedule_func);
-EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
+EXPORT_STATIC_CALL(preempt_schedule);
#endif


@@ -6503,7 +6503,7 @@ EXPORT_SYMBOL_GPL(preempt_schedule_notrace);

#ifdef CONFIG_PREEMPT_DYNAMIC
DEFINE_STATIC_CALL(preempt_schedule_notrace, __preempt_schedule_notrace_func);
-EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
+EXPORT_STATIC_CALL(preempt_schedule_notrace);
#endif

#endif /* CONFIG_PREEMPTION */
@@ -8150,10 +8150,10 @@ EXPORT_SYMBOL(__cond_resched);

#ifdef CONFIG_PREEMPT_DYNAMIC
DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
-EXPORT_STATIC_CALL_TRAMP(cond_resched);
+EXPORT_STATIC_CALL(cond_resched);

DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
-EXPORT_STATIC_CALL_TRAMP(might_resched);
+EXPORT_STATIC_CALL(might_resched);
#endif

/*
--
2.30.2

2021-11-10 00:08:31

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 4/7] static_call: fix broken static_call_query() for non-exported keys

static_call_query() accesses the func member of the static call key
directly, which means that it is broken for cases where it is used from
a module and the key resides elsewhere and is not exported.

Let's add a helper that returns this value, and export it from the same
module that the key resides in if the key itself is not exported. This
way, we can always get the value regardless of whether the key is
exported or not.

Note that the non-NULL check of &STATIC_CALL_KEY(...) does not typically
result in a load: for ISAs that support relocatable immediates, the
address is patched into the instruction stream.

For example, on ARM/Thumb2, we get

14a: f240 0300 movw r3, #0
14a: R_ARM_THM_MOVW_ABS_NC __SCK__pv_steal_clock
14e: f2c0 0300 movt r3, #0
14e: R_ARM_THM_MOVT_ABS __SCK__pv_steal_clock
152: b10b cbz r3, 158 <foo+0x14>
154: 6818 ldr r0, [r3, #0]
156: 4770 bx lr
158: f7ff bffe b.w 0 <__SCQ__pv_steal_clock>
158: R_ARM_THM_JUMP24 __SCQ__pv_steal_clock

I.e., it performs a conditional branch if zero (CBZ) on a quantity that
was loaded using an MOVW/MOVT move-immediate pair, and either loads the
func member directly, or tail calls __SCQ__pv_steal_clock if they key is
not exported.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/linux/static_call.h | 9 ++++++++-
include/linux/static_call_types.h | 18 +++++++++++++++++-
tools/include/linux/static_call_types.h | 18 +++++++++++++++++-
3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 3bba0bcba844..391f737496eb 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -149,7 +149,10 @@ extern void arch_static_call_transform(void *site, void *tramp, void *func, bool
STATIC_CALL_TRAMP_ADDR(name), __F); \
})

-#define static_call_query(name) (READ_ONCE(STATIC_CALL_KEY(name).func))
+#define EXPORT_STATIC_CALL_QUERY(name, sfx) \
+ typeof(STATIC_CALL_QUERY(name)()) STATIC_CALL_QUERY(name)(void) \
+ { return STATIC_CALL_KEY(name).func; } \
+ EXPORT_SYMBOL ## sfx (STATIC_CALL_QUERY(name))

#ifdef CONFIG_HAVE_STATIC_CALL_INLINE

@@ -200,9 +203,11 @@ extern long __static_call_return0(void);

/* Leave the key unexported, so modules can't change static call targets: */
#define EXPORT_STATIC_CALL_TRAMP(name) \
+ EXPORT_STATIC_CALL_QUERY(name,); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)); \
EXPORT_STATIC_CALL_GETKEY_HELPER(name)
#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
+ EXPORT_STATIC_CALL_QUERY(name, _GPL); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)); \
EXPORT_STATIC_CALL_GETKEY_HELPER(name)

@@ -253,8 +258,10 @@ static inline long __static_call_return0(void)

/* Leave the key unexported, so modules can't change static call targets: */
#define EXPORT_STATIC_CALL_TRAMP(name) \
+ EXPORT_STATIC_CALL_QUERY(name,); \
EXPORT_SYMBOL(STATIC_CALL_TRAMP(name))
#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \
+ EXPORT_STATIC_CALL_QUERY(name, _GPL); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))

#else /* Generic implementation */
diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index a31782909e43..2fce9aa8a995 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -23,6 +23,9 @@
#define STATIC_CALL_GETKEY_PREFIX_LEN (sizeof(STATIC_CALL_GETKEY_PREFIX_STR) - 1)
#define STATIC_CALL_GETKEY(name) __PASTE(STATIC_CALL_GETKEY_PREFIX, name)

+#define STATIC_CALL_QUERY_PREFIX __SCQ__
+#define STATIC_CALL_QUERY(name) __PASTE(STATIC_CALL_QUERY_PREFIX, name)
+
/*
* Flags in the low bits of static_call_site::key.
*/
@@ -43,7 +46,20 @@ struct static_call_site {
#define DECLARE_STATIC_CALL(name, func) \
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
- extern typeof(func) STATIC_CALL_TRAMP(name);
+ extern __weak typeof(func) *STATIC_CALL_QUERY(name)(void); \
+ extern typeof(func) STATIC_CALL_TRAMP(name)
+
+#define __static_call_query(name) \
+ ((typeof(STATIC_CALL_QUERY(name)()))READ_ONCE(STATIC_CALL_KEY(name).func))
+
+#ifdef MODULE
+/* the key might not be exported */
+#define static_call_query(name) \
+ (&STATIC_CALL_KEY(name) ? __static_call_query(name) \
+ : STATIC_CALL_QUERY(name)())
+#else
+#define static_call_query(name) __static_call_query(name)
+#endif

#ifdef CONFIG_HAVE_STATIC_CALL

diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h
index a31782909e43..2fce9aa8a995 100644
--- a/tools/include/linux/static_call_types.h
+++ b/tools/include/linux/static_call_types.h
@@ -23,6 +23,9 @@
#define STATIC_CALL_GETKEY_PREFIX_LEN (sizeof(STATIC_CALL_GETKEY_PREFIX_STR) - 1)
#define STATIC_CALL_GETKEY(name) __PASTE(STATIC_CALL_GETKEY_PREFIX, name)

+#define STATIC_CALL_QUERY_PREFIX __SCQ__
+#define STATIC_CALL_QUERY(name) __PASTE(STATIC_CALL_QUERY_PREFIX, name)
+
/*
* Flags in the low bits of static_call_site::key.
*/
@@ -43,7 +46,20 @@ struct static_call_site {
#define DECLARE_STATIC_CALL(name, func) \
extern __weak struct static_call_key STATIC_CALL_KEY(name); \
extern __weak struct static_call_key *STATIC_CALL_GETKEY(name)(void);\
- extern typeof(func) STATIC_CALL_TRAMP(name);
+ extern __weak typeof(func) *STATIC_CALL_QUERY(name)(void); \
+ extern typeof(func) STATIC_CALL_TRAMP(name)
+
+#define __static_call_query(name) \
+ ((typeof(STATIC_CALL_QUERY(name)()))READ_ONCE(STATIC_CALL_KEY(name).func))
+
+#ifdef MODULE
+/* the key might not be exported */
+#define static_call_query(name) \
+ (&STATIC_CALL_KEY(name) ? __static_call_query(name) \
+ : STATIC_CALL_QUERY(name)())
+#else
+#define static_call_query(name) __static_call_query(name)
+#endif

#ifdef CONFIG_HAVE_STATIC_CALL

--
2.30.2

2021-11-10 00:12:34

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 1/7] static_call: get rid of static_call_cond()

On Tue, 9 Nov 2021 at 19:38, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Nov 09, 2021 at 05:45:43PM +0100, Ard Biesheuvel wrote:
> > The main reason for the existence of static_call_cond() seems to be that
> > in theory, when using the generic implementation of static calls, it may
> > be possible for a compiler to elide the indirect call entirely if the
> > target is NULL, while still guaranteeing that all side effects of
> > argument evaluation occur as expected.
> >
> > This is rather optimistic: as documented by an existing code comment,
> > both GCC and Clang (version 10) get this wrong, and even if they ever
> > get it right, this is far too subtle to rely on for a code path that is
> > expected to be used only by the 'remaining' architectures once all the
> > best supported ones implement either the out-of-line or inline optimized
> > variety of static calls.
> >
> > Given that having static_call_cond() clutters up the API, and puts the
> > burden on the caller to go and check what kind of static call they are
> > dealing with, let's just get rid of the distinction.
>
> No, static_call_cond() signifies the function can be NULL. Both gcc and
> clang generate correct (but wildly ineffecient) code for this. Without
> static_call_cond() the generic implementation will do a NULL deref.
>
> That is, static_call_cond() does properly encapuslate:
>
> func = READ_ONCE(key.func);
> if (func)
> func(ARGS);
>
> You can't take that out.

I actually address that in the patch.

AIUI, the compiler generates an indirect call to __static_call_nop(),
right? So why not simply set .func to the address of
__static_call_nop() when NULL is passed to update / the initializer?

2021-11-10 00:12:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 2/7] static_call: deal with unexported keys without cluttering up the API

On Tue, Nov 09, 2021 at 05:45:44PM +0100, Ard Biesheuvel wrote:

> diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
> index 5a00b8b2cf9f..0bb36294cce7 100644
> --- a/include/linux/static_call_types.h
> +++ b/include/linux/static_call_types.h
> @@ -32,15 +32,20 @@
> struct static_call_site {
> s32 addr;
> s32 key;
> + s32 tramp;
> };

I can't say I'm thrilled at growing this thing, but the cleanup is nice.
Perhaps we can increase alignment on struct static_call_key and instead
frob it in .key still?

>
> #define DECLARE_STATIC_CALL(name, func) \
> - extern struct static_call_key STATIC_CALL_KEY(name); \
> + extern __weak struct static_call_key STATIC_CALL_KEY(name); \
> extern typeof(func) STATIC_CALL_TRAMP(name);

I'm a little bit confused on how this actually works. What does a __weak
extern data symbol do?

A __weak function definition would create a module local instance of the
function barring a strong override.

But what does a __weak extern do?

> diff --git a/kernel/static_call.c b/kernel/static_call.c
> index 43ba0b1e0edb..360cc3cd0fbf 100644
> --- a/kernel/static_call.c
> +++ b/kernel/static_call.c
> @@ -366,18 +366,18 @@ static int static_call_add_module(struct module *mod)
> * means modules are allowed to call static_call_update() on
> * it.
> *
> - * Otherwise, the key isn't exported, and 'addr' points to the
> + * Otherwise, the key isn't exported, and 'tramp' points to the
> * trampoline so we need to lookup the key.
> *
> * We go through this dance to prevent crazy modules from
> * abusing sensitive static calls.
> */
> - if (!kernel_text_address(addr))
> + if (addr)
> continue;

This seems to imply that the __weak extern symbol gets resolved to 0 at
module link time.

>
> - key = tramp_key_lookup(addr);
> + key = tramp_key_lookup((unsigned long)offset_to_ptr(&site->tramp));
> if (!key) {
> - pr_warn("Failed to fixup __raw_static_call() usage at: %ps\n",
> + pr_warn("Failed to fixup static_call() usage at: %ps\n",
> static_call_addr(site));
> return -EINVAL;

> }

2021-11-10 00:13:11

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 2/7] static_call: deal with unexported keys without cluttering up the API

On Tue, 9 Nov 2021 at 19:49, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Nov 09, 2021 at 05:45:44PM +0100, Ard Biesheuvel wrote:
>
> > diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
> > index 5a00b8b2cf9f..0bb36294cce7 100644
> > --- a/include/linux/static_call_types.h
> > +++ b/include/linux/static_call_types.h
> > @@ -32,15 +32,20 @@
> > struct static_call_site {
> > s32 addr;
> > s32 key;
> > + s32 tramp;
> > };
>
> I can't say I'm thrilled at growing this thing, but the cleanup is nice.
> Perhaps we can increase alignment on struct static_call_key and instead
> frob it in .key still?
>

This is already a place-relative field, and one points into the data
section and the other into text. So I don't see how we can squeeze
enough bits out of it to make this fit.

> >
> > #define DECLARE_STATIC_CALL(name, func) \
> > - extern struct static_call_key STATIC_CALL_KEY(name); \
> > + extern __weak struct static_call_key STATIC_CALL_KEY(name); \
> > extern typeof(func) STATIC_CALL_TRAMP(name);
>
> I'm a little bit confused on how this actually works. What does a __weak
> extern data symbol do?
>
> A __weak function definition would create a module local instance of the
> function barring a strong override.
>
> But what does a __weak extern do?
>

It is simply a reference that is permitted to remain undefined.

That is why (in another patch) I do sth like

extern __weak foo;

if (&foo)
... use foo
else
... use sth else


> > diff --git a/kernel/static_call.c b/kernel/static_call.c
> > index 43ba0b1e0edb..360cc3cd0fbf 100644
> > --- a/kernel/static_call.c
> > +++ b/kernel/static_call.c
> > @@ -366,18 +366,18 @@ static int static_call_add_module(struct module *mod)
> > * means modules are allowed to call static_call_update() on
> > * it.
> > *
> > - * Otherwise, the key isn't exported, and 'addr' points to the
> > + * Otherwise, the key isn't exported, and 'tramp' points to the
> > * trampoline so we need to lookup the key.
> > *
> > * We go through this dance to prevent crazy modules from
> > * abusing sensitive static calls.
> > */
> > - if (!kernel_text_address(addr))
> > + if (addr)
> > continue;
>
> This seems to imply that the __weak extern symbol gets resolved to 0 at
> module link time.
>

Yes, if the referenced symbol is not exported, it just uses 0x0 for its value.

> >
> > - key = tramp_key_lookup(addr);
> > + key = tramp_key_lookup((unsigned long)offset_to_ptr(&site->tramp));
> > if (!key) {
> > - pr_warn("Failed to fixup __raw_static_call() usage at: %ps\n",
> > + pr_warn("Failed to fixup static_call() usage at: %ps\n",
> > static_call_addr(site));
> > return -EINVAL;
>
> > }

2021-11-10 00:13:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 4/7] static_call: fix broken static_call_query() for non-exported keys

On Tue, Nov 09, 2021 at 05:45:46PM +0100, Ard Biesheuvel wrote:
> static_call_query() accesses the func member of the static call key
> directly, which means that it is broken for cases where it is used from
> a module and the key resides elsewhere and is not exported.
>

Arguably when the module can't change the value, it shouldn't be having
any interest in obtaining said value either.

I really think you're exporting too much. Modules aren't good, they
deserve to suffer.

2021-11-10 00:13:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 3/7] static_call: use helper to access non-exported key

On Tue, Nov 09, 2021 at 05:45:45PM +0100, Ard Biesheuvel wrote:
> @@ -196,13 +190,21 @@ extern long __static_call_return0(void);
> EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
> EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
>
> +#define EXPORT_STATIC_CALL_GETKEY_HELPER(name) \
> + struct static_call_key *STATIC_CALL_GETKEY(name)(void) { \
> + BUG_ON(!core_kernel_text( \
> + (unsigned long)__builtin_return_address(0))); \
> + return &STATIC_CALL_KEY(name); \
> + } \
> + EXPORT_SYMBOL_GPL(STATIC_CALL_GETKEY(name))

So if I were a nevarious module, I would look up the above symbol from
kallsyms (it is exported and easily obtainable) and then simply
read the text to discover the key address and we're in business.

2021-11-10 00:13:38

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 3/7] static_call: use helper to access non-exported key

On Tue, 9 Nov 2021 at 19:53, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Nov 09, 2021 at 05:45:45PM +0100, Ard Biesheuvel wrote:
> > @@ -196,13 +190,21 @@ extern long __static_call_return0(void);
> > EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
> > EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
> >
> > +#define EXPORT_STATIC_CALL_GETKEY_HELPER(name) \
> > + struct static_call_key *STATIC_CALL_GETKEY(name)(void) { \
> > + BUG_ON(!core_kernel_text( \
> > + (unsigned long)__builtin_return_address(0))); \
> > + return &STATIC_CALL_KEY(name); \
> > + } \
> > + EXPORT_SYMBOL_GPL(STATIC_CALL_GETKEY(name))
>
> So if I were a nevarious module, I would look up the above symbol from
> kallsyms (it is exported and easily obtainable) and then simply
> read the text to discover the key address and we're in business.
>

Yeah I realised that. So would you prefer to have a
.static_call_tramp_key section in each module and look up the keys in
the module loader?

2021-11-10 00:13:39

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 6/7] static_call: rename EXPORT_ macros to be more self-explanatory

On Tue, 9 Nov 2021 at 20:00, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, Nov 09, 2021 at 05:45:48PM +0100, Ard Biesheuvel wrote:
> > So let's rename these macros to
> >
> > EXPORT_STATIC_CALL -> EXPORT_STATIC_CALL_FOR_UPDATE
> > EXPORT_STATIC_CALL_TRAMP -> EXPORT_STATIC_CALL
> >
>
> Ok, let's pain this shed a bit.
>
> How about:
>
> EXPORT_STATIC_CALL_RW
> EXPORT_STATIC_CALL_RO
>

Works for me

> respectively. OR.. alternatively, have both:
>
> EXPORT_STATIC_CALL_KEY
> EXPORT_STATIC_CALL_TRAMP
>
> and those that want to export both get to use both.

Yeah but whether a trampoline even exists is an implementation detail,
which I would like to omit from the API.

2021-11-10 00:13:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 6/7] static_call: rename EXPORT_ macros to be more self-explanatory

On Tue, Nov 09, 2021 at 05:45:48PM +0100, Ard Biesheuvel wrote:
> So let's rename these macros to
>
> EXPORT_STATIC_CALL -> EXPORT_STATIC_CALL_FOR_UPDATE
> EXPORT_STATIC_CALL_TRAMP -> EXPORT_STATIC_CALL
>

Ok, let's pain this shed a bit.

How about:

EXPORT_STATIC_CALL_RW
EXPORT_STATIC_CALL_RO

respectively. OR.. alternatively, have both:

EXPORT_STATIC_CALL_KEY
EXPORT_STATIC_CALL_TRAMP

and those that want to export both get to use both.

2021-11-10 00:14:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 1/7] static_call: get rid of static_call_cond()

On Tue, Nov 09, 2021 at 05:45:43PM +0100, Ard Biesheuvel wrote:
> The main reason for the existence of static_call_cond() seems to be that
> in theory, when using the generic implementation of static calls, it may
> be possible for a compiler to elide the indirect call entirely if the
> target is NULL, while still guaranteeing that all side effects of
> argument evaluation occur as expected.
>
> This is rather optimistic: as documented by an existing code comment,
> both GCC and Clang (version 10) get this wrong, and even if they ever
> get it right, this is far too subtle to rely on for a code path that is
> expected to be used only by the 'remaining' architectures once all the
> best supported ones implement either the out-of-line or inline optimized
> variety of static calls.
>
> Given that having static_call_cond() clutters up the API, and puts the
> burden on the caller to go and check what kind of static call they are
> dealing with, let's just get rid of the distinction.

No, static_call_cond() signifies the function can be NULL. Both gcc and
clang generate correct (but wildly ineffecient) code for this. Without
static_call_cond() the generic implementation will do a NULL deref.

That is, static_call_cond() does properly encapuslate:

func = READ_ONCE(key.func);
if (func)
func(ARGS);

You can't take that out.

2021-11-10 00:15:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 1/7] static_call: get rid of static_call_cond()

On Tue, Nov 09, 2021 at 05:45:43PM +0100, Ard Biesheuvel wrote:
> static inline
> void __static_call_update(struct static_call_key *key, void *tramp, void *func)
> {
> + WRITE_ONCE(key->func, func ?: (void *)&__static_call_nop);
> }

Ha, yes, I suppose that ought to work.

2021-11-10 00:15:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 1/7] static_call: get rid of static_call_cond()

On Tue, Nov 09, 2021 at 07:41:03PM +0100, Ard Biesheuvel wrote:
> On Tue, 9 Nov 2021 at 19:38, Peter Zijlstra <[email protected]> wrote:
> >
> > On Tue, Nov 09, 2021 at 05:45:43PM +0100, Ard Biesheuvel wrote:
> > > The main reason for the existence of static_call_cond() seems to be that
> > > in theory, when using the generic implementation of static calls, it may
> > > be possible for a compiler to elide the indirect call entirely if the
> > > target is NULL, while still guaranteeing that all side effects of
> > > argument evaluation occur as expected.
> > >
> > > This is rather optimistic: as documented by an existing code comment,
> > > both GCC and Clang (version 10) get this wrong, and even if they ever
> > > get it right, this is far too subtle to rely on for a code path that is
> > > expected to be used only by the 'remaining' architectures once all the
> > > best supported ones implement either the out-of-line or inline optimized
> > > variety of static calls.
> > >
> > > Given that having static_call_cond() clutters up the API, and puts the
> > > burden on the caller to go and check what kind of static call they are
> > > dealing with, let's just get rid of the distinction.
> >
> > No, static_call_cond() signifies the function can be NULL. Both gcc and
> > clang generate correct (but wildly ineffecient) code for this. Without
> > static_call_cond() the generic implementation will do a NULL deref.
> >
> > That is, static_call_cond() does properly encapuslate:
> >
> > func = READ_ONCE(key.func);
> > if (func)
> > func(ARGS);
> >
> > You can't take that out.
>
> I actually address that in the patch.
>
> AIUI, the compiler generates an indirect call to __static_call_nop(),
> right? So why not simply set .func to the address of
> __static_call_nop() when NULL is passed to update / the initializer?

Ooh, lemme go have a proper look then.

2021-11-10 00:17:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 3/7] static_call: use helper to access non-exported key

On Tue, Nov 09, 2021 at 07:54:23PM +0100, Ard Biesheuvel wrote:
> On Tue, 9 Nov 2021 at 19:53, Peter Zijlstra <[email protected]> wrote:
> >
> > On Tue, Nov 09, 2021 at 05:45:45PM +0100, Ard Biesheuvel wrote:
> > > @@ -196,13 +190,21 @@ extern long __static_call_return0(void);
> > > EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
> > > EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
> > >
> > > +#define EXPORT_STATIC_CALL_GETKEY_HELPER(name) \
> > > + struct static_call_key *STATIC_CALL_GETKEY(name)(void) { \
> > > + BUG_ON(!core_kernel_text( \
> > > + (unsigned long)__builtin_return_address(0))); \
> > > + return &STATIC_CALL_KEY(name); \
> > > + } \
> > > + EXPORT_SYMBOL_GPL(STATIC_CALL_GETKEY(name))
> >
> > So if I were a nevarious module, I would look up the above symbol from
> > kallsyms (it is exported and easily obtainable) and then simply
> > read the text to discover the key address and we're in business.
> >
>
> Yeah I realised that. So would you prefer to have a
> .static_call_tramp_key section in each module and look up the keys in
> the module loader?

Think so... do you actually have a need for this?