2021-04-01 23:33:16

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 00/18] Add support for Clang CFI

This series adds support for Clang's Control-Flow Integrity (CFI)
checking. With CFI, the compiler injects a runtime check before each
indirect function call to ensure the target is a valid function with
the correct static type. This restricts possible call targets and
makes it more difficult for an attacker to exploit bugs that allow the
modification of stored function pointers. For more details, see:

https://clang.llvm.org/docs/ControlFlowIntegrity.html

The first patch contains build system changes and error handling,
and implements support for cross-module indirect call checking. The
remaining patches address issues caused by the compiler
instrumentation. These include fixing known type mismatches, as well
as issues with address space confusion and cross-module function
address equality.

These patches add support only for arm64, but I'll post patches also
for x86_64 after we address the remaining issues there, including
objtool support.

You can also pull this series from

https://github.com/samitolvanen/linux.git cfi-v5

---
Changes in v5:
- Changed module.lds.S to only include <asm/page.h> when CFI is
enabled to fix the MIPS build.
- Added a patch that fixes dynamic ftrace with CFI on arm64.

Changes in v4:
- Per Mark's suggestion, dropped __pa_function() and renamed
__va_function() to function_nocfi().
- Added a comment to function_nocfi() to explain what it does.
- Updated the psci patch to use an intermediate variable for
the physical address for clarity.

Changes in v3:
- Added a patch to change list_sort() callers treewide to use
const pointers instead of simply removing the internal casts.
- Changed cleanup_symbol_name() to return bool.
- Changed module.lds.S to drop the .eh_frame section only with
CONFIG_CFI_CLANG.
- Switched to synchronize_rcu() in update_shadow().

Changes in v2:
- Fixed .text merging in module.lds.S.
- Added WARN_ON_FUNCTION_MISMATCH() and changed kernel/thread.c
and kernel/workqueue.c to use the macro instead.


Sami Tolvanen (18):
add support for Clang CFI
cfi: add __cficanonical
mm: add generic function_nocfi macro
module: ensure __cfi_check alignment
workqueue: use WARN_ON_FUNCTION_MISMATCH
kthread: use WARN_ON_FUNCTION_MISMATCH
kallsyms: strip ThinLTO hashes from static functions
bpf: disable CFI in dispatcher functions
treewide: Change list_sort to use const pointers
lkdtm: use function_nocfi
psci: use function_nocfi for cpu_resume
arm64: implement function_nocfi
arm64: use function_nocfi with __pa_symbol
arm64: add __nocfi to functions that jump to a physical address
arm64: add __nocfi to __apply_alternatives
arm64: ftrace: use function_nocfi for ftrace_call
KVM: arm64: Disable CFI for nVHE
arm64: allow CONFIG_CFI_CLANG to be selected

Makefile | 17 +
arch/Kconfig | 45 +++
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/memory.h | 15 +
arch/arm64/include/asm/mmu_context.h | 4 +-
arch/arm64/kernel/acpi_parking_protocol.c | 3 +-
arch/arm64/kernel/alternative.c | 4 +-
arch/arm64/kernel/cpu-reset.h | 10 +-
arch/arm64/kernel/cpufeature.c | 4 +-
arch/arm64/kernel/ftrace.c | 2 +-
arch/arm64/kernel/psci.c | 3 +-
arch/arm64/kernel/smp_spin_table.c | 3 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 6 +-
arch/arm64/kvm/vgic/vgic-its.c | 8 +-
arch/arm64/kvm/vgic/vgic.c | 3 +-
block/blk-mq-sched.c | 3 +-
block/blk-mq.c | 3 +-
drivers/acpi/nfit/core.c | 3 +-
drivers/acpi/numa/hmat.c | 3 +-
drivers/clk/keystone/sci-clk.c | 4 +-
drivers/firmware/psci/psci.c | 7 +-
drivers/gpu/drm/drm_modes.c | 3 +-
drivers/gpu/drm/i915/gt/intel_engine_user.c | 3 +-
drivers/gpu/drm/i915/gvt/debugfs.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 3 +-
drivers/gpu/drm/radeon/radeon_cs.c | 4 +-
.../hw/usnic/usnic_uiom_interval_tree.c | 3 +-
drivers/interconnect/qcom/bcm-voter.c | 2 +-
drivers/md/raid5.c | 3 +-
drivers/misc/lkdtm/usercopy.c | 2 +-
drivers/misc/sram.c | 4 +-
drivers/nvme/host/core.c | 3 +-
.../controller/cadence/pcie-cadence-host.c | 3 +-
drivers/spi/spi-loopback-test.c | 3 +-
fs/btrfs/raid56.c | 3 +-
fs/btrfs/tree-log.c | 3 +-
fs/btrfs/volumes.c | 3 +-
fs/ext4/fsmap.c | 4 +-
fs/gfs2/glock.c | 3 +-
fs/gfs2/log.c | 2 +-
fs/gfs2/lops.c | 3 +-
fs/iomap/buffered-io.c | 3 +-
fs/ubifs/gc.c | 7 +-
fs/ubifs/replay.c | 4 +-
fs/xfs/scrub/bitmap.c | 4 +-
fs/xfs/xfs_bmap_item.c | 4 +-
fs/xfs/xfs_buf.c | 6 +-
fs/xfs/xfs_extent_busy.c | 4 +-
fs/xfs/xfs_extent_busy.h | 3 +-
fs/xfs/xfs_extfree_item.c | 4 +-
fs/xfs/xfs_refcount_item.c | 4 +-
fs/xfs/xfs_rmap_item.c | 4 +-
include/asm-generic/bug.h | 16 +
include/asm-generic/vmlinux.lds.h | 20 +-
include/linux/bpf.h | 4 +-
include/linux/cfi.h | 41 +++
include/linux/compiler-clang.h | 3 +
include/linux/compiler_types.h | 8 +
include/linux/init.h | 6 +-
include/linux/list_sort.h | 7 +-
include/linux/mm.h | 10 +
include/linux/module.h | 13 +-
include/linux/pci.h | 4 +-
init/Kconfig | 2 +-
kernel/Makefile | 4 +
kernel/cfi.c | 329 ++++++++++++++++++
kernel/kallsyms.c | 55 ++-
kernel/kthread.c | 3 +-
kernel/module.c | 43 +++
kernel/workqueue.c | 2 +-
lib/list_sort.c | 17 +-
lib/test_list_sort.c | 3 +-
net/tipc/name_table.c | 4 +-
scripts/Makefile.modfinal | 2 +-
scripts/module.lds.S | 19 +-
75 files changed, 759 insertions(+), 113 deletions(-)
create mode 100644 include/linux/cfi.h
create mode 100644 kernel/cfi.c


base-commit: 6905b1dc3c32a094f0da61bd656a740f0a97d592
--
2.31.0.208.g409f899ff0-goog


2021-04-01 23:33:21

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 02/18] cfi: add __cficanonical

With CONFIG_CFI_CLANG, the compiler replaces a function address taken
in C code with the address of a local jump table entry, which passes
runtime indirect call checks. However, the compiler won't replace
addresses taken in assembly code, which will result in a CFI failure
if we later jump to such an address in instrumented C code. The code
generated for the non-canonical jump table looks this:

<noncanonical.cfi_jt>: /* In C, &noncanonical points here */
jmp noncanonical
...
<noncanonical>: /* function body */
...

This change adds the __cficanonical attribute, which tells the
compiler to use a canonical jump table for the function instead. This
means the compiler will rename the actual function to <function>.cfi
and points the original symbol to the jump table entry instead:

<canonical>: /* jump table entry */
jmp canonical.cfi
...
<canonical.cfi>: /* function body */
...

As a result, the address taken in assembly, or other non-instrumented
code always points to the jump table and therefore, can be used for
indirect calls in instrumented code without tripping CFI checks.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]> # pci.h
---
include/linux/compiler-clang.h | 1 +
include/linux/compiler_types.h | 4 ++++
include/linux/init.h | 4 ++--
include/linux/pci.h | 4 ++--
4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 6de9d0c9377e..adbe76b203e2 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -63,3 +63,4 @@
#endif

#define __nocfi __attribute__((__no_sanitize__("cfi")))
+#define __cficanonical __attribute__((__cfi_canonical_jump_table__))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 796935a37e37..d29bda7f6ebd 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -246,6 +246,10 @@ struct ftrace_likely_data {
# define __nocfi
#endif

+#ifndef __cficanonical
+# define __cficanonical
+#endif
+
#ifndef asm_volatile_goto
#define asm_volatile_goto(x...) asm goto(x)
#endif
diff --git a/include/linux/init.h b/include/linux/init.h
index b3ea15348fbd..045ad1650ed1 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -220,8 +220,8 @@ extern bool initcall_debug;
__initcall_name(initstub, __iid, id)

#define __define_initcall_stub(__stub, fn) \
- int __init __stub(void); \
- int __init __stub(void) \
+ int __init __cficanonical __stub(void); \
+ int __init __cficanonical __stub(void) \
{ \
return fn(); \
} \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..39684b72db91 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1944,8 +1944,8 @@ enum pci_fixup_pass {
#ifdef CONFIG_LTO_CLANG
#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \
class_shift, hook, stub) \
- void stub(struct pci_dev *dev); \
- void stub(struct pci_dev *dev) \
+ void __cficanonical stub(struct pci_dev *dev); \
+ void __cficanonical stub(struct pci_dev *dev) \
{ \
hook(dev); \
} \
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:33:29

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 03/18] mm: add generic function_nocfi macro

With CONFIG_CFI_CLANG, the compiler replaces function addresses
in instrumented C code with jump table addresses. This means that
__pa_symbol(function) returns the physical address of the jump table
entry instead of the actual function, which may not work as the jump
table code will immediately jump to a virtual address that may not be
mapped.

To avoid this address space confusion, this change adds a generic
definition for function_nocfi(), which architectures that support CFI
can override. The typical implementation of would use inline assembly
to take the function address, which avoids compiler instrumentation.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
include/linux/mm.h | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8ba434287387..22cce9c7dd05 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -124,6 +124,16 @@ extern int mmap_rnd_compat_bits __read_mostly;
#define lm_alias(x) __va(__pa_symbol(x))
#endif

+/*
+ * With CONFIG_CFI_CLANG, the compiler replaces function addresses in
+ * instrumented C code with jump table addresses. Architectures that
+ * support CFI can define this macro to return the actual function address
+ * when needed.
+ */
+#ifndef function_nocfi
+#define function_nocfi(x) (x)
+#endif
+
/*
* To prevent common memory management code establishing
* a zero page mapping on a read fault.
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:33:37

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 04/18] module: ensure __cfi_check alignment

CONFIG_CFI_CLANG_SHADOW assumes the __cfi_check() function is page
aligned and at the beginning of the .text section. While Clang would
normally align the function correctly, it fails to do so for modules
with no executable code.

This change ensures the correct __cfi_check() location and
alignment. It also discards the .eh_frame section, which Clang can
generate with certain sanitizers, such as CFI.

Link: https://bugs.llvm.org/show_bug.cgi?id=46293
Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Jessica Yu <[email protected]>
---
scripts/module.lds.S | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/scripts/module.lds.S b/scripts/module.lds.S
index 168cd27e6122..f8022b34e388 100644
--- a/scripts/module.lds.S
+++ b/scripts/module.lds.S
@@ -3,10 +3,20 @@
* Archs are free to supply their own linker scripts. ld will
* combine them automatically.
*/
+#ifdef CONFIG_CFI_CLANG
+# include <asm/page.h>
+# define ALIGN_CFI ALIGN(PAGE_SIZE)
+# define SANITIZER_DISCARDS *(.eh_frame)
+#else
+# define ALIGN_CFI
+# define SANITIZER_DISCARDS
+#endif
+
SECTIONS {
/DISCARD/ : {
*(.discard)
*(.discard.*)
+ SANITIZER_DISCARDS
}

__ksymtab 0 : { *(SORT(___ksymtab+*)) }
@@ -40,7 +50,14 @@ SECTIONS {
*(.rodata..L*)
}

- .text : { *(.text .text.[0-9a-zA-Z_]*) }
+ /*
+ * With CONFIG_CFI_CLANG, we assume __cfi_check is at the beginning
+ * of the .text section, and is aligned to PAGE_SIZE.
+ */
+ .text : ALIGN_CFI {
+ *(.text.__cfi_check)
+ *(.text .text.[0-9a-zA-Z_]* .text..L.cfi*)
+ }
}

/* bring in arch-specific sections */
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:33:39

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 05/18] workqueue: use WARN_ON_FUNCTION_MISMATCH

With CONFIG_CFI_CLANG, a callback function passed to
__queue_delayed_work from a module points to a jump table entry
defined in the module instead of the one used in the core kernel,
which breaks function address equality in this check:

WARN_ON_ONCE(timer->function != delayed_work_timer_fn);

Use WARN_ON_FUNCTION_MISMATCH() instead to disable the warning
when CFI and modules are both enabled.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
kernel/workqueue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0d150da252e8..03fe07d2f39f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1630,7 +1630,7 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
struct work_struct *work = &dwork->work;

WARN_ON_ONCE(!wq);
- WARN_ON_ONCE(timer->function != delayed_work_timer_fn);
+ WARN_ON_FUNCTION_MISMATCH(timer->function, delayed_work_timer_fn);
WARN_ON_ONCE(timer_pending(timer));
WARN_ON_ONCE(!list_empty(&work->entry));

--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:33:49

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 06/18] kthread: use WARN_ON_FUNCTION_MISMATCH

With CONFIG_CFI_CLANG, a callback function passed to
__kthread_queue_delayed_work from a module points to a jump table
entry defined in the module instead of the one used in the core
kernel, which breaks function address equality in this check:

WARN_ON_ONCE(timer->function != ktead_delayed_work_timer_fn);

Use WARN_ON_FUNCTION_MISMATCH() instead to disable the warning
when CFI and modules are both enabled.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
kernel/kthread.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 1578973c5740..a1972eba2917 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -963,7 +963,8 @@ static void __kthread_queue_delayed_work(struct kthread_worker *worker,
struct timer_list *timer = &dwork->timer;
struct kthread_work *work = &dwork->work;

- WARN_ON_ONCE(timer->function != kthread_delayed_work_timer_fn);
+ WARN_ON_FUNCTION_MISMATCH(timer->function,
+ kthread_delayed_work_timer_fn);

/*
* If @delay is 0, queue @dwork->work immediately. This is for
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:33:59

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 07/18] kallsyms: strip ThinLTO hashes from static functions

With CONFIG_CFI_CLANG and ThinLTO, Clang appends a hash to the names
of all static functions not marked __used. This can break userspace
tools that don't expect the function name to change, so strip out the
hash from the output.

Suggested-by: Jack Pham <[email protected]>
Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
kernel/kallsyms.c | 55 ++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 50 insertions(+), 5 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 8043a90aa50e..c851ca0ed357 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -161,6 +161,27 @@ static unsigned long kallsyms_sym_address(int idx)
return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
}

+#if defined(CONFIG_CFI_CLANG) && defined(CONFIG_LTO_CLANG_THIN)
+/*
+ * LLVM appends a hash to static function names when ThinLTO and CFI are
+ * both enabled, i.e. foo() becomes foo$707af9a22804d33c81801f27dcfe489b.
+ * This causes confusion and potentially breaks user space tools, so we
+ * strip the suffix from expanded symbol names.
+ */
+static inline bool cleanup_symbol_name(char *s)
+{
+ char *res;
+
+ res = strrchr(s, '$');
+ if (res)
+ *res = '\0';
+
+ return res != NULL;
+}
+#else
+static inline bool cleanup_symbol_name(char *s) { return false; }
+#endif
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -173,6 +194,9 @@ unsigned long kallsyms_lookup_name(const char *name)

if (strcmp(namebuf, name) == 0)
return kallsyms_sym_address(i);
+
+ if (cleanup_symbol_name(namebuf) && strcmp(namebuf, name) == 0)
+ return kallsyms_sym_address(i);
}
return module_kallsyms_lookup_name(name);
}
@@ -303,7 +327,9 @@ const char *kallsyms_lookup(unsigned long addr,
namebuf, KSYM_NAME_LEN);
if (modname)
*modname = NULL;
- return namebuf;
+
+ ret = namebuf;
+ goto found;
}

/* See if it's in a module or a BPF JITed image. */
@@ -316,11 +342,16 @@ const char *kallsyms_lookup(unsigned long addr,
if (!ret)
ret = ftrace_mod_address_lookup(addr, symbolsize,
offset, modname, namebuf);
+
+found:
+ cleanup_symbol_name(namebuf);
return ret;
}

int lookup_symbol_name(unsigned long addr, char *symname)
{
+ int res;
+
symname[0] = '\0';
symname[KSYM_NAME_LEN - 1] = '\0';

@@ -331,15 +362,23 @@ int lookup_symbol_name(unsigned long addr, char *symname)
/* Grab name */
kallsyms_expand_symbol(get_symbol_offset(pos),
symname, KSYM_NAME_LEN);
- return 0;
+ goto found;
}
/* See if it's in a module. */
- return lookup_module_symbol_name(addr, symname);
+ res = lookup_module_symbol_name(addr, symname);
+ if (res)
+ return res;
+
+found:
+ cleanup_symbol_name(symname);
+ return 0;
}

int lookup_symbol_attrs(unsigned long addr, unsigned long *size,
unsigned long *offset, char *modname, char *name)
{
+ int res;
+
name[0] = '\0';
name[KSYM_NAME_LEN - 1] = '\0';

@@ -351,10 +390,16 @@ int lookup_symbol_attrs(unsigned long addr, unsigned long *size,
kallsyms_expand_symbol(get_symbol_offset(pos),
name, KSYM_NAME_LEN);
modname[0] = '\0';
- return 0;
+ goto found;
}
/* See if it's in a module. */
- return lookup_module_symbol_attrs(addr, size, offset, modname, name);
+ res = lookup_module_symbol_attrs(addr, size, offset, modname, name);
+ if (res)
+ return res;
+
+found:
+ cleanup_symbol_name(name);
+ return 0;
}

/* Look up a kernel symbol and return it in a text buffer. */
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:17

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 09/18] treewide: Change list_sort to use const pointers

list_sort() internally casts the comparison function passed to it
to a different type with constant struct list_head pointers, and
uses this pointer to call the functions, which trips indirect call
Control-Flow Integrity (CFI) checking.

Instead of removing the consts, this change defines the
list_cmp_func_t type and changes the comparison function types of
all list_sort() callers to use const pointers, thus avoiding type
mismatches.

Suggested-by: Nick Desaulniers <[email protected]>
Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/kvm/vgic/vgic-its.c | 8 ++++----
arch/arm64/kvm/vgic/vgic.c | 3 ++-
block/blk-mq-sched.c | 3 ++-
block/blk-mq.c | 3 ++-
drivers/acpi/nfit/core.c | 3 ++-
drivers/acpi/numa/hmat.c | 3 ++-
drivers/clk/keystone/sci-clk.c | 4 ++--
drivers/gpu/drm/drm_modes.c | 3 ++-
drivers/gpu/drm/i915/gt/intel_engine_user.c | 3 ++-
drivers/gpu/drm/i915/gvt/debugfs.c | 2 +-
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 3 ++-
drivers/gpu/drm/radeon/radeon_cs.c | 4 ++--
.../hw/usnic/usnic_uiom_interval_tree.c | 3 ++-
drivers/interconnect/qcom/bcm-voter.c | 2 +-
drivers/md/raid5.c | 3 ++-
drivers/misc/sram.c | 4 ++--
drivers/nvme/host/core.c | 3 ++-
.../pci/controller/cadence/pcie-cadence-host.c | 3 ++-
drivers/spi/spi-loopback-test.c | 3 ++-
fs/btrfs/raid56.c | 3 ++-
fs/btrfs/tree-log.c | 3 ++-
fs/btrfs/volumes.c | 3 ++-
fs/ext4/fsmap.c | 4 ++--
fs/gfs2/glock.c | 3 ++-
fs/gfs2/log.c | 2 +-
fs/gfs2/lops.c | 3 ++-
fs/iomap/buffered-io.c | 3 ++-
fs/ubifs/gc.c | 7 ++++---
fs/ubifs/replay.c | 4 ++--
fs/xfs/scrub/bitmap.c | 4 ++--
fs/xfs/xfs_bmap_item.c | 4 ++--
fs/xfs/xfs_buf.c | 6 +++---
fs/xfs/xfs_extent_busy.c | 4 ++--
fs/xfs/xfs_extent_busy.h | 3 ++-
fs/xfs/xfs_extfree_item.c | 4 ++--
fs/xfs/xfs_refcount_item.c | 4 ++--
fs/xfs/xfs_rmap_item.c | 4 ++--
include/linux/list_sort.h | 7 ++++---
lib/list_sort.c | 17 ++++++-----------
lib/test_list_sort.c | 3 ++-
net/tipc/name_table.c | 4 ++--
41 files changed, 90 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 40cbaca81333..b9518f94bd43 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -2190,8 +2190,8 @@ static int vgic_its_restore_ite(struct vgic_its *its, u32 event_id,
return offset;
}

-static int vgic_its_ite_cmp(void *priv, struct list_head *a,
- struct list_head *b)
+static int vgic_its_ite_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct its_ite *itea = container_of(a, struct its_ite, ite_list);
struct its_ite *iteb = container_of(b, struct its_ite, ite_list);
@@ -2329,8 +2329,8 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id,
return offset;
}

-static int vgic_its_device_cmp(void *priv, struct list_head *a,
- struct list_head *b)
+static int vgic_its_device_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct its_device *deva = container_of(a, struct its_device, dev_list);
struct its_device *devb = container_of(b, struct its_device, dev_list);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1c597c9885fa..15b666200f0b 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -255,7 +255,8 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
* Return negative if "a" sorts before "b", 0 to preserve order, and positive
* to sort "b" before "a".
*/
-static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int vgic_irq_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index e1e997af89a0..3ebd6f10f728 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -75,7 +75,8 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
blk_mq_run_hw_queue(hctx, true);
}

-static int sched_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int sched_rq_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct request *rqa = container_of(a, struct request, queuelist);
struct request *rqb = container_of(b, struct request, queuelist);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d4d7c1caa439..4e3a70ab5be1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1895,7 +1895,8 @@ void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
spin_unlock(&ctx->lock);
}

-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int plug_rq_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct request *rqa = container_of(a, struct request, queuelist);
struct request *rqb = container_of(b, struct request, queuelist);
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 8c5dde628405..d15e3ee93b5b 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1195,7 +1195,8 @@ static int __nfit_mem_init(struct acpi_nfit_desc *acpi_desc,
return 0;
}

-static int nfit_mem_cmp(void *priv, struct list_head *_a, struct list_head *_b)
+static int nfit_mem_cmp(void *priv, const struct list_head *_a,
+ const struct list_head *_b)
{
struct nfit_mem *a = container_of(_a, typeof(*a), list);
struct nfit_mem *b = container_of(_b, typeof(*b), list);
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index cb73a5d6ea76..137a5dd880c2 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -558,7 +558,8 @@ static bool hmat_update_best(u8 type, u32 value, u32 *best)
return updated;
}

-static int initiator_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int initiator_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct memory_initiator *ia;
struct memory_initiator *ib;
diff --git a/drivers/clk/keystone/sci-clk.c b/drivers/clk/keystone/sci-clk.c
index aaf31abe1c8f..7e1b136e71ae 100644
--- a/drivers/clk/keystone/sci-clk.c
+++ b/drivers/clk/keystone/sci-clk.c
@@ -503,8 +503,8 @@ static int ti_sci_scan_clocks_from_fw(struct sci_clk_provider *provider)

#else

-static int _cmp_sci_clk_list(void *priv, struct list_head *a,
- struct list_head *b)
+static int _cmp_sci_clk_list(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct sci_clk *ca = container_of(a, struct sci_clk, node);
struct sci_clk *cb = container_of(b, struct sci_clk, node);
diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c
index 1ac67d4505e0..6662d0457ad6 100644
--- a/drivers/gpu/drm/drm_modes.c
+++ b/drivers/gpu/drm/drm_modes.c
@@ -1290,7 +1290,8 @@ EXPORT_SYMBOL(drm_mode_prune_invalid);
* Negative if @lh_a is better than @lh_b, zero if they're equivalent, or
* positive if @lh_b is better than @lh_a.
*/
-static int drm_mode_compare(void *priv, struct list_head *lh_a, struct list_head *lh_b)
+static int drm_mode_compare(void *priv, const struct list_head *lh_a,
+ const struct list_head *lh_b)
{
struct drm_display_mode *a = list_entry(lh_a, struct drm_display_mode, head);
struct drm_display_mode *b = list_entry(lh_b, struct drm_display_mode, head);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
index 34e6096f196e..da21d2a10cc9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -49,7 +49,8 @@ static const u8 uabi_classes[] = {
[VIDEO_ENHANCEMENT_CLASS] = I915_ENGINE_CLASS_VIDEO_ENHANCE,
};

-static int engine_cmp(void *priv, struct list_head *A, struct list_head *B)
+static int engine_cmp(void *priv, const struct list_head *A,
+ const struct list_head *B)
{
const struct intel_engine_cs *a =
container_of((struct rb_node *)A, typeof(*a), uabi_node);
diff --git a/drivers/gpu/drm/i915/gvt/debugfs.c b/drivers/gpu/drm/i915/gvt/debugfs.c
index 62e6a14ad58e..9f1c209d9251 100644
--- a/drivers/gpu/drm/i915/gvt/debugfs.c
+++ b/drivers/gpu/drm/i915/gvt/debugfs.c
@@ -41,7 +41,7 @@ struct diff_mmio {

/* Compare two diff_mmio items. */
static int mmio_offset_compare(void *priv,
- struct list_head *a, struct list_head *b)
+ const struct list_head *a, const struct list_head *b)
{
struct diff_mmio *ma;
struct diff_mmio *mb;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index c1adea8765a9..52b9c39e0155 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1076,7 +1076,8 @@ static int igt_ppgtt_shrink_boom(void *arg)
return exercise_ppgtt(arg, shrink_boom);
}

-static int sort_holes(void *priv, struct list_head *A, struct list_head *B)
+static int sort_holes(void *priv, const struct list_head *A,
+ const struct list_head *B)
{
struct drm_mm_node *a = list_entry(A, typeof(*a), hole_stack);
struct drm_mm_node *b = list_entry(B, typeof(*b), hole_stack);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 35e937d39b51..1a5c3db1d53b 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -393,8 +393,8 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data)
return 0;
}

-static int cmp_size_smaller_first(void *priv, struct list_head *a,
- struct list_head *b)
+static int cmp_size_smaller_first(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct radeon_bo_list *la = list_entry(a, struct radeon_bo_list, tv.head);
struct radeon_bo_list *lb = list_entry(b, struct radeon_bo_list, tv.head);
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
index d399523206c7..29d71267af78 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom_interval_tree.c
@@ -83,7 +83,8 @@ usnic_uiom_interval_node_alloc(long int start, long int last, int ref_cnt,
return interval;
}

-static int interval_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int interval_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct usnic_uiom_interval_node *node_a, *node_b;

diff --git a/drivers/interconnect/qcom/bcm-voter.c b/drivers/interconnect/qcom/bcm-voter.c
index 1cc565bce2f4..d1591a28b743 100644
--- a/drivers/interconnect/qcom/bcm-voter.c
+++ b/drivers/interconnect/qcom/bcm-voter.c
@@ -39,7 +39,7 @@ struct bcm_voter {
u32 tcs_wait;
};

-static int cmp_vcd(void *priv, struct list_head *a, struct list_head *b)
+static int cmp_vcd(void *priv, const struct list_head *a, const struct list_head *b)
{
const struct qcom_icc_bcm *bcm_a = list_entry(a, struct qcom_icc_bcm, list);
const struct qcom_icc_bcm *bcm_b = list_entry(b, struct qcom_icc_bcm, list);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d57a5bd171f..841e1c1aa5e6 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -953,7 +953,8 @@ static void dispatch_bio_list(struct bio_list *tmp)
submit_bio_noacct(bio);
}

-static int cmp_stripe(void *priv, struct list_head *a, struct list_head *b)
+static int cmp_stripe(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
const struct r5pending_data *da = list_entry(a,
struct r5pending_data, sibling);
diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
index 6c1a23cb3e8c..202bf951e909 100644
--- a/drivers/misc/sram.c
+++ b/drivers/misc/sram.c
@@ -144,8 +144,8 @@ static void sram_free_partitions(struct sram_dev *sram)
}
}

-static int sram_reserve_cmp(void *priv, struct list_head *a,
- struct list_head *b)
+static int sram_reserve_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct sram_reserve *ra = list_entry(a, struct sram_reserve, list);
struct sram_reserve *rb = list_entry(b, struct sram_reserve, list);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0896e21642be..5eaaa51a5e30 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3855,7 +3855,8 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
return ret;
}

-static int ns_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int ns_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct nvme_ns *nsa = container_of(a, struct nvme_ns, list);
struct nvme_ns *nsb = container_of(b, struct nvme_ns, list);
diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c
index 73dcf8cf98fb..ae1c55503513 100644
--- a/drivers/pci/controller/cadence/pcie-cadence-host.c
+++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
@@ -345,7 +345,8 @@ static int cdns_pcie_host_bar_config(struct cdns_pcie_rc *rc,
return 0;
}

-static int cdns_pcie_host_dma_ranges_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int cdns_pcie_host_dma_ranges_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct resource_entry *entry1, *entry2;

diff --git a/drivers/spi/spi-loopback-test.c b/drivers/spi/spi-loopback-test.c
index df981e55c24c..f1cf2232f0b5 100644
--- a/drivers/spi/spi-loopback-test.c
+++ b/drivers/spi/spi-loopback-test.c
@@ -454,7 +454,8 @@ struct rx_ranges {
u8 *end;
};

-static int rx_ranges_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int rx_ranges_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct rx_ranges *rx_a = list_entry(a, struct rx_ranges, list);
struct rx_ranges *rx_b = list_entry(b, struct rx_ranges, list);
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 8c31357f08ed..f4139de63b2e 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1634,7 +1634,8 @@ struct btrfs_plug_cb {
/*
* rbios on the plug list are sorted for easier merging.
*/
-static int plug_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int plug_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct btrfs_raid_bio *ra = container_of(a, struct btrfs_raid_bio,
plug_list);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 92a368627791..00a88bd8105e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4136,7 +4136,8 @@ static noinline int copy_items(struct btrfs_trans_handle *trans,
return ret;
}

-static int extent_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int extent_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct extent_map *em1, *em2;

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1c6810bbaf8b..912dd8b9f156 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1224,7 +1224,8 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
return 0;
}

-static int devid_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int devid_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct btrfs_device *dev1, *dev2;

diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
index 4c2a9fe30067..4493ef0c715e 100644
--- a/fs/ext4/fsmap.c
+++ b/fs/ext4/fsmap.c
@@ -354,8 +354,8 @@ static unsigned int ext4_getfsmap_find_sb(struct super_block *sb,

/* Compare two fsmap items. */
static int ext4_getfsmap_compare(void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct ext4_fsmap *fa;
struct ext4_fsmap *fb;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 9567520d79f7..c06a6cdf05de 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1732,7 +1732,8 @@ void gfs2_glock_complete(struct gfs2_glock *gl, int ret)
spin_unlock(&gl->gl_lockref.lock);
}

-static int glock_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int glock_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct gfs2_glock *gla, *glb;

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 6410281546f9..88649b43fcff 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -695,7 +695,7 @@ void log_flush_wait(struct gfs2_sbd *sdp)
}
}

-static int ip_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int ip_cmp(void *priv, const struct list_head *a, const struct list_head *b)
{
struct gfs2_inode *ipa, *ipb;

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index a82f4747aa8d..b4809967efc6 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -634,7 +634,8 @@ static void gfs2_check_magic(struct buffer_head *bh)
kunmap_atomic(kaddr);
}

-static int blocknr_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int blocknr_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct gfs2_bufdata *bda, *bdb;

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 414769a6ad11..0129e6bab985 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1155,7 +1155,8 @@ iomap_ioend_try_merge(struct iomap_ioend *ioend, struct list_head *more_ioends,
EXPORT_SYMBOL_GPL(iomap_ioend_try_merge);

static int
-iomap_ioend_compare(void *priv, struct list_head *a, struct list_head *b)
+iomap_ioend_compare(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct iomap_ioend *ia = container_of(a, struct iomap_ioend, io_list);
struct iomap_ioend *ib = container_of(b, struct iomap_ioend, io_list);
diff --git a/fs/ubifs/gc.c b/fs/ubifs/gc.c
index a4aaeea63893..dc3e26e9ed7b 100644
--- a/fs/ubifs/gc.c
+++ b/fs/ubifs/gc.c
@@ -102,7 +102,8 @@ static int switch_gc_head(struct ubifs_info *c)
* This function compares data nodes @a and @b. Returns %1 if @a has greater
* inode or block number, and %-1 otherwise.
*/
-static int data_nodes_cmp(void *priv, struct list_head *a, struct list_head *b)
+static int data_nodes_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
ino_t inuma, inumb;
struct ubifs_info *c = priv;
@@ -145,8 +146,8 @@ static int data_nodes_cmp(void *priv, struct list_head *a, struct list_head *b)
* first and sorted by length in descending order. Directory entry nodes go
* after inode nodes and are sorted in ascending hash valuer order.
*/
-static int nondata_nodes_cmp(void *priv, struct list_head *a,
- struct list_head *b)
+static int nondata_nodes_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
ino_t inuma, inumb;
struct ubifs_info *c = priv;
diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c
index 0f8a6a16421b..4d17e5382b74 100644
--- a/fs/ubifs/replay.c
+++ b/fs/ubifs/replay.c
@@ -298,8 +298,8 @@ static int apply_replay_entry(struct ubifs_info *c, struct replay_entry *r)
* entries @a and @b by comparing their sequence numer. Returns %1 if @a has
* greater sequence number and %-1 otherwise.
*/
-static int replay_entries_cmp(void *priv, struct list_head *a,
- struct list_head *b)
+static int replay_entries_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct ubifs_info *c = priv;
struct replay_entry *ra, *rb;
diff --git a/fs/xfs/scrub/bitmap.c b/fs/xfs/scrub/bitmap.c
index f88694f22d05..813b5f219113 100644
--- a/fs/xfs/scrub/bitmap.c
+++ b/fs/xfs/scrub/bitmap.c
@@ -63,8 +63,8 @@ xbitmap_init(
static int
xbitmap_range_cmp(
void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xbitmap_range *ap;
struct xbitmap_range *bp;
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 2344757ede63..e3a691937e92 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -265,8 +265,8 @@ xfs_trans_log_finish_bmap_update(
static int
xfs_bmap_update_diff_items(
void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xfs_bmap_intent *ba;
struct xfs_bmap_intent *bb;
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 37a1d12762d8..592800c8852f 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -2124,9 +2124,9 @@ xfs_buf_delwri_queue(
*/
static int
xfs_buf_cmp(
- void *priv,
- struct list_head *a,
- struct list_head *b)
+ void *priv,
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xfs_buf *ap = container_of(a, struct xfs_buf, b_list);
struct xfs_buf *bp = container_of(b, struct xfs_buf, b_list);
diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
index ef17c1f6db32..a4075685d9eb 100644
--- a/fs/xfs/xfs_extent_busy.c
+++ b/fs/xfs/xfs_extent_busy.c
@@ -629,8 +629,8 @@ xfs_extent_busy_wait_all(
int
xfs_extent_busy_ag_cmp(
void *priv,
- struct list_head *l1,
- struct list_head *l2)
+ const struct list_head *l1,
+ const struct list_head *l2)
{
struct xfs_extent_busy *b1 =
container_of(l1, struct xfs_extent_busy, list);
diff --git a/fs/xfs/xfs_extent_busy.h b/fs/xfs/xfs_extent_busy.h
index 990ab3891971..8aea07100092 100644
--- a/fs/xfs/xfs_extent_busy.h
+++ b/fs/xfs/xfs_extent_busy.h
@@ -58,7 +58,8 @@ void
xfs_extent_busy_wait_all(struct xfs_mount *mp);

int
-xfs_extent_busy_ag_cmp(void *priv, struct list_head *a, struct list_head *b);
+xfs_extent_busy_ag_cmp(void *priv, const struct list_head *a,
+ const struct list_head *b);

static inline void xfs_extent_busy_sort(struct list_head *list)
{
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 93223ebb3372..2424230ca2c3 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -397,8 +397,8 @@ xfs_trans_free_extent(
static int
xfs_extent_free_diff_items(
void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xfs_mount *mp = priv;
struct xfs_extent_free_item *ra;
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 07ebccbbf4df..746f4eda724c 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -269,8 +269,8 @@ xfs_trans_log_finish_refcount_update(
static int
xfs_refcount_update_diff_items(
void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xfs_mount *mp = priv;
struct xfs_refcount_intent *ra;
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 49cebd68b672..dc4f0c9f0897 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -337,8 +337,8 @@ xfs_trans_log_finish_rmap_update(
static int
xfs_rmap_update_diff_items(
void *priv,
- struct list_head *a,
- struct list_head *b)
+ const struct list_head *a,
+ const struct list_head *b)
{
struct xfs_mount *mp = priv;
struct xfs_rmap_intent *ra;
diff --git a/include/linux/list_sort.h b/include/linux/list_sort.h
index 20f178c24e9d..453105f74e05 100644
--- a/include/linux/list_sort.h
+++ b/include/linux/list_sort.h
@@ -6,8 +6,9 @@

struct list_head;

+typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *,
+ const struct list_head *, const struct list_head *);
+
__attribute__((nonnull(2,3)))
-void list_sort(void *priv, struct list_head *head,
- int (*cmp)(void *priv, struct list_head *a,
- struct list_head *b));
+void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp);
#endif
diff --git a/lib/list_sort.c b/lib/list_sort.c
index 52f0c258c895..a926d96ffd44 100644
--- a/lib/list_sort.c
+++ b/lib/list_sort.c
@@ -7,16 +7,13 @@
#include <linux/list_sort.h>
#include <linux/list.h>

-typedef int __attribute__((nonnull(2,3))) (*cmp_func)(void *,
- struct list_head const *, struct list_head const *);
-
/*
* Returns a list organized in an intermediate format suited
* to chaining of merge() calls: null-terminated, no reserved or
* sentinel head node, "prev" links not maintained.
*/
__attribute__((nonnull(2,3,4)))
-static struct list_head *merge(void *priv, cmp_func cmp,
+static struct list_head *merge(void *priv, list_cmp_func_t cmp,
struct list_head *a, struct list_head *b)
{
struct list_head *head, **tail = &head;
@@ -52,7 +49,7 @@ static struct list_head *merge(void *priv, cmp_func cmp,
* throughout.
*/
__attribute__((nonnull(2,3,4,5)))
-static void merge_final(void *priv, cmp_func cmp, struct list_head *head,
+static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head,
struct list_head *a, struct list_head *b)
{
struct list_head *tail = head;
@@ -185,9 +182,7 @@ static void merge_final(void *priv, cmp_func cmp, struct list_head *head,
* 2^(k+1) - 1 (second merge of case 5 when x == 2^(k-1) - 1).
*/
__attribute__((nonnull(2,3)))
-void list_sort(void *priv, struct list_head *head,
- int (*cmp)(void *priv, struct list_head *a,
- struct list_head *b))
+void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)
{
struct list_head *list = head->next, *pending = NULL;
size_t count = 0; /* Count of pending */
@@ -227,7 +222,7 @@ void list_sort(void *priv, struct list_head *head,
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;

- a = merge(priv, (cmp_func)cmp, b, a);
+ a = merge(priv, cmp, b, a);
/* Install the merged result in place of the inputs */
a->prev = b->prev;
*tail = a;
@@ -249,10 +244,10 @@ void list_sort(void *priv, struct list_head *head,

if (!next)
break;
- list = merge(priv, (cmp_func)cmp, pending, list);
+ list = merge(priv, cmp, pending, list);
pending = next;
}
/* The final merge, rebuilding prev links */
- merge_final(priv, (cmp_func)cmp, head, pending, list);
+ merge_final(priv, cmp, head, pending, list);
}
EXPORT_SYMBOL(list_sort);
diff --git a/lib/test_list_sort.c b/lib/test_list_sort.c
index 1f017d3b610e..00daaf23316f 100644
--- a/lib/test_list_sort.c
+++ b/lib/test_list_sort.c
@@ -56,7 +56,8 @@ static int __init check(struct debug_el *ela, struct debug_el *elb)
return 0;
}

-static int __init cmp(void *priv, struct list_head *a, struct list_head *b)
+static int __init cmp(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct debug_el *ela, *elb;

diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index ee5ac40ea2b6..f8141443f2e2 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -397,8 +397,8 @@ static struct publication *tipc_service_remove_publ(struct service_range *sr,
* Code reused: time_after32() for the same purpose
*/
#define publication_after(pa, pb) time_after32((pa)->id, (pb)->id)
-static int tipc_publ_sort(void *priv, struct list_head *a,
- struct list_head *b)
+static int tipc_publ_sort(void *priv, const struct list_head *a,
+ const struct list_head *b)
{
struct publication *pa, *pb;

--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:24

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 10/18] lkdtm: use function_nocfi

To ensure we take the actual address of a function in kernel text,
use function_nocfi. Otherwise, with CONFIG_CFI_CLANG, the compiler
replaces the address with a pointer to the CFI jump table, which is
actually in the module when compiled with CONFIG_LKDTM=m.

Signed-off-by: Sami Tolvanen <[email protected]>
Acked-by: Kees Cook <[email protected]>
---
drivers/misc/lkdtm/usercopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
index 109e8d4302c1..15d220ef35a5 100644
--- a/drivers/misc/lkdtm/usercopy.c
+++ b/drivers/misc/lkdtm/usercopy.c
@@ -314,7 +314,7 @@ void lkdtm_USERCOPY_KERNEL(void)

pr_info("attempting bad copy_to_user from kernel text: %px\n",
vm_mmap);
- if (copy_to_user((void __user *)user_addr, vm_mmap,
+ if (copy_to_user((void __user *)user_addr, function_nocfi(vm_mmap),
unconst + PAGE_SIZE)) {
pr_warn("copy_to_user failed, but lacked Oops\n");
goto free_user;
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:27

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 08/18] bpf: disable CFI in dispatcher functions

BPF dispatcher functions are patched at runtime to perform direct
instead of indirect calls. Disable CFI for the dispatcher functions to
avoid conflicts.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
include/linux/bpf.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 3625f019767d..2f46f98479e1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -650,7 +650,7 @@ struct bpf_dispatcher {
struct bpf_ksym ksym;
};

-static __always_inline unsigned int bpf_dispatcher_nop_func(
+static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
const void *ctx,
const struct bpf_insn *insnsi,
unsigned int (*bpf_func)(const void *,
@@ -678,7 +678,7 @@ void bpf_trampoline_put(struct bpf_trampoline *tr);
}

#define DEFINE_BPF_DISPATCHER(name) \
- noinline unsigned int bpf_dispatcher_##name##_func( \
+ noinline __nocfi unsigned int bpf_dispatcher_##name##_func( \
const void *ctx, \
const struct bpf_insn *insnsi, \
unsigned int (*bpf_func)(const void *, \
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:32

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 12/18] arm64: implement function_nocfi

With CONFIG_CFI_CLANG, the compiler replaces function addresses in
instrumented C code with jump table addresses. This change implements
the function_nocfi() macro, which returns the actual function address
instead.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/include/asm/memory.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0aabc3be9a75..b55410afd3d1 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -321,6 +321,21 @@ static inline void *phys_to_virt(phys_addr_t x)
#define virt_to_pfn(x) __phys_to_pfn(__virt_to_phys((unsigned long)(x)))
#define sym_to_pfn(x) __phys_to_pfn(__pa_symbol(x))

+#ifdef CONFIG_CFI_CLANG
+/*
+ * With CONFIG_CFI_CLANG, the compiler replaces function address
+ * references with the address of the function's CFI jump table
+ * entry. The function_nocfi macro always returns the address of the
+ * actual function instead.
+ */
+#define function_nocfi(x) ({ \
+ void *addr; \
+ asm("adrp %0, " __stringify(x) "\n\t" \
+ "add %0, %0, :lo12:" __stringify(x) : "=r" (addr)); \
+ addr; \
+})
+#endif
+
/*
* virt_to_page(x) convert a _valid_ virtual address to struct page *
* virt_addr_valid(x) indicates whether a virtual address is valid
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:45

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 14/18] arm64: add __nocfi to functions that jump to a physical address

Disable CFI checking for functions that switch to linear mapping and
make an indirect call to a physical address, since the compiler only
understands virtual addresses and the CFI check for such indirect calls
would always fail.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/include/asm/mmu_context.h | 2 +-
arch/arm64/kernel/cpu-reset.h | 8 ++++----
arch/arm64/kernel/cpufeature.c | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 386b96400a57..d3cef9133539 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -119,7 +119,7 @@ static inline void cpu_install_idmap(void)
* Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
* avoiding the possibility of conflicting TLB entries being allocated.
*/
-static inline void cpu_replace_ttbr1(pgd_t *pgdp)
+static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp)
{
typedef void (ttbr_replace_func)(phys_addr_t);
extern ttbr_replace_func idmap_cpu_replace_ttbr1;
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index f3adc574f969..9a7b1262ef17 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -13,10 +13,10 @@
void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
unsigned long arg0, unsigned long arg1, unsigned long arg2);

-static inline void __noreturn cpu_soft_restart(unsigned long entry,
- unsigned long arg0,
- unsigned long arg1,
- unsigned long arg2)
+static inline void __noreturn __nocfi cpu_soft_restart(unsigned long entry,
+ unsigned long arg0,
+ unsigned long arg1,
+ unsigned long arg2)
{
typeof(__cpu_soft_restart) *restart;

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0b2e0d7b13ec..c2f94a5206e0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1445,7 +1445,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
}

#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-static void
+static void __nocfi
kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused)
{
typedef void (kpti_remap_fn)(int, int, phys_addr_t);
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:47

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 11/18] psci: use function_nocfi for cpu_resume

With CONFIG_CFI_CLANG, the compiler replaces function pointers with
jump table addresses, which results in __pa_symbol returning the
physical address of the jump table entry. As the jump table contains
an immediate jump to an EL1 virtual address, this typically won't
work as intended. Use function_nocfi to get the actual address of
cpu_resume.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
drivers/firmware/psci/psci.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index f5fc429cae3f..64344e84bd63 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -325,8 +325,9 @@ static int __init psci_features(u32 psci_func_id)
static int psci_suspend_finisher(unsigned long state)
{
u32 power_state = state;
+ phys_addr_t pa_cpu_resume = __pa_symbol(function_nocfi(cpu_resume));

- return psci_ops.cpu_suspend(power_state, __pa_symbol(cpu_resume));
+ return psci_ops.cpu_suspend(power_state, pa_cpu_resume);
}

int psci_cpu_suspend_enter(u32 state)
@@ -344,8 +345,10 @@ int psci_cpu_suspend_enter(u32 state)

static int psci_system_suspend(unsigned long unused)
{
+ phys_addr_t pa_cpu_resume = __pa_symbol(function_nocfi(cpu_resume));
+
return invoke_psci_fn(PSCI_FN_NATIVE(1_0, SYSTEM_SUSPEND),
- __pa_symbol(cpu_resume), 0, 0);
+ pa_cpu_resume, 0, 0);
}

static int psci_system_suspend_enter(suspend_state_t state)
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:34:59

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 16/18] arm64: ftrace: use function_nocfi for ftrace_call

With CONFIG_CFI_CLANG, the compiler replaces function pointers with
jump table addresses, which breaks dynamic ftrace as the address of
ftrace_call is replaced with the address of ftrace_call.cfi_jt. Use
function_nocfi() to get the address of the actual function instead.

Suggested-by: Ben Dai <[email protected]>
Signed-off-by: Sami Tolvanen <[email protected]>
---
arch/arm64/kernel/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 86a5cf9bc19a..b5d3ddaf69d9 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -55,7 +55,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
unsigned long pc;
u32 new;

- pc = (unsigned long)&ftrace_call;
+ pc = (unsigned long)function_nocfi(ftrace_call);
new = aarch64_insn_gen_branch_imm(pc, (unsigned long)func,
AARCH64_INSN_BRANCH_LINK);

--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:35:19

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 18/18] arm64: allow CONFIG_CFI_CLANG to be selected

Select ARCH_SUPPORTS_CFI_CLANG to allow CFI to be enabled.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115..d7395772b6b8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -75,6 +75,7 @@ config ARM64
select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
select ARCH_SUPPORTS_LTO_CLANG if CPU_LITTLE_ENDIAN
select ARCH_SUPPORTS_LTO_CLANG_THIN
+ select ARCH_SUPPORTS_CFI_CLANG
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG)
select ARCH_SUPPORTS_NUMA_BALANCING
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:35:24

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 17/18] KVM: arm64: Disable CFI for nVHE

Disable CFI for the nVHE code to avoid address space confusion.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/kvm/hyp/nvhe/Makefile | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index a6707df4f6c0..fb24a0f022ad 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -75,9 +75,9 @@ quiet_cmd_hyprel = HYPREL $@
quiet_cmd_hypcopy = HYPCOPY $@
cmd_hypcopy = $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ $< $@

-# Remove ftrace and Shadow Call Stack CFLAGS.
-# This is equivalent to the 'notrace' and '__noscs' annotations.
-KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS), $(KBUILD_CFLAGS))
+# Remove ftrace, Shadow Call Stack, and CFI CFLAGS.
+# This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations.
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS))

# KVM nVHE code is run at a different exception code with a different map, so
# compiler instrumentation that inserts callbacks or checks into the code may
--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:35:38

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 13/18] arm64: use function_nocfi with __pa_symbol

With CONFIG_CFI_CLANG, the compiler replaces function address
references with the address of the function's CFI jump table
entry. This means that __pa_symbol(function) returns the physical
address of the jump table entry, which can lead to address space
confusion as the jump table points to the function's virtual
address. Therefore, use the function_nocfi() macro to ensure we are
always taking the address of the actual function instead.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/include/asm/mmu_context.h | 2 +-
arch/arm64/kernel/acpi_parking_protocol.c | 3 ++-
arch/arm64/kernel/cpu-reset.h | 2 +-
arch/arm64/kernel/cpufeature.c | 2 +-
arch/arm64/kernel/psci.c | 3 ++-
arch/arm64/kernel/smp_spin_table.c | 3 ++-
6 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index bd02e99b1a4c..386b96400a57 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -140,7 +140,7 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp)
ttbr1 |= TTBR_CNP_BIT;
}

- replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1);
+ replace_phys = (void *)__pa_symbol(function_nocfi(idmap_cpu_replace_ttbr1));

cpu_install_idmap();
replace_phys(ttbr1);
diff --git a/arch/arm64/kernel/acpi_parking_protocol.c b/arch/arm64/kernel/acpi_parking_protocol.c
index e7c941d8340d..bfeeb5319abf 100644
--- a/arch/arm64/kernel/acpi_parking_protocol.c
+++ b/arch/arm64/kernel/acpi_parking_protocol.c
@@ -99,7 +99,8 @@ static int acpi_parking_protocol_cpu_boot(unsigned int cpu)
* that read this address need to convert this address to the
* Boot-Loader's endianness before jumping.
*/
- writeq_relaxed(__pa_symbol(secondary_entry), &mailbox->entry_point);
+ writeq_relaxed(__pa_symbol(function_nocfi(secondary_entry)),
+ &mailbox->entry_point);
writel_relaxed(cpu_entry->gic_cpu_id, &mailbox->cpu_id);

arch_send_wakeup_ipi_mask(cpumask_of(cpu));
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index ed50e9587ad8..f3adc574f969 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -22,7 +22,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long entry,

unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
is_hyp_mode_available();
- restart = (void *)__pa_symbol(__cpu_soft_restart);
+ restart = (void *)__pa_symbol(function_nocfi(__cpu_soft_restart));

cpu_install_idmap();
restart(el2_switch, entry, arg0, arg1, arg2);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e5281e1c8f1d..0b2e0d7b13ec 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1462,7 +1462,7 @@ kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused)
if (arm64_use_ng_mappings)
return;

- remap_fn = (void *)__pa_symbol(idmap_kpti_install_ng_mappings);
+ remap_fn = (void *)__pa_symbol(function_nocfi(idmap_kpti_install_ng_mappings));

cpu_install_idmap();
remap_fn(cpu, num_online_cpus(), __pa_symbol(swapper_pg_dir));
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 62d2bda7adb8..e74bcb57559b 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -38,7 +38,8 @@ static int __init cpu_psci_cpu_prepare(unsigned int cpu)

static int cpu_psci_cpu_boot(unsigned int cpu)
{
- int err = psci_ops.cpu_on(cpu_logical_map(cpu), __pa_symbol(secondary_entry));
+ int err = psci_ops.cpu_on(cpu_logical_map(cpu),
+ __pa_symbol(function_nocfi(secondary_entry)));
if (err)
pr_err("failed to boot CPU%d (%d)\n", cpu, err);

diff --git a/arch/arm64/kernel/smp_spin_table.c b/arch/arm64/kernel/smp_spin_table.c
index 056772c26098..4c4e36ded4aa 100644
--- a/arch/arm64/kernel/smp_spin_table.c
+++ b/arch/arm64/kernel/smp_spin_table.c
@@ -88,7 +88,8 @@ static int smp_spin_table_cpu_prepare(unsigned int cpu)
* boot-loader's endianness before jumping. This is mandated by
* the boot protocol.
*/
- writeq_relaxed(__pa_symbol(secondary_holding_pen), release_addr);
+ writeq_relaxed(__pa_symbol(function_nocfi(secondary_holding_pen)),
+ release_addr);
__flush_dcache_area((__force void *)release_addr,
sizeof(*release_addr));

--
2.31.0.208.g409f899ff0-goog

2021-04-01 23:35:58

by Sami Tolvanen

[permalink] [raw]
Subject: [PATCH v5 15/18] arm64: add __nocfi to __apply_alternatives

__apply_alternatives makes indirect calls to functions whose address
is taken in assembly code using the alternative_cb macro. With
non-canonical CFI, the compiler won't replace these function
references with the jump table addresses, which trips CFI. Disable CFI
checking in the function to work around the issue.

Signed-off-by: Sami Tolvanen <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
---
arch/arm64/kernel/alternative.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 1184c44ea2c7..abc84636af07 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -133,8 +133,8 @@ static void clean_dcache_range_nopatch(u64 start, u64 end)
} while (cur += d_size, cur < end);
}

-static void __apply_alternatives(void *alt_region, bool is_module,
- unsigned long *feature_mask)
+static void __nocfi __apply_alternatives(void *alt_region, bool is_module,
+ unsigned long *feature_mask)
{
struct alt_instr *alt;
struct alt_region *region = alt_region;
--
2.31.0.208.g409f899ff0-goog

2021-04-02 06:39:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v5 03/18] mm: add generic function_nocfi macro

Thanks, this looks much better than the earlier naming:

Acked-by: Christoph Hellwig <[email protected]>

2021-04-02 20:01:26

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5 00/18] Add support for Clang CFI

On Thu, Apr 01, 2021 at 04:31:58PM -0700, Sami Tolvanen wrote:
> This series adds support for Clang's Control-Flow Integrity (CFI)
> checking. With CFI, the compiler injects a runtime check before each
> indirect function call to ensure the target is a valid function with
> the correct static type. This restricts possible call targets and
> makes it more difficult for an attacker to exploit bugs that allow the
> modification of stored function pointers. For more details, see:
>
> https://clang.llvm.org/docs/ControlFlowIntegrity.html
>
> The first patch contains build system changes and error handling,
> and implements support for cross-module indirect call checking. The
> remaining patches address issues caused by the compiler
> instrumentation. These include fixing known type mismatches, as well
> as issues with address space confusion and cross-module function
> address equality.
>
> These patches add support only for arm64, but I'll post patches also
> for x86_64 after we address the remaining issues there, including
> objtool support.
>
> You can also pull this series from
>
> https://github.com/samitolvanen/linux.git cfi-v5
>
> ---
> Changes in v5:
> - Changed module.lds.S to only include <asm/page.h> when CFI is
> enabled to fix the MIPS build.
> - Added a patch that fixes dynamic ftrace with CFI on arm64.
>
> Changes in v4:
> - Per Mark's suggestion, dropped __pa_function() and renamed
> __va_function() to function_nocfi().
> - Added a comment to function_nocfi() to explain what it does.
> - Updated the psci patch to use an intermediate variable for
> the physical address for clarity.
>
> Changes in v3:
> - Added a patch to change list_sort() callers treewide to use
> const pointers instead of simply removing the internal casts.
> - Changed cleanup_symbol_name() to return bool.
> - Changed module.lds.S to drop the .eh_frame section only with
> CONFIG_CFI_CLANG.
> - Switched to synchronize_rcu() in update_shadow().
>
> Changes in v2:
> - Fixed .text merging in module.lds.S.
> - Added WARN_ON_FUNCTION_MISMATCH() and changed kernel/thread.c
> and kernel/workqueue.c to use the macro instead.
>
>
> Sami Tolvanen (18):
> add support for Clang CFI
> cfi: add __cficanonical
> mm: add generic function_nocfi macro
> module: ensure __cfi_check alignment
> workqueue: use WARN_ON_FUNCTION_MISMATCH
> kthread: use WARN_ON_FUNCTION_MISMATCH
> kallsyms: strip ThinLTO hashes from static functions
> bpf: disable CFI in dispatcher functions
> treewide: Change list_sort to use const pointers
> lkdtm: use function_nocfi
> psci: use function_nocfi for cpu_resume
> arm64: implement function_nocfi
> arm64: use function_nocfi with __pa_symbol
> arm64: add __nocfi to functions that jump to a physical address
> arm64: add __nocfi to __apply_alternatives
> arm64: ftrace: use function_nocfi for ftrace_call
> KVM: arm64: Disable CFI for nVHE
> arm64: allow CONFIG_CFI_CLANG to be selected
>
> Makefile | 17 +
> arch/Kconfig | 45 +++
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/memory.h | 15 +
> arch/arm64/include/asm/mmu_context.h | 4 +-
> arch/arm64/kernel/acpi_parking_protocol.c | 3 +-
> arch/arm64/kernel/alternative.c | 4 +-
> arch/arm64/kernel/cpu-reset.h | 10 +-
> arch/arm64/kernel/cpufeature.c | 4 +-
> arch/arm64/kernel/ftrace.c | 2 +-
> arch/arm64/kernel/psci.c | 3 +-
> arch/arm64/kernel/smp_spin_table.c | 3 +-
> arch/arm64/kvm/hyp/nvhe/Makefile | 6 +-
> arch/arm64/kvm/vgic/vgic-its.c | 8 +-
> arch/arm64/kvm/vgic/vgic.c | 3 +-
> block/blk-mq-sched.c | 3 +-
> block/blk-mq.c | 3 +-
> drivers/acpi/nfit/core.c | 3 +-
> drivers/acpi/numa/hmat.c | 3 +-
> drivers/clk/keystone/sci-clk.c | 4 +-
> drivers/firmware/psci/psci.c | 7 +-
> drivers/gpu/drm/drm_modes.c | 3 +-
> drivers/gpu/drm/i915/gt/intel_engine_user.c | 3 +-
> drivers/gpu/drm/i915/gvt/debugfs.c | 2 +-
> drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 3 +-
> drivers/gpu/drm/radeon/radeon_cs.c | 4 +-
> .../hw/usnic/usnic_uiom_interval_tree.c | 3 +-
> drivers/interconnect/qcom/bcm-voter.c | 2 +-
> drivers/md/raid5.c | 3 +-
> drivers/misc/lkdtm/usercopy.c | 2 +-
> drivers/misc/sram.c | 4 +-
> drivers/nvme/host/core.c | 3 +-
> .../controller/cadence/pcie-cadence-host.c | 3 +-
> drivers/spi/spi-loopback-test.c | 3 +-
> fs/btrfs/raid56.c | 3 +-
> fs/btrfs/tree-log.c | 3 +-
> fs/btrfs/volumes.c | 3 +-
> fs/ext4/fsmap.c | 4 +-
> fs/gfs2/glock.c | 3 +-
> fs/gfs2/log.c | 2 +-
> fs/gfs2/lops.c | 3 +-
> fs/iomap/buffered-io.c | 3 +-
> fs/ubifs/gc.c | 7 +-
> fs/ubifs/replay.c | 4 +-
> fs/xfs/scrub/bitmap.c | 4 +-
> fs/xfs/xfs_bmap_item.c | 4 +-
> fs/xfs/xfs_buf.c | 6 +-
> fs/xfs/xfs_extent_busy.c | 4 +-
> fs/xfs/xfs_extent_busy.h | 3 +-
> fs/xfs/xfs_extfree_item.c | 4 +-
> fs/xfs/xfs_refcount_item.c | 4 +-
> fs/xfs/xfs_rmap_item.c | 4 +-
> include/asm-generic/bug.h | 16 +
> include/asm-generic/vmlinux.lds.h | 20 +-
> include/linux/bpf.h | 4 +-
> include/linux/cfi.h | 41 +++
> include/linux/compiler-clang.h | 3 +
> include/linux/compiler_types.h | 8 +
> include/linux/init.h | 6 +-
> include/linux/list_sort.h | 7 +-
> include/linux/mm.h | 10 +
> include/linux/module.h | 13 +-
> include/linux/pci.h | 4 +-
> init/Kconfig | 2 +-
> kernel/Makefile | 4 +
> kernel/cfi.c | 329 ++++++++++++++++++
> kernel/kallsyms.c | 55 ++-
> kernel/kthread.c | 3 +-
> kernel/module.c | 43 +++
> kernel/workqueue.c | 2 +-
> lib/list_sort.c | 17 +-
> lib/test_list_sort.c | 3 +-
> net/tipc/name_table.c | 4 +-
> scripts/Makefile.modfinal | 2 +-
> scripts/module.lds.S | 19 +-
> 75 files changed, 759 insertions(+), 113 deletions(-)
> create mode 100644 include/linux/cfi.h
> create mode 100644 kernel/cfi.c
>
>
> base-commit: 6905b1dc3c32a094f0da61bd656a740f0a97d592
> --
> 2.31.0.208.g409f899ff0-goog
>


Hi Sami,

I booted this series on Equinix's c1.large.arm (2x Cavium ThunderX
CN8890) and c2.large.arm (1x Ampere eMAG 8180) servers [1] and my
Raspberry Pi 4B. I ran them through LTP's read_all test case on both
/proc and /sys and a few compile workloads, only uncovering one issue [2].

Consider this series:

Tested-by: Nathan Chancellor <[email protected]>

[1]: https://metal.equinix.com/developers/docs/servers/
[2]: https://lore.kernel.org/r/20210402195241.gahc5w25gezluw7p@archlinux-ax161/

Cheers,
Nathan

2021-04-06 20:52:49

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 13/18] arm64: use function_nocfi with __pa_symbol

On Thu, Apr 01, 2021 at 04:32:11PM -0700, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces function address
> references with the address of the function's CFI jump table
> entry. This means that __pa_symbol(function) returns the physical
> address of the jump table entry, which can lead to address space
> confusion as the jump table points to the function's virtual
> address. Therefore, use the function_nocfi() macro to ensure we are
> always taking the address of the actual function instead.
>
> Signed-off-by: Sami Tolvanen <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>
> ---
> arch/arm64/include/asm/mmu_context.h | 2 +-
> arch/arm64/kernel/acpi_parking_protocol.c | 3 ++-
> arch/arm64/kernel/cpu-reset.h | 2 +-
> arch/arm64/kernel/cpufeature.c | 2 +-
> arch/arm64/kernel/psci.c | 3 ++-
> arch/arm64/kernel/smp_spin_table.c | 3 ++-
> 6 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index bd02e99b1a4c..386b96400a57 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -140,7 +140,7 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp)
> ttbr1 |= TTBR_CNP_BIT;
> }
>
> - replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1);
> + replace_phys = (void *)__pa_symbol(function_nocfi(idmap_cpu_replace_ttbr1));
>
> cpu_install_idmap();
> replace_phys(ttbr1);
> diff --git a/arch/arm64/kernel/acpi_parking_protocol.c b/arch/arm64/kernel/acpi_parking_protocol.c
> index e7c941d8340d..bfeeb5319abf 100644
> --- a/arch/arm64/kernel/acpi_parking_protocol.c
> +++ b/arch/arm64/kernel/acpi_parking_protocol.c
> @@ -99,7 +99,8 @@ static int acpi_parking_protocol_cpu_boot(unsigned int cpu)
> * that read this address need to convert this address to the
> * Boot-Loader's endianness before jumping.
> */
> - writeq_relaxed(__pa_symbol(secondary_entry), &mailbox->entry_point);
> + writeq_relaxed(__pa_symbol(function_nocfi(secondary_entry)),
> + &mailbox->entry_point);
> writel_relaxed(cpu_entry->gic_cpu_id, &mailbox->cpu_id);
>
> arch_send_wakeup_ipi_mask(cpumask_of(cpu));
> diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> index ed50e9587ad8..f3adc574f969 100644
> --- a/arch/arm64/kernel/cpu-reset.h
> +++ b/arch/arm64/kernel/cpu-reset.h
> @@ -22,7 +22,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long entry,
>
> unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
> is_hyp_mode_available();
> - restart = (void *)__pa_symbol(__cpu_soft_restart);
> + restart = (void *)__pa_symbol(function_nocfi(__cpu_soft_restart));
>
> cpu_install_idmap();
> restart(el2_switch, entry, arg0, arg1, arg2);
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index e5281e1c8f1d..0b2e0d7b13ec 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1462,7 +1462,7 @@ kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused)
> if (arm64_use_ng_mappings)
> return;
>
> - remap_fn = (void *)__pa_symbol(idmap_kpti_install_ng_mappings);
> + remap_fn = (void *)__pa_symbol(function_nocfi(idmap_kpti_install_ng_mappings));
>
> cpu_install_idmap();
> remap_fn(cpu, num_online_cpus(), __pa_symbol(swapper_pg_dir));
> diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
> index 62d2bda7adb8..e74bcb57559b 100644
> --- a/arch/arm64/kernel/psci.c
> +++ b/arch/arm64/kernel/psci.c
> @@ -38,7 +38,8 @@ static int __init cpu_psci_cpu_prepare(unsigned int cpu)
>
> static int cpu_psci_cpu_boot(unsigned int cpu)
> {
> - int err = psci_ops.cpu_on(cpu_logical_map(cpu), __pa_symbol(secondary_entry));
> + int err = psci_ops.cpu_on(cpu_logical_map(cpu),
> + __pa_symbol(function_nocfi(secondary_entry)));

Could we use a temporary here, e.g.

phys_addr_t pa_secondary_entry = __pa_symbol(function_nocfi(secondary_entry));
int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);

> if (err)
> pr_err("failed to boot CPU%d (%d)\n", cpu, err);
>
> diff --git a/arch/arm64/kernel/smp_spin_table.c b/arch/arm64/kernel/smp_spin_table.c
> index 056772c26098..4c4e36ded4aa 100644
> --- a/arch/arm64/kernel/smp_spin_table.c
> +++ b/arch/arm64/kernel/smp_spin_table.c
> @@ -88,7 +88,8 @@ static int smp_spin_table_cpu_prepare(unsigned int cpu)
> * boot-loader's endianness before jumping. This is mandated by
> * the boot protocol.
> */
> - writeq_relaxed(__pa_symbol(secondary_holding_pen), release_addr);
> + writeq_relaxed(__pa_symbol(function_nocfi(secondary_holding_pen)),
> + release_addr);

Likewise here? e.g. at the start of the function have:

phys_addr_t pa_holding_pen = __pa_symbol(function_nocfi(secondary_holding_pen));

... then here have:

writeq_relaxed(pa_holding_pen, release_addr);

With those:

Acked-by: Mark Rutland <[email protected]>

Thanks,
Mark.

> __flush_dcache_area((__force void *)release_addr,
> sizeof(*release_addr));
>
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-06 21:03:48

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 14/18] arm64: add __nocfi to functions that jump to a physical address

[adding Ard for EFI runtime services bits]

On Thu, Apr 01, 2021 at 04:32:12PM -0700, Sami Tolvanen wrote:
> Disable CFI checking for functions that switch to linear mapping and
> make an indirect call to a physical address, since the compiler only
> understands virtual addresses and the CFI check for such indirect calls
> would always fail.

What does physical vs virtual have to do with this? Does the address
actually matter, or is this just a general thing that when calling an
assembly function we won't have a trampoline that the caller expects?

I wonder if we need to do something with asmlinkage here, perhaps?

I didn't spot anything in the seriues handling EFI runtime services
calls, and I strongly suspect we need to do something for those, unless
they're handled implicitly by something else.

> Signed-off-by: Sami Tolvanen <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>
> ---
> arch/arm64/include/asm/mmu_context.h | 2 +-
> arch/arm64/kernel/cpu-reset.h | 8 ++++----
> arch/arm64/kernel/cpufeature.c | 2 +-
> 3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index 386b96400a57..d3cef9133539 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -119,7 +119,7 @@ static inline void cpu_install_idmap(void)
> * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
> * avoiding the possibility of conflicting TLB entries being allocated.
> */
> -static inline void cpu_replace_ttbr1(pgd_t *pgdp)
> +static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp)

Given these are inlines, what's the effect when these are inlined into a
function that would normally use CFI? Does CFI get supressed for the
whole function, or just the bit that got inlined?

Is there an attribute that we could place on a function pointer to tell
the compiler to not check calls via that pointer? If that existed we'd
be able to scope this much more tightly.

Thanks,
Mark.

> {
> typedef void (ttbr_replace_func)(phys_addr_t);
> extern ttbr_replace_func idmap_cpu_replace_ttbr1;
> diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> index f3adc574f969..9a7b1262ef17 100644
> --- a/arch/arm64/kernel/cpu-reset.h
> +++ b/arch/arm64/kernel/cpu-reset.h
> @@ -13,10 +13,10 @@
> void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
> unsigned long arg0, unsigned long arg1, unsigned long arg2);
>
> -static inline void __noreturn cpu_soft_restart(unsigned long entry,
> - unsigned long arg0,
> - unsigned long arg1,
> - unsigned long arg2)
> +static inline void __noreturn __nocfi cpu_soft_restart(unsigned long entry,
> + unsigned long arg0,
> + unsigned long arg1,
> + unsigned long arg2)
> {
> typeof(__cpu_soft_restart) *restart;
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 0b2e0d7b13ec..c2f94a5206e0 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1445,7 +1445,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> }
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> -static void
> +static void __nocfi
> kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused)
> {
> typedef void (kpti_remap_fn)(int, int, phys_addr_t);
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-06 21:04:18

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 16/18] arm64: ftrace: use function_nocfi for ftrace_call

On Thu, Apr 01, 2021 at 04:32:14PM -0700, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces function pointers with
> jump table addresses, which breaks dynamic ftrace as the address of
> ftrace_call is replaced with the address of ftrace_call.cfi_jt. Use
> function_nocfi() to get the address of the actual function instead.
>
> Suggested-by: Ben Dai <[email protected]>
> Signed-off-by: Sami Tolvanen <[email protected]>
> ---
> arch/arm64/kernel/ftrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index 86a5cf9bc19a..b5d3ddaf69d9 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -55,7 +55,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
> unsigned long pc;
> u32 new;
>
> - pc = (unsigned long)&ftrace_call;
> + pc = (unsigned long)function_nocfi(ftrace_call);

Acked-by: Mark Rutland <[email protected]>

Thanks,
Mark.

> new = aarch64_insn_gen_branch_imm(pc, (unsigned long)func,
> AARCH64_INSN_BRANCH_LINK);
>
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-06 22:01:06

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v5 14/18] arm64: add __nocfi to functions that jump to a physical address

On Tue, 6 Apr 2021 at 13:54, Mark Rutland <[email protected]> wrote:
>
> [adding Ard for EFI runtime services bits]
>
> On Thu, Apr 01, 2021 at 04:32:12PM -0700, Sami Tolvanen wrote:
> > Disable CFI checking for functions that switch to linear mapping and
> > make an indirect call to a physical address, since the compiler only
> > understands virtual addresses and the CFI check for such indirect calls
> > would always fail.
>
> What does physical vs virtual have to do with this? Does the address
> actually matter, or is this just a general thing that when calling an
> assembly function we won't have a trampoline that the caller expects?
>
> I wonder if we need to do something with asmlinkage here, perhaps?
>
> I didn't spot anything in the seriues handling EFI runtime services
> calls, and I strongly suspect we need to do something for those, unless
> they're handled implicitly by something else.
>

All indirect EFI calls are routed via a asm helper that I originally
added to check whether x18 was corrupted by the firmware. So from the
caller side, we should be fine.

All callees are addresses that are provided by the firmware via tables
in memory, so I would assume that this addresses the callee side as
well. AFAICT, it is never left up to the compiler to emit these
indirect calls, or take the address of a firmware routine.

But a test would be nice :-)

> > Signed-off-by: Sami Tolvanen <[email protected]>
> > Reviewed-by: Kees Cook <[email protected]>
> > ---
> > arch/arm64/include/asm/mmu_context.h | 2 +-
> > arch/arm64/kernel/cpu-reset.h | 8 ++++----
> > arch/arm64/kernel/cpufeature.c | 2 +-
> > 3 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index 386b96400a57..d3cef9133539 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -119,7 +119,7 @@ static inline void cpu_install_idmap(void)
> > * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
> > * avoiding the possibility of conflicting TLB entries being allocated.
> > */
> > -static inline void cpu_replace_ttbr1(pgd_t *pgdp)
> > +static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp)
>
> Given these are inlines, what's the effect when these are inlined into a
> function that would normally use CFI? Does CFI get supressed for the
> whole function, or just the bit that got inlined?
>
> Is there an attribute that we could place on a function pointer to tell
> the compiler to not check calls via that pointer? If that existed we'd
> be able to scope this much more tightly.
>

I agree that it would be very helpful to be able to define a function
pointer type that is exempt from CFI checks.

2021-04-07 00:21:27

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 03/18] mm: add generic function_nocfi macro

On Thu, Apr 01, 2021 at 04:32:01PM -0700, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces function addresses
> in instrumented C code with jump table addresses. This means that
> __pa_symbol(function) returns the physical address of the jump table
> entry instead of the actual function, which may not work as the jump
> table code will immediately jump to a virtual address that may not be
> mapped.
>
> To avoid this address space confusion, this change adds a generic
> definition for function_nocfi(), which architectures that support CFI
> can override. The typical implementation of would use inline assembly
> to take the function address, which avoids compiler instrumentation.
>
> Signed-off-by: Sami Tolvanen <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

FWIW:

Acked-by: Mark Rutland <[email protected]>

Mark.

> ---
> include/linux/mm.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 8ba434287387..22cce9c7dd05 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -124,6 +124,16 @@ extern int mmap_rnd_compat_bits __read_mostly;
> #define lm_alias(x) __va(__pa_symbol(x))
> #endif
>
> +/*
> + * With CONFIG_CFI_CLANG, the compiler replaces function addresses in
> + * instrumented C code with jump table addresses. Architectures that
> + * support CFI can define this macro to return the actual function address
> + * when needed.
> + */
> +#ifndef function_nocfi
> +#define function_nocfi(x) (x)
> +#endif
> +
> /*
> * To prevent common memory management code establishing
> * a zero page mapping on a read fault.
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-07 00:22:46

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 12/18] arm64: implement function_nocfi

On Thu, Apr 01, 2021 at 04:32:10PM -0700, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces function addresses in
> instrumented C code with jump table addresses. This change implements
> the function_nocfi() macro, which returns the actual function address
> instead.
>
> Signed-off-by: Sami Tolvanen <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

I think that it's unfortunate that we have to drop to assembly here, but
given this is infrequent I agree it's not the end of the world, so:

Acked-by: Mark Rutland <[email protected]>

> ---
> arch/arm64/include/asm/memory.h | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0aabc3be9a75..b55410afd3d1 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -321,6 +321,21 @@ static inline void *phys_to_virt(phys_addr_t x)
> #define virt_to_pfn(x) __phys_to_pfn(__virt_to_phys((unsigned long)(x)))
> #define sym_to_pfn(x) __phys_to_pfn(__pa_symbol(x))
>
> +#ifdef CONFIG_CFI_CLANG
> +/*
> + * With CONFIG_CFI_CLANG, the compiler replaces function address
> + * references with the address of the function's CFI jump table
> + * entry. The function_nocfi macro always returns the address of the
> + * actual function instead.
> + */
> +#define function_nocfi(x) ({ \
> + void *addr; \
> + asm("adrp %0, " __stringify(x) "\n\t" \
> + "add %0, %0, :lo12:" __stringify(x) : "=r" (addr)); \

If it's not too painful, could we please move the asm constrain onto its
own line? That makes it slightly easier to read, and aligns with what
we've (mostly) done elsewhere in arm64.

Not a big deal either way, and the ack stands regardless.

Thanks,
Mark.

> + addr; \
> +})
> +#endif
> +
> /*
> * virt_to_page(x) convert a _valid_ virtual address to struct page *
> * virt_addr_valid(x) indicates whether a virtual address is valid
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-07 00:55:42

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v5 11/18] psci: use function_nocfi for cpu_resume

On Thu, Apr 01, 2021 at 04:32:09PM -0700, Sami Tolvanen wrote:
> With CONFIG_CFI_CLANG, the compiler replaces function pointers with
> jump table addresses, which results in __pa_symbol returning the
> physical address of the jump table entry. As the jump table contains
> an immediate jump to an EL1 virtual address, this typically won't
> work as intended. Use function_nocfi to get the actual address of
> cpu_resume.
>
> Signed-off-by: Sami Tolvanen <[email protected]>
> Reviewed-by: Kees Cook <[email protected]>

Acked-by: Mark Rutland <[email protected]>

Mark.

> ---
> drivers/firmware/psci/psci.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index f5fc429cae3f..64344e84bd63 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -325,8 +325,9 @@ static int __init psci_features(u32 psci_func_id)
> static int psci_suspend_finisher(unsigned long state)
> {
> u32 power_state = state;
> + phys_addr_t pa_cpu_resume = __pa_symbol(function_nocfi(cpu_resume));
>
> - return psci_ops.cpu_suspend(power_state, __pa_symbol(cpu_resume));
> + return psci_ops.cpu_suspend(power_state, pa_cpu_resume);
> }
>
> int psci_cpu_suspend_enter(u32 state)
> @@ -344,8 +345,10 @@ int psci_cpu_suspend_enter(u32 state)
>
> static int psci_system_suspend(unsigned long unused)
> {
> + phys_addr_t pa_cpu_resume = __pa_symbol(function_nocfi(cpu_resume));
> +
> return invoke_psci_fn(PSCI_FN_NATIVE(1_0, SYSTEM_SUSPEND),
> - __pa_symbol(cpu_resume), 0, 0);
> + pa_cpu_resume, 0, 0);
> }
>
> static int psci_system_suspend_enter(suspend_state_t state)
> --
> 2.31.0.208.g409f899ff0-goog
>

2021-04-07 07:44:02

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v5 14/18] arm64: add __nocfi to functions that jump to a physical address

On Tue, Apr 6, 2021 at 4:54 AM Mark Rutland <[email protected]> wrote:
>
> [adding Ard for EFI runtime services bits]
>
> On Thu, Apr 01, 2021 at 04:32:12PM -0700, Sami Tolvanen wrote:
> > Disable CFI checking for functions that switch to linear mapping and
> > make an indirect call to a physical address, since the compiler only
> > understands virtual addresses and the CFI check for such indirect calls
> > would always fail.
>
> What does physical vs virtual have to do with this? Does the address
> actually matter, or is this just a general thing that when calling an
> assembly function we won't have a trampoline that the caller expects?

No, this is about the actual address. The compiler-generated runtime
checks only know about EL1 virtual addresses, so if we switch to a
different address space, all indirect calls will trip CFI.

> I wonder if we need to do something with asmlinkage here, perhaps?
>
> I didn't spot anything in the seriues handling EFI runtime services
> calls, and I strongly suspect we need to do something for those, unless
> they're handled implicitly by something else.
>
> > Signed-off-by: Sami Tolvanen <[email protected]>
> > Reviewed-by: Kees Cook <[email protected]>
> > ---
> > arch/arm64/include/asm/mmu_context.h | 2 +-
> > arch/arm64/kernel/cpu-reset.h | 8 ++++----
> > arch/arm64/kernel/cpufeature.c | 2 +-
> > 3 files changed, 6 insertions(+), 6 deletions(-)
> >https://www.cnbc.com/2021/04/06/donald-trump-save-america-pac-has-85-million-on-hand-ahead-of-midterms.html
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index 386b96400a57..d3cef9133539 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -119,7 +119,7 @@ static inline void cpu_install_idmap(void)
> > * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
> > * avoiding the possibility of conflicting TLB entries being allocated.
> > */
> > -static inline void cpu_replace_ttbr1(pgd_t *pgdp)
> > +static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp)
>
> Given these are inlines, what's the effect when these are inlined into a
> function that would normally use CFI? Does CFI get supressed for the
> whole function, or just the bit that got inlined?

Just for the bit that gets inlined.

> Is there an attribute that we could place on a function pointer to tell
> the compiler to not check calls via that pointer? If that existed we'd
> be able to scope this much more tightly.

There isn't, but I do agree that this would be a useful feature.

Sami

2021-04-07 10:25:11

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v5 12/18] arm64: implement function_nocfi

On Tue, Apr 6, 2021 at 4:37 AM Mark Rutland <[email protected]> wrote:
>
> On Thu, Apr 01, 2021 at 04:32:10PM -0700, Sami Tolvanen wrote:
> > With CONFIG_CFI_CLANG, the compiler replaces function addresses in
> > instrumented C code with jump table addresses. This change implements
> > the function_nocfi() macro, which returns the actual function address
> > instead.
> >
> > Signed-off-by: Sami Tolvanen <[email protected]>
> > Reviewed-by: Kees Cook <[email protected]>
>
> I think that it's unfortunate that we have to drop to assembly here, but
> given this is infrequent I agree it's not the end of the world, so:
>
> Acked-by: Mark Rutland <[email protected]>
>
> > ---
> > arch/arm64/include/asm/memory.h | 15 +++++++++++++++
> > 1 file changed, 15 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 0aabc3be9a75..b55410afd3d1 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -321,6 +321,21 @@ static inline void *phys_to_virt(phys_addr_t x)
> > #define virt_to_pfn(x) __phys_to_pfn(__virt_to_phys((unsigned long)(x)))
> > #define sym_to_pfn(x) __phys_to_pfn(__pa_symbol(x))
> >
> > +#ifdef CONFIG_CFI_CLANG
> > +/*
> > + * With CONFIG_CFI_CLANG, the compiler replaces function address
> > + * references with the address of the function's CFI jump table
> > + * entry. The function_nocfi macro always returns the address of the
> > + * actual function instead.
> > + */
> > +#define function_nocfi(x) ({ \
> > + void *addr; \
> > + asm("adrp %0, " __stringify(x) "\n\t" \
> > + "add %0, %0, :lo12:" __stringify(x) : "=r" (addr)); \
>
> If it's not too painful, could we please move the asm constrain onto its
> own line? That makes it slightly easier to read, and aligns with what
> we've (mostly) done elsewhere in arm64.

Sure, I'll change this in the next version.

> Not a big deal either way, and the ack stands regardless.
>
> Thanks,
> Mark.

Sami