2023-06-13 00:46:04

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 00/42] Shadow stacks for userspace

Hi,

This series implements Shadow Stacks for userspace using x86's Control-flow
Enforcement Technology (CET). CET consists of two related security
features: shadow stacks and indirect branch tracking. This series
implements just the shadow stack part of this feature, and just for
userspace.

The main use case for shadow stack is providing protection against return
oriented programming attacks. It works by maintaining a secondary (shadow)
stack using a special memory type that has protections against
modification. When executing a CALL instruction, the processor pushes the
return address to both the normal stack and to the special permission
shadow stack. Upon RET, the processor pops the shadow stack copy and
compares it to the normal stack copy. For more details, see the
coverletter from v1 [0].

Shadow Stack was rejected by Linus for 6.4 [1][2]. This is a new version
that addresses his concerns. In the months since the series was queued in
tip, there were also some non-critical things that turned up, that I was
planning do as fast follow ups. Since we are doing a re-spin, I thought to
just include them in the initial series. (see 3 and 4) Also, the whole
series has been re-ordered, after some comments from Linus prompted some
reflection.

Most of the patches are the same, so I’ll specifically list the patches
with changes to help focus review.

1. Redo of the pte_mkwrite() refactoring patches
------------------------------------------------
The point of these patches was to make pte_mkwrite() take a VMA.
Unfortunately the original version of this refactor had a bug. Linus
suggested an alternate way of doing the refactor that would be less error
prone. It sounds like this will be used by the riscv and possibly arm
shadow stack features. It would be great to collect some Reviewed-by tags
on these from anyone else that would depend on them.

Changed (and renamed) patches:
mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()
mm: Move pte/pmd_mkwrite() callers with no VMA to _novma()
mm: Make pte_mkwrite() take a VMA

2. SavedDirty overhaul
----------------------
The Shadow Stack PTEs are defined by the HW as Write=0,Dirty=1. Since
Linux usually creates PTEs like that when it write-protects dirty memory,
the series introduced a SavedDirty software bit. When a PTE gets
write-protected the Dirty bit shifts to the SavedDirty bit so it won't
become shadow stack. When it is made writable, the opposite happens.

In the previously queued version, the SavedDirty bit was only used when
shadow stack was configured and available on the CPU. But this created two
versions of the Dirty bit behavior on x86, adding complexity. Linus
objected to this, and also the use of conditional control flow logic
instead of bit math. After some trial and error, I ended up with something
that tries to incorporate the feedback, but with some adjustments on the
specifics. I would like some feedback on the maintainability tradeoffs
taken and some scrutiny on the correctness as well.


First of all, switching to bit math for the SavedDirty dance really seems
to be a big improvement. The conditional part is now branchless and
isolated to two functions. Descriptions of the other changes follows, and
a little less of a clear win to me.


One thing that came up in this discussion was the performance impact of
the CMPXCHG loop added to ptep_set_wrprotect() in order to atomically do
the shift of Dirty to SavedDirty when a live PTE is being write-protected.
The concern was that this could impact the performance of fork() where it
is used to write-protect memory in the parent MM.

Linus had suggested to optimize the single threaded fork case to offset
the concerns of a performance impact. However, trying to stress this case,
I was unable to concoct a microbenchmark to show any slowdown of the LOCK
CMPXCHG loop vs the original LOCK AND. Hypothetically the CMPXCHG could
scale worse, but since I couldn’t actually entice it to show any slowdown,
I was thinking that the original worries might have been misplaced and we
could get away with the unconditional CMPXCHG loop.

As Dave previously mentioned, the other wrinkle in all of this is that on
CPUs that don’t support shadow stack, a CPU could rarely set Dirty=1 on a
PTE with Write=0. So the kernel logic needs to be robust to PTEs that have
Write=0,Dirty=1 PTEs in non-shadow stack cases on these CPUs. And also
should be able to handle Write=0,Dirty=1,SavedDirty=1 PTEs. Similarly, a
KNL (Xeon Phi platform) erratum can result in Dirty=1 bits getting set on
Present=0 PTEs.

In order to make the core-MM logic work correctly with shadow stack,
pte_write() needs to also return true for Write=0,Dirty=1 memory. Since
those older platforms can cause Dirty=1,Write=0 PTEs despite the kernels
efforts to not create any itself, the kernel can’t have this shadow stack
pte_write() logic when running on them. So the kernel can only check for
shadow stack memory in pte_write() when shadow stack is supported by the
CPU. Also, the kernel needs to make sure to not trigger any warnings when
it sees a Dirty=0,Write=1 PTE in an unexpected place. So these are also
only done when shadow stack is supported on the CPU. These checks are
isolated to single place in pte_shstk().

So in the end, we can shift the Dirty bit around unconditionally, but the
kernels behavior around the Dirty bit still needs to adjust depending on
the CPU actually supporting shadow stack.

Refactoring the SavedDirty<->Dirty setting logic into bitmath made it so
it would be a lot easier to compile the SavedDirty bit out if needed.
Basically it can be removed with only two checks in
mksaveddirty_shift()/clear_saveddirty_shift() (see “x86/mm: Introduce
_PAGE_SAVED_DIRTY”).

That all makes me wonder if it would still be better to disable SavedDirty
when shadow stack is not supported. One aspect of the idea to make
SavedDirty unconditional, was to unify the sets of rules to have to
reason about. But due to the behavior of the pre-shadow stack CPUs, this
also comes at the increased runtime behaviors we need to worry about. The
other point brought up was the increased testing of having SavedDirty used
more widely. But since we have to turn off the warnings and other logic,
this testing isn’t fully happening on these older platforms either. So I’m
not sure if the unconditional SavedDirty is really a win or not. I thought
it was slightly.

So in this version SavedDirty is turned on universally for x86. Even for
32 bit, which, while seeming a bit silly, allows there to be only one
version of the ptep_set_wrprotect() logic.

Changed patches:
x86/mm: Start actually marking _PAGE_SAVED_DIRTY
x86/mm: Update ptep/pmdp_set_wrprotect() for _PAGE_SAVED_DIRTY
x86/mm: Introduce _PAGE_SAVED_DIRTY
mm: Warn on shadow stack memory in wrong vma
x86/mm: Warn if create Write=0,Dirty=1 with raw prot

3. Shadow stack protections enhancements
----------------------------------------
The shadow stack signal frame format uses a special shadow stack frame
pattern that should not occur naturally in order to avoid forgery on
sigreturn. Two patches are added to strengthen the forgery checks. These
could have been squashed into the signal patch, but I thought leaving them
separate might make review easier for those familiar with the last series.
I was waffling on whether to postpone them to minimize changes from the
previously queued changed to v9. In the end, as long as the series was
already getting re-spun, I thought the extra protections were worth
starting with.

The new mmap maple tree code needs to be taught specifically about
VM_SHADOW_STACK, instead of just relying on vm_start_gap() like the old RB
stuff, so that is added as well to retain the start guard gap with maple
tree.

Added/changes patches:
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Check that signal frame is shadow stack mem
mm: Add guard pages around a shadow stack

4. Selftest enhancements
------------------------
A few miscellaneous selftest enhancements that accumulated since the old
series landed in tip. Added a test for the shadow stack guard gap and the
shadow stack ptrace interface. Also fixed a race that caused the uffd test
to sometimes hang.

Changed patches:
selftests/x86: Add shadow stack test

Since some of the changes were extensive in the modified patches, I
dropped some review tags. But I left testing tags, testers please retest.

Thanks,

Rick

[0] https://lore.kernel.org/lkml/[email protected]/
[1] https://lore.kernel.org/lkml/CAHk-=wiuVXTfgapmjYQvrEDzn3naF2oYnHuky+feEJSj_G_yFQ@mail.gmail.com/
[2] https://lore.kernel.org/lkml/CAHk-=wiB0wy6oXOsPtYU4DSbqJAY8z5iNBKdjdOp2LP23khUoA@mail.gmail.com/

Mike Rapoport (1):
x86/shstk: Add ARCH_SHSTK_UNLOCK

Rick Edgecombe (38):
mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()
mm: Move pte/pmd_mkwrite() callers with no VMA to _novma()
mm: Make pte_mkwrite() take a VMA
x86/shstk: Add Kconfig option for shadow stack
x86/traps: Move control protection handler to separate file
x86/cpufeatures: Add CPU feature flags for shadow stacks
x86/mm: Move pmd_write(), pud_write() up in the file
x86/mm: Introduce _PAGE_SAVED_DIRTY
x86/mm: Update ptep/pmdp_set_wrprotect() for _PAGE_SAVED_DIRTY
x86/mm: Start actually marking _PAGE_SAVED_DIRTY
x86/mm: Remove _PAGE_DIRTY from kernel RO pages
x86/mm: Check shadow stack page fault errors
mm: Add guard pages around a shadow stack.
mm: Warn on shadow stack memory in wrong vma
x86/mm: Warn if create Write=0,Dirty=1 with raw prot
mm/mmap: Add shadow stack pages to memory accounting
x86/mm: Introduce MAP_ABOVE4G
x86/mm: Teach pte_mkwrite() about stack memory
mm: Don't allow write GUPs to shadow stack memory
Documentation/x86: Add CET shadow stack description
x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states
x86/fpu: Add helper for modifying xstate
x86: Introduce userspace API for shadow stack
x86/shstk: Add user control-protection fault handler
x86/shstk: Add user-mode shadow stack support
x86/shstk: Handle thread shadow stack
x86/shstk: Introduce routines modifying shstk
x86/shstk: Handle signals for shadow stack
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Check that signal frame is shadow stack mem
x86/shstk: Introduce map_shadow_stack syscall
x86/shstk: Support WRSS for userspace
x86: Expose thread features in /proc/$PID/status
x86/shstk: Wire in shadow stack interface
x86/cpufeatures: Enable CET CR4 bit for shadow stack
selftests/x86: Add shadow stack test
x86: Add PTRACE interface for shadow stack
x86/shstk: Add ARCH_SHSTK_STATUS

Yu-cheng Yu (3):
mm: Re-introduce vm_flags to do_mmap()
mm: Move VM_UFFD_MINOR_BIT from 37 to 38
mm: Introduce VM_SHADOW_STACK for shadow stack memory

Documentation/arch/x86/index.rst | 1 +
Documentation/arch/x86/shstk.rst | 179 ++++
Documentation/filesystems/proc.rst | 1 +
Documentation/mm/arch_pgtable_helpers.rst | 12 +-
arch/Kconfig | 3 +
arch/alpha/include/asm/pgtable.h | 2 +-
arch/arc/include/asm/hugepage.h | 2 +-
arch/arc/include/asm/pgtable-bits-arcv2.h | 2 +-
arch/arm/include/asm/pgtable-3level.h | 2 +-
arch/arm/include/asm/pgtable.h | 2 +-
arch/arm/kernel/signal.c | 2 +-
arch/arm64/include/asm/pgtable.h | 4 +-
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/signal32.c | 2 +-
arch/arm64/mm/trans_pgd.c | 4 +-
arch/csky/include/asm/pgtable.h | 2 +-
arch/hexagon/include/asm/pgtable.h | 2 +-
arch/ia64/include/asm/pgtable.h | 2 +-
arch/loongarch/include/asm/pgtable.h | 4 +-
arch/m68k/include/asm/mcf_pgtable.h | 2 +-
arch/m68k/include/asm/motorola_pgtable.h | 2 +-
arch/m68k/include/asm/sun3_pgtable.h | 2 +-
arch/microblaze/include/asm/pgtable.h | 2 +-
arch/mips/include/asm/pgtable.h | 6 +-
arch/nios2/include/asm/pgtable.h | 2 +-
arch/openrisc/include/asm/pgtable.h | 2 +-
arch/parisc/include/asm/pgtable.h | 2 +-
arch/powerpc/include/asm/book3s/32/pgtable.h | 2 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 4 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 4 +-
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 4 +-
arch/powerpc/include/asm/nohash/64/pgtable.h | 2 +-
arch/riscv/include/asm/pgtable.h | 6 +-
arch/s390/include/asm/hugetlb.h | 2 +-
arch/s390/include/asm/pgtable.h | 4 +-
arch/s390/mm/pageattr.c | 4 +-
arch/sh/include/asm/pgtable_32.h | 4 +-
arch/sparc/include/asm/pgtable_32.h | 2 +-
arch/sparc/include/asm/pgtable_64.h | 6 +-
arch/sparc/kernel/signal32.c | 2 +-
arch/sparc/kernel/signal_64.c | 2 +-
arch/um/include/asm/pgtable.h | 2 +-
arch/x86/Kconfig | 24 +
arch/x86/Kconfig.assembler | 5 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/disabled-features.h | 16 +-
arch/x86/include/asm/fpu/api.h | 9 +
arch/x86/include/asm/fpu/regset.h | 7 +-
arch/x86/include/asm/fpu/sched.h | 3 +-
arch/x86/include/asm/fpu/types.h | 16 +-
arch/x86/include/asm/fpu/xstate.h | 6 +-
arch/x86/include/asm/idtentry.h | 2 +-
arch/x86/include/asm/mmu_context.h | 2 +
arch/x86/include/asm/pgtable.h | 302 +++++-
arch/x86/include/asm/pgtable_types.h | 46 +-
arch/x86/include/asm/processor.h | 8 +
arch/x86/include/asm/shstk.h | 38 +
arch/x86/include/asm/special_insns.h | 13 +
arch/x86/include/asm/tlbflush.h | 3 +-
arch/x86/include/asm/trap_pf.h | 2 +
arch/x86/include/asm/traps.h | 12 +
arch/x86/include/uapi/asm/mman.h | 4 +
arch/x86/include/uapi/asm/prctl.h | 12 +
arch/x86/kernel/Makefile | 4 +
arch/x86/kernel/cet.c | 152 +++
arch/x86/kernel/cpu/common.c | 35 +-
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/proc.c | 23 +
arch/x86/kernel/fpu/core.c | 54 +-
arch/x86/kernel/fpu/regset.c | 81 ++
arch/x86/kernel/fpu/xstate.c | 90 +-
arch/x86/kernel/idt.c | 2 +-
arch/x86/kernel/process.c | 21 +-
arch/x86/kernel/process_64.c | 8 +
arch/x86/kernel/ptrace.c | 12 +
arch/x86/kernel/shstk.c | 529 +++++++++++
arch/x86/kernel/signal.c | 1 +
arch/x86/kernel/signal_32.c | 2 +-
arch/x86/kernel/signal_64.c | 8 +-
arch/x86/kernel/sys_x86_64.c | 6 +-
arch/x86/kernel/traps.c | 87 --
arch/x86/mm/fault.c | 22 +
arch/x86/mm/pat/set_memory.c | 4 +-
arch/x86/mm/pgtable.c | 40 +
arch/x86/xen/enlighten_pv.c | 2 +-
arch/x86/xen/mmu_pv.c | 2 +-
arch/x86/xen/xen-asm.S | 2 +-
arch/xtensa/include/asm/pgtable.h | 2 +-
fs/aio.c | 2 +-
fs/proc/array.c | 6 +
fs/proc/task_mmu.c | 3 +
include/asm-generic/hugetlb.h | 2 +-
include/linux/mm.h | 67 +-
include/linux/mman.h | 4 +
include/linux/pgtable.h | 28 +
include/linux/proc_fs.h | 2 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/siginfo.h | 3 +-
include/uapi/asm-generic/unistd.h | 2 +-
include/uapi/linux/elf.h | 2 +
ipc/shm.c | 2 +-
kernel/sys_ni.c | 1 +
mm/debug_vm_pgtable.c | 12 +-
mm/gup.c | 2 +-
mm/huge_memory.c | 11 +-
mm/internal.h | 4 +-
mm/memory.c | 5 +-
mm/migrate.c | 2 +-
mm/migrate_device.c | 2 +-
mm/mmap.c | 14 +-
mm/mprotect.c | 2 +-
mm/nommu.c | 4 +-
mm/userfaultfd.c | 2 +-
mm/util.c | 2 +-
tools/testing/selftests/x86/Makefile | 2 +-
.../testing/selftests/x86/test_shadow_stack.c | 884 ++++++++++++++++++
117 files changed, 2789 insertions(+), 307 deletions(-)
create mode 100644 Documentation/arch/x86/shstk.rst
create mode 100644 arch/x86/include/asm/shstk.h
create mode 100644 arch/x86/kernel/cet.c
create mode 100644 arch/x86/kernel/shstk.c
create mode 100644 tools/testing/selftests/x86/test_shadow_stack.c

--
2.34.1



2023-06-13 00:46:32

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 22/42] mm: Don't allow write GUPs to shadow stack memory

The x86 Control-flow Enforcement Technology (CET) feature includes a
new type of memory called shadow stack. This shadow stack memory has
some unusual properties, which requires some core mm changes to
function properly.

In userspace, shadow stack memory is writable only in very specific,
controlled ways. However, since userspace can, even in the limited
ways, modify shadow stack contents, the kernel treats it as writable
memory. As a result, without additional work there would remain many
ways for userspace to trigger the kernel to write arbitrary data to
shadow stacks via get_user_pages(, FOLL_WRITE) based operations. To
help userspace protect their shadow stacks, make this a little less
exposed by blocking writable get_user_pages() operations for shadow
stack VMAs.

Still allow FOLL_FORCE to write through shadow stack protections, as it
does for read-only protections. This is required for debugging use
cases.

Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
arch/x86/include/asm/pgtable.h | 5 +++++
mm/gup.c | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5383f7282f89..fce35f5d4a4e 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1630,6 +1630,11 @@ static inline bool __pte_access_permitted(unsigned long pteval, bool write)
{
unsigned long need_pte_bits = _PAGE_PRESENT|_PAGE_USER;

+ /*
+ * Write=0,Dirty=1 PTEs are shadow stack, which the kernel
+ * shouldn't generally allow access to, but since they
+ * are already Write=0, the below logic covers both cases.
+ */
if (write)
need_pte_bits |= _PAGE_RW;

diff --git a/mm/gup.c b/mm/gup.c
index bbe416236593..cc0dd5267509 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -978,7 +978,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
return -EFAULT;

if (write) {
- if (!(vm_flags & VM_WRITE)) {
+ if (!(vm_flags & VM_WRITE) || (vm_flags & VM_SHADOW_STACK)) {
if (!(gup_flags & FOLL_FORCE))
return -EFAULT;
/* hugetlb does not support FOLL_FORCE|FOLL_WRITE. */
--
2.34.1


2023-06-13 00:46:32

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 19/42] mm/mmap: Add shadow stack pages to memory accounting

The x86 Control-flow Enforcement Technology (CET) feature includes a new
type of memory called shadow stack. This shadow stack memory has some
unusual properties, which requires some core mm changes to function
properly.

Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
mm/internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 68410c6d97ac..dd2ded32d3d5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -535,14 +535,14 @@ static inline bool is_exec_mapping(vm_flags_t flags)
}

/*
- * Stack area - automatically grows in one direction
+ * Stack area (including shadow stacks)
*
* VM_GROWSUP / VM_GROWSDOWN VMAs are always private anonymous:
* do_mmap() forbids all other combinations.
*/
static inline bool is_stack_mapping(vm_flags_t flags)
{
- return (flags & VM_STACK) == VM_STACK;
+ return ((flags & VM_STACK) == VM_STACK) || (flags & VM_SHADOW_STACK);
}

/*
--
2.34.1


2023-06-13 00:47:22

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm changes to function properly.

One of these unusual properties is that shadow stack memory is writable,
but only in limited ways. These limits are applied via a specific PTE
bit combination. Nevertheless, the memory is writable, and core mm code
will need to apply the writable permissions in the typical paths that
call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
that the x86 implementation of it can know whether to create regular
writable memory or shadow stack memory.

But there are a couple of challenges to this. Modifying the signatures of
each arch pte_mkwrite() implementation would be error prone because some
are generated with macros and would need to be re-implemented. Also, some
pte_mkwrite() callers operate on kernel memory without a VMA.

So this can be done in a three step process. First pte_mkwrite() can be
renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
added that just calls pte_mkwrite_novma(). Next callers without a VMA can
be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
can be changed to take/pass a VMA.

Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
create a new arch config HAS_HUGE_PAGE that can be used to tell if
pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
compiler would not be able to find pmd_mkwrite_novma().

No functional change.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Michal Simek <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
---
Hi Non-x86 Arch’s,

x86 has a feature that allows for the creation of a special type of
writable memory (shadow stack) that is only writable in limited specific
ways. Previously, changes were proposed to core MM code to teach it to
decide when to create normally writable memory or the special shadow stack
writable memory, but David Hildenbrand suggested[0] to change
pXX_mkwrite() to take a VMA, so awareness of shadow stack memory can be
moved into x86 code. Later Linus suggested a less error-prone way[1] to go
about this after the first attempt had a bug.

Since pXX_mkwrite() is defined in every arch, it requires some tree-wide
changes. So that is why you are seeing some patches out of a big x86
series pop up in your arch mailing list. There is no functional change.
After this refactor, the shadow stack series goes on to use the arch
helpers to push arch memory details inside arch/x86 and other arch's
with upcoming shadow stack features.

Testing was just 0-day build testing.

Hopefully that is enough context. Thanks!

[0] https://lore.kernel.org/lkml/[email protected]/
[1] https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
---
Documentation/mm/arch_pgtable_helpers.rst | 6 ++++++
arch/Kconfig | 3 +++
arch/alpha/include/asm/pgtable.h | 2 +-
arch/arc/include/asm/hugepage.h | 2 +-
arch/arc/include/asm/pgtable-bits-arcv2.h | 2 +-
arch/arm/include/asm/pgtable-3level.h | 2 +-
arch/arm/include/asm/pgtable.h | 2 +-
arch/arm64/include/asm/pgtable.h | 4 ++--
arch/csky/include/asm/pgtable.h | 2 +-
arch/hexagon/include/asm/pgtable.h | 2 +-
arch/ia64/include/asm/pgtable.h | 2 +-
arch/loongarch/include/asm/pgtable.h | 4 ++--
arch/m68k/include/asm/mcf_pgtable.h | 2 +-
arch/m68k/include/asm/motorola_pgtable.h | 2 +-
arch/m68k/include/asm/sun3_pgtable.h | 2 +-
arch/microblaze/include/asm/pgtable.h | 2 +-
arch/mips/include/asm/pgtable.h | 6 +++---
arch/nios2/include/asm/pgtable.h | 2 +-
arch/openrisc/include/asm/pgtable.h | 2 +-
arch/parisc/include/asm/pgtable.h | 2 +-
arch/powerpc/include/asm/book3s/32/pgtable.h | 2 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
arch/powerpc/include/asm/nohash/32/pgtable.h | 4 ++--
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 4 ++--
arch/powerpc/include/asm/nohash/64/pgtable.h | 2 +-
arch/riscv/include/asm/pgtable.h | 6 +++---
arch/s390/include/asm/hugetlb.h | 2 +-
arch/s390/include/asm/pgtable.h | 4 ++--
arch/sh/include/asm/pgtable_32.h | 4 ++--
arch/sparc/include/asm/pgtable_32.h | 2 +-
arch/sparc/include/asm/pgtable_64.h | 6 +++---
arch/um/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/pgtable.h | 4 ++--
arch/xtensa/include/asm/pgtable.h | 2 +-
include/asm-generic/hugetlb.h | 2 +-
include/linux/pgtable.h | 14 ++++++++++++++
36 files changed, 70 insertions(+), 47 deletions(-)

diff --git a/Documentation/mm/arch_pgtable_helpers.rst b/Documentation/mm/arch_pgtable_helpers.rst
index af3891f895b0..69ce1f2aa4d1 100644
--- a/Documentation/mm/arch_pgtable_helpers.rst
+++ b/Documentation/mm/arch_pgtable_helpers.rst
@@ -48,6 +48,9 @@ PTE Page Table Helpers
+---------------------------+--------------------------------------------------+
| pte_mkwrite | Creates a writable PTE |
+---------------------------+--------------------------------------------------+
+| pte_mkwrite_novma | Creates a writable PTE, of the conventional type |
+| | of writable. |
++---------------------------+--------------------------------------------------+
| pte_wrprotect | Creates a write protected PTE |
+---------------------------+--------------------------------------------------+
| pte_mkspecial | Creates a special PTE |
@@ -120,6 +123,9 @@ PMD Page Table Helpers
+---------------------------+--------------------------------------------------+
| pmd_mkwrite | Creates a writable PMD |
+---------------------------+--------------------------------------------------+
+| pmd_mkwrite_novma | Creates a writable PMD, of the conventional type |
+| | of writable. |
++---------------------------+--------------------------------------------------+
| pmd_wrprotect | Creates a write protected PMD |
+---------------------------+--------------------------------------------------+
| pmd_mkspecial | Creates a special PMD |
diff --git a/arch/Kconfig b/arch/Kconfig
index 205fd23e0cad..3bc11c9a2ac1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -919,6 +919,9 @@ config HAVE_ARCH_HUGE_VMALLOC
config ARCH_WANT_HUGE_PMD_SHARE
bool

+config HAS_HUGE_PAGE
+ def_bool HAVE_ARCH_HUGE_VMAP || TRANSPARENT_HUGEPAGE || HUGETLBFS
+
config HAVE_ARCH_SOFT_DIRTY
bool

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index ba43cb841d19..af1a13ab3320 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -256,7 +256,7 @@ extern inline int pte_young(pte_t pte) { return pte_val(pte) & _PAGE_ACCESSED;
extern inline pte_t pte_wrprotect(pte_t pte) { pte_val(pte) |= _PAGE_FOW; return pte; }
extern inline pte_t pte_mkclean(pte_t pte) { pte_val(pte) &= ~(__DIRTY_BITS); return pte; }
extern inline pte_t pte_mkold(pte_t pte) { pte_val(pte) &= ~(__ACCESS_BITS); return pte; }
-extern inline pte_t pte_mkwrite(pte_t pte) { pte_val(pte) &= ~_PAGE_FOW; return pte; }
+extern inline pte_t pte_mkwrite_novma(pte_t pte){ pte_val(pte) &= ~_PAGE_FOW; return pte; }
extern inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= __DIRTY_BITS; return pte; }
extern inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= __ACCESS_BITS; return pte; }

diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h
index 5001b796fb8d..ef8d4166370c 100644
--- a/arch/arc/include/asm/hugepage.h
+++ b/arch/arc/include/asm/hugepage.h
@@ -21,7 +21,7 @@ static inline pmd_t pte_pmd(pte_t pte)
}

#define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd)))
-#define pmd_mkwrite(pmd) pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mkwrite_novma(pmd) pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)))
#define pmd_mkdirty(pmd) pte_pmd(pte_mkdirty(pmd_pte(pmd)))
#define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
#define pmd_mkyoung(pmd) pte_pmd(pte_mkyoung(pmd_pte(pmd)))
diff --git a/arch/arc/include/asm/pgtable-bits-arcv2.h b/arch/arc/include/asm/pgtable-bits-arcv2.h
index 6e9f8ca6d6a1..5c073d9f41c2 100644
--- a/arch/arc/include/asm/pgtable-bits-arcv2.h
+++ b/arch/arc/include/asm/pgtable-bits-arcv2.h
@@ -87,7 +87,7 @@

PTE_BIT_FUNC(mknotpresent, &= ~(_PAGE_PRESENT));
PTE_BIT_FUNC(wrprotect, &= ~(_PAGE_WRITE));
-PTE_BIT_FUNC(mkwrite, |= (_PAGE_WRITE));
+PTE_BIT_FUNC(mkwrite_novma, |= (_PAGE_WRITE));
PTE_BIT_FUNC(mkclean, &= ~(_PAGE_DIRTY));
PTE_BIT_FUNC(mkdirty, |= (_PAGE_DIRTY));
PTE_BIT_FUNC(mkold, &= ~(_PAGE_ACCESSED));
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 106049791500..71c3add6417f 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -202,7 +202,7 @@ static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }

PMD_BIT_FUNC(wrprotect, |= L_PMD_SECT_RDONLY);
PMD_BIT_FUNC(mkold, &= ~PMD_SECT_AF);
-PMD_BIT_FUNC(mkwrite, &= ~L_PMD_SECT_RDONLY);
+PMD_BIT_FUNC(mkwrite_novma, &= ~L_PMD_SECT_RDONLY);
PMD_BIT_FUNC(mkdirty, |= L_PMD_SECT_DIRTY);
PMD_BIT_FUNC(mkclean, &= ~L_PMD_SECT_DIRTY);
PMD_BIT_FUNC(mkyoung, |= PMD_SECT_AF);
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index a58ccbb406ad..f37ba2472eae 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -227,7 +227,7 @@ static inline pte_t pte_wrprotect(pte_t pte)
return set_pte_bit(pte, __pgprot(L_PTE_RDONLY));
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return clear_pte_bit(pte, __pgprot(L_PTE_RDONLY));
}
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0bd18de9fd97..7a3d62cb9bee 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -180,7 +180,7 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot)
return pmd;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte = set_pte_bit(pte, __pgprot(PTE_WRITE));
pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
@@ -487,7 +487,7 @@ static inline int pmd_trans_huge(pmd_t pmd)
#define pmd_cont(pmd) pte_cont(pmd_pte(pmd))
#define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd)))
#define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
-#define pmd_mkwrite(pmd) pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mkwrite_novma(pmd) pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)))
#define pmd_mkclean(pmd) pte_pmd(pte_mkclean(pmd_pte(pmd)))
#define pmd_mkdirty(pmd) pte_pmd(pte_mkdirty(pmd_pte(pmd)))
#define pmd_mkyoung(pmd) pte_pmd(pte_mkyoung(pmd_pte(pmd)))
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index d4042495febc..aa0cce4fc02f 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -176,7 +176,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
if (pte_val(pte) & _PAGE_MODIFIED)
diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index 59393613d086..fc2d2d83368d 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -300,7 +300,7 @@ static inline pte_t pte_wrprotect(pte_t pte)
}

/* pte_mkwrite - mark page as writable */
-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
return pte;
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 21c97e31a28a..f80aba7cad99 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -268,7 +268,7 @@ ia64_phys_addr_valid (unsigned long addr)
* access rights:
*/
#define pte_wrprotect(pte) (__pte(pte_val(pte) & ~_PAGE_AR_RW))
-#define pte_mkwrite(pte) (__pte(pte_val(pte) | _PAGE_AR_RW))
+#define pte_mkwrite_novma(pte) (__pte(pte_val(pte) | _PAGE_AR_RW))
#define pte_mkold(pte) (__pte(pte_val(pte) & ~_PAGE_A))
#define pte_mkyoung(pte) (__pte(pte_val(pte) | _PAGE_A))
#define pte_mkclean(pte) (__pte(pte_val(pte) & ~_PAGE_D))
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index d28fb9dbec59..8245cf367b31 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -390,7 +390,7 @@ static inline pte_t pte_mkdirty(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
if (pte_val(pte) & _PAGE_MODIFIED)
@@ -490,7 +490,7 @@ static inline int pmd_write(pmd_t pmd)
return !!(pmd_val(pmd) & _PAGE_WRITE);
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
pmd_val(pmd) |= _PAGE_WRITE;
if (pmd_val(pmd) & _PAGE_MODIFIED)
diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h
index d97fbb812f63..42ebea0488e3 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -211,7 +211,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= CF_PAGE_WRITABLE;
return pte;
diff --git a/arch/m68k/include/asm/motorola_pgtable.h b/arch/m68k/include/asm/motorola_pgtable.h
index ec0dc19ab834..ba28ca4d219a 100644
--- a/arch/m68k/include/asm/motorola_pgtable.h
+++ b/arch/m68k/include/asm/motorola_pgtable.h
@@ -155,7 +155,7 @@ static inline int pte_young(pte_t pte) { return pte_val(pte) & _PAGE_ACCESSED;
static inline pte_t pte_wrprotect(pte_t pte) { pte_val(pte) |= _PAGE_RONLY; return pte; }
static inline pte_t pte_mkclean(pte_t pte) { pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
static inline pte_t pte_mkold(pte_t pte) { pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) { pte_val(pte) &= ~_PAGE_RONLY; return pte; }
+static inline pte_t pte_mkwrite_novma(pte_t pte){ pte_val(pte) &= ~_PAGE_RONLY; return pte; }
static inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= _PAGE_DIRTY; return pte; }
static inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= _PAGE_ACCESSED; return pte; }
static inline pte_t pte_mknocache(pte_t pte)
diff --git a/arch/m68k/include/asm/sun3_pgtable.h b/arch/m68k/include/asm/sun3_pgtable.h
index e582b0484a55..4114eaff7404 100644
--- a/arch/m68k/include/asm/sun3_pgtable.h
+++ b/arch/m68k/include/asm/sun3_pgtable.h
@@ -143,7 +143,7 @@ static inline int pte_young(pte_t pte) { return pte_val(pte) & SUN3_PAGE_ACCESS
static inline pte_t pte_wrprotect(pte_t pte) { pte_val(pte) &= ~SUN3_PAGE_WRITEABLE; return pte; }
static inline pte_t pte_mkclean(pte_t pte) { pte_val(pte) &= ~SUN3_PAGE_MODIFIED; return pte; }
static inline pte_t pte_mkold(pte_t pte) { pte_val(pte) &= ~SUN3_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) { pte_val(pte) |= SUN3_PAGE_WRITEABLE; return pte; }
+static inline pte_t pte_mkwrite_novma(pte_t pte){ pte_val(pte) |= SUN3_PAGE_WRITEABLE; return pte; }
static inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= SUN3_PAGE_MODIFIED; return pte; }
static inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= SUN3_PAGE_ACCESSED; return pte; }
static inline pte_t pte_mknocache(pte_t pte) { pte_val(pte) |= SUN3_PAGE_NOCACHE; return pte; }
diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h
index d1b8272abcd9..9108b33a7886 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -266,7 +266,7 @@ static inline pte_t pte_mkread(pte_t pte) \
{ pte_val(pte) |= _PAGE_USER; return pte; }
static inline pte_t pte_mkexec(pte_t pte) \
{ pte_val(pte) |= _PAGE_USER | _PAGE_EXEC; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) \
+static inline pte_t pte_mkwrite_novma(pte_t pte) \
{ pte_val(pte) |= _PAGE_RW; return pte; }
static inline pte_t pte_mkdirty(pte_t pte) \
{ pte_val(pte) |= _PAGE_DIRTY; return pte; }
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 574fa14ac8b2..40a54fd6e48d 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -309,7 +309,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte.pte_low |= _PAGE_WRITE;
if (pte.pte_low & _PAGE_MODIFIED) {
@@ -364,7 +364,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
if (pte_val(pte) & _PAGE_MODIFIED)
@@ -627,7 +627,7 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd)
return pmd;
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
pmd_val(pmd) |= _PAGE_WRITE;
if (pmd_val(pmd) & _PAGE_MODIFIED)
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index 0f5c2564e9f5..cf1ffbc1a121 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -129,7 +129,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
return pte;
diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h
index 3eb9b9555d0d..828820c74fc5 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -250,7 +250,7 @@ static inline pte_t pte_mkold(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte_val(pte) |= _PAGE_WRITE;
return pte;
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index e715df5385d6..79d1cef2fd7c 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -331,7 +331,7 @@ static inline pte_t pte_mkold(pte_t pte) { pte_val(pte) &= ~_PAGE_ACCESSED; retu
static inline pte_t pte_wrprotect(pte_t pte) { pte_val(pte) &= ~_PAGE_WRITE; return pte; }
static inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= _PAGE_DIRTY; return pte; }
static inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) { pte_val(pte) |= _PAGE_WRITE; return pte; }
+static inline pte_t pte_mkwrite_novma(pte_t pte) { pte_val(pte) |= _PAGE_WRITE; return pte; }
static inline pte_t pte_mkspecial(pte_t pte) { pte_val(pte) |= _PAGE_SPECIAL; return pte; }

/*
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 7bf1fe7297c6..67dfb674a4c1 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -498,7 +498,7 @@ static inline pte_t pte_mkpte(pte_t pte)
return pte;
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_RW);
}
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4acc9690f599..0328d917494a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -600,7 +600,7 @@ static inline pte_t pte_mkexec(pte_t pte)
return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_EXEC));
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
/*
* write implies read, hence set both
@@ -1071,7 +1071,7 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
#define pmd_mkdirty(pmd) pte_pmd(pte_mkdirty(pmd_pte(pmd)))
#define pmd_mkclean(pmd) pte_pmd(pte_mkclean(pmd_pte(pmd)))
#define pmd_mkyoung(pmd) pte_pmd(pte_mkyoung(pmd_pte(pmd)))
-#define pmd_mkwrite(pmd) pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mkwrite_novma(pmd) pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)))

#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
#define pmd_soft_dirty(pmd) pte_soft_dirty(pmd_pte(pmd))
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index fec56d965f00..33213b31fcbb 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -170,8 +170,8 @@ void unmap_kernel_page(unsigned long va);
#define pte_clear(mm, addr, ptep) \
do { pte_update(mm, addr, ptep, ~0, 0, 0); } while (0)

-#ifndef pte_mkwrite
-static inline pte_t pte_mkwrite(pte_t pte)
+#ifndef pte_mkwrite_novma
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_RW);
}
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 1a89ebdc3acc..21f681ee535a 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -101,12 +101,12 @@ static inline int pte_write(pte_t pte)

#define pte_write pte_write

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) & ~_PAGE_RO);
}

-#define pte_mkwrite pte_mkwrite
+#define pte_mkwrite_novma pte_mkwrite_novma

static inline bool pte_user(pte_t pte)
{
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 287e25864ffa..abe4fd82721e 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -85,7 +85,7 @@
#ifndef __ASSEMBLY__
/* pte_clear moved to later in this file */

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_RW);
}
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 2258b27173b0..b38faec98154 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -379,7 +379,7 @@ static inline pte_t pte_wrprotect(pte_t pte)

/* static inline pte_t pte_mkread(pte_t pte) */

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_WRITE);
}
@@ -665,9 +665,9 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return pte_pmd(pte_mkyoung(pmd_pte(pmd)));
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
- return pte_pmd(pte_mkwrite(pmd_pte(pmd)));
+ return pte_pmd(pte_mkwrite_novma(pmd_pte(pmd)));
}

static inline pmd_t pmd_wrprotect(pmd_t pmd)
diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h
index ccdbccfde148..f07267875a19 100644
--- a/arch/s390/include/asm/hugetlb.h
+++ b/arch/s390/include/asm/hugetlb.h
@@ -104,7 +104,7 @@ static inline int huge_pte_dirty(pte_t pte)

static inline pte_t huge_pte_mkwrite(pte_t pte)
{
- return pte_mkwrite(pte);
+ return pte_mkwrite_novma(pte);
}

static inline pte_t huge_pte_mkdirty(pte_t pte)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 6822a11c2c8a..699406036f30 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1005,7 +1005,7 @@ static inline pte_t pte_wrprotect(pte_t pte)
return set_pte_bit(pte, __pgprot(_PAGE_PROTECT));
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
pte = set_pte_bit(pte, __pgprot(_PAGE_WRITE));
if (pte_val(pte) & _PAGE_DIRTY)
@@ -1488,7 +1488,7 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd)
return set_pmd_bit(pmd, __pgprot(_SEGMENT_ENTRY_PROTECT));
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
pmd = set_pmd_bit(pmd, __pgprot(_SEGMENT_ENTRY_WRITE));
if (pmd_val(pmd) & _SEGMENT_ENTRY_DIRTY)
diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 21952b094650..165b4fd08152 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -359,11 +359,11 @@ static inline pte_t pte_##fn(pte_t pte) { pte.pte_##h op; return pte; }
* kernel permissions), we attempt to couple them a bit more sanely here.
*/
PTE_BIT_FUNC(high, wrprotect, &= ~(_PAGE_EXT_USER_WRITE | _PAGE_EXT_KERN_WRITE));
-PTE_BIT_FUNC(high, mkwrite, |= _PAGE_EXT_USER_WRITE | _PAGE_EXT_KERN_WRITE);
+PTE_BIT_FUNC(high, mkwrite_novma, |= _PAGE_EXT_USER_WRITE | _PAGE_EXT_KERN_WRITE);
PTE_BIT_FUNC(high, mkhuge, |= _PAGE_SZHUGE);
#else
PTE_BIT_FUNC(low, wrprotect, &= ~_PAGE_RW);
-PTE_BIT_FUNC(low, mkwrite, |= _PAGE_RW);
+PTE_BIT_FUNC(low, mkwrite_novma, |= _PAGE_RW);
PTE_BIT_FUNC(low, mkhuge, |= _PAGE_SZHUGE);
#endif

diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index d4330e3c57a6..a2d909446539 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -241,7 +241,7 @@ static inline pte_t pte_mkold(pte_t pte)
return __pte(pte_val(pte) & ~SRMMU_REF);
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return __pte(pte_val(pte) | SRMMU_WRITE);
}
diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 5563efa1a19f..4dd4f6cdc670 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -517,7 +517,7 @@ static inline pte_t pte_mkclean(pte_t pte)
return __pte(val);
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
unsigned long val = pte_val(pte), mask;

@@ -772,11 +772,11 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return __pmd(pte_val(pte));
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
pte_t pte = __pte(pmd_val(pmd));

- pte = pte_mkwrite(pte);
+ pte = pte_mkwrite_novma(pte);

return __pmd(pte_val(pte));
}
diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h
index a70d1618eb35..46f59a8bc812 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -207,7 +207,7 @@ static inline pte_t pte_mkyoung(pte_t pte)
return(pte);
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
if (unlikely(pte_get_bits(pte, _PAGE_RW)))
return pte;
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 15ae4d6ba476..112e6060eafa 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -352,7 +352,7 @@ static inline pte_t pte_mkyoung(pte_t pte)
return pte_set_flags(pte, _PAGE_ACCESSED);
}

-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{
return pte_set_flags(pte, _PAGE_RW);
}
@@ -453,7 +453,7 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return pmd_set_flags(pmd, _PAGE_ACCESSED);
}

-static inline pmd_t pmd_mkwrite(pmd_t pmd)
+static inline pmd_t pmd_mkwrite_novma(pmd_t pmd)
{
return pmd_set_flags(pmd, _PAGE_RW);
}
diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h
index fc7a14884c6c..27e3ae38a5de 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -262,7 +262,7 @@ static inline pte_t pte_mkdirty(pte_t pte)
{ pte_val(pte) |= _PAGE_DIRTY; return pte; }
static inline pte_t pte_mkyoung(pte_t pte)
{ pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte)
+static inline pte_t pte_mkwrite_novma(pte_t pte)
{ pte_val(pte) |= _PAGE_WRITABLE; return pte; }

#define pgprot_noncached(prot) \
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index d7f6335d3999..4da02798a00b 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -22,7 +22,7 @@ static inline unsigned long huge_pte_dirty(pte_t pte)

static inline pte_t huge_pte_mkwrite(pte_t pte)
{
- return pte_mkwrite(pte);
+ return pte_mkwrite_novma(pte);
}

#ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index c5a51481bbb9..ae271a307584 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -507,6 +507,20 @@ extern pud_t pudp_huge_clear_flush(struct vm_area_struct *vma,
pud_t *pudp);
#endif

+#ifndef pte_mkwrite
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+ return pte_mkwrite_novma(pte);
+}
+#endif
+
+#if defined(CONFIG_HAS_HUGE_PAGE) && !defined(pmd_mkwrite)
+static inline pmd_t pmd_mkwrite(pmd_t pmd)
+{
+ return pmd_mkwrite_novma(pmd);
+}
+#endif
+
#ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT
struct mm_struct;
static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep)
--
2.34.1


2023-06-13 00:47:36

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 13/42] x86/mm: Remove _PAGE_DIRTY from kernel RO pages

New processors that support Shadow Stack regard Write=0,Dirty=1 PTEs as
shadow stack pages.

In normal cases, it can be helpful to create Write=1 PTEs as also Dirty=1
if HW dirty tracking is not needed, because if the Dirty bit is not already
set the CPU has to set Dirty=1 when the memory gets written to. This
creates additional work for the CPU. So traditional wisdom was to simply
set the Dirty bit whenever you didn't care about it. However, it was never
really very helpful for read-only kernel memory.

When CR4.CET=1 and IA32_S_CET.SH_STK_EN=1, some instructions can write to
such supervisor memory. The kernel does not set IA32_S_CET.SH_STK_EN, so
avoiding kernel Write=0,Dirty=1 memory is not strictly needed for any
functional reason. But having Write=0,Dirty=1 kernel memory doesn't have
any functional benefit either, so to reduce ambiguity between shadow stack
and regular Write=0 pages, remove Dirty=1 from any kernel Write=0 PTEs.

Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
arch/x86/include/asm/pgtable_types.h | 8 +++++---
arch/x86/mm/pat/set_memory.c | 4 ++--
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index ee6f8e57e115..26f07d6d5758 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -222,10 +222,12 @@ enum page_cache_mode {
#define _PAGE_TABLE_NOENC (__PP|__RW|_USR|___A| 0|___D| 0| 0)
#define _PAGE_TABLE (__PP|__RW|_USR|___A| 0|___D| 0| 0| _ENC)

-#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX|___D| 0|___G)
-#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0|___D| 0|___G)
+#define __PAGE_KERNEL_RO (__PP| 0| 0|___A|__NX| 0| 0|___G)
+#define __PAGE_KERNEL_ROX (__PP| 0| 0|___A| 0| 0| 0|___G)
+#define __PAGE_KERNEL (__PP|__RW| 0|___A|__NX|___D| 0|___G)
+#define __PAGE_KERNEL_EXEC (__PP|__RW| 0|___A| 0|___D| 0|___G)
#define __PAGE_KERNEL_NOCACHE (__PP|__RW| 0|___A|__NX|___D| 0|___G| __NC)
-#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX|___D| 0|___G)
+#define __PAGE_KERNEL_VVAR (__PP| 0|_USR|___A|__NX| 0| 0|___G)
#define __PAGE_KERNEL_LARGE (__PP|__RW| 0|___A|__NX|___D|_PSE|___G)
#define __PAGE_KERNEL_LARGE_EXEC (__PP|__RW| 0|___A| 0|___D|_PSE|___G)
#define __PAGE_KERNEL_WP (__PP|__RW| 0|___A|__NX|___D| 0|___G| __WP)
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7159cf787613..fc627acfe40e 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2073,12 +2073,12 @@ int set_memory_nx(unsigned long addr, int numpages)

int set_memory_ro(unsigned long addr, int numpages)
{
- return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW), 0);
+ return change_page_attr_clear(&addr, numpages, __pgprot(_PAGE_RW | _PAGE_DIRTY), 0);
}

int set_memory_rox(unsigned long addr, int numpages)
{
- pgprot_t clr = __pgprot(_PAGE_RW);
+ pgprot_t clr = __pgprot(_PAGE_RW | _PAGE_DIRTY);

if (__supported_pte_mask & _PAGE_NX)
clr.pgprot |= _PAGE_NX;
--
2.34.1


2023-06-13 00:48:00

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 26/42] x86: Introduce userspace API for shadow stack

Add three new arch_prctl() handles:

- ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
feature. Returns 0 on success or a negative value on error.

- ARCH_SHSTK_LOCK prevents future disabling or enabling of the
specified feature. Returns 0 on success or a negative value
on error.

The features are handled per-thread and inherited over fork(2)/clone(2),
but reset on exec().

Co-developed-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
arch/x86/include/asm/processor.h | 6 +++++
arch/x86/include/asm/shstk.h | 21 +++++++++++++++
arch/x86/include/uapi/asm/prctl.h | 6 +++++
arch/x86/kernel/Makefile | 2 ++
arch/x86/kernel/process_64.c | 6 +++++
arch/x86/kernel/shstk.c | 44 +++++++++++++++++++++++++++++++
6 files changed, 85 insertions(+)
create mode 100644 arch/x86/include/asm/shstk.h
create mode 100644 arch/x86/kernel/shstk.c

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a1e4fa58b357..407d5551b6a7 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -28,6 +28,7 @@ struct vm86;
#include <asm/unwind_hints.h>
#include <asm/vmxfeatures.h>
#include <asm/vdso/processor.h>
+#include <asm/shstk.h>

#include <linux/personality.h>
#include <linux/cache.h>
@@ -475,6 +476,11 @@ struct thread_struct {
*/
u32 pkru;

+#ifdef CONFIG_X86_USER_SHADOW_STACK
+ unsigned long features;
+ unsigned long features_locked;
+#endif
+
/* Floating point and extended processor state */
struct fpu fpu;
/*
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
new file mode 100644
index 000000000000..ec753809f074
--- /dev/null
+++ b/arch/x86/include/asm/shstk.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SHSTK_H
+#define _ASM_X86_SHSTK_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+struct task_struct;
+
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+long shstk_prctl(struct task_struct *task, int option, unsigned long features);
+void reset_thread_features(void);
+#else
+static inline long shstk_prctl(struct task_struct *task, int option,
+ unsigned long arg2) { return -EINVAL; }
+static inline void reset_thread_features(void) {}
+#endif /* CONFIG_X86_USER_SHADOW_STACK */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_SHSTK_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index e8d7ebbca1a4..1cd44ecc9ce0 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -23,9 +23,15 @@
#define ARCH_MAP_VDSO_32 0x2002
#define ARCH_MAP_VDSO_64 0x2003

+/* Don't use 0x3001-0x3004 because of old glibcs */
+
#define ARCH_GET_UNTAG_MASK 0x4001
#define ARCH_ENABLE_TAGGED_ADDR 0x4002
#define ARCH_GET_MAX_TAG_BITS 0x4003
#define ARCH_FORCE_TAGGED_SVA 0x4004

+#define ARCH_SHSTK_ENABLE 0x5001
+#define ARCH_SHSTK_DISABLE 0x5002
+#define ARCH_SHSTK_LOCK 0x5003
+
#endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index abee0564b750..6b6bf47652ee 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -147,6 +147,8 @@ obj-$(CONFIG_CALL_THUNKS) += callthunks.o

obj-$(CONFIG_X86_CET) += cet.o

+obj-$(CONFIG_X86_USER_SHADOW_STACK) += shstk.o
+
###
# 64 bit specific files
ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3d181c16a2f6..0f89aa0186d1 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -515,6 +515,8 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
load_gs_index(__USER_DS);
}

+ reset_thread_features();
+
loadsegment(fs, 0);
loadsegment(es, _ds);
loadsegment(ds, _ds);
@@ -894,6 +896,10 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
else
return put_user(LAM_U57_BITS, (unsigned long __user *)arg2);
#endif
+ case ARCH_SHSTK_ENABLE:
+ case ARCH_SHSTK_DISABLE:
+ case ARCH_SHSTK_LOCK:
+ return shstk_prctl(task, option, arg2);
default:
ret = -EINVAL;
break;
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
new file mode 100644
index 000000000000..41ed6552e0a5
--- /dev/null
+++ b/arch/x86/kernel/shstk.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * shstk.c - Intel shadow stack support
+ *
+ * Copyright (c) 2021, Intel Corporation.
+ * Yu-cheng Yu <[email protected]>
+ */
+
+#include <linux/sched.h>
+#include <linux/bitops.h>
+#include <asm/prctl.h>
+
+void reset_thread_features(void)
+{
+ current->thread.features = 0;
+ current->thread.features_locked = 0;
+}
+
+long shstk_prctl(struct task_struct *task, int option, unsigned long features)
+{
+ if (option == ARCH_SHSTK_LOCK) {
+ task->thread.features_locked |= features;
+ return 0;
+ }
+
+ /* Don't allow via ptrace */
+ if (task != current)
+ return -EINVAL;
+
+ /* Do not allow to change locked features */
+ if (features & task->thread.features_locked)
+ return -EPERM;
+
+ /* Only support enabling/disabling one feature at a time. */
+ if (hweight_long(features) > 1)
+ return -EINVAL;
+
+ if (option == ARCH_SHSTK_DISABLE) {
+ return -EINVAL;
+ }
+
+ /* Handle ARCH_SHSTK_ENABLE */
+ return -EINVAL;
+}
--
2.34.1


2023-06-13 00:48:07

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 18/42] x86/mm: Warn if create Write=0,Dirty=1 with raw prot

When user shadow stack is in use, Write=0,Dirty=1 is treated by the CPU as
shadow stack memory. So for shadow stack memory this bit combination is
valid, but when Dirty=1,Write=1 (conventionally writable) memory is being
write protected, the kernel has been taught to transition the Dirty=1
bit to SavedDirty=1, to avoid inadvertently creating shadow stack
memory. It does this inside pte_wrprotect() because it knows the PTE is
not intended to be a writable shadow stack entry, it is supposed to be
write protected.

However, when a PTE is created by a raw prot using mk_pte(), mk_pte()
can't know whether to adjust Dirty=1 to SavedDirty=1. It can't
distinguish between the caller intending to create a shadow stack PTE or
needing the SavedDirty shift.

The kernel has been updated to not do this, and so Write=0,Dirty=1
memory should only be created by the pte_mkfoo() helpers. Add a warning
to make sure no new mk_pte() start doing this, like, for example,
set_memory_rox() did.

Signed-off-by: Rick Edgecombe <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
v9:
- Always do the check since 32 bit now supports SavedDirty
---
arch/x86/include/asm/pgtable.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 89cfa93d0ad6..5383f7282f89 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1032,7 +1032,14 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
* (Currently stuck as a macro because of indirect forward reference
* to linux/mm.h:page_to_nid())
*/
-#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot))
+#define mk_pte(page, pgprot) \
+({ \
+ pgprot_t __pgprot = pgprot; \
+ \
+ WARN_ON_ONCE((pgprot_val(__pgprot) & (_PAGE_DIRTY | _PAGE_RW)) == \
+ _PAGE_DIRTY); \
+ pfn_pte(page_to_pfn(page), __pgprot); \
+})

static inline int pmd_bad(pmd_t pmd)
{
--
2.34.1


2023-06-13 00:48:08

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 27/42] x86/shstk: Add user control-protection fault handler

A control-protection fault is triggered when a control-flow transfer
attempt violates Shadow Stack or Indirect Branch Tracking constraints.
For example, the return address for a RET instruction differs from the copy
on the shadow stack.

There already exists a control-protection fault handler for handling kernel
IBT faults. Refactor this fault handler into separate user and kernel
handlers, like the page fault handler. Add a control-protection handler
for usermode. To avoid ifdeffery, put them both in a new file cet.c, which
is compiled in the case of either of the two CET features supported in the
kernel: kernel IBT or user mode shadow stack. Move some static inline
functions from traps.c into a header so they can be used in cet.c.

Opportunistically fix a comment in the kernel IBT part of the fault
handler that is on the end of the line instead of preceding it.

Keep the same behavior for the kernel side of the fault handler, except for
converting a BUG to a WARN in the case of a #CP happening when the feature
is missing. This unifies the behavior with the new shadow stack code, and
also prevents the kernel from crashing under this situation which is
potentially recoverable.

The control-protection fault handler works in a similar way as the general
protection fault handler. It provides the si_code SEGV_CPERR to the signal
handler.

Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
arch/arm/kernel/signal.c | 2 +-
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/signal32.c | 2 +-
arch/sparc/kernel/signal32.c | 2 +-
arch/sparc/kernel/signal_64.c | 2 +-
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/idtentry.h | 2 +-
arch/x86/include/asm/traps.h | 12 +++
arch/x86/kernel/cet.c | 94 +++++++++++++++++++++---
arch/x86/kernel/idt.c | 2 +-
arch/x86/kernel/signal_32.c | 2 +-
arch/x86/kernel/signal_64.c | 2 +-
arch/x86/kernel/traps.c | 12 ---
arch/x86/xen/enlighten_pv.c | 2 +-
arch/x86/xen/xen-asm.S | 2 +-
include/uapi/asm-generic/siginfo.h | 3 +-
16 files changed, 117 insertions(+), 34 deletions(-)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index e07f359254c3..9a3c9de5ac5e 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -681,7 +681,7 @@ asmlinkage void do_rseq_syscall(struct pt_regs *regs)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 2cfc810d0a5b..06d31731c8ed 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -1343,7 +1343,7 @@ void __init minsigstksz_setup(void)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index 4700f8522d27..bbd542704730 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -460,7 +460,7 @@ void compat_setup_restart_syscall(struct pt_regs *regs)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c
index dad38960d1a8..82da8a2d769d 100644
--- a/arch/sparc/kernel/signal32.c
+++ b/arch/sparc/kernel/signal32.c
@@ -751,7 +751,7 @@ asmlinkage int do_sys32_sigstack(u32 u_ssptr, u32 u_ossptr, unsigned long sp)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c
index 570e43e6fda5..b4e410976e0d 100644
--- a/arch/sparc/kernel/signal_64.c
+++ b/arch/sparc/kernel/signal_64.c
@@ -562,7 +562,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0, unsigned long
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index b9c7eae2e70f..702d93fdd10e 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -111,6 +111,12 @@
#define DISABLE_USER_SHSTK (1 << (X86_FEATURE_USER_SHSTK & 31))
#endif

+#ifdef CONFIG_X86_KERNEL_IBT
+#define DISABLE_IBT 0
+#else
+#define DISABLE_IBT (1 << (X86_FEATURE_IBT & 31))
+#endif
+
/*
* Make sure to add features to the correct mask
*/
@@ -134,7 +140,7 @@
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
DISABLE_ENQCMD)
#define DISABLED_MASK17 0
-#define DISABLED_MASK18 0
+#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK19 0
#define DISABLED_MASK20 0
#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index b241af4ce9b4..61e0e6301f09 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -614,7 +614,7 @@ DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF, xenpv_exc_double_fault);
#endif

/* #CP */
-#ifdef CONFIG_X86_KERNEL_IBT
+#ifdef CONFIG_X86_CET
DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP, exc_control_protection);
#endif

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 47ecfff2c83d..75e0dabf0c45 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -47,4 +47,16 @@ void __noreturn handle_stack_overflow(struct pt_regs *regs,
struct stack_info *info);
#endif

+static inline void cond_local_irq_enable(struct pt_regs *regs)
+{
+ if (regs->flags & X86_EFLAGS_IF)
+ local_irq_enable();
+}
+
+static inline void cond_local_irq_disable(struct pt_regs *regs)
+{
+ if (regs->flags & X86_EFLAGS_IF)
+ local_irq_disable();
+}
+
#endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
index 7ad22b705b64..cc10d8be9d74 100644
--- a/arch/x86/kernel/cet.c
+++ b/arch/x86/kernel/cet.c
@@ -4,10 +4,6 @@
#include <asm/bugs.h>
#include <asm/traps.h>

-static __ro_after_init bool ibt_fatal = true;
-
-extern void ibt_selftest_ip(void); /* code label defined in asm below */
-
enum cp_error_code {
CP_EC = (1 << 15) - 1,

@@ -20,15 +16,80 @@ enum cp_error_code {
CP_ENCL = 1 << 15,
};

-DEFINE_IDTENTRY_ERRORCODE(exc_control_protection)
+static const char cp_err[][10] = {
+ [0] = "unknown",
+ [1] = "near ret",
+ [2] = "far/iret",
+ [3] = "endbranch",
+ [4] = "rstorssp",
+ [5] = "setssbsy",
+};
+
+static const char *cp_err_string(unsigned long error_code)
{
- if (!cpu_feature_enabled(X86_FEATURE_IBT)) {
- pr_err("Unexpected #CP\n");
- BUG();
+ unsigned int cpec = error_code & CP_EC;
+
+ if (cpec >= ARRAY_SIZE(cp_err))
+ cpec = 0;
+ return cp_err[cpec];
+}
+
+static void do_unexpected_cp(struct pt_regs *regs, unsigned long error_code)
+{
+ WARN_ONCE(1, "Unexpected %s #CP, error_code: %s\n",
+ user_mode(regs) ? "user mode" : "kernel mode",
+ cp_err_string(error_code));
+}
+
+static DEFINE_RATELIMIT_STATE(cpf_rate, DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+
+static void do_user_cp_fault(struct pt_regs *regs, unsigned long error_code)
+{
+ struct task_struct *tsk;
+ unsigned long ssp;
+
+ /*
+ * An exception was just taken from userspace. Since interrupts are disabled
+ * here, no scheduling should have messed with the registers yet and they
+ * will be whatever is live in userspace. So read the SSP before enabling
+ * interrupts so locking the fpregs to do it later is not required.
+ */
+ rdmsrl(MSR_IA32_PL3_SSP, ssp);
+
+ cond_local_irq_enable(regs);
+
+ tsk = current;
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_nr = X86_TRAP_CP;
+
+ /* Ratelimit to prevent log spamming. */
+ if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
+ __ratelimit(&cpf_rate)) {
+ pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)%s",
+ tsk->comm, task_pid_nr(tsk),
+ regs->ip, regs->sp, ssp, error_code,
+ cp_err_string(error_code),
+ error_code & CP_ENCL ? " in enclave" : "");
+ print_vma_addr(KERN_CONT " in ", regs->ip);
+ pr_cont("\n");
}

- if (WARN_ON_ONCE(user_mode(regs) || (error_code & CP_EC) != CP_ENDBR))
+ force_sig_fault(SIGSEGV, SEGV_CPERR, (void __user *)0);
+ cond_local_irq_disable(regs);
+}
+
+static __ro_after_init bool ibt_fatal = true;
+
+/* code label defined in asm below */
+extern void ibt_selftest_ip(void);
+
+static void do_kernel_cp_fault(struct pt_regs *regs, unsigned long error_code)
+{
+ if ((error_code & CP_EC) != CP_ENDBR) {
+ do_unexpected_cp(regs, error_code);
return;
+ }

if (unlikely(regs->ip == (unsigned long)&ibt_selftest_ip)) {
regs->ax = 0;
@@ -74,3 +135,18 @@ static int __init ibt_setup(char *str)
}

__setup("ibt=", ibt_setup);
+
+DEFINE_IDTENTRY_ERRORCODE(exc_control_protection)
+{
+ if (user_mode(regs)) {
+ if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK))
+ do_user_cp_fault(regs, error_code);
+ else
+ do_unexpected_cp(regs, error_code);
+ } else {
+ if (cpu_feature_enabled(X86_FEATURE_IBT))
+ do_kernel_cp_fault(regs, error_code);
+ else
+ do_unexpected_cp(regs, error_code);
+ }
+}
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index a58c6bc1cd68..5074b8420359 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -107,7 +107,7 @@ static const __initconst struct idt_data def_idts[] = {
ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE),
#endif

-#ifdef CONFIG_X86_KERNEL_IBT
+#ifdef CONFIG_X86_CET
INTG(X86_TRAP_CP, asm_exc_control_protection),
#endif

diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 9027fc088f97..c12624bc82a3 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -402,7 +402,7 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 13a1e6083837..0e808c72bf7e 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -403,7 +403,7 @@ void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact)
*/
static_assert(NSIGILL == 11);
static_assert(NSIGFPE == 15);
-static_assert(NSIGSEGV == 9);
+static_assert(NSIGSEGV == 10);
static_assert(NSIGBUS == 5);
static_assert(NSIGTRAP == 6);
static_assert(NSIGCHLD == 6);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6f666dfa97de..f358350624b2 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -77,18 +77,6 @@

DECLARE_BITMAP(system_vectors, NR_VECTORS);

-static inline void cond_local_irq_enable(struct pt_regs *regs)
-{
- if (regs->flags & X86_EFLAGS_IF)
- local_irq_enable();
-}
-
-static inline void cond_local_irq_disable(struct pt_regs *regs)
-{
- if (regs->flags & X86_EFLAGS_IF)
- local_irq_disable();
-}
-
__always_inline int is_valid_bugaddr(unsigned long addr)
{
if (addr < TASK_SIZE_MAX)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 093b78c8bbec..5a034a994682 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -640,7 +640,7 @@ static struct trap_array_entry trap_array[] = {
TRAP_ENTRY(exc_coprocessor_error, false ),
TRAP_ENTRY(exc_alignment_check, false ),
TRAP_ENTRY(exc_simd_coprocessor_error, false ),
-#ifdef CONFIG_X86_KERNEL_IBT
+#ifdef CONFIG_X86_CET
TRAP_ENTRY(exc_control_protection, false ),
#endif
};
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 08f1ceb9eb81..9e5e68008785 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -148,7 +148,7 @@ xen_pv_trap asm_exc_page_fault
xen_pv_trap asm_exc_spurious_interrupt_bug
xen_pv_trap asm_exc_coprocessor_error
xen_pv_trap asm_exc_alignment_check
-#ifdef CONFIG_X86_KERNEL_IBT
+#ifdef CONFIG_X86_CET
xen_pv_trap asm_exc_control_protection
#endif
#ifdef CONFIG_X86_MCE
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index ffbe4cec9f32..0f52d0ac47c5 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -242,7 +242,8 @@ typedef struct siginfo {
#define SEGV_ADIPERR 7 /* Precise MCD exception */
#define SEGV_MTEAERR 8 /* Asynchronous ARM MTE error */
#define SEGV_MTESERR 9 /* Synchronous ARM MTE exception */
-#define NSIGSEGV 9
+#define SEGV_CPERR 10 /* Control protection fault */
+#define NSIGSEGV 10

/*
* SIGBUS si_codes
--
2.34.1


2023-06-13 00:48:22

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 06/42] x86/shstk: Add Kconfig option for shadow stack

Shadow stack provides protection for applications against function return
address corruption. It is active when the processor supports it, the
kernel has CONFIG_X86_SHADOW_STACK enabled, and the application is built
for the feature. This is only implemented for the 64-bit kernel. When it
is enabled, legacy non-shadow stack applications continue to work, but
without protection.

Since there is another feature that utilizes CET (Kernel IBT) that will
share implementation with shadow stacks, create CONFIG_CET to signify
that at least one CET feature is configured.

Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
arch/x86/Kconfig | 24 ++++++++++++++++++++++++
arch/x86/Kconfig.assembler | 5 +++++
2 files changed, 29 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 53bab123a8ee..ce460d6b4e25 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1852,6 +1852,11 @@ config CC_HAS_IBT
(CC_IS_CLANG && CLANG_VERSION >= 140000)) && \
$(as-instr,endbr64)

+config X86_CET
+ def_bool n
+ help
+ CET features configured (Shadow stack or IBT)
+
config X86_KERNEL_IBT
prompt "Indirect Branch Tracking"
def_bool y
@@ -1859,6 +1864,7 @@ config X86_KERNEL_IBT
# https://github.com/llvm/llvm-project/commit/9d7001eba9c4cb311e03cd8cdc231f9e579f2d0f
depends on !LD_IS_LLD || LLD_VERSION >= 140000
select OBJTOOL
+ select X86_CET
help
Build the kernel with support for Indirect Branch Tracking, a
hardware support course-grain forward-edge Control Flow Integrity
@@ -1952,6 +1958,24 @@ config X86_SGX

If unsure, say N.

+config X86_USER_SHADOW_STACK
+ bool "X86 userspace shadow stack"
+ depends on AS_WRUSS
+ depends on X86_64
+ select ARCH_USES_HIGH_VMA_FLAGS
+ select X86_CET
+ help
+ Shadow stack protection is a hardware feature that detects function
+ return address corruption. This helps mitigate ROP attacks.
+ Applications must be enabled to use it, and old userspace does not
+ get protection "for free".
+
+ CPUs supporting shadow stacks were first released in 2020.
+
+ See Documentation/x86/shstk.rst for more information.
+
+ If unsure, say N.
+
config EFI
bool "EFI runtime service support"
depends on ACPI
diff --git a/arch/x86/Kconfig.assembler b/arch/x86/Kconfig.assembler
index b88f784cb02e..8ad41da301e5 100644
--- a/arch/x86/Kconfig.assembler
+++ b/arch/x86/Kconfig.assembler
@@ -24,3 +24,8 @@ config AS_GFNI
def_bool $(as-instr,vgf2p8mulb %xmm0$(comma)%xmm1$(comma)%xmm2)
help
Supported by binutils >= 2.30 and LLVM integrated assembler
+
+config AS_WRUSS
+ def_bool $(as-instr,wrussq %rax$(comma)(%rbx))
+ help
+ Supported by binutils >= 2.31 and LLVM integrated assembler
--
2.34.1


2023-06-13 00:48:22

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 41/42] x86/shstk: Add ARCH_SHSTK_UNLOCK

From: Mike Rapoport <[email protected]>

Userspace loaders may lock features before a CRIU restore operation has
the chance to set them to whatever state is required by the process
being restored. Allow a way for CRIU to unlock features. Add it as an
arch_prctl() like the other shadow stack operations, but restrict it being
called by the ptrace arch_pctl() interface.

[Merged into recent API changes, added commit log and docs]

Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
Documentation/arch/x86/shstk.rst | 4 ++++
arch/x86/include/uapi/asm/prctl.h | 1 +
arch/x86/kernel/process_64.c | 1 +
arch/x86/kernel/shstk.c | 9 +++++++--
4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/Documentation/arch/x86/shstk.rst b/Documentation/arch/x86/shstk.rst
index f09afa504ec0..f3553cc8c758 100644
--- a/Documentation/arch/x86/shstk.rst
+++ b/Documentation/arch/x86/shstk.rst
@@ -75,6 +75,10 @@ arch_prctl(ARCH_SHSTK_LOCK, unsigned long features)
are ignored. The mask is ORed with the existing value. So any feature bits
set here cannot be enabled or disabled afterwards.

+arch_prctl(ARCH_SHSTK_UNLOCK, unsigned long features)
+ Unlock features. 'features' is a mask of all features to unlock. All
+ bits set are processed, unset bits are ignored. Only works via ptrace.
+
The return values are as follows. On success, return 0. On error, errno can
be::

diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index eedfde3b63be..3189c4a96468 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -33,6 +33,7 @@
#define ARCH_SHSTK_ENABLE 0x5001
#define ARCH_SHSTK_DISABLE 0x5002
#define ARCH_SHSTK_LOCK 0x5003
+#define ARCH_SHSTK_UNLOCK 0x5004

/* ARCH_SHSTK_ features bits */
#define ARCH_SHSTK_SHSTK (1ULL << 0)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 0f89aa0186d1..e6db21c470aa 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -899,6 +899,7 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
case ARCH_SHSTK_ENABLE:
case ARCH_SHSTK_DISABLE:
case ARCH_SHSTK_LOCK:
+ case ARCH_SHSTK_UNLOCK:
return shstk_prctl(task, option, arg2);
default:
ret = -EINVAL;
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index d723cdc93474..d43b7a9c57ce 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -489,9 +489,14 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long features)
return 0;
}

- /* Don't allow via ptrace */
- if (task != current)
+ /* Only allow via ptrace */
+ if (task != current) {
+ if (option == ARCH_SHSTK_UNLOCK && IS_ENABLED(CONFIG_CHECKPOINT_RESTORE)) {
+ task->thread.features_locked &= ~features;
+ return 0;
+ }
return -EINVAL;
+ }

/* Do not allow to change locked features */
if (features & task->thread.features_locked)
--
2.34.1


2023-06-13 00:48:40

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 33/42] x86/shstk: Check that signal frame is shadow stack mem

The shadow stack signal frame is read by the kernel on sigreturn. It
relies on shadow stack memory protections to prevent forgeries of this
signal frame (which included the pre-signal SSP). This behavior helps
userspace protect itself. However, using the INCSSP instruction userspace
can adjust the SSP to 8 bytes beyond the end of a shadow stack. INCSSP
performs shadow stack reads to make sure it doesn’t increment off of the
shadow stack, but on the end position it actually reads 8 bytes below the
new SSP.

For the shadow stack HW operations, this situation (INCSSP off the end
of a shadow stack by 8 bytes) would be fine. If the a RET is executed, the
push to the shadow stack would fail to write to the shadow stack. If a
CALL is executed, the SSP will be incremented back onto the stack and the
return address will be written successfully to the very end. That is
expected behavior around shadow stack underflow.

However, the kernel doesn’t have a way to read shadow stack memory using
shadow stack accesses. WRUSS can write to shadow stack memory with a
shadow stack access which ensures the access is to shadow stack memory.
But unfortunately for this case, there is no equivalent instruction for
shadow stack reads. So when reading the shadow stack signal frames, the
kernel currently assumes the SSP is pointing to the shadow stack and uses
a normal read.

The SSP pointing to shadow stack memory will be true in most cases, but as
described above, in can be untrue by 8 bytes. So lookup the VMA of the
shadow stack sigframe being read to verify it is shadow stack.

Since the SSP can only be beyond the shadow stack by 8 bytes, and
shadow stack memory is page aligned, this check only needs to be done
when this type of relative position to a page boundary is encountered.
So skip the extra work otherwise.

Signed-off-by: Rick Edgecombe <[email protected]>
---
v9:
- New patch
---
arch/x86/kernel/shstk.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index a8705f7d966c..50733a510446 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -249,15 +249,38 @@ static int shstk_push_sigframe(unsigned long *ssp)

static int shstk_pop_sigframe(unsigned long *ssp)
{
+ struct vm_area_struct *vma;
unsigned long token_addr;
- int err;
+ bool need_to_check_vma;
+ int err = 1;

+ /*
+ * It is possible for the SSP to be off the end of a shadow stack by 4
+ * or 8 bytes. If the shadow stack is at the start of a page or 4 bytes
+ * before it, it might be this case, so check that the address being
+ * read is actually shadow stack.
+ */
if (!IS_ALIGNED(*ssp, 8))
return -EINVAL;

+ need_to_check_vma = PAGE_ALIGN(*ssp) == *ssp;
+
+ if (need_to_check_vma)
+ mmap_read_lock_killable(current->mm);
+
err = get_shstk_data(&token_addr, (unsigned long __user *)*ssp);
if (unlikely(err))
- return err;
+ goto out_err;
+
+ if (need_to_check_vma) {
+ vma = find_vma(current->mm, *ssp);
+ if (!vma || !(vma->vm_flags & VM_SHADOW_STACK)) {
+ err = -EFAULT;
+ goto out_err;
+ }
+
+ mmap_read_unlock(current->mm);
+ }

/* Restore SSP aligned? */
if (unlikely(!IS_ALIGNED(token_addr, 8)))
@@ -270,6 +293,10 @@ static int shstk_pop_sigframe(unsigned long *ssp)
*ssp = token_addr;

return 0;
+out_err:
+ if (need_to_check_vma)
+ mmap_read_unlock(current->mm);
+ return err;
}

int setup_signal_shadow_stack(struct ksignal *ksig)
--
2.34.1


2023-06-13 00:48:48

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 17/42] mm: Warn on shadow stack memory in wrong vma

The x86 Control-flow Enforcement Technology (CET) feature includes a new
type of memory called shadow stack. This shadow stack memory has some
unusual properties, which requires some core mm changes to function
properly.

One sharp edge is that PTEs that are both Write=0 and Dirty=1 are
treated as shadow by the CPU, but this combination used to be created by
the kernel on x86. Previous patches have changed the kernel to now avoid
creating these PTEs unless they are for shadow stack memory. In case any
missed corners of the kernel are still creating PTEs like this for
non-shadow stack memory, and to catch any re-introductions of the logic,
warn if any shadow stack PTEs (Write=0, Dirty=1) are found in non-shadow
stack VMAs when they are being zapped. This won't catch transient cases
but should have decent coverage.

In order to check if a PTE is shadow stack in core mm code, add two arch
breakouts arch_check_zapped_pte/pmd(). This will allow shadow stack
specific code to be kept in arch/x86.

Only do the check if shadow stack is supported by the CPU and configured
because in rare cases older CPUs may write Dirty=1 to a Write=0 CPU on
older CPUs. This check is handled in pte_shstk()/pmd_shstk().

Signed-off-by: Rick Edgecombe <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
v9:
- Add comments about not doing the check on non-shadow stack CPUs
---
arch/x86/include/asm/pgtable.h | 6 ++++++
arch/x86/mm/pgtable.c | 20 ++++++++++++++++++++
include/linux/pgtable.h | 14 ++++++++++++++
mm/huge_memory.c | 1 +
mm/memory.c | 1 +
5 files changed, 42 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index d8724f5b1202..89cfa93d0ad6 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1664,6 +1664,12 @@ static inline bool arch_has_hw_pte_young(void)
return true;
}

+#define arch_check_zapped_pte arch_check_zapped_pte
+void arch_check_zapped_pte(struct vm_area_struct *vma, pte_t pte);
+
+#define arch_check_zapped_pmd arch_check_zapped_pmd
+void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd);
+
#ifdef CONFIG_XEN_PV
#define arch_has_hw_nonleaf_pmd_young arch_has_hw_nonleaf_pmd_young
static inline bool arch_has_hw_nonleaf_pmd_young(void)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0ad2c62ac0a8..101e721d74aa 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -894,3 +894,23 @@ pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma)

return pmd_clear_saveddirty(pmd);
}
+
+void arch_check_zapped_pte(struct vm_area_struct *vma, pte_t pte)
+{
+ /*
+ * Hardware before shadow stack can (rarely) set Dirty=1
+ * on a Write=0 PTE. So the below condition
+ * only indicates a software bug when shadow stack is
+ * supported by the HW. This checking is covered in
+ * pte_shstk().
+ */
+ VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) &&
+ pte_shstk(pte));
+}
+
+void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd)
+{
+ /* See note in arch_check_zapped_pte() */
+ VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) &&
+ pmd_shstk(pmd));
+}
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 0f3cf726812a..feb1fd2c814f 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -291,6 +291,20 @@ static inline bool arch_has_hw_pte_young(void)
}
#endif

+#ifndef arch_check_zapped_pte
+static inline void arch_check_zapped_pte(struct vm_area_struct *vma,
+ pte_t pte)
+{
+}
+#endif
+
+#ifndef arch_check_zapped_pmd
+static inline void arch_check_zapped_pmd(struct vm_area_struct *vma,
+ pmd_t pmd)
+{
+}
+#endif
+
#ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
unsigned long address,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 37dd56b7b3d1..c3cc20c1b26c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1681,6 +1681,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
*/
orig_pmd = pmdp_huge_get_and_clear_full(vma, addr, pmd,
tlb->fullmm);
+ arch_check_zapped_pmd(vma, orig_pmd);
tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
if (vma_is_special_huge(vma)) {
if (arch_needs_pgtable_deposit())
diff --git a/mm/memory.c b/mm/memory.c
index c1b6fe944c20..40c0b233b61d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1412,6 +1412,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
continue;
ptent = ptep_get_and_clear_full(mm, addr, pte,
tlb->fullmm);
+ arch_check_zapped_pte(vma, ptent);
tlb_remove_tlb_entry(tlb, pte, addr);
zap_install_uffd_wp_if_needed(vma, addr, pte, details,
ptent);
--
2.34.1


2023-06-13 00:49:04

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v9 42/42] x86/shstk: Add ARCH_SHSTK_STATUS

CRIU and GDB need to get the current shadow stack and WRSS enablement
status. This information is already available via /proc/pid/status, but
this is inconvenient for CRIU because it involves parsing the text output
in an area of the code where this is difficult. Provide a status
arch_prctl(), ARCH_SHSTK_STATUS for retrieving the status. Have arg2 be a
userspace address, and make the new arch_prctl simply copy the features
out to userspace.

Suggested-by: Mike Rapoport <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
---
Documentation/arch/x86/shstk.rst | 6 ++++++
arch/x86/include/asm/shstk.h | 2 +-
arch/x86/include/uapi/asm/prctl.h | 1 +
arch/x86/kernel/process_64.c | 1 +
arch/x86/kernel/shstk.c | 8 +++++++-
5 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/Documentation/arch/x86/shstk.rst b/Documentation/arch/x86/shstk.rst
index f3553cc8c758..60260e809baf 100644
--- a/Documentation/arch/x86/shstk.rst
+++ b/Documentation/arch/x86/shstk.rst
@@ -79,6 +79,11 @@ arch_prctl(ARCH_SHSTK_UNLOCK, unsigned long features)
Unlock features. 'features' is a mask of all features to unlock. All
bits set are processed, unset bits are ignored. Only works via ptrace.

+arch_prctl(ARCH_SHSTK_STATUS, unsigned long addr)
+ Copy the currently enabled features to the address passed in addr. The
+ features are described using the bits passed into the others in
+ 'features'.
+
The return values are as follows. On success, return 0. On error, errno can
be::

@@ -86,6 +91,7 @@ be::
-ENOTSUPP if the feature is not supported by the hardware or
kernel.
-EINVAL arguments (non existing feature, etc)
+ -EFAULT if could not copy information back to userspace

The feature's bits supported are::

diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index ecb23a8ca47d..42fee8959df7 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -14,7 +14,7 @@ struct thread_shstk {
u64 size;
};

-long shstk_prctl(struct task_struct *task, int option, unsigned long features);
+long shstk_prctl(struct task_struct *task, int option, unsigned long arg2);
void reset_thread_features(void);
unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags,
unsigned long stack_size);
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 3189c4a96468..384e2cc6ac19 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -34,6 +34,7 @@
#define ARCH_SHSTK_DISABLE 0x5002
#define ARCH_SHSTK_LOCK 0x5003
#define ARCH_SHSTK_UNLOCK 0x5004
+#define ARCH_SHSTK_STATUS 0x5005

/* ARCH_SHSTK_ features bits */
#define ARCH_SHSTK_SHSTK (1ULL << 0)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e6db21c470aa..33b268747bb7 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -900,6 +900,7 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
case ARCH_SHSTK_DISABLE:
case ARCH_SHSTK_LOCK:
case ARCH_SHSTK_UNLOCK:
+ case ARCH_SHSTK_STATUS:
return shstk_prctl(task, option, arg2);
default:
ret = -EINVAL;
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index d43b7a9c57ce..b26810c7cd1c 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -482,8 +482,14 @@ SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsi
return alloc_shstk(addr, aligned_size, size, set_tok);
}

-long shstk_prctl(struct task_struct *task, int option, unsigned long features)
+long shstk_prctl(struct task_struct *task, int option, unsigned long arg2)
{
+ unsigned long features = arg2;
+
+ if (option == ARCH_SHSTK_STATUS) {
+ return put_user(task->thread.features, (unsigned long __user *)arg2);
+ }
+
if (option == ARCH_SHSTK_LOCK) {
task->thread.features_locked |= features;
return 0;
--
2.34.1


2023-06-13 02:19:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Mon, Jun 12, 2023 at 5:14 PM Rick Edgecombe
<[email protected]> wrote:
>
> This series implements Shadow Stacks for userspace using x86's Control-flow
> Enforcement Technology (CET).

Do you have this in a git tree somewhere? For series with this many
patches, I find it easier to just do a "git fetch" and "gitk
..FETCH_HEAD" these days, and then reply by email on anything I find.

That's partly because it makes it really easy to zoom in on some
particular area (eg "let's look at just mm/ and the generic include
files")

Linus

2023-06-13 03:42:10

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Mon, 2023-06-12 at 18:34 -0700, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 5:14 PM Rick Edgecombe
> <[email protected]> wrote:
> >
> > This series implements Shadow Stacks for userspace using x86's
> > Control-flow
> > Enforcement Technology (CET).
>
> Do you have this in a git tree somewhere? For series with this many
> patches, I find it easier to just do a "git fetch" and "gitk
> ..FETCH_HEAD" these days, and then reply by email on anything I find.
>
> That's partly because it makes it really easy to zoom in on some
> particular area (eg "let's look at just mm/ and the generic include
> files")

Sure. I probably should have included that upfront. Here is a github
repo:
https://github.com/rpedgeco/linux/tree/user_shstk_v9

I went ahead and included the tags[0] from last time in case that's
useful, but unfortunately the github web interface is not very
conducive to viewing the tag-based segmentation of the series. If
having it in a korg repo would be useful, please let me know.

[0]
https://lore.kernel.org/lkml/[email protected]/

2023-06-13 07:39:49

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Tue, Jun 13, 2023 at 2:13 AM Rick Edgecombe
<[email protected]> wrote:
> The x86 Shadow stack feature includes a new type of memory called shadow
> stack. This shadow stack memory has some unusual properties, which requires
> some core mm changes to function properly.
>
> One of these unusual properties is that shadow stack memory is writable,
> but only in limited ways. These limits are applied via a specific PTE
> bit combination. Nevertheless, the memory is writable, and core mm code
> will need to apply the writable permissions in the typical paths that
> call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
> that the x86 implementation of it can know whether to create regular
> writable memory or shadow stack memory.
>
> But there are a couple of challenges to this. Modifying the signatures of
> each arch pte_mkwrite() implementation would be error prone because some
> are generated with macros and would need to be re-implemented. Also, some
> pte_mkwrite() callers operate on kernel memory without a VMA.
>
> So this can be done in a three step process. First pte_mkwrite() can be
> renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
> added that just calls pte_mkwrite_novma(). Next callers without a VMA can
> be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
> can be changed to take/pass a VMA.
>
> Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
> adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
> pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
> create a new arch config HAS_HUGE_PAGE that can be used to tell if
> pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
> compiler would not be able to find pmd_mkwrite_novma().
>
> No functional change.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Michal Simek <[email protected]>
> Cc: Dinh Nguyen <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Linus Torvalds <[email protected]>
> Signed-off-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/

> arch/m68k/include/asm/mcf_pgtable.h | 2 +-
> arch/m68k/include/asm/motorola_pgtable.h | 2 +-
> arch/m68k/include/asm/sun3_pgtable.h | 2 +-

Acked-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-06-13 07:53:24

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Mon, Jun 12, 2023 at 05:10:27PM -0700, Rick Edgecombe wrote:
> The x86 Shadow stack feature includes a new type of memory called shadow
> stack. This shadow stack memory has some unusual properties, which requires
> some core mm changes to function properly.
>
> One of these unusual properties is that shadow stack memory is writable,
> but only in limited ways. These limits are applied via a specific PTE
> bit combination. Nevertheless, the memory is writable, and core mm code
> will need to apply the writable permissions in the typical paths that
> call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
> that the x86 implementation of it can know whether to create regular
> writable memory or shadow stack memory.

Nit: ^ mapping?

> But there are a couple of challenges to this. Modifying the signatures of
> each arch pte_mkwrite() implementation would be error prone because some
> are generated with macros and would need to be re-implemented. Also, some
> pte_mkwrite() callers operate on kernel memory without a VMA.
>
> So this can be done in a three step process. First pte_mkwrite() can be
> renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
> added that just calls pte_mkwrite_novma(). Next callers without a VMA can
> be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
> can be changed to take/pass a VMA.
>
> Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
> adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
> pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
> create a new arch config HAS_HUGE_PAGE that can be used to tell if
> pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
> compiler would not be able to find pmd_mkwrite_novma().
>
> No functional change.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Michal Simek <[email protected]>
> Cc: Dinh Nguyen <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Linus Torvalds <[email protected]>
> Signed-off-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/

Reviewed-by: Mike Rapoport (IBM) <[email protected]>

--
Sincerely yours,
Mike.

2023-06-13 12:33:56

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On 13.06.23 02:10, Rick Edgecombe wrote:
> The x86 Shadow stack feature includes a new type of memory called shadow
> stack. This shadow stack memory has some unusual properties, which requires
> some core mm changes to function properly.
>
> One of these unusual properties is that shadow stack memory is writable,
> but only in limited ways. These limits are applied via a specific PTE
> bit combination. Nevertheless, the memory is writable, and core mm code
> will need to apply the writable permissions in the typical paths that
> call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
> that the x86 implementation of it can know whether to create regular
> writable memory or shadow stack memory.
>
> But there are a couple of challenges to this. Modifying the signatures of
> each arch pte_mkwrite() implementation would be error prone because some
> are generated with macros and would need to be re-implemented. Also, some
> pte_mkwrite() callers operate on kernel memory without a VMA.
>
> So this can be done in a three step process. First pte_mkwrite() can be
> renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
> added that just calls pte_mkwrite_novma(). Next callers without a VMA can
> be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
> can be changed to take/pass a VMA.
>
> Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
> adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
> pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
> create a new arch config HAS_HUGE_PAGE that can be used to tell if
> pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
> compiler would not be able to find pmd_mkwrite_novma().
>
> No functional change.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Michal Simek <[email protected]>
> Cc: Dinh Nguyen <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Linus Torvalds <[email protected]>
> Signed-off-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
> ---
> Hi Non-x86 Arch’s,
>
> x86 has a feature that allows for the creation of a special type of
> writable memory (shadow stack) that is only writable in limited specific
> ways. Previously, changes were proposed to core MM code to teach it to
> decide when to create normally writable memory or the special shadow stack
> writable memory, but David Hildenbrand suggested[0] to change
> pXX_mkwrite() to take a VMA, so awareness of shadow stack memory can be
> moved into x86 code. Later Linus suggested a less error-prone way[1] to go
> about this after the first attempt had a bug.
>
> Since pXX_mkwrite() is defined in every arch, it requires some tree-wide
> changes. So that is why you are seeing some patches out of a big x86
> series pop up in your arch mailing list. There is no functional change.
> After this refactor, the shadow stack series goes on to use the arch
> helpers to push arch memory details inside arch/x86 and other arch's
> with upcoming shadow stack features.
>
> Testing was just 0-day build testing.
>
> Hopefully that is enough context. Thanks!
>
> [0] https://lore.kernel.org/lkml/[email protected]/
> [1] https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
> ---

Acked-by: David Hildenbrand <[email protected]>

--
Cheers,

David / dhildenb


2023-06-13 16:39:23

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Tue, 2023-06-13 at 10:43 +0300, Mike Rapoport wrote:
> On Mon, Jun 12, 2023 at 05:10:27PM -0700, Rick Edgecombe wrote:
> > The x86 Shadow stack feature includes a new type of memory called
> > shadow
> > stack. This shadow stack memory has some unusual properties, which
> > requires
> > some core mm changes to function properly.
> >
> > One of these unusual properties is that shadow stack memory is
> > writable,
> > but only in limited ways. These limits are applied via a specific
> > PTE
> > bit combination. Nevertheless, the memory is writable, and core mm
> > code
> > will need to apply the writable permissions in the typical paths
> > that
> > call pte_mkwrite(). Future patches will make pte_mkwrite() take a
> > VMA, so
> > that the x86 implementation of it can know whether to create
> > regular
> > writable memory or shadow stack memory.
>
> Nit:                            ^ mapping?

Hmm, sure.

>
> > But there are a couple of challenges to this. Modifying the
> > signatures of
> > each arch pte_mkwrite() implementation would be error prone because
> > some
> > are generated with macros and would need to be re-implemented.
> > Also, some
> > pte_mkwrite() callers operate on kernel memory without a VMA.
> >
> > So this can be done in a three step process. First pte_mkwrite()
> > can be
> > renamed to pte_mkwrite_novma() in each arch, with a generic
> > pte_mkwrite()
> > added that just calls pte_mkwrite_novma(). Next callers without a
> > VMA can
> > be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all
> > callers
> > can be changed to take/pass a VMA.
> >
> > Start the process by renaming pte_mkwrite() to pte_mkwrite_novma()
> > and
> > adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
> > pattern for pmd_mkwrite(). Since not all archs have a
> > pmd_mkwrite_novma(),
> > create a new arch config HAS_HUGE_PAGE that can be used to tell if
> > pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE
> > cases the
> > compiler would not be able to find pmd_mkwrite_novma().
> >
> > No functional change.
> >
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: Michal Simek <[email protected]>
> > Cc: Dinh Nguyen <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Suggested-by: Linus Torvalds <[email protected]>
> > Signed-off-by: Rick Edgecombe <[email protected]>
> > Link:
> > https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
>
> Reviewed-by: Mike Rapoport (IBM) <[email protected]>

Thanks!

2023-06-13 16:49:24

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Tue, 2023-06-13 at 14:26 +0200, David Hildenbrand wrote:
>
> Acked-by: David Hildenbrand <[email protected]>

Thanks!

2023-06-13 16:55:30

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Tue, 2023-06-13 at 09:19 +0200, Geert Uytterhoeven wrote:
> Acked-by: Geert Uytterhoeven <[email protected]>

Thanks!

2023-06-13 18:04:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Mon, Jun 12, 2023 at 8:12 PM Edgecombe, Rick P
<[email protected]> wrote:
>
> Sure. I probably should have included that upfront. Here is a github
> repo:
> https://github.com/rpedgeco/linux/tree/user_shstk_v9
>
> I went ahead and included the tags[0] from last time in case that's
> useful, but unfortunately the github web interface is not very
> conducive to viewing the tag-based segmentation of the series. If
> having it in a korg repo would be useful, please let me know.

Oh, kernel.org vs github doesn't matter. I'm not actually merging this
yet, I'm just doing a fetch to then easily be able to look at it
locally in different formats.

I tend to like seeing small things in my MUA just because then I don't
switch back-and-forth between reading email and some gitk workflow,
and it is easy to just scan through the series and reply all inthe
MUA.

But when it's some bigger piece, just doing a "git fetch" and then
being able to dissect it locally is really convenient.

Having worked with patches for three decades, I can read diffs in my
sleep - but it's still quite useful to say "give me the patches just
for *this* file" to just see how some specific area changed without
having to look at the other parts.

Or for example, that whole pte_mkwrite -> pte_mkwrite_novma patch is
much denser and more legible with color-coding and the --word-diff.

Anyway, I'm scanning through it right now. No comments yet, I only
just got started.

Linus

2023-06-13 18:40:19

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Tue, Jun 13, 2023 at 10:44 AM Linus Torvalds
<[email protected]> wrote:
>
> Anyway, I'm scanning through it right now. No comments yet, I only
> just got started.

Well, it all looked fine from a quick scan. One small comment, and
even that was just a minor stylistic nit.

I didn't actually look through the x86 state infrastructure side -
I'll just trust that is fine, and it doesn't interact with anything
else, so I don't really worry about it. I mainly care about the VM
side not causing problems, and the changes on that side all looked
fine.

Linus

2023-06-13 19:49:29

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Tue, 2023-06-13 at 11:27 -0700, Linus Torvalds wrote:
> On Tue, Jun 13, 2023 at 10:44 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > Anyway, I'm scanning through it right now. No comments yet, I only
> > just got started.
>
> Well, it all looked fine from a quick scan. One small comment, and
> even that was just a minor stylistic nit.
>
> I didn't actually look through the x86 state infrastructure side -
> I'll just trust that is fine, and it doesn't interact with anything
> else, so I don't really worry about it. I mainly care about the VM
> side not causing problems, and the changes on that side all looked
> fine.

Thanks!

2023-06-14 23:58:41

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v9 17/42] mm: Warn on shadow stack memory in wrong vma

On Mon, Jun 12, 2023 at 05:10:43PM -0700, Rick Edgecombe wrote:
> The x86 Control-flow Enforcement Technology (CET) feature includes a new
> type of memory called shadow stack. This shadow stack memory has some
> unusual properties, which requires some core mm changes to function
> properly.

Reviewed-by: Mark Brown <[email protected]>


Attachments:
(No filename) (346.00 B)
signature.asc (499.00 B)
Download all attachments

2023-06-15 00:08:40

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v9 00/42] Shadow stacks for userspace

On Mon, Jun 12, 2023 at 05:10:26PM -0700, Rick Edgecombe wrote:

> This series implements Shadow Stacks for userspace using x86's Control-flow
> Enforcement Technology (CET). CET consists of two related security
> features: shadow stacks and indirect branch tracking. This series
> implements just the shadow stack part of this feature, and just for
> userspace.

I've been using the generic changes here for the work I've been doing on
arm64's similar GCS feature, while that is still very much WIP and
hasn't been posted anywhere most of the common code here has been
exercised. I've been through the patches that I've specifically checked
or used. Thanks for all the work here.


Attachments:
(No filename) (698.00 B)
signature.asc (499.00 B)
Download all attachments

2023-06-19 05:40:23

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On 6/13/23 02:10, Rick Edgecombe wrote:
> The x86 Shadow stack feature includes a new type of memory called shadow
> stack. This shadow stack memory has some unusual properties, which requires
> some core mm changes to function properly.
>
> One of these unusual properties is that shadow stack memory is writable,
> but only in limited ways. These limits are applied via a specific PTE
> bit combination. Nevertheless, the memory is writable, and core mm code
> will need to apply the writable permissions in the typical paths that
> call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
> that the x86 implementation of it can know whether to create regular
> writable memory or shadow stack memory.
>
> But there are a couple of challenges to this. Modifying the signatures of
> each arch pte_mkwrite() implementation would be error prone because some
> are generated with macros and would need to be re-implemented. Also, some
> pte_mkwrite() callers operate on kernel memory without a VMA.
>
> So this can be done in a three step process. First pte_mkwrite() can be
> renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
> added that just calls pte_mkwrite_novma(). Next callers without a VMA can
> be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
> can be changed to take/pass a VMA.
>
> Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
> adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
> pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
> create a new arch config HAS_HUGE_PAGE that can be used to tell if
> pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
> compiler would not be able to find pmd_mkwrite_novma().
>
> No functional change.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Michal Simek <[email protected]>
> Cc: Dinh Nguyen <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Linus Torvalds <[email protected]>
> Signed-off-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
> ---
> Hi Non-x86 Arch’s,
>
> x86 has a feature that allows for the creation of a special type of
> writable memory (shadow stack) that is only writable in limited specific
> ways. Previously, changes were proposed to core MM code to teach it to
> decide when to create normally writable memory or the special shadow stack
> writable memory, but David Hildenbrand suggested[0] to change
> pXX_mkwrite() to take a VMA, so awareness of shadow stack memory can be
> moved into x86 code. Later Linus suggested a less error-prone way[1] to go
> about this after the first attempt had a bug.
>
> Since pXX_mkwrite() is defined in every arch, it requires some tree-wide
> changes. So that is why you are seeing some patches out of a big x86
> series pop up in your arch mailing list. There is no functional change.
> After this refactor, the shadow stack series goes on to use the arch
> helpers to push arch memory details inside arch/x86 and other arch's
> with upcoming shadow stack features.
>
> Testing was just 0-day build testing.
>
> Hopefully that is enough context. Thanks!
>
> [0] https://lore.kernel.org/lkml/[email protected]/
> [1] https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
> ---
> arch/parisc/include/asm/pgtable.h | 2 +-

Acked-by: Helge Deller <[email protected]> # parisc

Helge

2023-07-14 23:38:55

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Mon, Jun 12, 2023 at 05:10:27PM -0700, Rick Edgecombe wrote:
> The x86 Shadow stack feature includes a new type of memory called shadow
> stack. This shadow stack memory has some unusual properties, which requires
> some core mm changes to function properly.

This seems to break sparc64_defconfig when applied on top of v6.5-rc1:

In file included from /home/broonie/git/bisect/include/linux/mm.h:29,
from /home/broonie/git/bisect/net/core/skbuff.c:40:
/home/broonie/git/bisect/include/linux/pgtable.h: In function 'pmd_mkwrite':
/home/broonie/git/bisect/include/linux/pgtable.h:528:9: error: implicit declaration of function 'pmd_mkwrite_novma'; did you mean 'pte_mkwrite_novma'? [-Werror=implicit-function-declaration]
return pmd_mkwrite_novma(pmd);
^~~~~~~~~~~~~~~~~
pte_mkwrite_novma
/home/broonie/git/bisect/include/linux/pgtable.h:528:9: error: incompatible types when returning type 'int' but 'pmd_t' {aka 'struct <anonymous>'} was expected
return pmd_mkwrite_novma(pmd);
^~~~~~~~~~~~~~~~~~~~~~

The same issue seems to apply with the version that was in -next based
on v6.4-rc4 too.


Attachments:
(No filename) (1.14 kB)
signature.asc (499.00 B)
Download all attachments

2023-07-17 16:19:15

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Fri, 2023-07-14 at 23:57 +0100, Mark Brown wrote:
> On Mon, Jun 12, 2023 at 05:10:27PM -0700, Rick Edgecombe wrote:
> > The x86 Shadow stack feature includes a new type of memory called
> > shadow
> > stack. This shadow stack memory has some unusual properties, which
> > requires
> > some core mm changes to function properly.
>
> This seems to break sparc64_defconfig when applied on top of v6.5-
> rc1:
>
> In file included from /home/broonie/git/bisect/include/linux/mm.h:29,
>                  from /home/broonie/git/bisect/net/core/skbuff.c:40:
> /home/broonie/git/bisect/include/linux/pgtable.h: In function
> 'pmd_mkwrite':
> /home/broonie/git/bisect/include/linux/pgtable.h:528:9: error:
> implicit declaration of function 'pmd_mkwrite_novma'; did you mean
> 'pte_mkwrite_novma'? [-Werror=implicit-function-declaration]
>   return pmd_mkwrite_novma(pmd);
>          ^~~~~~~~~~~~~~~~~
>          pte_mkwrite_novma
> /home/broonie/git/bisect/include/linux/pgtable.h:528:9: error:
> incompatible types when returning type 'int' but 'pmd_t' {aka 'struct
> <anonymous>'} was expected
>   return pmd_mkwrite_novma(pmd);
>          ^~~~~~~~~~~~~~~~~~~~~~
>
> The same issue seems to apply with the version that was in -next
> based
> on v6.4-rc4 too.

The version in your branch is not the same as the version in tip (which
had a squashed build fix). I was able to reproduce the build error with
your branch. But not with the one in tip rebased on v6.5-rc1. So can
you try this version:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/shstk&id=899223d69ce9f338056f4c41ef870d70040fc860


2023-07-17 17:07:41

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v9 01/42] mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()

On Mon, Jul 17, 2023 at 03:55:50PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2023-07-14 at 23:57 +0100, Mark Brown wrote:

> > The same issue seems to apply with the version that was in -next
> > based
> > on v6.4-rc4 too.

> The version in your branch is not the same as the version in tip (which
> had a squashed build fix). I was able to reproduce the build error with
> your branch. But not with the one in tip rebased on v6.5-rc1. So can
> you try this version:
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/shstk&id=899223d69ce9f338056f4c41ef870d70040fc860

Ah, I'd not seen that patch or that tip had been rebased - I'd actually
been using literally the branch from tip as my base at whatever point I
last noticed it changing up until I rebased onto -rc1.


Attachments:
(No filename) (809.00 B)
signature.asc (499.00 B)
Download all attachments