2021-08-04 05:04:44

by Ira Weiny

[permalink] [raw]
Subject: [PATCH V7 00/18] PKS/PMEM: Add Stray Write Protection

From: Ira Weiny <[email protected]>

NOTE: x86 maintainers, I'm submitting this for ack/review by Dave Hansen and
Dan Williams. Feel free to ignore it but we have had a lot of internal debate
on a number of design decisions so we would like to have the remaining reviews
public such that everyone can see the remaining debate/decisions.

Furthermore, this gives a public reference for Rick to build other PKS use
cases on.


PKS/PMEM Stray write protection
===============================

This series is broken into 2 parts.

1) Introduce Protection Key Supervisor (PKS)
2) Use PKS to protect PMEM from stray writes

Introduce Protection Key Supervisor (PKS)
-----------------------------------------

PKS enables protections on 'domains' of supervisor pages to limit supervisor
mode access to pages beyond the normal paging protections. PKS works in a
similar fashion to user space pkeys, PKU. As with PKU, supervisor pkeys are
checked in addition to normal paging protections and Access or Writes can be
disabled via a MSR update without TLB flushes when permissions change.

Also like PKU, a page mapping is assigned to a domain by setting pkey bits in
the page table entry for that mapping.

Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.

XSAVE is not supported for the PKRS MSR. Therefore the implementation
saves/restores the MSR across context switches and during exceptions. Nested
exceptions are supported by each exception getting a new PKS state.

For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections.

Other keys, (1-15) are statically allocated by kernel users adding an entry to
'enum pks_pkey_consumers' and adding a corresponding default value in
consumer_defaults in create_initial_pkrs_value(). This patch series allocates
a single key for use by persistent memory stray write protection. When the
number of users grows larger the sharing of keys will need to be resolved
depending on the needs of the users at that time.

More usage details can be found in the documentation.

The following are key attributes of PKS.

1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
2) Works on a per thread basis

PKS is available with 4 and 5 level paging. Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.


Use PKS to protect PMEM from stray writes
-----------------------------------------

DAX leverages the direct-map to enable 'struct page' services for PMEM. Given
that PMEM capacity may be an order of magnitude higher capacity than System RAM
it presents a large vulnerability surface to stray writes. Such a stray write
becomes a silent data corruption bug.

Given that PMEM access from the kernel is limited to a constrained set of
locations (PMEM driver, Filesystem-DAX, and direct-I/O), it is amenable to PKS
protection. Set up an infrastructure for extra device access protection. Then
implement the protection using the new Protection Keys Supervisor (PKS) on
architectures which support it.

Because PMEM pages are all associated with a struct dev_pagemap the flag of
protecting memory can be stored there. All PMEM is protected by the same pkey.
So a single flag is all that is needed to indicate protection.

General access in the kernel is supported by modifying the kmap infrastructure
which can detect if a page is PMEM and pks protected. If so kmap_local_page()
and kmap_atomic() can enable access until their unmap's are called.

Because PKS is a thread local mechanism and because kmap was never really
intended to create a long term mapping,

This implementation avoids supporting the kmap()/kunmap() for a number of
reasons. First, kmap was never really intended to create long term mappings.
Second, no known kernel users of pmem use kmap. Third, PKS is a thread local
mechanism.

Originally this series modified many of the kmap call sites to indicate they
were thread local.[1] And an attempt to support kmap()[2] was made. But now
that kmap_local_page() has been developed[3] and in more wide spread use,
kmap() should be safe to leave unsupported and is considered an invalid access.

Handling invalid access to these pages is configurable via a new module
parameter memremap.pks_fault_mode. 2 modes are suported.

'relaxed' (default) -- WARN_ONCE, disable the protection and allow
access

'strict' -- prevent any unguarded access to a protected dev_pagemap
range

The fault handler detects the PMEM fault and applies the above configuration to
the faulting thread. The kmap call is a special case. It is considered an
invalid access but uses the configuration early before any access such that the
kmap code path can be better evaluated and fixed.


[1] https://lore.kernel.org/lkml/[email protected]/

[2] https://lore.kernel.org/lkml/[email protected]/

[3] https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

[4] https://lore.kernel.org/lkml/[email protected]/

[5] https://lore.kernel.org/lkml/[email protected]/

[6] https://lore.kernel.org/lkml/[email protected]/


Fenghua Yu (1):
x86/pks: Add PKS kernel API

Ira Weiny (16):
x86/pkeys: Create pkeys_common.h
x86/fpu: Refactor arch_set_user_pkey_access()
x86/pks: Add additional PKEY helper macros
x86/pks: Add PKS defines and Kconfig options
x86/pks: Add PKS setup code
x86/fault: Adjust WARN_ON for PKey fault
x86/pks: Preserve the PKRS MSR on context switch
x86/entry: Preserve PKRS MSR across exceptions
x86/pks: Introduce pks_abandon_protections()
x86/pks: Add PKS Test code
memremap_pages: Add access protection via supervisor Protection Keys
(PKS)
memremap_pages: Add memremap.pks_fault_mode
kmap: Add stray access protection for devmap pages
dax: Stray access protection for dax_direct_access()
nvdimm/pmem: Enable stray access protection
devdax: Enable stray access protection

Rick Edgecombe (1):
x86/pks: Add PKS fault callbacks

.../admin-guide/kernel-parameters.txt | 14 +
Documentation/core-api/protection-keys.rst | 153 +++-
arch/x86/Kconfig | 1 +
arch/x86/entry/calling.h | 26 +
arch/x86/entry/common.c | 56 ++
arch/x86/entry/entry_64.S | 22 +-
arch/x86/entry/entry_64_compat.S | 6 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/pgtable_types.h | 12 +
arch/x86/include/asm/pkeys.h | 2 +
arch/x86/include/asm/pkeys_common.h | 19 +
arch/x86/include/asm/pkru.h | 16 +-
arch/x86/include/asm/pks.h | 67 ++
arch/x86/include/asm/processor-flags.h | 2 +
arch/x86/include/asm/processor.h | 19 +-
arch/x86/include/uapi/asm/processor-flags.h | 2 +
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/fpu/xstate.c | 22 +-
arch/x86/kernel/head_64.S | 7 +-
arch/x86/kernel/process.c | 3 +
arch/x86/kernel/process_64.c | 3 +
arch/x86/mm/fault.c | 82 +-
arch/x86/mm/pkeys.c | 277 +++++-
drivers/dax/device.c | 2 +
drivers/dax/super.c | 54 ++
drivers/md/dm-writecache.c | 8 +-
drivers/nvdimm/pmem.c | 55 +-
fs/dax.c | 8 +
fs/fuse/virtio_fs.c | 2 +
include/linux/dax.h | 8 +
include/linux/highmem-internal.h | 5 +
include/linux/memremap.h | 1 +
include/linux/mm.h | 88 ++
include/linux/pgtable.h | 4 +
include/linux/pkeys.h | 36 +
include/linux/sched.h | 7 +
init/init_task.c | 3 +
kernel/entry/common.c | 14 +-
kernel/fork.c | 3 +
lib/Kconfig.debug | 13 +
lib/Makefile | 3 +
lib/pks/Makefile | 3 +
lib/pks/pks_test.c | 864 ++++++++++++++++++
mm/Kconfig | 26 +
mm/memremap.c | 158 ++++
tools/testing/selftests/x86/Makefile | 2 +-
tools/testing/selftests/x86/test_pks.c | 157 ++++
49 files changed, 2261 insertions(+), 86 deletions(-)
create mode 100644 arch/x86/include/asm/pkeys_common.h
create mode 100644 arch/x86/include/asm/pks.h
create mode 100644 lib/pks/Makefile
create mode 100644 lib/pks/pks_test.c
create mode 100644 tools/testing/selftests/x86/test_pks.c

--
2.28.0.rc0.12.gb6a658bd00c9



2021-08-04 05:07:39

by Ira Weiny

[permalink] [raw]
Subject: [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode

From: Ira Weiny <[email protected]>

Some systems may be using pmem in unanticipated ways. As such it is
possible a code path may violation the restrictions of the PMEM PKS
protections.

In order to provide a more seamless integration of the PMEM PKS feature
provide a pks_fault_mode that allows for a relaxed mode should a
previously working feature start to fault on PKS protected PMEM.

2 modes are available:

'relaxed' (default) -- WARN_ONCE, abandon the protections, and
continuing to operate.

'strict' -- BUG_ON/or fault indicating the error. This is the
most protective of the PMEM memory but may be undesirable in
some configurations.

NOTE: There was some debate about if a 3rd mode called 'silent' should
be available. 'silent' would be the same as 'relaxed' but not print any
output. While 'silent' is nice for admins to reduce console/log output
it would result in less motivation to fix invalid access to the
protected pmem pages. Therefore, 'silent' is left out.

In addition, kmap() is known to not work with this protection. Provide
a new call; pgmap_protection_flag_invalid(). This gives better
debugging for missed kmap() users. This call also respects the
pks_fault_mode settings.

Signed-off-by: Ira Weiny <[email protected]>

---
Changes for V7
Leverage Rick Edgecombe's fault callback infrastructure to relax invalid
uses and prevent crashes
From Dan Williams
Use sysfs_* calls for parameter
Make pgmap_disable_protection inline
Remove pfn from warn output
Remove silent parameter option
---
.../admin-guide/kernel-parameters.txt | 14 +++
arch/x86/mm/pkeys.c | 8 +-
include/linux/mm.h | 26 ++++++
mm/memremap.c | 85 +++++++++++++++++++
4 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bdb22006f713..7902fce7f1da 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4081,6 +4081,20 @@
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/x86/i386/IO-APIC.rst.

+ memremap.pks_fault_mode= [X86] Control the behavior of page map
+ protection violations. Violations may not be an actual
+ use of the memory but simply an attempt to map it in an
+ incompatible way.
+ (depends on CONFIG_DEVMAP_ACCESS_PROTECTION
+
+ Format: { relaxed | strict }
+
+ relaxed - Print a warning, disable the protection and
+ continue execution.
+ strict - Stop kernel execution via BUG_ON or fault
+
+ default: relaxed
+
plip= [PPT,NET] Parallel port network link
Format: { parport<nr> | timid | 0 }
See also Documentation/admin-guide/parport.rst.
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index cdebc2018888..201004586c2b 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -9,6 +9,7 @@
#include <linux/debugfs.h> /* debugfs_create_u32() */
#include <linux/mm_types.h> /* mm_struct, vma, etc... */
#include <linux/pkeys.h> /* PKEY_* */
+#include <linux/mm.h> /* fault callback */
#include <uapi/asm-generic/mman-common.h>

#include <asm/cpufeature.h> /* boot_cpu_has, ... */
@@ -241,7 +242,12 @@ int handle_abandoned_pks_value(struct pt_regs *regs)
return (ept_regs->thread_pkrs != old);
}

-static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = { 0 };
+static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = {
+ [PKS_KEY_DEFAULT] = NULL,
+#ifdef CONFIG_DEVMAP_ACCESS_PROTECTION
+ [PKS_KEY_PGMAP_PROTECTION] = pgmap_pks_fault_callback,
+#endif
+};

bool handle_pks_key_callback(unsigned long address, bool write, u16 key)
{
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d3c1a3ecca87..c13c7af7cad3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1216,6 +1216,7 @@ static inline bool devmap_protected(struct page *page)
return false;
}

+void __pgmap_protection_flag_invalid(struct dev_pagemap *pgmap);
void __pgmap_mk_readwrite(struct dev_pagemap *pgmap);
void __pgmap_mk_noaccess(struct dev_pagemap *pgmap);

@@ -1232,6 +1233,27 @@ static inline bool pgmap_check_pgmap_prot(struct page *page)
return true;
}

+/*
+ * pgmap_protection_flag_invalid - Check and flag an invalid use of a pgmap
+ * protected page
+ *
+ * There are code paths which are known to not be compatible with pgmap
+ * protections. pgmap_protection_flag_invalid() is provided as a 'relief
+ * valve' to be used in those functions which are known to be incompatible.
+ *
+ * Thus an invalid code path can be flag more precisely what code contains the
+ * bug vs just flagging a fault. Like the fault handler code this abandons the
+ * use of the PKS key and optionally allows the calling code path to continue
+ * based on the configuration of the memremap.pks_fault_mode command line
+ * (and/or sysfs) option.
+ */
+static inline void pgmap_protection_flag_invalid(struct page *page)
+{
+ if (!pgmap_check_pgmap_prot(page))
+ return;
+ __pgmap_protection_flag_invalid(page->pgmap);
+}
+
static inline void pgmap_mk_readwrite(struct page *page)
{
if (!pgmap_check_pgmap_prot(page))
@@ -1247,10 +1269,14 @@ static inline void pgmap_mk_noaccess(struct page *page)

bool pgmap_protection_enabled(void);

+bool pgmap_pks_fault_callback(unsigned long address, bool write);
+
#else

static inline void __pgmap_mk_readwrite(struct dev_pagemap *pgmap) { }
static inline void __pgmap_mk_noaccess(struct dev_pagemap *pgmap) { }
+
+static inline void pgmap_protection_flag_invalid(struct page *page) { }
static inline void pgmap_mk_readwrite(struct page *page) { }
static inline void pgmap_mk_noaccess(struct page *page) { }
static inline bool pgmap_protection_enabled(void)
diff --git a/mm/memremap.c b/mm/memremap.c
index a05de8714916..930b360bad86 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -95,6 +95,91 @@ static void devmap_protection_disable(void)
static_branch_dec(&dev_pgmap_protection_static_key);
}

+/*
+ * Ignore the checkpatch warning because the typedef allows
+ * param_check_pks_fault_modes to automatically check the passed value.
+ */
+typedef enum {
+ PKS_MODE_STRICT = 0,
+ PKS_MODE_RELAXED = 1,
+} pks_fault_modes;
+
+pks_fault_modes pks_fault_mode = PKS_MODE_RELAXED;
+
+static int param_set_pks_fault_mode(const char *val, const struct kernel_param *kp)
+{
+ int ret = -EINVAL;
+
+ if (!sysfs_streq(val, "relaxed")) {
+ pks_fault_mode = PKS_MODE_RELAXED;
+ ret = 0;
+ } else if (!sysfs_streq(val, "strict")) {
+ pks_fault_mode = PKS_MODE_STRICT;
+ ret = 0;
+ }
+
+ return ret;
+}
+
+static int param_get_pks_fault_mode(char *buffer, const struct kernel_param *kp)
+{
+ int ret = 0;
+
+ switch (pks_fault_mode) {
+ case PKS_MODE_STRICT:
+ ret = sysfs_emit(buffer, "strict\n");
+ break;
+ case PKS_MODE_RELAXED:
+ ret = sysfs_emit(buffer, "relaxed\n");
+ break;
+ default:
+ ret = sysfs_emit(buffer, "<unknown>\n");
+ break;
+ }
+
+ return ret;
+}
+
+static const struct kernel_param_ops param_ops_pks_fault_modes = {
+ .set = param_set_pks_fault_mode,
+ .get = param_get_pks_fault_mode,
+};
+
+#define param_check_pks_fault_modes(name, p) \
+ __param_check(name, p, pks_fault_modes)
+module_param(pks_fault_mode, pks_fault_modes, 0644);
+
+static void pgmap_abandon_protection(void)
+{
+ static bool protections_abandoned = false;
+
+ if (!protections_abandoned) {
+ protections_abandoned = true;
+ pks_abandon_protections(PKS_KEY_PGMAP_PROTECTION);
+ }
+}
+
+void __pgmap_protection_flag_invalid(struct dev_pagemap *pgmap)
+{
+ BUG_ON(pks_fault_mode == PKS_MODE_STRICT);
+
+ WARN_ONCE(1, "Page map protection disabled");
+ pgmap_abandon_protection();
+}
+EXPORT_SYMBOL_GPL(__pgmap_protection_flag_invalid);
+
+bool pgmap_pks_fault_callback(unsigned long address, bool write)
+{
+ /* In strict mode just let the fault handler oops */
+ if (pks_fault_mode == PKS_MODE_STRICT)
+ return false;
+
+ WARN_ONCE(1, "Page map protection disabled");
+ pgmap_abandon_protection();
+ return true;
+}
+EXPORT_SYMBOL_GPL(pgmap_pks_fault_callback);
+
void __pgmap_mk_readwrite(struct dev_pagemap *pgmap)
{
if (!current->pgmap_prot_count++)
--
2.28.0.rc0.12.gb6a658bd00c9


2021-08-04 05:15:44

by Ira Weiny

[permalink] [raw]
Subject: [PATCH V7 12/18] x86/pks: Add PKS fault callbacks

From: Rick Edgecombe <[email protected]>

Some PKS keys will want special handling on accesses that violate their
permissions. One of these is PMEM which will want to have a mode that
logs the access violation, disables protection, and continues rather
than oops the machine.

Since PKS faults do not provide the actual key that faulted, this
information needs to be recovered by walking the page tables and
extracting it from the leaf entry.

This infrastructure could be used to implement abandoned pkeys, but adds
support in a separate call such that abandoned pkeys are handled more
quickly by skipping the page table walk.

In pkeys.c, define a new api for setting callbacks for individual pkeys.

Co-developed-by: Ira Weiny <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>

---
Changes for V7:
New patch
---
Documentation/core-api/protection-keys.rst | 27 +++++++++++-
arch/x86/include/asm/pks.h | 7 +++
arch/x86/mm/fault.c | 51 ++++++++++++++++++++++
arch/x86/mm/pkeys.c | 13 ++++++
include/linux/pkeys.h | 2 +
5 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index 8cf7eaaed3e5..bbf81b12e67d 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -113,7 +113,8 @@ Kernel API for PKS support

Similar to user space pkeys, supervisor pkeys allow additional protections to
be defined for a supervisor mappings. Unlike user space pkeys, violations of
-these protections result in a a kernel oops.
+these protections result in a a kernel oops unless a PKS fault handler is
+provided which handles the fault.

Supervisor Memory Protection Keys (PKS) is a feature which is found on Intel's
Sapphire Rapids (and later) "Scalable Processor" Server CPUs. It will also be
@@ -145,6 +146,30 @@ Disabled.
consumer_defaults[PKS_KEY_MY_FEATURE] = PKR_DISABLE_WRITE;
...

+
+Users may also provide a fault handler which can handle a fault differently
+than an oops. Continuing our example from above if 'MY_FEATURE' wanted to
+define a handler they can do so by adding the coresponding entry to the
+pks_key_callbacks array.
+
+::
+
+ #ifdef CONFIG_MY_FEATURE
+ bool my_feature_pks_fault_callback(unsigned long address, bool write)
+ {
+ if (my_feature_fault_is_ok)
+ return true;
+ return false;
+ }
+ #endif
+
+ static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = {
+ [PKS_KEY_DEFAULT] = NULL,
+ #ifdef CONFIG_MY_FEATURE
+ [PKS_KEY_PGMAP_PROTECTION] = my_feature_pks_fault_callback,
+ #endif
+ };
+
The following interface is used to manipulate the 'protection domain' defined
by a pkey within the kernel. Setting a pkey value in a supervisor PTE adds
this additional protection to the page.
diff --git a/arch/x86/include/asm/pks.h b/arch/x86/include/asm/pks.h
index e28413cc410d..3de5089d379d 100644
--- a/arch/x86/include/asm/pks.h
+++ b/arch/x86/include/asm/pks.h
@@ -23,6 +23,7 @@ static inline struct extended_pt_regs *extended_pt_regs(struct pt_regs *regs)

void show_extended_regs_oops(struct pt_regs *regs, unsigned long error_code);
int handle_abandoned_pks_value(struct pt_regs *regs);
+bool handle_pks_key_callback(unsigned long address, bool write, u16 key);

#else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */

@@ -36,6 +37,12 @@ static inline int handle_abandoned_pks_value(struct pt_regs *regs)
{
return 0;
}
+static inline bool handle_pks_key_fault(struct pt_regs *regs,
+ unsigned long hw_error_code,
+ unsigned long address)
+{
+ return false;
+}

#endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 3780ed0f9597..7a8c807006c7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1134,6 +1134,54 @@ bool fault_in_kernel_space(unsigned long address)
return address >= TASK_SIZE_MAX;
}

+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+bool handle_pks_key_fault(struct pt_regs *regs, unsigned long hw_error_code,
+ unsigned long address)
+{
+ bool write = (hw_error_code & X86_PF_WRITE);
+ pgd_t pgd;
+ p4d_t p4d;
+ pud_t pud;
+ pmd_t pmd;
+ pte_t pte;
+
+ pgd = READ_ONCE(*(init_mm.pgd + pgd_index(address)));
+ if (!pgd_present(pgd))
+ return false;
+
+ p4d = READ_ONCE(*p4d_offset(&pgd, address));
+ if (!p4d_present(p4d))
+ return false;
+
+ if (p4d_large(p4d))
+ return handle_pks_key_callback(address, write,
+ pte_flags_pkey(p4d_val(p4d)));
+
+ pud = READ_ONCE(*pud_offset(&p4d, address));
+ if (!pud_present(pud))
+ return false;
+
+ if (pud_large(pud))
+ return handle_pks_key_callback(address, write,
+ pte_flags_pkey(pud_val(pud)));
+
+ pmd = READ_ONCE(*pmd_offset(&pud, address));
+ if (!pmd_present(pmd))
+ return false;
+
+ if (pmd_large(pmd))
+ return handle_pks_key_callback(address, write,
+ pte_flags_pkey(pmd_val(pmd)));
+
+ pte = READ_ONCE(*pte_offset_kernel(&pmd, address));
+ if (!pte_present(pte))
+ return false;
+
+ return handle_pks_key_callback(address, write,
+ pte_flags_pkey(pte_val(pte)));
+}
+#endif
+
/*
* Called for all faults where 'address' is part of the kernel address
* space. Might get called for faults that originate from *code* that
@@ -1164,6 +1212,9 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code,

if (handle_abandoned_pks_value(regs))
return;
+
+ if (handle_pks_key_fault(regs, hw_error_code, address))
+ return;
}

#ifdef CONFIG_X86_32
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index c7358662ec07..f0166725a128 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -241,6 +241,19 @@ int handle_abandoned_pks_value(struct pt_regs *regs)
return (ept_regs->thread_pkrs != old);
}

+static const pks_key_callback pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = { 0 };
+
+bool handle_pks_key_callback(unsigned long address, bool write, u16 key)
+{
+ if (key > PKS_KEY_NR_CONSUMERS)
+ return false;
+
+ if (pks_key_callbacks[key])
+ return pks_key_callbacks[key](address, write);
+
+ return false;
+}
+
/*
* write_pkrs() optimizes MSR writes by maintaining a per cpu cache which can
* be checked quickly.
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 4d22ccd971fc..549fa01d7da3 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -62,6 +62,8 @@ void pks_mk_readonly(int pkey);
void pks_mk_readwrite(int pkey);
void pks_abandon_protections(int pkey);

+typedef bool (*pks_key_callback)(unsigned long address, bool write);
+
#else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */

static inline void pkrs_save_irq(struct pt_regs *regs) { }
--
2.28.0.rc0.12.gb6a658bd00c9


2021-08-04 05:27:14

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode

On 8/3/21 9:32 PM, [email protected] wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index bdb22006f713..7902fce7f1da 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4081,6 +4081,20 @@
> pirq= [SMP,APIC] Manual mp-table setup
> See Documentation/x86/i386/IO-APIC.rst.
>
> + memremap.pks_fault_mode= [X86] Control the behavior of page map
> + protection violations. Violations may not be an actual
> + use of the memory but simply an attempt to map it in an
> + incompatible way.
> + (depends on CONFIG_DEVMAP_ACCESS_PROTECTION

Missing closing ')' above.

> +
> + Format: { relaxed | strict }
> +
> + relaxed - Print a warning, disable the protection and
> + continue execution.
> + strict - Stop kernel execution via BUG_ON or fault
> +
> + default: relaxed
> +


--
~Randy


2021-08-07 19:37:31

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode

On Tue, Aug 03, 2021 at 09:57:31PM -0700, Randy Dunlap wrote:
> On 8/3/21 9:32 PM, [email protected] wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index bdb22006f713..7902fce7f1da 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -4081,6 +4081,20 @@
> > pirq= [SMP,APIC] Manual mp-table setup
> > See Documentation/x86/i386/IO-APIC.rst.
> > + memremap.pks_fault_mode= [X86] Control the behavior of page map
> > + protection violations. Violations may not be an actual
> > + use of the memory but simply an attempt to map it in an
> > + incompatible way.
> > + (depends on CONFIG_DEVMAP_ACCESS_PROTECTION
>
> Missing closing ')' above.

Fixed. Thank you!
Ira

>
> > +
> > + Format: { relaxed | strict }
> > +
> > + relaxed - Print a warning, disable the protection and
> > + continue execution.
> > + strict - Stop kernel execution via BUG_ON or fault
> > +
> > + default: relaxed
> > +
>
>
> --
> ~Randy
>
>

2021-08-11 19:02:43

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode

On Tue, 2021-08-03 at 21:32 -0700, [email protected] wrote:
> +static int param_set_pks_fault_mode(const char *val, const struct
> kernel_param *kp)
> +{
> + int ret = -EINVAL;
> +
> + if (!sysfs_streq(val, "relaxed")) {
> + pks_fault_mode = PKS_MODE_RELAXED;
> + ret = 0;
> + } else if (!sysfs_streq(val, "strict")) {
> + pks_fault_mode = PKS_MODE_STRICT;
> + ret = 0;
> + }
> +
> + return ret;
> +}
> +

Looks like !sysfs_streq() should be just sysfs_streq().

2021-08-11 21:21:58

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH V7 12/18] x86/pks: Add PKS fault callbacks

On Tue, 2021-08-03 at 21:32 -0700, [email protected] wrote:
> +static const pks_key_callback
> pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = { 0 };
> +
> +bool handle_pks_key_callback(unsigned long address, bool write, u16
> key)
> +{
> + if (key > PKS_KEY_NR_CONSUMERS)
> + return false;
Good idea, should be >= though?

> +
> + if (pks_key_callbacks[key])
> + return pks_key_callbacks[key](address, write);
> +
> + return false;
> +}
> +

Otherwise, I've rebased on this series and didn't hit any problems.
Thanks.

2021-08-17 03:15:50

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH V7 14/18] memremap_pages: Add memremap.pks_fault_mode

On Wed, Aug 11, 2021 at 12:01:28PM -0700, Edgecombe, Rick P wrote:
> On Tue, 2021-08-03 at 21:32 -0700, [email protected] wrote:
> > +static int param_set_pks_fault_mode(const char *val, const struct
> > kernel_param *kp)
> > +{
> > + int ret = -EINVAL;
> > +
> > + if (!sysfs_streq(val, "relaxed")) {
> > + pks_fault_mode = PKS_MODE_RELAXED;
> > + ret = 0;
> > + } else if (!sysfs_streq(val, "strict")) {
> > + pks_fault_mode = PKS_MODE_STRICT;
> > + ret = 0;
> > + }
> > +
> > + return ret;
> > +}
> > +
>
> Looks like !sysfs_streq() should be just sysfs_streq().

Indeed. Fixed.

Thanks!
Ira

2021-08-17 03:22:46

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH V7 12/18] x86/pks: Add PKS fault callbacks

On Wed, Aug 11, 2021 at 02:18:26PM -0700, Edgecombe, Rick P wrote:
> On Tue, 2021-08-03 at 21:32 -0700, [email protected] wrote:
> > +static const pks_key_callback
> > pks_key_callbacks[PKS_KEY_NR_CONSUMERS] = { 0 };
> > +
> > +bool handle_pks_key_callback(unsigned long address, bool write, u16
> > key)
> > +{
> > + if (key > PKS_KEY_NR_CONSUMERS)
> > + return false;
> Good idea, should be >= though?

Yep. Fixed thanks.

>
> > +
> > + if (pks_key_callbacks[key])
> > + return pks_key_callbacks[key](address, write);
> > +
> > + return false;
> > +}
> > +
>
> Otherwise, I've rebased on this series and didn't hit any problems.
> Thanks.

Awesome! I still want Dave and Dan to weigh in prior to me respining with the
changes so far.

Thanks,
Ira