When user code execution with privilege mode, it will lead to
infinite loop in the page fault handler if ARM_LPAE enabled,
The issue could be reproduced with
"echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT"
Lets' fix it by adding the check in do_page_fault() and panic
when ARM_LPAE enabled.
Fixes: 1d4d37159d01 ("ARM: 8235/1: Support for the PXN CPU feature on ARMv7")
Signed-off-by: Kefeng Wang <[email protected]>
---
arch/arm/mm/fault.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 7cfa9a59d3ec..279bbeb33b48 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -257,8 +257,14 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
vm_flags = VM_WRITE;
}
- if (fsr & FSR_LNX_PF)
+ if (fsr & FSR_LNX_PF) {
vm_flags = VM_EXEC;
+#ifdef CONFIG_ARM_LPAE
+ if (addr && addr < TASK_SIZE && !user_mode(regs))
+ die_kernel_fault("execution of user memory",
+ addr, fsr, regs);
+#endif
+ }
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
--
2.26.2
Hi,
On Wed, Jun 02, 2021 at 03:02:46PM +0800, Kefeng Wang wrote:
> When user code execution with privilege mode, it will lead to
> infinite loop in the page fault handler if ARM_LPAE enabled,
>
> The issue could be reproduced with
> "echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT"
>
> Lets' fix it by adding the check in do_page_fault() and panic
> when ARM_LPAE enabled.
>
> Fixes: 1d4d37159d01 ("ARM: 8235/1: Support for the PXN CPU feature on ARMv7")
> Signed-off-by: Kefeng Wang <[email protected]>
> ---
> arch/arm/mm/fault.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index 7cfa9a59d3ec..279bbeb33b48 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -257,8 +257,14 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
> vm_flags = VM_WRITE;
> }
>
> - if (fsr & FSR_LNX_PF)
> + if (fsr & FSR_LNX_PF) {
> vm_flags = VM_EXEC;
> +#ifdef CONFIG_ARM_LPAE
> + if (addr && addr < TASK_SIZE && !user_mode(regs))
> + die_kernel_fault("execution of user memory",
> + addr, fsr, regs);
> +#endif
> + }
Do we need to do this test here?
Also, is this really LPAE specific? We have similar protection on 32-bit
ARM using domains to disable access to userspace except when the user
accessors are being used, so I would expect kernel-mode execution to
also cause a fault there.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On 2021/6/2 18:52, Russell King (Oracle) wrote:
> Hi,
>
> On Wed, Jun 02, 2021 at 03:02:46PM +0800, Kefeng Wang wrote:
>> When user code execution with privilege mode, it will lead to
>> infinite loop in the page fault handler if ARM_LPAE enabled,
>>
>> The issue could be reproduced with
>> "echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT"
>>
>> Lets' fix it by adding the check in do_page_fault() and panic
>> when ARM_LPAE enabled.
>>
>> Fixes: 1d4d37159d01 ("ARM: 8235/1: Support for the PXN CPU feature on ARMv7")
>> Signed-off-by: Kefeng Wang <[email protected]>
>> ---
>> arch/arm/mm/fault.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
>> index 7cfa9a59d3ec..279bbeb33b48 100644
>> --- a/arch/arm/mm/fault.c
>> +++ b/arch/arm/mm/fault.c
>> @@ -257,8 +257,14 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>> vm_flags = VM_WRITE;
>> }
>>
>> - if (fsr & FSR_LNX_PF)
>> + if (fsr & FSR_LNX_PF) {
>> vm_flags = VM_EXEC;
>> +#ifdef CONFIG_ARM_LPAE
>> + if (addr && addr < TASK_SIZE && !user_mode(regs))
>> + die_kernel_fault("execution of user memory",
>> + addr, fsr, regs);
>> +#endif
>> + }
> Do we need to do this test here?
>
> Also, is this really LPAE specific? We have similar protection on 32-bit
> ARM using domains to disable access to userspace except when the user
> accessors are being used, so I would expect kernel-mode execution to
> also cause a fault there.
IFSR format when using the Short-descriptor translation table format
Domain fault 01001 First level 01011 Second level
Permission fault 01101 First level 01111 Second level
IFSR format when using the Long-descriptor translation table format
0011LL Permission fault. LL bits indicate levelb.
After check the ARM spec, I think for the permission fault, we should panic
with or without LPAE, will change to
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 7cfa9a59d3ec..dd97d9b19dec 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -257,8 +257,11 @@ do_page_fault(unsigned long addr, unsigned int fsr,
struct pt_regs *regs)
vm_flags = VM_WRITE;
}
- if (fsr & FSR_LNX_PF)
+ if (fsr & FSR_LNX_PF) {
vm_flags = VM_EXEC;
+ if (!user_mode(regs))
+ die_kernel_fault("execution of memory", addr,
fsr, regs);
+ }
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
If no object, I will send all patches with updates to patch system,
thanks.
>
On Wed, Jun 02, 2021 at 11:13:14PM +0800, Kefeng Wang wrote:
> ? IFSR format when using the Short-descriptor translation table format
>
> ??? Domain fault ? ?? 01001??? ??? ??? First level?? 01011 ??? Second level
>
> ??? Permission fault 01101 ?? ??? ??? First level?? 01111 ??? Second level
>
> ? IFSR format when using the Long-descriptor translation table format
>
> ?? 0011LL Permission fault. LL bits indicate levelb.
>
> After check the ARM spec, I think for the permission fault,? we should panic
> with or without LPAE, will change to
As I explained in one of the previous patches, the page tables that get
used for mapping kernel space are the _tasks_ own page tables. Any new
kernel mappings are lazily copied to the task page tables - such as
when a module is loaded.
The first time we touch a page, we could end up with a page translation
fault. This will call do_page_fault(), and so with your proposal,
loading a module will potentially cause a kernel panic in this case,
probably leading to systems that panic early during userspace boot.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On 2021/6/2 23:58, Russell King (Oracle) wrote:
> On Wed, Jun 02, 2021 at 11:13:14PM +0800, Kefeng Wang wrote:
>> IFSR format when using the Short-descriptor translation table format
>>
>> Domain fault 01001 First level 01011 Second level
>>
>> Permission fault 01101 First level 01111 Second level
>>
>> IFSR format when using the Long-descriptor translation table format
>>
>> 0011LL Permission fault. LL bits indicate levelb.
>>
>> After check the ARM spec, I think for the permission fault, we should panic
>> with or without LPAE, will change to
> As I explained in one of the previous patches, the page tables that get
> used for mapping kernel space are the _tasks_ own page tables. Any new
> kernel mappings are lazily copied to the task page tables - such as
> when a module is loaded.
>
> The first time we touch a page, we could end up with a page translation
> fault. This will call do_page_fault(), and so with your proposal,
> loading a module will potentially cause a kernel panic in this case,
> probably leading to systems that panic early during userspace boot.
Could we add some FSR_FS check, only panic when the permission fault, eg,
+static inline bool is_permission_fault(unsigned int fsr)
+{
+ int fs = fsr_fs(fsr);
+#ifdef CONFIG_ARM_LPAE
+ if ((fs & FS_PERM_NOLL_MASK) == FS_PERM_NOLL)
+ return true;
+#else
+ if (fs == FS_L1_PERM || fs == )
+ return true;
+#endif
+ return false;
+}
+
static int __kprobes
do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
{
@@ -255,8 +268,7 @@ do_page_fault(unsigned long addr, unsigned int fsr,
struct pt_regs *regs)
if (fsr & FSR_LNX_PF) {
vm_flags = VM_EXEC;
-
- if (!user_mode(regs))
+ if (is_permission_fault && !user_mode(regs))
die_kernel_fault("execution of memory",
mm, addr, fsr, regs);
}
diff --git a/arch/arm/mm/fault.h b/arch/arm/mm/fault.h
index 9ecc2097a87a..187954b4acca 100644
--- a/arch/arm/mm/fault.h
+++ b/arch/arm/mm/fault.h
@@ -14,6 +14,8 @@
#ifdef CONFIG_ARM_LPAE
#define FSR_FS_AEA 17
+#define FS_PERM_NOLL 0xC
+#define FS_PERM_NOLL_MASK 0x3C
static inline int fsr_fs(unsigned int fsr)
{
@@ -21,6 +23,8 @@ static inline int fsr_fs(unsigned int fsr)
}
#else
#define FSR_FS_AEA 22
+#define FS_L1_PERM 0xD
+#define FS_L2_PERM 0xF
and suggestion or proper solution to solve the issue?
>
Hi Russell, any comments, thanks.
On 2021/6/3 17:38, Kefeng Wang wrote:
>
> On 2021/6/2 23:58, Russell King (Oracle) wrote:
>> On Wed, Jun 02, 2021 at 11:13:14PM +0800, Kefeng Wang wrote:
>>> IFSR format when using the Short-descriptor translation table format
>>>
>>> Domain fault 01001 First level 01011
>>> Second level
>>>
>>> Permission fault 01101 First level 01111 Second level
>>>
>>> IFSR format when using the Long-descriptor translation table format
>>>
>>> 0011LL Permission fault. LL bits indicate levelb.
>>>
>>> After check the ARM spec, I think for the permission fault, we
>>> should panic
>>> with or without LPAE, will change to
>> As I explained in one of the previous patches, the page tables that get
>> used for mapping kernel space are the _tasks_ own page tables. Any new
>> kernel mappings are lazily copied to the task page tables - such as
>> when a module is loaded.
>>
>> The first time we touch a page, we could end up with a page translation
>> fault. This will call do_page_fault(), and so with your proposal,
>> loading a module will potentially cause a kernel panic in this case,
>> probably leading to systems that panic early during userspace boot.
>
> Could we add some FSR_FS check, only panic when the permission fault,
> eg,
>
> +static inline bool is_permission_fault(unsigned int fsr)
> +{
> + int fs = fsr_fs(fsr);
> +#ifdef CONFIG_ARM_LPAE
> + if ((fs & FS_PERM_NOLL_MASK) == FS_PERM_NOLL)
> + return true;
> +#else
> + if (fs == FS_L1_PERM || fs == FS_L2_PERM )
> + return true;
> +#endif
> + return false;
> +}
> +
> static int __kprobes
> do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs
> *regs)
> {
> @@ -255,8 +268,7 @@ do_page_fault(unsigned long addr, unsigned int
> fsr, struct pt_regs *regs)
>
> if (fsr & FSR_LNX_PF) {
> vm_flags = VM_EXEC;
> -
> - if (!user_mode(regs))
> + if (is_permission_fault && !user_mode(regs))
> die_kernel_fault("execution of memory",
> mm, addr, fsr, regs);
> }
>
> diff --git a/arch/arm/mm/fault.h b/arch/arm/mm/fault.h
> index 9ecc2097a87a..187954b4acca 100644
> --- a/arch/arm/mm/fault.h
> +++ b/arch/arm/mm/fault.h
> @@ -14,6 +14,8 @@
>
> #ifdef CONFIG_ARM_LPAE
> #define FSR_FS_AEA 17
> +#define FS_PERM_NOLL 0xC
> +#define FS_PERM_NOLL_MASK 0x3C
>
> static inline int fsr_fs(unsigned int fsr)
> {
> @@ -21,6 +23,8 @@ static inline int fsr_fs(unsigned int fsr)
> }
> #else
> #define FSR_FS_AEA 22
> +#define FS_L1_PERM 0xD
> +#define FS_L2_PERM 0xF
>
> and suggestion or proper solution to solve the issue?
>
>>