2024-01-08 13:00:57

by Brandt, Oliver - Lenze

[permalink] [raw]
Subject: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

Using the cacheflush() syscall from an 32-bit user-space fails when
ARM64_PAN is used. We 'll get an endless loop:

1. executing "dc cvau, x2" results in raising an abort
2. abort handler does not fix the reason for the abort and
returns to 1.

Disabling PAN for the time of the cache maintenance fixes this.

Fixes: 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
Cc: [email protected]
Signed-off-by: Oliver Brandt <[email protected]>
---
arch/arm64/mm/cache.S | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 503567c864fde..333c4c2baa568 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -70,10 +70,12 @@ SYM_FUNC_ALIAS(__pi_caches_clean_inval_pou, caches_clean_inval_pou)
*/
SYM_FUNC_START(caches_clean_inval_user_pou)
uaccess_ttbr0_enable x2, x3, x4
+ ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, CONFIG_ARM64_PAN)

caches_clean_inval_pou_macro 2f
mov x0, xzr
1:
+ ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
uaccess_ttbr0_disable x1, x2
ret
2:
--
2.43.0


2024-01-08 15:56:17

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

Hi Oliver,

On Mon, Jan 08, 2024 at 01:00:39PM +0000, Brandt, Oliver - Lenze wrote:
> Using the cacheflush() syscall from an 32-bit user-space fails when
> ARM64_PAN is used. We 'll get an endless loop:
>
> 1. executing "dc cvau, x2" results in raising an abort
> 2. abort handler does not fix the reason for the abort and
> returns to 1.
>
> Disabling PAN for the time of the cache maintenance fixes this.

Hmm... the ARM ARM says PSTATE.PAN is not supposed to affect DC CVAU.

Looking at the latest ARM ARM (ARM DDI 0487J.a), R_PMTWB states:

| The PSTATE.PAN bit has no effect on all of the following:
|
| o Instruction fetches.
| o Data cache instructions, except DC ZVA.
| o If FEAT_PAN2 is not implemented, then address translation instructions.
| o If FEAT_PAN2 is implemented, then the address translation instructions
| other than AT S1E1RP and AT S1E1WP.

So IIUC, DC CVAU shouldn't be affected by PAN.

This could be a CPU bug; which CPU are you seeing this with?

Mark.

> Fixes: 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
> Cc: [email protected]
> Signed-off-by: Oliver Brandt <[email protected]>
> ---
> arch/arm64/mm/cache.S | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
> index 503567c864fde..333c4c2baa568 100644
> --- a/arch/arm64/mm/cache.S
> +++ b/arch/arm64/mm/cache.S
> @@ -70,10 +70,12 @@ SYM_FUNC_ALIAS(__pi_caches_clean_inval_pou, caches_clean_inval_pou)
> */
> SYM_FUNC_START(caches_clean_inval_user_pou)
> uaccess_ttbr0_enable x2, x3, x4
> + ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
>
> caches_clean_inval_pou_macro 2f
> mov x0, xzr
> 1:
> + ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
> uaccess_ttbr0_disable x1, x2
> ret
> 2:
> --
> 2.43.0

2024-01-08 16:38:00

by Brandt, Oliver - Lenze

[permalink] [raw]
Subject: Re: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

Hi Mark,

>
> Hi Oliver,
>
> On Mon, Jan 08, 2024 at 01:00:39PM +0000, Brandt, Oliver - Lenze wrote:
> > Using the cacheflush() syscall from an 32-bit user-space fails when
> > ARM64_PAN is used. We 'll get an endless loop:
> >
> > 1. executing "dc cvau, x2" results in raising an abort
> > 2. abort handler does not fix the reason for the abort and
> > returns to 1.
> >
> > Disabling PAN for the time of the cache maintenance fixes this.
>
> Hmm... the ARM ARM says PSTATE.PAN is not supposed to affect DC CVAU.
>
> Looking at the latest ARM ARM (ARM DDI 0487J.a), R_PMTWB states:
>
> > The PSTATE.PAN bit has no effect on all of the following:
> >
> > o Instruction fetches.
> > o Data cache instructions, except DC ZVA.
> > o If FEAT_PAN2 is not implemented, then address translation instructions.
> > o If FEAT_PAN2 is implemented, then the address translation instructions
> > other than AT S1E1RP and AT S1E1WP.
>
> So IIUC, DC CVAU shouldn't be affected by PAN.

Ups... Sorry, didn't noticed this.

> This could be a CPU bug; which CPU are you seeing this with?

I've stumbled about this while using Intel's simulator "Simics" with a
model of the upcoming "Agilex5 socfpga". The "Agilex5" is a SoC
containing two Cortex A76 and two Cortex A55.

We are expecting the real silicon in a couple of weeks. Seems to be a
good idea to check the silicon first. Sorry to bother you with this.

Many thanks for the quick reply!

>
> Mark.

Oliver

>
> > Fixes: 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
> > Cc: [email protected]
> > Signed-off-by: Oliver Brandt <[email protected]>
> > ---
> > arch/arm64/mm/cache.S | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
> > index 503567c864fde..333c4c2baa568 100644
> > --- a/arch/arm64/mm/cache.S
> > +++ b/arch/arm64/mm/cache.S
> > @@ -70,10 +70,12 @@ SYM_FUNC_ALIAS(__pi_caches_clean_inval_pou, caches_clean_inval_pou)
> > */
> > SYM_FUNC_START(caches_clean_inval_user_pou)
> > uaccess_ttbr0_enable x2, x3, x4
> > + ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
> >
> > caches_clean_inval_pou_macro 2f
> > mov x0, xzr
> > 1:
> > + ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
> > uaccess_ttbr0_disable x1, x2
> > ret
> > 2:
> > --
> > 2.43.0

2024-01-08 18:04:32

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

On Mon, Jan 08, 2024 at 04:37:37PM +0000, Brandt, Oliver - Lenze wrote:
> > On Mon, Jan 08, 2024 at 01:00:39PM +0000, Brandt, Oliver - Lenze wrote:
> > > Using the cacheflush() syscall from an 32-bit user-space fails when
> > > ARM64_PAN is used. We 'll get an endless loop:
> > >
> > > 1. executing "dc cvau, x2" results in raising an abort
> > > 2. abort handler does not fix the reason for the abort and
> > > returns to 1.
> > >
> > > Disabling PAN for the time of the cache maintenance fixes this.
> >
> > Hmm... the ARM ARM says PSTATE.PAN is not supposed to affect DC CVAU.
> >
> > Looking at the latest ARM ARM (ARM DDI 0487J.a), R_PMTWB states:
> >
> > > The PSTATE.PAN bit has no effect on all of the following:
> > >
> > > o Instruction fetches.
> > > o Data cache instructions, except DC ZVA.
> > > o If FEAT_PAN2 is not implemented, then address translation instructions.
> > > o If FEAT_PAN2 is implemented, then the address translation instructions
> > > other than AT S1E1RP and AT S1E1WP.
> >
> > So IIUC, DC CVAU shouldn't be affected by PAN.
>
> Ups... Sorry, didn't noticed this.

No worries; this is not at all obvious!

> > This could be a CPU bug; which CPU are you seeing this with?
>
> I've stumbled about this while using Intel's simulator "Simics" with a
> model of the upcoming "Agilex5 socfpga". The "Agilex5" is a SoC
> containing two Cortex A76 and two Cortex A55.

Ah, so it could be a bug in Simics, then.

> We are expecting the real silicon in a couple of weeks. Seems to be a
> good idea to check the silicon first. Sorry to bother you with this.

Just to make sure I ran a quick test on an AML-905D3-CC board (quad-core
Cortex-A55), and AFAICT we're not taking unexpected faults. Log below,
including the test case.

If you do see problems on silicon, please let us know!

Mark.

---->8----
mark@flodeboller:~/test/aarch32-cacheflush$ sudo dmesg | grep -i access
[ 0.010476] CPU features: detected: Privileged Access Never
mark@flodeboller:~/test/aarch32-cacheflush$ cat test.c
#include <stdio.h>

void cacheflush(void *start, size_t size)
{
printf("Attempting flush of [%p..%p]\n", start, start + size);
__builtin___clear_cache(start, start + size);
}

int main(int argc, char *argv[])
{
static char buf[4096];

cacheflush(buf, sizeof(buf));

cacheflush(NULL, sizeof(buf));

return 0;
}
mark@flodeboller:~/test/aarch32-cacheflush$ arm-linux-gnueabihf-gcc test.c -o test -O3
mark@flodeboller:~/test/aarch32-cacheflush$ file test
test: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=a53713f6b623b9b7c29cee4dc615fb7d43a0dcb6, for GNU/Linux 3.2.0, not stripped
mark@flodeboller:~/test/aarch32-cacheflush$ strace ./test
execve("./test", ["./test"], 0xffffd7e09890 /* 25 vars */ <unfinished ...>
[ Process PID=7682 runs in 32 bit mode. ]
strace: WARNING: Proper structure decoding for this personality is not supported, please consider building strace with mpers support enabled.
<... execve resumed>) = 0
brk(NULL) = 0x222b000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=31402, ...}) = 0
mmap2(NULL, 31402, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7b0b000
close(3) = 0
openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0i\344\1\0004\0\0\0"..., 512) = 512
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=1102644, ...}) = 0
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7b09000
mmap2(NULL, 1139660, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf79f2000
mmap2(0xf7afc000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x109000) = 0xf7afc000
mmap2(0xf7aff000, 37836, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7aff000
close(3) = 0
set_tls(0xf7b09ce0) = 0
set_tid_address(0xf7b09848) = 7682
set_robust_list(0xf7b0984c, 12) = 0
rseq(0xf7b09cc0, 0x20, 0, 0xe7f5def3) = 0
mprotect(0xf7afc000, 8192, PROT_READ) = 0
mprotect(0x572000, 4096, PROT_READ) = 0
mprotect(0xf7b31000, 4096, PROT_READ) = 0
ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
munmap(0xf7b0b000, 31402) = 0
statx(1, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0620, stx_size=0, ...}) = 0
getrandom("\x79\xe4\xe7\x57", 4, GRND_NONBLOCK) = 4
brk(NULL) = 0x222b000
brk(0x224c000) = 0x224c000
write(1, "Attempting flush of [0x573040..0"..., 41Attempting flush of [0x573040..0x574040]
) = 41
cacheflush(0x573040, 0x574040, 0) = 0
write(1, "Attempting flush of [(nil)..0x10"..., 36Attempting flush of [(nil)..0x1000]
) = 36
cacheflush(0, 0x1000, 0) = -1 EFAULT (Bad address)
exit_group(0) = ?
+++ exited with 0 +++

2024-01-09 08:42:33

by Brandt, Oliver - Lenze

[permalink] [raw]
Subject: Re: [PATCH] arm64: mm: disable PAN during caches_clean_inval_user_pou

On Mon, 2024-01-08 at 17:58 +0000, Mark Rutland wrote:
> On Mon, Jan 08, 2024 at 04:37:37PM +0000, Brandt, Oliver - Lenze wrote:
> > > On Mon, Jan 08, 2024 at 01:00:39PM +0000, Brandt, Oliver - Lenze wrote:
> > > > Using the cacheflush() syscall from an 32-bit user-space fails when
> > > > ARM64_PAN is used. We 'll get an endless loop:
> > > >
> > > > 1. executing "dc cvau, x2" results in raising an abort
> > > > 2. abort handler does not fix the reason for the abort and
> > > > returns to 1.
> > > >
> > > > Disabling PAN for the time of the cache maintenance fixes this.
> > >
> > > Hmm... the ARM ARM says PSTATE.PAN is not supposed to affect DC CVAU.
> > >
> > > Looking at the latest ARM ARM (ARM DDI 0487J.a), R_PMTWB states:
> > >
> > > > The PSTATE.PAN bit has no effect on all of the following:
> > > >
> > > > o Instruction fetches.
> > > > o Data cache instructions, except DC ZVA.
> > > > o If FEAT_PAN2 is not implemented, then address translation instructions.
> > > > o If FEAT_PAN2 is implemented, then the address translation instructions
> > > > other than AT S1E1RP and AT S1E1WP.
> > >
> > > So IIUC, DC CVAU shouldn't be affected by PAN.
> >
> > Ups... Sorry, didn't noticed this.
>
> No worries; this is not at all obvious!
>
> > > This could be a CPU bug; which CPU are you seeing this with?
> >
> > I've stumbled about this while using Intel's simulator "Simics" with a
> > model of the upcoming "Agilex5 socfpga". The "Agilex5" is a SoC
> > containing two Cortex A76 and two Cortex A55.
>
> Ah, so it could be a bug in Simics, then.
>

Now I think so, too. Not the first bug we've found, but the first in the
used CPU models.

> > We are expecting the real silicon in a couple of weeks. Seems to be a
> > good idea to check the silicon first. Sorry to bother you with this.
>
> Just to make sure I ran a quick test on an AML-905D3-CC board (quad-core
> Cortex-A55), and AFAICT we're not taking unexpected faults. Log below,
> including the test case.
>
> If you do see problems on silicon, please let us know!
>

I will. Thanks a lot for spending your time on this!

> Mark.

Oliver

>
> ---->8----
> mark@flodeboller:~/test/aarch32-cacheflush$ sudo dmesg | grep -i access
> [ 0.010476] CPU features: detected: Privileged Access Never
> mark@flodeboller:~/test/aarch32-cacheflush$ cat test.c
> #include <stdio.h>
>
> void cacheflush(void *start, size_t size)
> {
> printf("Attempting flush of [%p..%p]\n", start, start + size);
> __builtin___clear_cache(start, start + size);
> }
>
> int main(int argc, char *argv[])
> {
> static char buf[4096];
>
> cacheflush(buf, sizeof(buf));
>
> cacheflush(NULL, sizeof(buf));
>
> return 0;
> }
> mark@flodeboller:~/test/aarch32-cacheflush$ arm-linux-gnueabihf-gcc test.c -o test -O3
> mark@flodeboller:~/test/aarch32-cacheflush$ file test
> test: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=a53713f6b623b9b7c29cee4dc615fb7d43a0dcb6, for GNU/Linux 3.2.0, not stripped
> mark@flodeboller:~/test/aarch32-cacheflush$ strace ./test
> execve("./test", ["./test"], 0xffffd7e09890 /* 25 vars */ <unfinished ...>
> [ Process PID=7682 runs in 32 bit mode. ]
> strace: WARNING: Proper structure decoding for this personality is not supported, please consider building strace with mpers support enabled.
> <... execve resumed>) = 0
> brk(NULL) = 0x222b000
> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=31402, ...}) = 0
> mmap2(NULL, 31402, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7b0b000
> close(3) = 0
> openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0i\344\1\0004\0\0\0"..., 512) = 512
> statx(3, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=1102644, ...}) = 0
> mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7b09000
> mmap2(NULL, 1139660, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf79f2000
> mmap2(0xf7afc000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x109000) = 0xf7afc000
> mmap2(0xf7aff000, 37836, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7aff000
> close(3) = 0
> set_tls(0xf7b09ce0) = 0
> set_tid_address(0xf7b09848) = 7682
> set_robust_list(0xf7b0984c, 12) = 0
> rseq(0xf7b09cc0, 0x20, 0, 0xe7f5def3) = 0
> mprotect(0xf7afc000, 8192, PROT_READ) = 0
> mprotect(0x572000, 4096, PROT_READ) = 0
> mprotect(0xf7b31000, 4096, PROT_READ) = 0
> ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
> munmap(0xf7b0b000, 31402) = 0
> statx(1, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0620, stx_size=0, ...}) = 0
> getrandom("\x79\xe4\xe7\x57", 4, GRND_NONBLOCK) = 4
> brk(NULL) = 0x222b000
> brk(0x224c000) = 0x224c000
> write(1, "Attempting flush of [0x573040..0"..., 41Attempting flush of [0x573040..0x574040]
> ) = 41
> cacheflush(0x573040, 0x574040, 0) = 0
> write(1, "Attempting flush of [(nil)..0x10"..., 36Attempting flush of [(nil)..0x1000]
> ) = 36
> cacheflush(0, 0x1000, 0) = -1 EFAULT (Bad address)
> exit_group(0) = ?
> +++ exited with 0 +++