2020-03-19 09:14:39

by Rémi Denis-Courmont

[permalink] [raw]
Subject: [PATCHv3 0/3] clean up KPTI / SDEI trampoline data alignment

Hi,

The KPTI and SDE trampolines each load a pointer from the same fixmap data
page. This reduces the data alignment to the useful value, and tries to
clarify the assembler code.

Changes since v2:
- Retain alignment even when SDEI is disabled to keep ld happy.

----------------------------------------------------------------
Rémi Denis-Courmont (3):
arm64: clean up trampoline vector loads
arm64/sdei: gather trampolines' .rodata
arm64: reduce trampoline data alignment

arch/arm64/kernel/entry.S | 23 ++++++++++-------------
arch/arm64/mm/mmu.c | 5 ++---
2 files changed, 12 insertions(+), 16 deletions(-)

--
Реми Дёни-Курмон
http://www.remlab.net/




2020-03-19 09:14:46

by Rémi Denis-Courmont

[permalink] [raw]
Subject: [PATCH 2/3] arm64/sdei: gather trampolines' .rodata

From: Rémi Denis-Courmont <[email protected]>

This gathers the two bits of data together for clarity.

Signed-off-by: Rémi Denis-Courmont <[email protected]>
---
arch/arm64/kernel/entry.S | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 24f828739696..c36733d8cd75 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -862,6 +862,11 @@ SYM_CODE_END(tramp_exit_compat)
SYM_DATA_START(__entry_tramp_data_start)
.quad vectors
SYM_DATA_END(__entry_tramp_data_start)
+#ifdef CONFIG_ARM_SDE_INTERFACE
+SYM_DATA_START(__sdei_asm_trampoline_next_handler)
+ .quad __sdei_asm_handler
+SYM_DATA_END(__sdei_asm_trampoline_next_handler)
+#endif /* CONFIG_ARM_SDE_INTERFACE */
.popsection // .rodata
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
@@ -980,13 +985,6 @@ SYM_CODE_END(__sdei_asm_exit_trampoline)
NOKPROBE(__sdei_asm_exit_trampoline)
.ltorg
.popsection // .entry.tramp.text
-#ifdef CONFIG_RANDOMIZE_BASE
-.pushsection ".rodata", "a"
-SYM_DATA_START(__sdei_asm_trampoline_next_handler)
- .quad __sdei_asm_handler
-SYM_DATA_END(__sdei_asm_trampoline_next_handler)
-.popsection // .rodata
-#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */

/*
--
2.26.0.rc2

2020-03-19 09:14:53

by Rémi Denis-Courmont

[permalink] [raw]
Subject: [PATCH 3/3] arm64: reduce trampoline data alignment

From: Rémi Denis-Courmont <[email protected]>

The trampoline data, currently consisting of two relocated pointers,
must be within a single page. However, there are no needs for it to
start a page.

This reduces the alignment to 16 bytes, which is sufficient to ensure
that the data is entirely within a single page of the fixmap.

Signed-off-by: Rémi Denis-Courmont <[email protected]>
---
arch/arm64/kernel/entry.S | 2 +-
arch/arm64/mm/mmu.c | 5 ++---
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index c36733d8cd75..ecad15443655 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -858,7 +858,7 @@ SYM_CODE_END(tramp_exit_compat)
.popsection // .entry.tramp.text
#ifdef CONFIG_RANDOMIZE_BASE
.pushsection ".rodata", "a"
- .align PAGE_SHIFT
+ .align 4 // all .rodata must be in a single fixmap page
SYM_DATA_START(__entry_tramp_data_start)
.quad vectors
SYM_DATA_END(__entry_tramp_data_start)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 9b08f7c7e6f0..6a0e75f48e7b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -599,9 +599,8 @@ static int __init map_entry_trampoline(void)
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
extern char __entry_tramp_data_start[];

- __set_fixmap(FIX_ENTRY_TRAMP_DATA,
- __pa_symbol(__entry_tramp_data_start),
- PAGE_KERNEL_RO);
+ pa_start = __pa_symbol(__entry_tramp_data_start) & PAGE_MASK;
+ __set_fixmap(FIX_ENTRY_TRAMP_DATA, pa_start, PAGE_KERNEL_RO);
}

return 0;
--
2.26.0.rc2

2020-03-19 09:21:56

by Rémi Denis-Courmont

[permalink] [raw]
Subject: [PATCH 1/3] arm64: clean up trampoline vector loads

From: Rémi Denis-Courmont <[email protected]>

This switches from custom instruction patterns to the regular large
memory model sequence with ADRP and LDR. In doing so, the ADD
instruction can be eliminated in the SDEI handler, and the code no
longer assumes that the trampoline vectors and the vectors address both
start on a page boundary.

Signed-off-by: Rémi Denis-Courmont <[email protected]>
---
arch/arm64/kernel/entry.S | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e5d4e30ee242..24f828739696 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -805,9 +805,9 @@ alternative_else_nop_endif
2:
tramp_map_kernel x30
#ifdef CONFIG_RANDOMIZE_BASE
- adr x30, tramp_vectors + PAGE_SIZE
+ adrp x30, tramp_vectors + PAGE_SIZE
alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
- ldr x30, [x30]
+ ldr x30, [x30, #:lo12:__entry_tramp_data_start]
#else
ldr x30, =vectors
#endif
@@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]

#ifdef CONFIG_RANDOMIZE_BASE
- adr x4, tramp_vectors + PAGE_SIZE
- add x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
- ldr x4, [x4]
+ adrp x4, tramp_vectors + PAGE_SIZE
+ ldr x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
#else
ldr x4, =__sdei_asm_handler
#endif
--
2.26.0.rc2

2020-03-19 18:39:57

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCHv3 0/3] clean up KPTI / SDEI trampoline data alignment

On Thu, Mar 19, 2020 at 11:12:56AM +0200, R?mi Denis-Courmont wrote:
> Hi,
>
> The KPTI and SDE trampolines each load a pointer from the same fixmap data
> page. This reduces the data alignment to the useful value, and tries to
> clarify the assembler code.
>
> Changes since v2:
> - Retain alignment even when SDEI is disabled to keep ld happy.
>
> ----------------------------------------------------------------
> R?mi Denis-Courmont (3):
> arm64: clean up trampoline vector loads
> arm64/sdei: gather trampolines' .rodata
> arm64: reduce trampoline data alignment

For the series:

Acked-by: Will Deacon <[email protected]>

[in future please don't drop acks from patches that haven't changed, cheers]

Will

2020-03-20 16:57:01

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCHv3 0/3] clean up KPTI / SDEI trampoline data alignment

On Thu, Mar 19, 2020 at 11:12:56AM +0200, R?mi Denis-Courmont wrote:
> Hi,
>
> The KPTI and SDE trampolines each load a pointer from the same fixmap data
> page. This reduces the data alignment to the useful value, and tries to
> clarify the assembler code.
>
> Changes since v2:
> - Retain alignment even when SDEI is disabled to keep ld happy.

I queued v2. Thanks.

--
Catalin

2020-03-21 13:43:46

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 3/3] arm64: reduce trampoline data alignment

On Thu, Mar 19, 2020 at 11:14:07AM +0200, R?mi Denis-Courmont wrote:
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index c36733d8cd75..ecad15443655 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -858,7 +858,7 @@ SYM_CODE_END(tramp_exit_compat)
> .popsection // .entry.tramp.text
> #ifdef CONFIG_RANDOMIZE_BASE
> .pushsection ".rodata", "a"
> - .align PAGE_SHIFT
> + .align 4 // all .rodata must be in a single fixmap page
> SYM_DATA_START(__entry_tramp_data_start)
> .quad vectors
> SYM_DATA_END(__entry_tramp_data_start)
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 9b08f7c7e6f0..6a0e75f48e7b 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -599,9 +599,8 @@ static int __init map_entry_trampoline(void)
> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> extern char __entry_tramp_data_start[];
>
> - __set_fixmap(FIX_ENTRY_TRAMP_DATA,
> - __pa_symbol(__entry_tramp_data_start),
> - PAGE_KERNEL_RO);
> + pa_start = __pa_symbol(__entry_tramp_data_start) & PAGE_MASK;
> + __set_fixmap(FIX_ENTRY_TRAMP_DATA, pa_start, PAGE_KERNEL_RO);
> }
>
> return 0;

For some reason, I haven't investigated yet, a kernel with KASAN and 64K
pages enabled does not boot (see the attached config). It seems to lock
up when starting user space. Bisected to this commit, reverting it fixes
the issue.

--
Catalin


Attachments:
(No filename) (1.43 kB)
.config (212.40 kB)
Download all attachments

2020-03-23 12:00:49

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 3/3] arm64: reduce trampoline data alignment

On Sat, Mar 21, 2020 at 01:41:01PM +0000, Catalin Marinas wrote:
> On Thu, Mar 19, 2020 at 11:14:07AM +0200, Rémi Denis-Courmont wrote:
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index c36733d8cd75..ecad15443655 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -858,7 +858,7 @@ SYM_CODE_END(tramp_exit_compat)
> > .popsection // .entry.tramp.text
> > #ifdef CONFIG_RANDOMIZE_BASE
> > .pushsection ".rodata", "a"
> > - .align PAGE_SHIFT
> > + .align 4 // all .rodata must be in a single fixmap page
> > SYM_DATA_START(__entry_tramp_data_start)
> > .quad vectors
> > SYM_DATA_END(__entry_tramp_data_start)
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 9b08f7c7e6f0..6a0e75f48e7b 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -599,9 +599,8 @@ static int __init map_entry_trampoline(void)
> > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> > extern char __entry_tramp_data_start[];
> >
> > - __set_fixmap(FIX_ENTRY_TRAMP_DATA,
> > - __pa_symbol(__entry_tramp_data_start),
> > - PAGE_KERNEL_RO);
> > + pa_start = __pa_symbol(__entry_tramp_data_start) & PAGE_MASK;
> > + __set_fixmap(FIX_ENTRY_TRAMP_DATA, pa_start, PAGE_KERNEL_RO);
> > }
> >
> > return 0;
>
> For some reason, I haven't investigated yet, a kernel with KASAN and 64K
> pages enabled does not boot (see the attached config). It seems to lock
> up when starting user space. Bisected to this commit, reverting it fixes
> the issue.

I think the issue might be due to ADRP + ADD :lo12: using 4K offsets,
and so patch 1 isn't quite right for !4K kernels, as we're not
accounting for 4 bits of the address when we try to generate it.

I'll check that now.

Thanks,
Mark.

2020-03-23 12:08:16

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> From: Rémi Denis-Courmont <[email protected]>
>
> This switches from custom instruction patterns to the regular large
> memory model sequence with ADRP and LDR. In doing so, the ADD
> instruction can be eliminated in the SDEI handler, and the code no
> longer assumes that the trampoline vectors and the vectors address both
> start on a page boundary.
>
> Signed-off-by: Rémi Denis-Courmont <[email protected]>
> ---
> arch/arm64/kernel/entry.S | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5d4e30ee242..24f828739696 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -805,9 +805,9 @@ alternative_else_nop_endif
> 2:
> tramp_map_kernel x30
> #ifdef CONFIG_RANDOMIZE_BASE
> - adr x30, tramp_vectors + PAGE_SIZE
> + adrp x30, tramp_vectors + PAGE_SIZE
> alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> - ldr x30, [x30]
> + ldr x30, [x30, #:lo12:__entry_tramp_data_start]

I think this is busted for !4K kernels once we reduce the alignment of
__entry_tramp_data_start.

The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
I think that's what's causing the hang Catalin sees with 64K pages (and
would also be a problem for 16K pages).

Ideally, we'd account for those bits with the ADRP, but I'm not sure
that an ELF relocation can encode symbol + addr + symbol:15-12, so we
likely nned more instructions to explicitly mask that in.

... either that, or leave this page aligned.

> #else
> ldr x30, =vectors
> #endif
> @@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
> 1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
>
> #ifdef CONFIG_RANDOMIZE_BASE
> - adr x4, tramp_vectors + PAGE_SIZE
> - add x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
> - ldr x4, [x4]
> + adrp x4, tramp_vectors + PAGE_SIZE
> + ldr x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]

Likewise here.

Thanks,
Mark.

> #else
> ldr x4, =__sdei_asm_handler
> #endif
> --
> 2.26.0.rc2
>

2020-03-23 12:10:08

by Rémi Denis-Courmont

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > From: Rémi Denis-Courmont <[email protected]>
> >
> > This switches from custom instruction patterns to the regular large
> > memory model sequence with ADRP and LDR. In doing so, the ADD
> > instruction can be eliminated in the SDEI handler, and the code no
> > longer assumes that the trampoline vectors and the vectors address both
> > start on a page boundary.
> >
> > Signed-off-by: Rémi Denis-Courmont <[email protected]>
> > ---
> >
> > arch/arm64/kernel/entry.S | 9 ++++-----
> > 1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index e5d4e30ee242..24f828739696 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> >
> > 2:
> > tramp_map_kernel x30
> >
> > #ifdef CONFIG_RANDOMIZE_BASE
> >
> > - adr x30, tramp_vectors + PAGE_SIZE
> > + adrp x30, tramp_vectors + PAGE_SIZE
> >
> > alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> >
> > - ldr x30, [x30]
> > + ldr x30, [x30, #:lo12:__entry_tramp_data_start]
>
> I think this is busted for !4K kernels once we reduce the alignment of
> __entry_tramp_data_start.
>
> The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> relocation gives us bits 11:0, so we haven't accounted for bits 15:12.

IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

I rather suspect that the problem is with my C code diff assuming that
PAGE_MASK is 4095.

--
Rémi Denis-Courmont
http://www.remlab.net/



2020-03-23 12:16:51

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > From: Rémi Denis-Courmont <[email protected]>
> > >
> > > This switches from custom instruction patterns to the regular large
> > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > instruction can be eliminated in the SDEI handler, and the code no
> > > longer assumes that the trampoline vectors and the vectors address both
> > > start on a page boundary.
> > >
> > > Signed-off-by: Rémi Denis-Courmont <[email protected]>
> > > ---
> > >
> > > arch/arm64/kernel/entry.S | 9 ++++-----
> > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > index e5d4e30ee242..24f828739696 100644
> > > --- a/arch/arm64/kernel/entry.S
> > > +++ b/arch/arm64/kernel/entry.S
> > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > >
> > > 2:
> > > tramp_map_kernel x30
> > >
> > > #ifdef CONFIG_RANDOMIZE_BASE
> > >
> > > - adr x30, tramp_vectors + PAGE_SIZE
> > > + adrp x30, tramp_vectors + PAGE_SIZE
> > >
> > > alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > >
> > > - ldr x30, [x30]
> > > + ldr x30, [x30, #:lo12:__entry_tramp_data_start]
> >
> > I think this is busted for !4K kernels once we reduce the alignment of
> > __entry_tramp_data_start.
> >
> > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
>
> IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

Sorry, I had erroneously assumed tramp_vectors was page aligned. The
issue still stands -- we haven't accounted for bits 15:12, as those can
differ between tramp_vectors and __entry_tramp_data_start.

Thanks,
Mark.

2020-03-23 19:05:22

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 02:08:53PM +0200, R?mi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a ?crit :
> > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, R?mi Denis-Courmont wrote:
> > > > From: R?mi Denis-Courmont <[email protected]>
> > > >
> > > > This switches from custom instruction patterns to the regular large
> > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > longer assumes that the trampoline vectors and the vectors address both
> > > > start on a page boundary.
> > > >
> > > > Signed-off-by: R?mi Denis-Courmont <[email protected]>
> > > > ---
> > > >
> > > > arch/arm64/kernel/entry.S | 9 ++++-----
> > > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > index e5d4e30ee242..24f828739696 100644
> > > > --- a/arch/arm64/kernel/entry.S
> > > > +++ b/arch/arm64/kernel/entry.S
> > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > >
> > > > 2:
> > > > tramp_map_kernel x30
> > > >
> > > > #ifdef CONFIG_RANDOMIZE_BASE
> > > >
> > > > - adr x30, tramp_vectors + PAGE_SIZE
> > > > + adrp x30, tramp_vectors + PAGE_SIZE
> > > >
> > > > alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > >
> > > > - ldr x30, [x30]
> > > > + ldr x30, [x30, #:lo12:__entry_tramp_data_start]
> > >
> > > I think this is busted for !4K kernels once we reduce the alignment of
> > > __entry_tramp_data_start.
> > >
> > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> >
> > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
>
> Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> issue still stands -- we haven't accounted for bits 15:12, as those can
> differ between tramp_vectors and __entry_tramp_data_start.

Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
below doesn't solve the issue I'm seeing (only reverting patch 3).

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ca1340eb46d8..4cc9d1df3985 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -810,7 +810,7 @@ alternative_else_nop_endif
2:
tramp_map_kernel x30
#ifdef CONFIG_RANDOMIZE_BASE
- adrp x30, tramp_vectors + PAGE_SIZE
+ adrp x30, __entry_tramp_data_start
alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
ldr x30, [x30, #:lo12:__entry_tramp_data_start]
#else
@@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]

#ifdef CONFIG_RANDOMIZE_BASE
- adrp x4, tramp_vectors + PAGE_SIZE
+ adrp x4, __sdei_asm_trampoline_next_handler
ldr x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
#else
ldr x4, =__sdei_asm_handler

--
Catalin

2020-03-23 20:43:19

by Rémi Denis-Courmont

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > From: Rémi Denis-Courmont <[email protected]>
> > > > >
> > > > > This switches from custom instruction patterns to the regular large
> > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > both
> > > > > start on a page boundary.
> > > > >
> > > > > Signed-off-by: Rémi Denis-Courmont <[email protected]>
> > > > > ---
> > > > >
> > > > > arch/arm64/kernel/entry.S | 9 ++++-----
> > > > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > index e5d4e30ee242..24f828739696 100644
> > > > > --- a/arch/arm64/kernel/entry.S
> > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > >
> > > > > 2:
> > > > > tramp_map_kernel x30
> > > > >
> > > > > #ifdef CONFIG_RANDOMIZE_BASE
> > > > >
> > > > > - adr x30, tramp_vectors + PAGE_SIZE
> > > > > + adrp x30, tramp_vectors + PAGE_SIZE
> > > > >
> > > > > alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > >
> > > > > - ldr x30, [x30]
> > > > > + ldr x30, [x30, #:lo12:__entry_tramp_data_start]
> > > >
> > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > __entry_tramp_data_start.
> > > >
> > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > lo12
> > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > >
> > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> >
> > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > issue still stands -- we haven't accounted for bits 15:12, as those can
> > differ between tramp_vectors and __entry_tramp_data_start.

Does that mean that the SDEI code never worked with page size > 4 KiB?

> Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> below doesn't solve the issue I'm seeing (only reverting patch 3).

AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset
in the main vmlinux does not match the architected offset inside the fixmap. If
so, then using the symbol directly will not work at all.




> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ca1340eb46d8..4cc9d1df3985 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -810,7 +810,7 @@ alternative_else_nop_endif
> 2:
> tramp_map_kernel x30
> #ifdef CONFIG_RANDOMIZE_BASE
> - adrp x30, tramp_vectors + PAGE_SIZE
> + adrp x30, __entry_tramp_data_start
> alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> ldr x30, [x30, #:lo12:__entry_tramp_data_start]
> #else
> @@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
> 1: str x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
>
> #ifdef CONFIG_RANDOMIZE_BASE
> - adrp x4, tramp_vectors + PAGE_SIZE
> + adrp x4, __sdei_asm_trampoline_next_handler
> ldr x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
> #else
> ldr x4, =__sdei_asm_handler


--
雷米‧德尼-库尔蒙
http://www.remlab.net/



2020-03-24 10:39:10

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Mon, Mar 23, 2020 at 10:42:30PM +0200, R?mi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a ?crit :
> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
>
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset
> in the main vmlinux does not match the architected offset inside the fixmap. If
> so, then using the symbol directly will not work at all.

You are right, it broke the defconfig as well.

--
Catalin

2020-03-24 10:52:59

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > > From: Rémi Denis-Courmont <[email protected]>
> > > > > >
> > > > > > This switches from custom instruction patterns to the regular large
> > > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > > both
> > > > > > start on a page boundary.
> > > > > >
> > > > > > Signed-off-by: Rémi Denis-Courmont <[email protected]>
> > > > > > ---
> > > > > >
> > > > > > arch/arm64/kernel/entry.S | 9 ++++-----
> > > > > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > > index e5d4e30ee242..24f828739696 100644
> > > > > > --- a/arch/arm64/kernel/entry.S
> > > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > > >
> > > > > > 2:
> > > > > > tramp_map_kernel x30
> > > > > >
> > > > > > #ifdef CONFIG_RANDOMIZE_BASE
> > > > > >
> > > > > > - adr x30, tramp_vectors + PAGE_SIZE
> > > > > > + adrp x30, tramp_vectors + PAGE_SIZE
> > > > > >
> > > > > > alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > > >
> > > > > > - ldr x30, [x30]
> > > > > > + ldr x30, [x30, #:lo12:__entry_tramp_data_start]
> > > > >
> > > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > > __entry_tramp_data_start.
> > > > >
> > > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > > lo12
> > > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > > >
> > > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> > >
> > > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > > issue still stands -- we haven't accounted for bits 15:12, as those can
> > > differ between tramp_vectors and __entry_tramp_data_start.
>
> Does that mean that the SDEI code never worked with page size > 4 KiB?

I think this happens to work, but is fragile. Because nothing happens to
get placed in .rodata between the _entry_tramp_data_start data and the
__sdei_asm_trampoline_next_handler data, the
__sdei_asm_trampoline_next_handler data doesn't spill into a separate
page from the _entry_tramp_data_start data.

If we did start adding stuff into .rodata between those two, there'd be
a bigger risk of things going wrong. That was why I suggested a
.entry.tramp.data section previously.

> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
>
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset
> in the main vmlinux does not match the architected offset inside the fixmap. If
> so, then using the symbol directly will not work at all.

Indeed. I can't see a neat way of avoiding this right now, so should we
drop these patches and leave the code as-is (but with comments as to the
special requirements that it has)?

Thanks,
Mark.

2020-03-24 11:25:06

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 1/3] arm64: clean up trampoline vector loads

On Tue, Mar 24, 2020 at 10:52:17AM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 10:42:30PM +0200, R?mi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a ?crit :
> > > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > > below doesn't solve the issue I'm seeing (only reverting patch 3).
> >
> > AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset
> > in the main vmlinux does not match the architected offset inside the fixmap. If
> > so, then using the symbol directly will not work at all.
>
> Indeed. I can't see a neat way of avoiding this right now, so should we
> drop these patches and leave the code as-is (but with comments as to the
> special requirements that it has)?

I'm going to drop these three patches from -next for now but I can take
any updated comments (they are pretty much missing from this code).

Thanks.

--
Catalin