ARM64 JIT used FP (x29) as eBPF fp register, but FP is subjected to
change during function call so it may cause the BPF prog stack base address
change too. Whenever, it pointed to the bottom of BPF prog stack instead of
the top.
So, when copying data via bpf_probe_read, it will be copied to (SP - offset),
then it may overwrite the saved FP/LR.
Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
saved register, so it will keep intact during function call.
It is initialized in BPF prog prologue when BPF prog is started to run
everytime. When BPF prog exits, it could be just tossed.
Other than this the BPf prog stack base need to be setup before function
call stack.
So, the BPF stack layout looks like:
high
original A64_SP => 0:+-----+ BPF prologue
| | FP/LR and callee saved registers
BPF fp register => +64:+-----+
| |
| ... | BPF prog stack
| |
| |
current A64_SP => +-----+
| |
| ... | Function call stack
| |
+-----+
low
Signed-off-by: Yang Shi <[email protected]>
CC: Zi Shen Lim <[email protected]>
CC: Xi Wang <[email protected]>
---
arch/arm64/net/bpf_jit_comp.c | 38 +++++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index a44e529..6809647 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -50,7 +50,7 @@ static const int bpf2a64[] = {
[BPF_REG_8] = A64_R(21),
[BPF_REG_9] = A64_R(22),
/* read-only frame pointer to access stack */
- [BPF_REG_FP] = A64_FP,
+ [BPF_REG_FP] = A64_R(25),
/* temporary register for internal BPF JIT */
[TMP_REG_1] = A64_R(23),
[TMP_REG_2] = A64_R(24),
@@ -155,18 +155,42 @@ static void build_prologue(struct jit_ctx *ctx)
stack_size += 4; /* extra for skb_copy_bits buffer */
stack_size = STACK_ALIGN(stack_size);
+ /*
+ * BPF prog stack layout
+ *
+ * high
+ * original A64_SP => 0:+-----+ BPF prologue
+ * | | FP/LR and callee saved registers
+ * BPF fp register => +64:+-----+
+ * | |
+ * | ... | BPF prog stack
+ * | |
+ * | |
+ * current A64_SP => +-----+
+ * | |
+ * | ... | Function call stack
+ * | |
+ * +-----+
+ * low
+ *
+ */
+
+ /* Save FP and LR registers to stay align with ARM64 AAPCS */
+ emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
+
/* Save callee-saved register */
emit(A64_PUSH(r6, r7, A64_SP), ctx);
emit(A64_PUSH(r8, r9, A64_SP), ctx);
if (ctx->tmp_used)
emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx);
- /* Set up BPF stack */
- emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
-
- /* Set up frame pointer */
+ /* Set up BPF prog stack base register (x25) */
emit(A64_MOV(1, fp, A64_SP), ctx);
+ /* Set up function call stack */
+ emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
+ emit(A64_MOV(1, A64_FP, A64_SP), ctx);
+
/* Clear registers A and X */
emit_a64_mov_i64(ra, 0, ctx);
emit_a64_mov_i64(rx, 0, ctx);
@@ -196,8 +220,8 @@ static void build_epilogue(struct jit_ctx *ctx)
emit(A64_POP(r8, r9, A64_SP), ctx);
emit(A64_POP(r6, r7, A64_SP), ctx);
- /* Restore frame pointer */
- emit(A64_MOV(1, fp, A64_SP), ctx);
+ /* Restore FP/LR registers */
+ emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
/* Set return value */
emit(A64_MOV(1, A64_R(0), r0), ctx);
--
2.0.2
On Fri, Nov 06, 2015 at 09:36:17PM -0800, Yang Shi wrote:
> ARM64 JIT used FP (x29) as eBPF fp register, but FP is subjected to
> change during function call so it may cause the BPF prog stack base address
> change too. Whenever, it pointed to the bottom of BPF prog stack instead of
> the top.
>
> So, when copying data via bpf_probe_read, it will be copied to (SP - offset),
> then it may overwrite the saved FP/LR.
>
> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
> saved register, so it will keep intact during function call.
> It is initialized in BPF prog prologue when BPF prog is started to run
> everytime. When BPF prog exits, it could be just tossed.
>
> Other than this the BPf prog stack base need to be setup before function
> call stack.
>
> So, the BPF stack layout looks like:
>
> high
> original A64_SP => 0:+-----+ BPF prologue
> | | FP/LR and callee saved registers
> BPF fp register => +64:+-----+
> | |
> | ... | BPF prog stack
> | |
> | |
> current A64_SP => +-----+
> | |
> | ... | Function call stack
> | |
> +-----+
> low
>
> Signed-off-by: Yang Shi <[email protected]>
> CC: Zi Shen Lim <[email protected]>
> CC: Xi Wang <[email protected]>
Thanks for tracking it down.
That looks like fundamental bug in arm64 jit. I'm surprised function calls worked at all.
Zi please review.
On Sat, Nov 7, 2015 at 6:27 PM, Alexei Starovoitov
<[email protected]> wrote:
> On Fri, Nov 06, 2015 at 09:36:17PM -0800, Yang Shi wrote:
>> ARM64 JIT used FP (x29) as eBPF fp register, but FP is subjected to
>> change during function call so it may cause the BPF prog stack base address
>> change too. Whenever, it pointed to the bottom of BPF prog stack instead of
>> the top.
>>
>> So, when copying data via bpf_probe_read, it will be copied to (SP - offset),
>> then it may overwrite the saved FP/LR.
>>
>> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
>> saved register, so it will keep intact during function call.
>> It is initialized in BPF prog prologue when BPF prog is started to run
>> everytime. When BPF prog exits, it could be just tossed.
>>
>> Other than this the BPf prog stack base need to be setup before function
>> call stack.
>>
>> So, the BPF stack layout looks like:
>>
>> high
>> original A64_SP => 0:+-----+ BPF prologue
>> | | FP/LR and callee saved registers
>> BPF fp register => +64:+-----+
>> | |
>> | ... | BPF prog stack
>> | |
>> | |
>> current A64_SP => +-----+
>> | |
>> | ... | Function call stack
>> | |
>> +-----+
>> low
>>
>> Signed-off-by: Yang Shi <[email protected]>
>> CC: Zi Shen Lim <[email protected]>
>> CC: Xi Wang <[email protected]>
>
> Thanks for tracking it down.
> That looks like fundamental bug in arm64 jit. I'm surprised function calls worked at all.
> Zi please review.
>
For function calls (BPF_JMP | BPF_CALL), we are compliant with AAPCS64
[1]. That part is okay.
bpf_probe_read accesses the BPF program stack, which is based on BPF_REG_FP.
This exposes an issue with how BPF_REG_FP was setup, as Yang pointed out.
Instead of having BPF_REG_FP point to top of stack, we erroneously
point it to the bottom of stack. When there are function calls, we run
the risk of clobbering of BPF stack. Bad idea.
Otherwise, since BPF_REG_FP is read-only, and is setup exactly once in
prologue, it remains consistent throughout lifetime of the BPF
program.
Yang, can you please try the following?
8<-----
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -161,12 +161,12 @@ static void build_prologue(struct jit_ctx *ctx)
if (ctx->tmp_used)
emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx);
- /* Set up BPF stack */
- emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
-
/* Set up frame pointer */
emit(A64_MOV(1, fp, A64_SP), ctx);
+ /* Set up BPF stack */
+ emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
+
/* Clear registers A and X */
emit_a64_mov_i64(ra, 0, ctx);
emit_a64_mov_i64(rx, 0, ctx);
----->8
[1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
On 11/8/2015 2:29 PM, Z Lim wrote:
> On Sat, Nov 7, 2015 at 6:27 PM, Alexei Starovoitov
> <[email protected]> wrote:
>> On Fri, Nov 06, 2015 at 09:36:17PM -0800, Yang Shi wrote:
>>> ARM64 JIT used FP (x29) as eBPF fp register, but FP is subjected to
>>> change during function call so it may cause the BPF prog stack base address
>>> change too. Whenever, it pointed to the bottom of BPF prog stack instead of
>>> the top.
>>>
>>> So, when copying data via bpf_probe_read, it will be copied to (SP - offset),
>>> then it may overwrite the saved FP/LR.
>>>
>>> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
>>> saved register, so it will keep intact during function call.
>>> It is initialized in BPF prog prologue when BPF prog is started to run
>>> everytime. When BPF prog exits, it could be just tossed.
>>>
>>> Other than this the BPf prog stack base need to be setup before function
>>> call stack.
>>>
>>> So, the BPF stack layout looks like:
>>>
>>> high
>>> original A64_SP => 0:+-----+ BPF prologue
>>> | | FP/LR and callee saved registers
>>> BPF fp register => +64:+-----+
>>> | |
>>> | ... | BPF prog stack
>>> | |
>>> | |
>>> current A64_SP => +-----+
>>> | |
>>> | ... | Function call stack
>>> | |
>>> +-----+
>>> low
>>>
>>> Signed-off-by: Yang Shi <[email protected]>
>>> CC: Zi Shen Lim <[email protected]>
>>> CC: Xi Wang <[email protected]>
>>
>> Thanks for tracking it down.
>> That looks like fundamental bug in arm64 jit. I'm surprised function calls worked at all.
>> Zi please review.
>>
>
> For function calls (BPF_JMP | BPF_CALL), we are compliant with AAPCS64
> [1]. That part is okay.
>
>
> bpf_probe_read accesses the BPF program stack, which is based on BPF_REG_FP.
>
> This exposes an issue with how BPF_REG_FP was setup, as Yang pointed out.
> Instead of having BPF_REG_FP point to top of stack, we erroneously
> point it to the bottom of stack. When there are function calls, we run
> the risk of clobbering of BPF stack. Bad idea.
Yes, exactly.
>
> Otherwise, since BPF_REG_FP is read-only, and is setup exactly once in
> prologue, it remains consistent throughout lifetime of the BPF
> program.
>
>
> Yang, can you please try the following?
It should work without the below change:
+ emit(A64_MOV(1, A64_FP, A64_SP), ctx);
I added it to stay align with ARMv8 AAPCS to maintain the correct FP
during function call. It makes us get correct stack backtrace.
I think we'd better to keep compliant with ARMv8 AAPCS in BPF JIT
prologue too.
If nobody thinks it is necessary, we definitely could remove that change.
Thanks,
Yang
>
> 8<-----
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -161,12 +161,12 @@ static void build_prologue(struct jit_ctx *ctx)
> if (ctx->tmp_used)
> emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx);
>
> - /* Set up BPF stack */
> - emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
> -
> /* Set up frame pointer */
> emit(A64_MOV(1, fp, A64_SP), ctx);
>
> + /* Set up BPF stack */
> + emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
> +
> /* Clear registers A and X */
> emit_a64_mov_i64(ra, 0, ctx);
> emit_a64_mov_i64(rx, 0, ctx);
> ----->8
>
> [1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>
On Mon, Nov 9, 2015 at 10:08 AM, Shi, Yang <[email protected]> wrote:
> I added it to stay align with ARMv8 AAPCS to maintain the correct FP during
> function call. It makes us get correct stack backtrace.
>
> I think we'd better to keep compliant with ARMv8 AAPCS in BPF JIT prologue
> too.
>
> If nobody thinks it is necessary, we definitely could remove that change.
Oh no, I don't think anyone will say it's unnecessary!
I agree the A64_FP-related change is a good idea, so stack unwinding works.
How about splitting this into two patches? One for the BPF-related
bug, and another for A64 FP-handling.
Thanks again for tracking this down and improving things overall for arm64 :)
>
> Thanks,
> Yang
>
>
On 11/9/2015 12:00 PM, Z Lim wrote:
> On Mon, Nov 9, 2015 at 10:08 AM, Shi, Yang <[email protected]> wrote:
>> I added it to stay align with ARMv8 AAPCS to maintain the correct FP during
>> function call. It makes us get correct stack backtrace.
>>
>> I think we'd better to keep compliant with ARMv8 AAPCS in BPF JIT prologue
>> too.
>>
>> If nobody thinks it is necessary, we definitely could remove that change.
>
> Oh no, I don't think anyone will say it's unnecessary!
> I agree the A64_FP-related change is a good idea, so stack unwinding works.
>
> How about splitting this into two patches? One for the BPF-related
> bug, and another for A64 FP-handling.
I'm not sure if this is a good approach or not. IMHO, they are kind of
atomic. Without A64 FP-handling, that fix looks incomplete and
introduces another problem (stack backtrace).
Thanks,
Yang
>
> Thanks again for tracking this down and improving things overall for arm64 :)
>
>>
>> Thanks,
>> Yang
>>
>>
On Tue, Nov 10, 2015 at 11:46 AM, Shi, Yang <[email protected]> wrote:
> On 11/9/2015 12:00 PM, Z Lim wrote:
>>
>> How about splitting this into two patches? One for the BPF-related
>> bug, and another for A64 FP-handling.
>
> I'm not sure if this is a good approach or not. IMHO, they are kind of
> atomic. Without A64 FP-handling, that fix looks incomplete and introduces
> another problem (stack backtrace).
>
The first, even on its own, doesn't make things worse, only better.
The second, which we agree needs to be fixed also, addresses a different issue.
Either way, please also note that these patches fix the original
implementation. We do want -stable to pick these up.
Suggestions for the diagram:
- As an enhancement, would you mind showing the A64_FP also?
- Please revisit "+64:"