From: Pu Lehui <[email protected]>
The current bpf trampoline attach to kernel functions via ftrace direct
call API, while text_poke is applied for bpf2bpf attach and tail call
optimization. For architectures that do not support ftrace direct call,
text_poke is still able to attach bpf trampoline to kernel functions.
Let's relax it by rollback to text_poke when architecture not supported.
Signed-off-by: Pu Lehui <[email protected]>
---
kernel/bpf/trampoline.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index d6395215b849..386197a7952c 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -228,15 +228,11 @@ static int modify_fentry(struct bpf_trampoline *tr, void *old_addr, void *new_ad
static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
{
void *ip = tr->func.addr;
- unsigned long faddr;
int ret;
- faddr = ftrace_location((unsigned long)ip);
- if (faddr) {
- if (!tr->fops)
- return -ENOTSUPP;
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) &&
+ !!ftrace_location((unsigned long)ip))
tr->func.ftrace_managed = true;
- }
if (bpf_trampoline_module_get(tr))
return -ENOENT;
--
2.25.1
On 2022/12/20 10:32, Xu Kuohai wrote:
> On 12/20/2022 10:13 AM, Pu Lehui wrote:
>> From: Pu Lehui <[email protected]>
>>
>> The current bpf trampoline attach to kernel functions via ftrace direct
>> call API, while text_poke is applied for bpf2bpf attach and tail call
>> optimization. For architectures that do not support ftrace direct call,
>> text_poke is still able to attach bpf trampoline to kernel functions.
>> Let's relax it by rollback to text_poke when architecture not supported.
>>
>> Signed-off-by: Pu Lehui <[email protected]>
>> ---
>> kernel/bpf/trampoline.c | 8 ++------
>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
>> index d6395215b849..386197a7952c 100644
>> --- a/kernel/bpf/trampoline.c
>> +++ b/kernel/bpf/trampoline.c
>> @@ -228,15 +228,11 @@ static int modify_fentry(struct bpf_trampoline
>> *tr, void *old_addr, void *new_ad
>> static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
>> {
>> void *ip = tr->func.addr;
>> - unsigned long faddr;
>> int ret;
>> - faddr = ftrace_location((unsigned long)ip);
>> - if (faddr) {
>> - if (!tr->fops)
>> - return -ENOTSUPP;
>> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) &&
>> + !!ftrace_location((unsigned long)ip))
>> tr->func.ftrace_managed = true;
>> - }
>>
>
> After this patch, a kernel function with true trace_location will be
> patched
> by bpf when CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is disabled, which
> means
> that a kernel function may be patched by both bpf and ftrace in a mutually
> unaware way. This will cause ftrace and bpf_arch_text_poke to fail in a
> somewhat random way if the function to be patched was already patched
> by the other.
Thanks for your review. And yes, this is a backward compatible solution
for architectures that not support DYNAMIC_FTRACE_WITH_DIRECT_CALLS.
>
>> if (bpf_trampoline_module_get(tr))
>> return -ENOENT;
Pu Lehui <[email protected]> writes:
> On 2022/12/20 10:32, Xu Kuohai wrote:
>> On 12/20/2022 10:13 AM, Pu Lehui wrote:
>>> From: Pu Lehui <[email protected]>
>>>
>>> The current bpf trampoline attach to kernel functions via ftrace direct
>>> call API, while text_poke is applied for bpf2bpf attach and tail call
>>> optimization. For architectures that do not support ftrace direct call,
>>> text_poke is still able to attach bpf trampoline to kernel functions.
>>> Let's relax it by rollback to text_poke when architecture not supported.
>>>
>>> Signed-off-by: Pu Lehui <[email protected]>
>>> ---
>>> kernel/bpf/trampoline.c | 8 ++------
>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
>>> index d6395215b849..386197a7952c 100644
>>> --- a/kernel/bpf/trampoline.c
>>> +++ b/kernel/bpf/trampoline.c
>>> @@ -228,15 +228,11 @@ static int modify_fentry(struct bpf_trampoline
>>> *tr, void *old_addr, void *new_ad
>>> static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
>>> {
>>> void *ip = tr->func.addr;
>>> - unsigned long faddr;
>>> int ret;
>>> - faddr = ftrace_location((unsigned long)ip);
>>> - if (faddr) {
>>> - if (!tr->fops)
>>> - return -ENOTSUPP;
>>> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) &&
>>> + !!ftrace_location((unsigned long)ip))
>>> tr->func.ftrace_managed = true;
>>> - }
>>>
>>
>> After this patch, a kernel function with true trace_location will be
>> patched
>> by bpf when CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is disabled, which
>> means
>> that a kernel function may be patched by both bpf and ftrace in a mutually
>> unaware way. This will cause ftrace and bpf_arch_text_poke to fail in a
>> somewhat random way if the function to be patched was already patched
>> by the other.
>
> Thanks for your review. And yes, this is a backward compatible solution
> for architectures that not support DYNAMIC_FTRACE_WITH_DIRECT_CALLS.
It's not "backward compatible". Reiterating what Kuohai said; The BPF
trampoline must be aware of ftrace's state -- with this patch, the
trampoline can't blindly poke the text managed my ftrace.
I'd recommend to remove this patch from the series.
Björn
On 2023/1/3 20:05, Björn Töpel wrote:
> Pu Lehui <[email protected]> writes:
>
>> On 2022/12/20 10:32, Xu Kuohai wrote:
>>> On 12/20/2022 10:13 AM, Pu Lehui wrote:
>>>> From: Pu Lehui <[email protected]>
>>>>
>>>> The current bpf trampoline attach to kernel functions via ftrace direct
>>>> call API, while text_poke is applied for bpf2bpf attach and tail call
>>>> optimization. For architectures that do not support ftrace direct call,
>>>> text_poke is still able to attach bpf trampoline to kernel functions.
>>>> Let's relax it by rollback to text_poke when architecture not supported.
>>>>
>>>> Signed-off-by: Pu Lehui <[email protected]>
>>>> ---
>>>> kernel/bpf/trampoline.c | 8 ++------
>>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
>>>> index d6395215b849..386197a7952c 100644
>>>> --- a/kernel/bpf/trampoline.c
>>>> +++ b/kernel/bpf/trampoline.c
>>>> @@ -228,15 +228,11 @@ static int modify_fentry(struct bpf_trampoline
>>>> *tr, void *old_addr, void *new_ad
>>>> static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
>>>> {
>>>> void *ip = tr->func.addr;
>>>> - unsigned long faddr;
>>>> int ret;
>>>> - faddr = ftrace_location((unsigned long)ip);
>>>> - if (faddr) {
>>>> - if (!tr->fops)
>>>> - return -ENOTSUPP;
>>>> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS) &&
>>>> + !!ftrace_location((unsigned long)ip))
>>>> tr->func.ftrace_managed = true;
>>>> - }
>>>>
>>>
>>> After this patch, a kernel function with true trace_location will be
>>> patched
>>> by bpf when CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is disabled, which
>>> means
>>> that a kernel function may be patched by both bpf and ftrace in a mutually
>>> unaware way. This will cause ftrace and bpf_arch_text_poke to fail in a
>>> somewhat random way if the function to be patched was already patched
>>> by the other.
>>
>> Thanks for your review. And yes, this is a backward compatible solution
>> for architectures that not support DYNAMIC_FTRACE_WITH_DIRECT_CALLS.
>
> It's not "backward compatible". Reiterating what Kuohai said; The BPF
> trampoline must be aware of ftrace's state -- with this patch, the
> trampoline can't blindly poke the text managed my ftrace.
>
> I'd recommend to remove this patch from the series.
>
After deep consideration, Kuohai's catching is much more reasonable.
Will remove it in the next.
In the meantime, I found that song and guoren have worked on supporting
riscv ftrace with direct call [0], so we can concentrate on making
bpf_arch_text_poke specifically for the bpf context.
However, riscv ftrace base framework will change because [0] uses t0 as
the link register of traced function. We should consider the generality
of riscv bpf trampoline for kernel function and bpf context. It's not
clear if [0] will be upstreamed, so maybe we should wait for it?
[0]
https://lore.kernel.org/linux-riscv/[email protected]
Anyway, thanks both of you for the review.
>
> Björn