From: Masami Hiramatsu (Google) <[email protected]>
There is a small chance that get_kretprobe(ri) returns NULL in
kretprobe_dispatcher() when another CPU unregisters the kretprobe
right after __kretprobe_trampoline_handler().
To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
return value again. And if it is NULL, it returns soon because that
kretprobe is under unregistering process.
This issue has been introduced when the kretprobe is decoupled
from the struct kretprobe_instance by commit d741bf41d7c7
("kprobes: Remove kretprobe hash"). Before that commit, the
struct kretprob_instance::rp directly points the kretprobe
and it is never be NULL.
Reported-by: Yonghong Song <[email protected]>
Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
Cc: [email protected]
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
---
kernel/trace/trace_kprobe.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 93507330462c..a245ea673715 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1718,8 +1718,17 @@ static int
kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
struct kretprobe *rp = get_kretprobe(ri);
- struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
+ struct trace_kprobe *tk;
+
+ /*
+ * There is a small chance that get_kretprobe(ri) returns NULL when
+ * the kretprobe is unregister on another CPU between kretprobe's
+ * trampoline_handler and this function.
+ */
+ if (unlikely(!rp))
+ return 0;
+ tk = container_of(rp, struct trace_kprobe, rp);
raw_cpu_inc(*tk->nhit);
if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
> On May 27, 2022, at 8:55 AM, Masami Hiramatsu (Google) <[email protected]> wrote:
>
> From: Masami Hiramatsu (Google) <[email protected]>
>
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
>
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
>
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
>
> Reported-by: Yonghong Song <[email protected]>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: [email protected]
> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Song Liu <[email protected]>
[...]
On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <[email protected]>
>
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
>
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
>
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
>
> Reported-by: Yonghong Song <[email protected]>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: [email protected]
> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Steven, I presume you'll pick this fix up?
Thanks,
Daniel
On May 30, 2022 9:33:23 PM GMT+02:00, Daniel Borkmann <[email protected]> wrote:
>On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
>> From: Masami Hiramatsu (Google) <[email protected]>
>>
>> There is a small chance that get_kretprobe(ri) returns NULL in
>> kretprobe_dispatcher() when another CPU unregisters the kretprobe
>> right after __kretprobe_trampoline_handler().
>>
>> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
>> return value again. And if it is NULL, it returns soon because that
>> kretprobe is under unregistering process.
>>
>> This issue has been introduced when the kretprobe is decoupled
>> from the struct kretprobe_instance by commit d741bf41d7c7
>> ("kprobes: Remove kretprobe hash"). Before that commit, the
>> struct kretprob_instance::rp directly points the kretprobe
>> and it is never be NULL.
>>
>> Reported-by: Yonghong Song <[email protected]>
>> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
>> Cc: [email protected]
>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
>
>Steven, I presume you'll pick this fix up?
I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
-- Steve
>
>Thanks,
>Daniel
--
Sent from my Android device with K-9 Mail. Please excuse my brevity and top posting.
On 5/31/22 12:00 AM, Steven Rostedt wrote:
> On May 30, 2022 9:33:23 PM GMT+02:00, Daniel Borkmann <[email protected]> wrote:
>> On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
>>> From: Masami Hiramatsu (Google) <[email protected]>
>>>
>>> There is a small chance that get_kretprobe(ri) returns NULL in
>>> kretprobe_dispatcher() when another CPU unregisters the kretprobe
>>> right after __kretprobe_trampoline_handler().
>>>
>>> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
>>> return value again. And if it is NULL, it returns soon because that
>>> kretprobe is under unregistering process.
>>>
>>> This issue has been introduced when the kretprobe is decoupled
>>> from the struct kretprobe_instance by commit d741bf41d7c7
>>> ("kprobes: Remove kretprobe hash"). Before that commit, the
>>> struct kretprob_instance::rp directly points the kretprobe
>>> and it is never be NULL.
>>>
>>> Reported-by: Yonghong Song <[email protected]>
>>> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
>>> Cc: [email protected]
>>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
>>
>> Steven, I presume you'll pick this fix up?
>
> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
Ok, thanks. If I don't hear back I presume you'll pick it up then.
On Wed, 8 Jun 2022 14:38:39 +0200
Daniel Borkmann <[email protected]> wrote:
> >>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> >>
> >> Steven, I presume you'll pick this fix up?
> >
> > I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
>
> Ok, thanks. If I don't hear back I presume you'll pick it up then.
Yeah, I'm way behind due to the conference. And I'll be on PTO from
tomorrow and back on Tuesday. And registration for Linux Plumbers is
supposed to open today (but of course there's issues with that!), thus, I'm
really have too much on my plate today :-p
-- Steve
On Wed, Jun 8, 2022 at 6:36 AM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 8 Jun 2022 14:38:39 +0200
> Daniel Borkmann <[email protected]> wrote:
>
> > >>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> > >>
> > >> Steven, I presume you'll pick this fix up?
> > >
> > > I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
> >
> > Ok, thanks. If I don't hear back I presume you'll pick it up then.
>
> Yeah, I'm way behind due to the conference. And I'll be on PTO from
> tomorrow and back on Tuesday. And registration for Linux Plumbers is
> supposed to open today (but of course there's issues with that!), thus, I'm
> really have too much on my plate today :-p
Steven,
you missed Davem's presentation at KR about importance of delegation.
You need watch it:
https://youtu.be/ELPENQrtUas?t=9085
On Wed, 8 Jun 2022 06:57:00 -0700
Alexei Starovoitov <[email protected]> wrote:
> you missed Davem's presentation at KR about importance of delegation.
> You need watch it:
> https://youtu.be/ELPENQrtUas?t=9085
It's on my TODO list.
-- Steve
On 6/8/22 3:10 PM, Steven Rostedt wrote:
> On Wed, 8 Jun 2022 14:38:39 +0200
> Daniel Borkmann <[email protected]> wrote:
>
>>>>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
>>>>
>>>> Steven, I presume you'll pick this fix up?
>>>
>>> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
>>
>> Ok, thanks. If I don't hear back I presume you'll pick it up then.
>
> Yeah, I'm way behind due to the conference. And I'll be on PTO from
> tomorrow and back on Tuesday. And registration for Linux Plumbers is
> supposed to open today (but of course there's issues with that!), thus, I'm
> really have too much on my plate today :-p
Steven, we still have this in our patchwork for tracking so it doesn't fall
off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
or do you want to Ack and we ship the fix via bpf tree? Just asking as I
didn't see any further updates ever since.
Thanks,
Daniel
On 6/17/22 10:26 AM, Daniel Borkmann wrote:
> On 6/8/22 3:10 PM, Steven Rostedt wrote:
>> On Wed, 8 Jun 2022 14:38:39 +0200
>> Daniel Borkmann <[email protected]> wrote:
>>
>>>>>> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
>>>>>
>>>>> Steven, I presume you'll pick this fix up?
>>>>
>>>> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
>>>
>>> Ok, thanks. If I don't hear back I presume you'll pick it up then.
>>
>> Yeah, I'm way behind due to the conference. And I'll be on PTO from
>> tomorrow and back on Tuesday. And registration for Linux Plumbers is
>> supposed to open today (but of course there's issues with that!), thus, I'm
>> really have too much on my plate today :-p
>
> Steven, we still have this in our patchwork for tracking so it doesn't fall
> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
> didn't see any further updates ever since.
Same goes for these two fixes waiting for Ack:
https://lore.kernel.org/bpf/165461825202.280167.12903689442217921817.stgit@devnote2/
Thanks,
Daniel
On Sat, May 28, 2022 at 12:55:39AM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <[email protected]>
>
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
>
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
>
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
>
> Reported-by: Yonghong Song <[email protected]>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: [email protected]
> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
jirka
> ---
> kernel/trace/trace_kprobe.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 93507330462c..a245ea673715 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1718,8 +1718,17 @@ static int
> kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
> {
> struct kretprobe *rp = get_kretprobe(ri);
> - struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
> + struct trace_kprobe *tk;
> +
> + /*
> + * There is a small chance that get_kretprobe(ri) returns NULL when
> + * the kretprobe is unregister on another CPU between kretprobe's
> + * trampoline_handler and this function.
> + */
> + if (unlikely(!rp))
> + return 0;
>
> + tk = container_of(rp, struct trace_kprobe, rp);
> raw_cpu_inc(*tk->nhit);
>
> if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
>
On Fri, 17 Jun 2022 10:26:40 +0200
Daniel Borkmann <[email protected]> wrote:
> Steven, we still have this in our patchwork for tracking so it doesn't fall
> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
> didn't see any further updates ever since.
Sorry, between traveling for conferences and PTO I fell behind. I'll pull
this into my urgent queue and start running my tests on it.
-- Steve
On 6/17/22 6:02 PM, Steven Rostedt wrote:
> On Fri, 17 Jun 2022 10:26:40 +0200
> Daniel Borkmann <[email protected]> wrote:
>
>> Steven, we still have this in our patchwork for tracking so it doesn't fall
>> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
>> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
>> didn't see any further updates ever since.
>
> Sorry, between traveling for conferences and PTO I fell behind. I'll pull
> this into my urgent queue and start running my tests on it.
Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.
On Fri, 17 Jun 2022 18:09:44 +0200
Daniel Borkmann <[email protected]> wrote:
> > Sorry, between traveling for conferences and PTO I fell behind. I'll pull
> > this into my urgent queue and start running my tests on it.
>
> Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.
This patch I'll take, but please keep the fprobe/rethook one, as that's
specific to your tree.
-- Steve
On 6/17/22 7:48 PM, Steven Rostedt wrote:
> On Fri, 17 Jun 2022 18:09:44 +0200
> Daniel Borkmann <[email protected]> wrote:
>
>>> Sorry, between traveling for conferences and PTO I fell behind. I'll pull
>>> this into my urgent queue and start running my tests on it.
>>
>> Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.
>
> This patch I'll take, but please keep the fprobe/rethook one, as that's
> specific to your tree.
Ok, sounds good to me. I just applied them with the comment suggestion added.