Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1403958pxu; Mon, 23 Nov 2020 22:05:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJwyP/bX+RMTCvJ1SfnrHqU8KbBg6/oiQ11vtX/NY0XcPdIfNyh5tPL2tF93P9zkJNFIMkrC X-Received: by 2002:a17:906:8058:: with SMTP id x24mr2917794ejw.272.1606197945825; Mon, 23 Nov 2020 22:05:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606197945; cv=none; d=google.com; s=arc-20160816; b=CDzkp7DasyOhXaYDPW5zWQf9zOcmfUG9+uSGUdQlTIMwCrIVgorTDQYYvVMiqAKVs0 KiSTzu281J4nv+154EXTNDiPfWIxOg9R49YCrwSuen5zXdHF6eblFb4kY2mlozwF9ASn UB8CiGawKZ4TzAa7tVK252X/1NdbocRO98Wv8YTqbymtxN8iyct0kwRLF623Sj1Iz1In t67DaHktdTcBTJ7jy32YC5l59H5DI0tllqTmK72jlz9hdaCyTlXDrzchm8+FDUe90DAW HjOCH6QqcGCcVZJvYqOVMcwkyLDRl1YpRbgp9iplF22qCOXT2KBq1ro2bYJEOHSBSkfk URkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:mail-followup-to:message-id:subject:cc:to :from:date; bh=XoW1y5r1Joo8ysOA1txmsvh0XORXycTdvxHH4rFWIHY=; b=C7ChvIUgT5PSiOs9Xo89IP0feTZNq1dK2JErcDuh2ujkItpcWFOiJ1+X7wWkBYqAUq 5WBhVv08aS+43sFiV47mmQ3a2GvJmxj/VJY3HkueRHLGOmywTu3eV/0oNkzovi9F3rBo JtUVJfFcqm3C+BfXFuJL3T+kLHHxE37dbQxDsI0zjGuA9jbfQw6oC28cxwBHa3yWWXQ8 X1zz8JFzzCbiZohOYd9rFWRjQYoqwi14bJeBuWSHYKiXOkBuJFxXyTKKrRx2cZQUfK1t 3nD7BHxK85vDl+dBBLVzFszqqvd7x9iC68pjjQZLP+mfeNkHwJgznsY+Z9VJ8d7vJJT2 nMPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rn17si9109965ejb.282.2020.11.23.22.05.08; Mon, 23 Nov 2020 22:05:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728846AbgKXF7T (ORCPT + 99 others); Tue, 24 Nov 2020 00:59:19 -0500 Received: from hydra.tuxags.com ([64.13.172.54]:34234 "EHLO mail.tuxags.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725786AbgKXF7T (ORCPT ); Tue, 24 Nov 2020 00:59:19 -0500 Received: by mail.tuxags.com (Postfix, from userid 1000) id C14948971543; Mon, 23 Nov 2020 21:59:18 -0800 (PST) Date: Mon, 23 Nov 2020 21:59:18 -0800 From: Matt Mullins To: Steven Rostedt Cc: Florian Weimer , Peter Zijlstra , Mathieu Desnoyers , linux-kernel , Matt Mullins , Ingo Molnar , Alexei Starovoitov , Daniel Borkmann , Dmitry Vyukov , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , netdev , bpf , Kees Cook , Josh Poimboeuf , linux-toolchains@vger.kernel.org Subject: Re: [PATCH v3] tracepoint: Do not fail unregistering a probe due to memory allocation Message-ID: <20201124055918.k5m6htif7ukhch6v@hydra.tuxags.com> Mail-Followup-To: Steven Rostedt , Florian Weimer , Peter Zijlstra , Mathieu Desnoyers , linux-kernel , Matt Mullins , Ingo Molnar , Alexei Starovoitov , Daniel Borkmann , Dmitry Vyukov , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , netdev , bpf , Kees Cook , Josh Poimboeuf , linux-toolchains@vger.kernel.org References: <20201116175107.02db396d@gandalf.local.home> <47463878.48157.1605640510560.JavaMail.zimbra@efficios.com> <20201117142145.43194f1a@gandalf.local.home> <375636043.48251.1605642440621.JavaMail.zimbra@efficios.com> <20201117153451.3015c5c9@gandalf.local.home> <20201118132136.GJ3121378@hirez.programming.kicks-ass.net> <87h7pmwyta.fsf@mid.deneb.enyo.de> <20201118141226.GV3121392@hirez.programming.kicks-ass.net> <874klmwxxm.fsf@mid.deneb.enyo.de> <20201118093405.7a6d2290@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201118093405.7a6d2290@gandalf.local.home> User-Agent: NeoMutt/20170113 (1.7.2) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 18, 2020 at 09:34:05AM -0500, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > The list of tracepoint callbacks is managed by an array that is protected > by RCU. To update this array, a new array is allocated, the updates are > copied over to the new array, and then the list of functions for the > tracepoint is switched over to the new array. After a completion of an RCU > grace period, the old array is freed. > > This process happens for both adding a callback as well as removing one. > But on removing a callback, if the new array fails to be allocated, the > callback is not removed, and may be used after it is freed by the clients > of the tracepoint. > > There's really no reason to fail if the allocation for a new array fails > when removing a function. Instead, the function can simply be replaced by a > stub function that could be cleaned up on the next modification of the > array. That is, instead of calling the function registered to the > tracepoint, it would call a stub function in its place. > > Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us > Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home > Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home > > Cc: Mathieu Desnoyers > Cc: Ingo Molnar > Cc: Alexei Starovoitov > Cc: Daniel Borkmann > Cc: Dmitry Vyukov > Cc: Martin KaFai Lau > Cc: Song Liu > Cc: Yonghong Song > Cc: Andrii Nakryiko > Cc: John Fastabend > Cc: KP Singh > Cc: netdev > Cc: bpf > Cc: Kees Cook > Cc: Florian Weimer > Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") > Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com > Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com > Reported-by: Matt Mullins > Signed-off-by: Steven Rostedt (VMware) I'm a bit late answering your initial query, but yes indeed this fixes the bug I was hunting. I just watched it live through the reproducer for about a half-hour, while unpatched I get an instant "BUG: unable to handle page fault". Tested-by: Matt Mullins > --- > Changes since v2: > - Went back to using a stub function and not touching > the fast path. > - Removed adding __GFP_NOFAIL from the allocation of the removal. > > kernel/tracepoint.c | 80 ++++++++++++++++++++++++++++++++++++--------- > 1 file changed, 64 insertions(+), 16 deletions(-) > > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 3f659f855074..3e261482296c 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -53,6 +53,12 @@ struct tp_probes { > struct tracepoint_func probes[]; > }; > > +/* Called in removal of a func but failed to allocate a new tp_funcs */ > +static void tp_stub_func(void) > +{ > + return; > +} > + > static inline void *allocate_probes(int count) > { > struct tp_probes *p = kmalloc(struct_size(p, probes, count), > @@ -131,6 +137,7 @@ func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func, > { > struct tracepoint_func *old, *new; > int nr_probes = 0; > + int stub_funcs = 0; > int pos = -1; > > if (WARN_ON(!tp_func->func)) > @@ -147,14 +154,34 @@ func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func, > if (old[nr_probes].func == tp_func->func && > old[nr_probes].data == tp_func->data) > return ERR_PTR(-EEXIST); > + if (old[nr_probes].func == tp_stub_func) > + stub_funcs++; > } > } > - /* + 2 : one for new probe, one for NULL func */ > - new = allocate_probes(nr_probes + 2); > + /* + 2 : one for new probe, one for NULL func - stub functions */ > + new = allocate_probes(nr_probes + 2 - stub_funcs); > if (new == NULL) > return ERR_PTR(-ENOMEM); > if (old) { > - if (pos < 0) { > + if (stub_funcs) { > + /* Need to copy one at a time to remove stubs */ > + int probes = 0; > + > + pos = -1; > + for (nr_probes = 0; old[nr_probes].func; nr_probes++) { > + if (old[nr_probes].func == tp_stub_func) > + continue; > + if (pos < 0 && old[nr_probes].prio < prio) > + pos = probes++; > + new[probes++] = old[nr_probes]; > + } > + nr_probes = probes; > + if (pos < 0) > + pos = probes; > + else > + nr_probes--; /* Account for insertion */ > + > + } else if (pos < 0) { > pos = nr_probes; > memcpy(new, old, nr_probes * sizeof(struct tracepoint_func)); > } else { > @@ -188,8 +215,9 @@ static void *func_remove(struct tracepoint_func **funcs, > /* (N -> M), (N > 1, M >= 0) probes */ > if (tp_func->func) { > for (nr_probes = 0; old[nr_probes].func; nr_probes++) { > - if (old[nr_probes].func == tp_func->func && > - old[nr_probes].data == tp_func->data) > + if ((old[nr_probes].func == tp_func->func && > + old[nr_probes].data == tp_func->data) || > + old[nr_probes].func == tp_stub_func) > nr_del++; > } > } > @@ -208,14 +236,32 @@ static void *func_remove(struct tracepoint_func **funcs, > /* N -> M, (N > 1, M > 0) */ > /* + 1 for NULL */ > new = allocate_probes(nr_probes - nr_del + 1); > - if (new == NULL) > - return ERR_PTR(-ENOMEM); > - for (i = 0; old[i].func; i++) > - if (old[i].func != tp_func->func > - || old[i].data != tp_func->data) > - new[j++] = old[i]; > - new[nr_probes - nr_del].func = NULL; > - *funcs = new; > + if (new) { > + for (i = 0; old[i].func; i++) > + if ((old[i].func != tp_func->func > + || old[i].data != tp_func->data) > + && old[i].func != tp_stub_func) > + new[j++] = old[i]; > + new[nr_probes - nr_del].func = NULL; > + *funcs = new; > + } else { > + /* > + * Failed to allocate, replace the old function > + * with calls to tp_stub_func. > + */ > + for (i = 0; old[i].func; i++) > + if (old[i].func == tp_func->func && > + old[i].data == tp_func->data) { > + old[i].func = tp_stub_func; > + /* Set the prio to the next event. */ > + if (old[i + 1].func) > + old[i].prio = > + old[i + 1].prio; > + else > + old[i].prio = -1; > + } > + *funcs = old; > + } > } > debug_print_probes(*funcs); > return old; > @@ -295,10 +341,12 @@ static int tracepoint_remove_func(struct tracepoint *tp, > tp_funcs = rcu_dereference_protected(tp->funcs, > lockdep_is_held(&tracepoints_mutex)); > old = func_remove(&tp_funcs, func); > - if (IS_ERR(old)) { > - WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM); > + if (WARN_ON_ONCE(IS_ERR(old))) > return PTR_ERR(old); > - } > + > + if (tp_funcs == old) > + /* Failed allocating new tp_funcs, replaced func with stub */ > + return 0; > > if (!tp_funcs) { > /* Removed last function */ > -- > 2.25.4 >