Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp366005pxb; Wed, 18 Nov 2020 06:36:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJxfe+LHi58rMXMc+wIhzMofDNEKOng4RM/1QxwEHr4W4cAVZUsQ1CCSwEdGqFYIM35GJgoD X-Received: by 2002:a17:906:314f:: with SMTP id e15mr24149478eje.496.1605710187313; Wed, 18 Nov 2020 06:36:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605710187; cv=none; d=google.com; s=arc-20160816; b=KSw2Lj72pOPrL/CoJcZzZc6dketX6iXx/pW0eojJCWTnsuH0b4OGfQkE9fZyvHsj3F gEUvg29JDkFXlMoSpi2H0yAGXKofQJz/w6vBtADhKlw9UwVpKVhbFrj4+6p52czDJMbS 1N/9AUeuhnUOSEQJ8hCvgQxXFnAQOkojWQfi/OA3kC0gwO7myCHwPUD2VKkKHJkSb5ab SmhVUdkVtDWK9boumBC4I4FqtsFA1pHBCoUSd2fSvL5gSg+DnBWTlFEPd+7UDfv4ztv1 C3B9J9sMUIW9+s2lP1YVeZzF9NR3npYzoIIvmO6PQjhyoiCFJtPYF1xqEo2hrMFjtTvp CBrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=gZib92gfU7LR0DHTjPkUJvSaWYvOW+rGpQHVAXELMJI=; b=VONIRuHriZlLqEk3LD9m8reF+4MThma5jfRP+L8nIBmJ9UoBIhNTAFXeWuDFoyg6bR sjPApjkU9mMekalDtuE8o1IObbypG03M45ZZI2j+hw0hzZEsprzZJTUIw146hRSbHKay WXWbYZTcViVNGqqSR4gII7qimBlkV4r3Rs2QL9GV2vg8v5ZUY0f75B1xyKTukCvCMpO2 6Ffa23xdinl4X0mrYIxi2zZsZ6amuEgmHFaGv0irldxiITK695x37VWGTc5XTj6QzM76 RC9ARnm+PW/q/DheLTKFpxuu+bb7mS+lN2de668Tm84w41FF/ErFhRmBNX7Nh6Duz7/t TTHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i7si2615164ejc.568.2020.11.18.06.36.03; Wed, 18 Nov 2020 06:36:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726580AbgKROeL (ORCPT + 99 others); Wed, 18 Nov 2020 09:34:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:55756 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725834AbgKROeK (ORCPT ); Wed, 18 Nov 2020 09:34:10 -0500 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 13B3220A8B; Wed, 18 Nov 2020 14:34:06 +0000 (UTC) Date: Wed, 18 Nov 2020 09:34:05 -0500 From: Steven Rostedt To: Florian Weimer Cc: Peter Zijlstra , Mathieu Desnoyers , linux-kernel , Matt Mullins , Ingo Molnar , Alexei Starovoitov , Daniel Borkmann , Dmitry Vyukov , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , netdev , bpf , Kees Cook , Josh Poimboeuf , linux-toolchains@vger.kernel.org Subject: [PATCH v3] tracepoint: Do not fail unregistering a probe due to memory allocation Message-ID: <20201118093405.7a6d2290@gandalf.local.home> In-Reply-To: <874klmwxxm.fsf@mid.deneb.enyo.de> References: <20201116175107.02db396d@gandalf.local.home> <47463878.48157.1605640510560.JavaMail.zimbra@efficios.com> <20201117142145.43194f1a@gandalf.local.home> <375636043.48251.1605642440621.JavaMail.zimbra@efficios.com> <20201117153451.3015c5c9@gandalf.local.home> <20201118132136.GJ3121378@hirez.programming.kicks-ass.net> <87h7pmwyta.fsf@mid.deneb.enyo.de> <20201118141226.GV3121392@hirez.programming.kicks-ass.net> <874klmwxxm.fsf@mid.deneb.enyo.de> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Steven Rostedt (VMware)" The list of tracepoint callbacks is managed by an array that is protected by RCU. To update this array, a new array is allocated, the updates are copied over to the new array, and then the list of functions for the tracepoint is switched over to the new array. After a completion of an RCU grace period, the old array is freed. This process happens for both adding a callback as well as removing one. But on removing a callback, if the new array fails to be allocated, the callback is not removed, and may be used after it is freed by the clients of the tracepoint. There's really no reason to fail if the allocation for a new array fails when removing a function. Instead, the function can simply be replaced by a stub function that could be cleaned up on the next modification of the array. That is, instead of calling the function registered to the tracepoint, it would call a stub function in its place. Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home Cc: Mathieu Desnoyers Cc: Ingo Molnar Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Dmitry Vyukov Cc: Martin KaFai Lau Cc: Song Liu Cc: Yonghong Song Cc: Andrii Nakryiko Cc: John Fastabend Cc: KP Singh Cc: netdev Cc: bpf Cc: Kees Cook Cc: Florian Weimer Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com Reported-by: Matt Mullins Signed-off-by: Steven Rostedt (VMware) --- Changes since v2: - Went back to using a stub function and not touching the fast path. - Removed adding __GFP_NOFAIL from the allocation of the removal. kernel/tracepoint.c | 80 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 64 insertions(+), 16 deletions(-) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 3f659f855074..3e261482296c 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -53,6 +53,12 @@ struct tp_probes { struct tracepoint_func probes[]; }; +/* Called in removal of a func but failed to allocate a new tp_funcs */ +static void tp_stub_func(void) +{ + return; +} + static inline void *allocate_probes(int count) { struct tp_probes *p = kmalloc(struct_size(p, probes, count), @@ -131,6 +137,7 @@ func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func, { struct tracepoint_func *old, *new; int nr_probes = 0; + int stub_funcs = 0; int pos = -1; if (WARN_ON(!tp_func->func)) @@ -147,14 +154,34 @@ func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func, if (old[nr_probes].func == tp_func->func && old[nr_probes].data == tp_func->data) return ERR_PTR(-EEXIST); + if (old[nr_probes].func == tp_stub_func) + stub_funcs++; } } - /* + 2 : one for new probe, one for NULL func */ - new = allocate_probes(nr_probes + 2); + /* + 2 : one for new probe, one for NULL func - stub functions */ + new = allocate_probes(nr_probes + 2 - stub_funcs); if (new == NULL) return ERR_PTR(-ENOMEM); if (old) { - if (pos < 0) { + if (stub_funcs) { + /* Need to copy one at a time to remove stubs */ + int probes = 0; + + pos = -1; + for (nr_probes = 0; old[nr_probes].func; nr_probes++) { + if (old[nr_probes].func == tp_stub_func) + continue; + if (pos < 0 && old[nr_probes].prio < prio) + pos = probes++; + new[probes++] = old[nr_probes]; + } + nr_probes = probes; + if (pos < 0) + pos = probes; + else + nr_probes--; /* Account for insertion */ + + } else if (pos < 0) { pos = nr_probes; memcpy(new, old, nr_probes * sizeof(struct tracepoint_func)); } else { @@ -188,8 +215,9 @@ static void *func_remove(struct tracepoint_func **funcs, /* (N -> M), (N > 1, M >= 0) probes */ if (tp_func->func) { for (nr_probes = 0; old[nr_probes].func; nr_probes++) { - if (old[nr_probes].func == tp_func->func && - old[nr_probes].data == tp_func->data) + if ((old[nr_probes].func == tp_func->func && + old[nr_probes].data == tp_func->data) || + old[nr_probes].func == tp_stub_func) nr_del++; } } @@ -208,14 +236,32 @@ static void *func_remove(struct tracepoint_func **funcs, /* N -> M, (N > 1, M > 0) */ /* + 1 for NULL */ new = allocate_probes(nr_probes - nr_del + 1); - if (new == NULL) - return ERR_PTR(-ENOMEM); - for (i = 0; old[i].func; i++) - if (old[i].func != tp_func->func - || old[i].data != tp_func->data) - new[j++] = old[i]; - new[nr_probes - nr_del].func = NULL; - *funcs = new; + if (new) { + for (i = 0; old[i].func; i++) + if ((old[i].func != tp_func->func + || old[i].data != tp_func->data) + && old[i].func != tp_stub_func) + new[j++] = old[i]; + new[nr_probes - nr_del].func = NULL; + *funcs = new; + } else { + /* + * Failed to allocate, replace the old function + * with calls to tp_stub_func. + */ + for (i = 0; old[i].func; i++) + if (old[i].func == tp_func->func && + old[i].data == tp_func->data) { + old[i].func = tp_stub_func; + /* Set the prio to the next event. */ + if (old[i + 1].func) + old[i].prio = + old[i + 1].prio; + else + old[i].prio = -1; + } + *funcs = old; + } } debug_print_probes(*funcs); return old; @@ -295,10 +341,12 @@ static int tracepoint_remove_func(struct tracepoint *tp, tp_funcs = rcu_dereference_protected(tp->funcs, lockdep_is_held(&tracepoints_mutex)); old = func_remove(&tp_funcs, func); - if (IS_ERR(old)) { - WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM); + if (WARN_ON_ONCE(IS_ERR(old))) return PTR_ERR(old); - } + + if (tp_funcs == old) + /* Failed allocating new tp_funcs, replaced func with stub */ + return 0; if (!tp_funcs) { /* Removed last function */ -- 2.25.4