Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2086374pxv; Sat, 26 Jun 2021 08:45:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxSRskJi0tOlZLyac44pWaYtylrxfK8HGTK4lHbVt0e+zYfZK18UbCd8Srf66e1TZXQ38jU X-Received: by 2002:a92:9502:: with SMTP id y2mr11313036ilh.1.1624722309705; Sat, 26 Jun 2021 08:45:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624722309; cv=none; d=google.com; s=arc-20160816; b=T9oYGhqt61UBJ7GTYS2ATUPcB0jMeOU29CQ1regGkvCrqIxx3wOx4BEeaztFgouxAl /9c9TvCOLtbykwK9CJF0yiM61la1T6yb6dXyItisWOO5idARTCpjSTKMcgWLD4A9dkm1 Sa1mrTgAYzvZgQVyrDbNQvkPutZ6HSlwzRqgJ5MvkNtZB3TUog09eyl0OoXPFzEfbeBC gNa4enxIuv9Bj5j1O1ATh5ZQjUjc0zobOW1XRFxafsWYP5b5vRF0MtTaTwWltK1Ju4z3 9z+6cNHANThAVeruMyVJYB9Lej1bIeyXX9JfTti9yB2jCegMlKm/CZNzU2poWfJ3VfUj ntQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=aoVF6iid+tomRXK/DFpu/ww+88d4t6oOnom/CvwopXU=; b=qL3YX5TFK0MXNdaLGfH8j+fms4IUnCC1bUz+w1HiLTNETgz58PEMbS77WYS37YKZI2 jiy0aZSBWboe1rV6Z9cwI77P/0GH1whbH14c7nImyval4e0HyC9oi/24JWjl247BtaMa OgBSYgFKLZNxdM63xpkLWcTqoWFDzNHg86OyOvBBtAL0mk7iTdUQOxpVQO4y89m5vbLG qPk2r1meMbg469nViyPz0+uXN9yHO1PbPHkyXFTRqoi7At660r6ogWg+0xP6LXghPYfl u6/oYctKdOqqzFo7cLBGVfdiEgtvOQQao2TZ4aMoyby+336GZkIYO3juYuxt+HfzEjEy wLEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y5si9067029ioq.102.2021.06.26.08.44.57; Sat, 26 Jun 2021 08:45:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230112AbhFZPpT (ORCPT + 99 others); Sat, 26 Jun 2021 11:45:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:51650 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229796AbhFZPpS (ORCPT ); Sat, 26 Jun 2021 11:45:18 -0400 Received: from rorschach.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2EE6C61941; Sat, 26 Jun 2021 15:42:55 +0000 (UTC) Date: Sat, 26 Jun 2021 11:41:57 -0400 From: Steven Rostedt To: Tetsuo Handa Cc: Peter Zijlstra , Mathieu Desnoyers , Ingo Molnar , Robert Richter , Gabriel Krisman Bertazi , "Gustavo A. R. Silva" , linux-kernel@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , netdev , bpf@vger.kernel.org Subject: Re: [PATCH] tracepoint: Do not warn on EEXIST or ENOENT Message-ID: <20210626114157.765d9371@rorschach.local.home> In-Reply-To: <7297f336-70e5-82d3-f8d3-27f08c7d1548@i-love.sakura.ne.jp> References: <20210626135845.4080-1-penguin-kernel@I-love.SAKURA.ne.jp> <20210626101834.55b4ecf1@rorschach.local.home> <7297f336-70e5-82d3-f8d3-27f08c7d1548@i-love.sakura.ne.jp> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 27 Jun 2021 00:13:17 +0900 Tetsuo Handa wrote: > On 2021/06/26 23:18, Steven Rostedt wrote: > > On Sat, 26 Jun 2021 22:58:45 +0900 > > Tetsuo Handa wrote: > > > >> syzbot is hitting WARN_ON_ONCE() at tracepoint_add_func() [1], but > >> func_add() returning -EEXIST and func_remove() returning -ENOENT are > >> not kernel bugs that can justify crashing the system. > > > > There should be no path that registers a tracepoint twice. That's a bug > > in the kernel. Looking at the link below, I see the backtrace: > > > > Call Trace: > > tracepoint_probe_register_prio kernel/tracepoint.c:369 [inline] > > tracepoint_probe_register+0x9c/0xe0 kernel/tracepoint.c:389 > > __bpf_probe_register kernel/trace/bpf_trace.c:2154 [inline] > > bpf_probe_register+0x15a/0x1c0 kernel/trace/bpf_trace.c:2159 > > bpf_raw_tracepoint_open+0x34a/0x720 kernel/bpf/syscall.c:2878 > > __do_sys_bpf+0x2586/0x4f40 kernel/bpf/syscall.c:4435 > > do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 > > > > So BPF is allowing the user to register the same tracepoint more than > > once? That looks to be a bug in the BPF code where it shouldn't be > > allowing user space to register the same tracepoint multiple times. > > I didn't catch your question. > > (1) func_add() can reject an attempt to add same tracepoint multiple times > by returning -EINVAL to the caller. > (2) But tracepoint_add_func() (the caller of func_add()) is calling WARN_ON_ONCE() > if func_add() returned -EINVAL. That's because (before BPF) there's no place in the kernel that tries to register the same tracepoint multiple times, and was considered a bug if it happened, because there's no ref counters to deal with adding them multiple times. If the tracepoint is already registered (with the given function and data), then something likely went wrong. > (3) And tracepoint_add_func() is triggerable via request from userspace. Only via BPF correct? I'm not sure how it works, but can't BPF catch that it is registering the same tracepoint again? We could add this patch, but then we need to add the WARN_ON_ONCE() to all the other callers, because for the other callers it's a bug if the tracepoint was registered twice with the same callback and data. Or we can add another interface that wont warn, and BPF can use that. > (4) tracepoint_probe_register_prio() serializes tracepoint_add_func() call > triggered by concurrent request from userspace using tracepoints_mutex mutex. You keep saying user space. Is it a BPF program? > (5) But tracepoint_add_func() does not check whether same tracepoint multiple > is already registered before calling func_add(). Because it's considered a bug in the kernel if that is the case. > (6) As a result, tracepoint_add_func() receives -EINVAL from func_add(), and > calls WARN_ON_ONCE() and the system crashes due to panic_on_warn == 1. > > Why this is a bug in the BPF code? The BPF code is not allowing userspace to > register the same tracepoint multiple times. I think that tracepoint_add_func() Then how is the same tracepoint being registered multiple times? You keep saying "user space" but the only way user space is doing this is through BPF. Or am I missing something? > is stupid enough to crash the kernel instead of rejecting when an attempt to > register the same tracepoint multiple times is made. Because its a bug in the kernel, and WARN_ON_ONCE() is what is used when you detect something that is considered a bug in the kernel. If you don't want warnings to crash the kernel, you don't add "panic_on_warning". If BPF is expected to register the same tracepoint with the same callback and data more than once, then let's add a call to do that without warning. Like I said, other callers expect the call to succeed unless it's out of memory, which tends to cause other problems. FYI, this warning has caught bugs in my own code, that triggered my tests to fail, and had me go fix that bug before pushing it further. And my tests fail only on a full warning. -- Steve