Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2421983rwb; Mon, 7 Nov 2022 13:07:06 -0800 (PST) X-Google-Smtp-Source: AA0mqf4Jg7/BurkwBAHdou9imiHefsLKMqCjDaVsvq5+xK+8c34g4iV4PNRG0lx+hIDTr8bhsmOI X-Received: by 2002:a63:ef12:0:b0:470:862e:ad2a with SMTP id u18-20020a63ef12000000b00470862ead2amr2046757pgh.470.1667855226391; Mon, 07 Nov 2022 13:07:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667855226; cv=none; d=google.com; s=arc-20160816; b=IkjXsX725CfMnNK/pA+X4T2VealvKuSP60mV65N5/gXEYZVM3SxM2DyVftQzeHUyxR fMjKTkOJT0hI0ZVMg7XKH1+kdt8aNU5AKQFD1rI023G81NsDyrP6UF2g+P+zNV252FkS +XSmOfjDZ/sPYA3Nkc4NM+sGIY0zAlG+SJT/k2QpcPWvqiXE8UnMQ4FjDybt/NKt8JpA oSJYh0oEYNDJZSs9yYw3f7XKI2/nUY3EsUho8ul4pU4r20y6wqQybV5ha+aDc/bkll/L s2zMj88yGAOx7ogdi8YrpgQ/in7juTj/rxjJqxn1QSArtbLfJAkYADtqTo8WgqZfiOtv Hupg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=RJMr8R8EGsmirHMIXYmrOnj4ubtrOhjjOpW0S5unVpE=; b=DtDjsl8jHSdwh/DlTBk10nE4yzUy98T5AE9+qBHNzYQAtuJf2lokb4gZCvCvjmiESb 8TqxsQ+Gh6ClVC9cIAWAbRFBCl27GoAVdb1Ckn+4HBbGxt5eqUaXiutliMaGnQUIMp0L ACROmTuZwaW8nnJK0d7Y9psrlcg+DWRIPO8memd3Tsd4pQl6z/qMnrr8/4YvNUpB2DiS 3lNwJo583K7Tu3FHdsBf8bLfWa9n204MnsMFvq0lR6bnUzPInCV1/3CYh99O7iGi3Vbh jPfzCBbNPLU/HXt/O0ZmiIMLRFxJxCdgyz9LuI8v4+nHwvgKbNR9+ckfjOMXy5/yatgO M4OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=q70O1VXp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j4-20020a170902690400b001887891a06csi7973400plk.54.2022.11.07.13.06.54; Mon, 07 Nov 2022 13:07:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=q70O1VXp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232958AbiKGUtY (ORCPT + 92 others); Mon, 7 Nov 2022 15:49:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233008AbiKGUtP (ORCPT ); Mon, 7 Nov 2022 15:49:15 -0500 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4808F1E3D5; Mon, 7 Nov 2022 12:49:14 -0800 (PST) Received: by mail-ed1-x531.google.com with SMTP id f7so19506289edc.6; Mon, 07 Nov 2022 12:49:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RJMr8R8EGsmirHMIXYmrOnj4ubtrOhjjOpW0S5unVpE=; b=q70O1VXpXS7GFUt6uHXlQsr/HgKxg0AWaOltxKhKpt+7DPwvKNGRVjLRKvtddqzTSN aKXfmAojCZMGedIauBqr/4Y5XP5vZABYUujh3bVsKrrfh82LFOVBJRswZvxnjI0tq8uI 6hdTYkw/kPN8AUkgDFTpriXkxzMBIKkucXwvu4YM7JLKOjTv8/FUA8Iosa+h2niIm98f 6Nd4MXcVr61Aj9Vpzy4/AolNPz+7oOuTBkhMxxc7hJV3BOmM+Gigu/X/Xk7pXd4mwHSj 1q0aP+6gwC4fnaaYICsnCCfVqxCbxDF6Ic8rUfB4APPFZ5yuYfV5OJRFWLn2N8LXq2tp aqjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RJMr8R8EGsmirHMIXYmrOnj4ubtrOhjjOpW0S5unVpE=; b=W9hfFIQ9RL6Bi+sIliXaeIRUD21iio85pxGCJdkSpkujdj4D3EdqhLs1g1d67StJzz 0gPTYKd0MqnAhJklB/0Tdw/qia74/OG6SjaxBw29MfUy06akbnuziDjuXAncqCEtzYnu wCOTUraIlcZFFKBhJdZJD9dDv4ONTUs03dU2KJ2Y/yVUqqzudwjZ4/BcTQ3IqssKu8wB 7SW+Bgd7lx7+7wjKzre+hMUfegRGDRcIDKmGlIOW1ikIZh+bDTuFEdjHWi3Emg8n/L7o /Re+IWyfC4AXrs36d7+WoP1I5oFoBQIj8jR0ONz7pKnQScdGd1PFhqjROI2W/ugBefht 8r2w== X-Gm-Message-State: ACrzQf1feOC8aSiFf+Zl6dE14QLGxnC0zEeT34nZ6zNipPqQf4E/wrf9 RuJFxjwKWY6PWToq2MvKdus/SeeeHRBpGpJedtg= X-Received: by 2002:a05:6402:428d:b0:460:b26c:82a5 with SMTP id g13-20020a056402428d00b00460b26c82a5mr53613270edc.66.1667854152643; Mon, 07 Nov 2022 12:49:12 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alexei Starovoitov Date: Mon, 7 Nov 2022 12:49:01 -0800 Message-ID: Subject: Re: WARNING in bpf_bprintf_prepare To: Jiri Olsa Cc: Hao Sun , Alexei Starovoitov , Linux Kernel Mailing List , Andrii Nakryiko , bpf , Daniel Borkmann , Hao Luo , John Fastabend , KP Singh , Martin KaFai Lau , Stanislav Fomichev , Song Liu , Yonghong Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 7, 2022 at 4:31 AM Jiri Olsa wrote: > > On Wed, Nov 02, 2022 at 03:28:47PM +0100, Jiri Olsa wrote: > > On Thu, Oct 27, 2022 at 07:45:16PM +0800, Hao Sun wrote: > > > Jiri Olsa =E4=BA=8E2022=E5=B9=B410=E6=9C=8827=E6= =97=A5=E5=91=A8=E5=9B=9B 19:24=E5=86=99=E9=81=93=EF=BC=9A > > > > > > > > On Thu, Oct 27, 2022 at 10:27:28AM +0800, Hao Sun wrote: > > > > > Hi, > > > > > > > > > > The following warning can be triggered with the C reproducer in t= he link. > > > > > Syzbot also reported this several days ago, Jiri posted a patch t= hat > > > > > uses bpf prog `active` field to fix this by 05b24ff9b2cfab (bpf: > > > > > Prevent bpf program recursion...) according to syzbot dashboard > > > > > (https://syzkaller.appspot.com/bug?id=3D179313fb375161d50a98311a2= 8b8e2fc5f7350f9). > > > > > But this warning can still be triggered on 247f34f7b803 > > > > > (Linux-v6.1-rc2) that already merged the patch, so it seems that = this > > > > > still is an issue. > > > > > > > > > > HEAD commit: 247f34f7b803 Linux 6.1-rc2 > > > > > git tree: upstream > > > > > console output: https://pastebin.com/raw/kNw8JCu5 > > > > > kernel config: https://pastebin.com/raw/sE5QK5HL > > > > > C reproducer: https://pastebin.com/raw/X96ASi27 > > > > > > > > hi, > > > > right, that fix addressed that issue for single bpf program, > > > > and it won't prevent if there are multiple programs hook on > > > > contention_begin tracepoint and calling bpf_trace_printk, > > > > > > > > I'm not sure we can do something there.. will check > > > > > > > > do you run just the reproducer, or you load the server somehow? > > > > I cannot hit the issue so far > > > > > > > > > > Hi, > > > > > > Last email has format issues, resend it here. > > > > > > I built the kernel with the config in the link, which contains > > > =E2=80=9CCONFIG_CMDLINE=3D"earlyprintk=3Dserial net.ifnames=3D0 > > > sysctl.kernel.hung_task_all_cpu_backtrace=3D1 panic_on_warn=3D1 =E2= =80=A6=E2=80=9D, and > > > boot the kernel with normal qemu setup and then the warning can be > > > triggered by executing the reproducer. > > > > > > Also, I=E2=80=99m willing to test the proposed patch if any. > > > > fyi I reproduced that.. will check if we can do anything about that > > I reproduced this with set of 8 programs all hooked to contention_begin > tracepoint and here's what I think is happening: > > all programs (prog1 .. prog8) call just bpf_trace_printk helper and I'm > running 'perf bench sched messaging' to load the machine > > at some point some contended lock triggers trace_contention_begin: > > trace_contention_begin > __traceiter_contention_begin <-- itera= tes all functions attached to tracepoint > __bpf_trace_run(prog1) > prog1->active =3D 1 > bpf_prog_run(prog1) > bpf_trace_printk > bpf_bprintf_prepare <-- takes= buffer 1 out of 3 > raw_spin_lock_irqsave(trace_printk_lock) > > # we have global single trace_printk_lock, so we will trigg= er > # its trace_contention_begin at some point > > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <-- prog1= is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active =3D 1 > bpf_prog_run(prog2) > bpf_trace_printk > bpf_bprintf_prepare <-- takes= buffer 2 out of 3 > raw_spin_lock_irqsave(trace_printk_lock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <-- prog1= is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active block <-- prog2= is already 'running', skipping the execution > __bpf_trace_run(prog3) > prog3->active =3D 1 > bpf_prog_run(prog3) > bpf_trace_printk > bpf_bprintf_prepare <-- takes= buffer 3 out of 3 > raw_spin_lock_irqsave(trace_printk_l= ock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > prog1->active block <--= prog1 is already 'running', skipping the execution > __bpf_trace_run(prog2) > prog2->active block <--= prog2 is already 'running', skipping the execution > __bpf_trace_run(prog3) > prog3->active block <--= prog3 is already 'running', skipping the execution > __bpf_trace_run(prog4) > prog4->active =3D 1 > bpf_prog_run(prog4) > bpf_trace_printk > bpf_bprintf_prepare <--= tries to take buffer 4 out of 3 -> WARNING > > > the code path may vary based on the contention of the trace_printk_lock, > so I saw different nesting within 8 programs, but all eventually ended up > at 4 levels of nesting and hit the warning > > I think we could perhaps move the 'active' flag protection from program > to the tracepoint level (in the patch below), to prevent nesting executio= n > of the same tracepoint, so it'd look like: > > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) { > contention_begin.active =3D 1 > bpf_prog_run(prog1) > bpf_trace_printk > bpf_bprintf_prepare > raw_spin_lock_irqsave(trace_printk_lock) > trace_contention_begin > __traceiter_contention_begin > __bpf_trace_run(prog1) > blocked because contention_begin.active =3D=3D 1 > __bpf_trace_run(prog2) > blocked because contention_begin.active =3D=3D 1 > __bpf_trace_run(prog3) > ... > __bpf_trace_run(prog8) > blocked because contention_begin.active =3D=3D 1 > > raw_spin_unlock_irqrestore > bpf_bprintf_cleanup > > contention_begin.active =3D 0 > } > > __bpf_trace_run(prog2) { > contention_begin.active =3D 1 > bpf_prog_run(prog2) > ... > contention_begin.active =3D 0 > } > > do we need bpf program execution in nested tracepoints? > we could actually allow 3 nesting levels for this case.. thoughts? > > thanks, > jirka > > > --- > diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h > index 6a13220d2d27..5a354ae096e5 100644 > --- a/include/trace/bpf_probe.h > +++ b/include/trace/bpf_probe.h > @@ -78,11 +78,15 @@ > #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__= VA_ARGS__) > > #define __BPF_DECLARE_TRACE(call, proto, args) \ > +static DEFINE_PER_CPU(int, __bpf_trace_tp_active_##call); \ > static notrace void \ > __bpf_trace_##call(void *__data, proto) = \ > { \ > struct bpf_prog *prog =3D __data; = \ > - CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(ar= gs)); \ > + \ > + if (likely(this_cpu_inc_return(__bpf_trace_tp_active_##call) =3D= =3D 1)) \ > + CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_T= O_U64(args)); \ > + this_cpu_dec(__bpf_trace_tp_active_##call); = \ > } This approach will hurt real use cases where multiple and different raw_tp progs run on the same cpu. Instead let's disallow attaching to trace_contention and potentially any other hook with similar recursion properties. Another option is to add a recursion check to trace_contention itself.