Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1990065rdb; Thu, 7 Dec 2023 15:02:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IHBM3272sZtb4A4RxadZwcilSBz14vICRI+TKgkVU5KrHHH55cwrgJWCgoXLGJqRj7k+iem X-Received: by 2002:a17:902:ed15:b0:1d0:d18c:bc5c with SMTP id b21-20020a170902ed1500b001d0d18cbc5cmr3213356pld.127.1701990141676; Thu, 07 Dec 2023 15:02:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701990141; cv=none; d=google.com; s=arc-20160816; b=bZ42hTAldj8xOb6AVtB9s9hXipDsoyfQkvbRvUDHpv2HaqIuTshHUQghNoDPnHXprj dQnxCb6Cg0FxF2NpU7FjNGyPrEVLx7qdQZ0I8e43n2BE55o7uBbGmxn4v31G5Rob1u/H MeR+D+sHlSQyTwP02j+Q0NCH2EH7hI1biJbIc6PWCPQdfl5F6Ke2V3FLwrp5QXxrmOet IKOCmUxzVg9WyCg6OhKGMsNZcQhsaKS97ZzN7YTntUlokT5nkCVRiOOIQcP70vQB0+RZ cVNvKR+XePhmFaJq92F2S9F/TBTuMnO5GVNLHROAylClAYyGXwwNgBitKRT4YeL3CjtC evjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=12HCXgU+vznucLu0wMRthOiLEPRh9ZKY7qobr4QejA4=; fh=8r9Xts/cQV/m9jmhuu1jJNLe13izZejA1xuULgG0UJM=; b=A56DMMNXINUnkuotWOfJiGpifiC07/oW+ZJ9oYnDmllrUa3WYOBZC+Pf2EJpYiialQ z087eSQ8JfMAthxmd4haRwsTZSbhuTgL6M0bJo9cMOTvH99ezvgvT48rIrGYTWMP2R95 ki+tdJmcFGQRcpmEYXRzoxju0dUaPaEvS4jjFe6/IIDrQGtGk2oaXmoBNzdyeJthaKY0 v+MOMk8nd3d5JKFYNuIErULJ7RIssGWpjuDqdvmZhvFO2gbZY3W/Y3ixsHSGNmRY0+NG yXSod0RZeo97yOonWaH1V29WRCfK4ljvUyJMwPXBWeUMt838NWgFvU6aGAAwmh9EEZTa LK8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kylehuey.com header.s=google header.b=bH29X2Ic; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id f11-20020a17090274cb00b001d062135ef8si457131plt.601.2023.12.07.15.02.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 15:02:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kylehuey.com header.s=google header.b=bH29X2Ic; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 820BC802582F; Thu, 7 Dec 2023 15:02:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232428AbjLGXCK (ORCPT + 99 others); Thu, 7 Dec 2023 18:02:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229671AbjLGXCI (ORCPT ); Thu, 7 Dec 2023 18:02:08 -0500 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F0A410EF for ; Thu, 7 Dec 2023 15:02:14 -0800 (PST) Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-54dca2a3f16so2283468a12.0 for ; Thu, 07 Dec 2023 15:02:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kylehuey.com; s=google; t=1701990133; x=1702594933; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=12HCXgU+vznucLu0wMRthOiLEPRh9ZKY7qobr4QejA4=; b=bH29X2IcBwNWQ4r5bfbXPPaReUQLSU+k7Kvq+MmnDI/U2/7OFJ9c1LoOQVSg66vYtc 4/++iGKo0SBdd7i6i3D32UQZfnKKQTanOaieNLtfEwIIfQJpaUNZtv5LPTDGmLYIe3D1 am3TvBAC0Z1c4AQRPfaMS5iBH97RE4wAmsvqe5wFvOYd6NxTIlRFQAbGhYSr+7rh6xC5 4A0FiNCrtvcGRh5SPWwnWb0fw4ARjHNwpAF00UT6kn/YJWhIdmSJHX2bpq0xXxrPymqR OLhOSUEG3yg6MXrOye4NAYXBz5aFmsdr/Nm81RAqinbbzJrWwH6yBdD7y7CkMzulXKza kGCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701990133; x=1702594933; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=12HCXgU+vznucLu0wMRthOiLEPRh9ZKY7qobr4QejA4=; b=t4UTgYPHpEX+VLG4HWldxg1HZdq3ZNIJ6wF+pfblDA9yqVIPnGrSVpOZN0z4aiZUG2 V4rTvQVs816LtELggnERcQi0MhA3tcHY7nE2jriI3D7FLKV8ND9BP3z92eUamXeTnlAs 3nDnB/TujyErqcywsIX/aG1a8o+o863qY9KUa0Up5Z6N4b7l+1FmNIW8Ys6LchnO2uxe 98KrezNdi8MQrp9cvLtvSN514ggnH8Y04gMuDsWv4NrM+qRBkBaXLpQZNohSIAYWsy+V 8HfX4OSbbPLbH761tMD9MywoRQFDmuzNAJFf2qEe7V5mQHSzxmOO4yKSsqcsAOUU+UkD YUCA== X-Gm-Message-State: AOJu0Yyf28gPxescv5jeqGyh5bPNBE+8aVfApWRrw11tyK+cKK4EBZ1L DwEdxZM9xzGw0F1LDXkPEfDkE6Wi2oIaluSHgnC83A== X-Received: by 2002:a17:906:3b99:b0:a1d:4874:1ab7 with SMTP id u25-20020a1709063b9900b00a1d48741ab7mr36186ejf.65.1701990132593; Thu, 07 Dec 2023 15:02:12 -0800 (PST) MIME-Version: 1.0 References: <20231207163458.5554-1-khuey@kylehuey.com> <20231207163458.5554-2-khuey@kylehuey.com> In-Reply-To: From: Kyle Huey Date: Thu, 7 Dec 2023 15:02:01 -0800 Message-ID: Subject: Re: [PATCH v2 1/3] perf: Reorder overflow handler ahead of event_limit/sigtrap To: Namhyung Kim Cc: Marco Elver , Kyle Huey , open list , Andrii Nakryiko , Jiri Olsa , Yonghong Song , "Robert O'Callahan" , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 07 Dec 2023 15:02:20 -0800 (PST) On Thu, Dec 7, 2023 at 2:38=E2=80=AFPM Namhyung Kim w= rote: > > Hello, > > On Thu, Dec 07, 2023 at 06:53:58PM +0100, Marco Elver wrote: > > On Thu, 7 Dec 2023 at 18:47, Kyle Huey wrote: > > > > > > > > > > > > On Thu, Dec 7, 2023, 9:05=E2=80=AFAM Marco Elver w= rote: > > >> > > >> On Thu, 7 Dec 2023 at 17:35, Kyle Huey wrote: > > >> > > > >> > The perf subsystem already allows an overflow handler to clear pen= ding_kill > > >> > to suppress a fasync signal (although nobody currently does this).= Allow an > > >> > overflow handler to suppress the other visible side effects of an = overflow, > > >> > namely event_limit accounting and SIGTRAP generation. > > Well, I think it can still hit the throttling logic and generate > a PERF_RECORD_THROTTLE. But it should be rare.. > > > >> > > > >> > Signed-off-by: Kyle Huey > > >> > --- > > >> > kernel/events/core.c | 10 +++++++--- > > >> > 1 file changed, 7 insertions(+), 3 deletions(-) > > >> > > > >> > diff --git a/kernel/events/core.c b/kernel/events/core.c > > >> > index b704d83a28b2..19fddfc27a4a 100644 > > >> > --- a/kernel/events/core.c > > >> > +++ b/kernel/events/core.c > > >> > @@ -9541,6 +9541,12 @@ static int __perf_event_overflow(struct per= f_event *event, > > >> > */ > > >> > > > >> > event->pending_kill =3D POLL_IN; > > >> > + > > >> > + READ_ONCE(event->overflow_handler)(event, data, regs); > > >> > + > > >> > + if (!event->pending_kill) > > >> > + return ret; > > >> > > >> It's not at all intuitive that resetting pending_kill to 0 will > > >> suppress everything else, too. There is no relationship between the > > >> fasync signals and SIGTRAP. pending_kill is for the former and > > >> pending_sigtrap is for the latter. One should not affect the other. > > >> > > >> A nicer solution would be to properly undo the various pending_* > > >> states (in the case of pending_sigtrap being set it should be enough > > >> to reset pending_sigtrap to 0, and also decrement > > >> event->ctx->nr_pending). > > > > > > > > > I don't believe it's possible to correctly undo the event_limit decre= ment after the fact (if it's e.g. racing with the ioctl that adds to the ev= ent limit). > > > > > >> Although I can see why this solution is simpler. Perhaps with enough > > >> comments it might be clearer. > > >> > > >> Preferences? > > > > > > > > > The cleanest way would probably be to add a return value to the overf= low handler function that controls this. It requires changing a bunch of ar= ch specific code on arches I don't have access to though. > > > > Hmm. > > > > Maybe wait for perf maintainers to say what is preferrable. (I could > > live with just making sure this has no other weird side effects and > > more comments.) > > What if we can call bpf handler directly and check the return value? > Then I think we can also get rid of the original overflow handler. > > Something like this (untested..) mmm, that's an interesting idea. I'll experiment with it. - Kyle > Thanks, > Namhyung > > > ---8<--- > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index e85cd1c0eaf3..1eba6f5bb70b 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -809,7 +809,6 @@ struct perf_event { > perf_overflow_handler_t overflow_handler; > void *overflow_handler_context; > #ifdef CONFIG_BPF_SYSCALL > - perf_overflow_handler_t orig_overflow_handler; > struct bpf_prog *prog; > u64 bpf_cookie; > #endif > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 4c72a41f11af..e1a00646dbbe 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -9471,6 +9471,12 @@ static inline bool sample_is_allowed(struct perf_e= vent *event, struct pt_regs *r > * Generic event overflow handling, sampling. > */ > > +#ifdef CONFIG_BPF_SYSCALL > +static int bpf_overflow_handler(struct perf_event *event, > + struct perf_sample_data *data, > + struct pt_regs *regs); > +#endif > + > static int __perf_event_overflow(struct perf_event *event, > int throttle, struct perf_sample_data *d= ata, > struct pt_regs *regs) > @@ -9487,6 +9493,11 @@ static int __perf_event_overflow(struct perf_event= *event, > > ret =3D __perf_event_account_interrupt(event, throttle); > > +#ifdef CONFIG_BPF_SYSCALL > + if (event->prog && bpf_overflow_handler(event, data, regs) =3D=3D= 0) > + return ret; > +#endif > + > /* > * XXX event_limit might not quite work as expected on inherited > * events > @@ -10346,7 +10357,7 @@ static void perf_event_free_filter(struct perf_ev= ent *event) > } > > #ifdef CONFIG_BPF_SYSCALL > -static void bpf_overflow_handler(struct perf_event *event, > +static int bpf_overflow_handler(struct perf_event *event, > struct perf_sample_data *data, > struct pt_regs *regs) > { > @@ -10355,7 +10366,7 @@ static void bpf_overflow_handler(struct perf_even= t *event, > .event =3D event, > }; > struct bpf_prog *prog; > - int ret =3D 0; > + int ret =3D 1; > > ctx.regs =3D perf_arch_bpf_user_pt_regs(regs); > if (unlikely(__this_cpu_inc_return(bpf_prog_active) !=3D 1)) > @@ -10369,10 +10380,7 @@ static void bpf_overflow_handler(struct perf_eve= nt *event, > rcu_read_unlock(); > out: > __this_cpu_dec(bpf_prog_active); > - if (!ret) > - return; > - > - event->orig_overflow_handler(event, data, regs); > + return ret; > } > > static int perf_event_set_bpf_handler(struct perf_event *event, > @@ -10408,8 +10416,6 @@ static int perf_event_set_bpf_handler(struct perf= _event *event, > > event->prog =3D prog; > event->bpf_cookie =3D bpf_cookie; > - event->orig_overflow_handler =3D READ_ONCE(event->overflow_handle= r); > - WRITE_ONCE(event->overflow_handler, bpf_overflow_handler); > return 0; > } > > @@ -10420,7 +10426,6 @@ static void perf_event_free_bpf_handler(struct pe= rf_event *event) > if (!prog) > return; > > - WRITE_ONCE(event->overflow_handler, event->orig_overflow_handler)= ; > event->prog =3D NULL; > bpf_prog_put(prog); > } > @@ -11880,13 +11885,11 @@ perf_event_alloc(struct perf_event_attr *attr, = int cpu, > overflow_handler =3D parent_event->overflow_handler; > context =3D parent_event->overflow_handler_context; > #if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_EVENT_TRACING) > - if (overflow_handler =3D=3D bpf_overflow_handler) { > + if (parent_event->prog) { > struct bpf_prog *prog =3D parent_event->prog; > > bpf_prog_inc(prog); > event->prog =3D prog; > - event->orig_overflow_handler =3D > - parent_event->orig_overflow_handler; > } > #endif > }