Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1199965rwb; Thu, 6 Oct 2022 09:39:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Xmn07JJUE510lt3xKbNZYqKSX2HslvM5egbn+4Gb1nI5W7dmq/D6cFXJWBH0g3uRQsqdy X-Received: by 2002:a05:6a00:804:b0:544:9d05:60a2 with SMTP id m4-20020a056a00080400b005449d0560a2mr715838pfk.57.1665074356714; Thu, 06 Oct 2022 09:39:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665074356; cv=none; d=google.com; s=arc-20160816; b=R5A6HdMUQxDrD9NkzfJSFnArnzFbr1aJBOe7vnqOLbpLU4/8HgbhweepL+39lo63sS RfqbA4pDrUkBJOiQcvb+A/ZZcGlBIPaPLZlP/Pk2hUcZ/y4rjztoNqIiDSqNsR62cbxL JvVFWPb1uoZfOeveNejPquCGxqNIdZQlFSnmvvLUCZuKG2EV674mMMxwdQoeSF/8bvqR IzhOFhKsaDBs9DxtnVsgN3IV/pY1LQZXsUfODWnrSooYh7rJjs7Scdxdfx+oXGX2hJd6 cGWt3R6j5YrndHZNFvf5PEQmfSLNpiJaWBIsTTuPkQVZvK54X9fbFZqQzeOHliOCoASL SQDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=6vcGCSyh/yp3xC5uIRVMFXTdQ0xt3noIJJL9jaVO1sg=; b=tvESrUb1MQ5IV4DVkk1jZ61TC2wZiu+qBMYev2hU/Lda2wqKkappJusUyFruhxjGPU F90CJtuqJtLwEvlDwDa1UXeVZwyTCTtB5XQte9JFybujaP3fungDGthSte+KHTTz4RA+ XBc0L+xVS4O0DwTZ0uPe6/xWaydTPkmLsOOqXo0mDB+1k1aVOpdKStNymba3ay4FAe7s kF4OLaYiWg/Ybb9TFm/222GCbhCi8myZXE0q2tWHVgerLjQMLNYhZ+pp4NMk8zvkdayo dyVLB64s8J+AF9y2GKgY1zqNmj46iZmPR41G9usaTqvJ3SXIhNybRTFpnLCtzACHz4Ms i+cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=QE3AQCSy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e13-20020a056a001a8d00b00562027a1d58si6301594pfv.297.2022.10.06.09.39.03; Thu, 06 Oct 2022 09:39:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=QE3AQCSy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229611AbiJFQTe (ORCPT + 99 others); Thu, 6 Oct 2022 12:19:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231415AbiJFQT3 (ORCPT ); Thu, 6 Oct 2022 12:19:29 -0400 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2728B48A0 for ; Thu, 6 Oct 2022 09:19:24 -0700 (PDT) Received: by mail-pj1-x102c.google.com with SMTP id gf8so2143418pjb.5 for ; Thu, 06 Oct 2022 09:19:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=6vcGCSyh/yp3xC5uIRVMFXTdQ0xt3noIJJL9jaVO1sg=; b=QE3AQCSyl7woC242XO7P+dERgkeafBZ9F3zsmCqLJmcvrmgNZttdSaiJ4dMoaGtJ+X WTd7v/l1hw0gNdEfTQbpPvvNXjQniznxY91XzEe3CGz4CdVhgsFGKs55j8StOhA7Y1AF hNQdQEysPtPt9NLxouU0Jkck7fI5G5clOUuBY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=6vcGCSyh/yp3xC5uIRVMFXTdQ0xt3noIJJL9jaVO1sg=; b=5f+W4nlMzehm/wKYLIZng58QENu6zzEJ6CjUUcpa9a5ymf4b1XfA1n2QhIsDV3uz/I CKltlsXjgWqfLArM2uf6g7CPZdyi/7M4l87eVlPoYX3sUWzjjUOoKk1ISt1y6BgIOVpU /Iu7m+h5PZyiYW5Zdc8WLF8Fr5I3z2l5mwt3T+DB53wUtSWg2zYIrBFxHKaKDjpYLhzj TCKNuOD2baZ1yFo7deZkHwGFbYccQ7ZKqbetGasBwhTuRDNy9E4zyjP0GdwtyCZpMGPU oaoPmBxjO7G3yPa8xL22e1FtLXgs5mgDoi0oZuuKNyn/ou6xWrnljCzGfmKul9FhCyYQ z5EQ== X-Gm-Message-State: ACrzQf3Vvin07krTDFP2uGsb0ptNvY52VstoSsS8ohT3pDWq4WhemqaT 29VVWXCOaTyYRSfP+RFcL2iEPwWKssYZcNh7eo+0/Q== X-Received: by 2002:a17:902:8542:b0:179:eb8d:f41d with SMTP id d2-20020a170902854200b00179eb8df41dmr247725plo.62.1665073163603; Thu, 06 Oct 2022 09:19:23 -0700 (PDT) MIME-Version: 1.0 References: <20220913162732.163631-1-xukuohai@huaweicloud.com> <970a25e4-9b79-9e0c-b338-ed1a934f2770@huawei.com> <2cb606b4-aa8b-e259-cdfd-1bfc61fd7c44@huawei.com> <7f34d333-3b2a-aea5-f411-d53be2c46eee@huawei.com> <20221005110707.55bd9354@gandalf.local.home> <20221005113019.18aeda76@gandalf.local.home> In-Reply-To: <20221005113019.18aeda76@gandalf.local.home> From: Florent Revest Date: Thu, 6 Oct 2022 18:19:12 +0200 Message-ID: Subject: Re: [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64 To: Steven Rostedt Cc: Xu Kuohai , Mark Rutland , Catalin Marinas , Daniel Borkmann , Xu Kuohai , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, Will Deacon , Jean-Philippe Brucker , Ingo Molnar , Oleg Nesterov , Alexei Starovoitov , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Zi Shen Lim , Pasha Tatashin , Ard Biesheuvel , Marc Zyngier , Guo Ren , Masami Hiramatsu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 5, 2022 at 5:30 PM Steven Rostedt wrote: > > On Wed, 5 Oct 2022 17:10:33 +0200 > Florent Revest wrote: > > > On Wed, Oct 5, 2022 at 5:07 PM Steven Rostedt wrote: > > > > > > Can you show the implementation of the indirect call you used? > > > > Xu used my development branch here > > https://github.com/FlorentRevest/linux/commits/fprobe-min-args > > That looks like it could be optimized quite a bit too. > > Specifically this part: > > static bool bpf_fprobe_entry(struct fprobe *fp, unsigned long ip, struct ftrace_regs *regs, void *private) > { > struct bpf_fprobe_call_context *call_ctx = private; > struct bpf_fprobe_context *fprobe_ctx = fp->ops.private; > struct bpf_tramp_links *links = fprobe_ctx->links; > struct bpf_tramp_links *fentry = &links[BPF_TRAMP_FENTRY]; > struct bpf_tramp_links *fmod_ret = &links[BPF_TRAMP_MODIFY_RETURN]; > struct bpf_tramp_links *fexit = &links[BPF_TRAMP_FEXIT]; > int i, ret; > > memset(&call_ctx->ctx, 0, sizeof(call_ctx->ctx)); > call_ctx->ip = ip; > for (i = 0; i < fprobe_ctx->nr_args; i++) > call_ctx->args[i] = ftrace_regs_get_argument(regs, i); > > for (i = 0; i < fentry->nr_links; i++) > call_bpf_prog(fentry->links[i], &call_ctx->ctx, call_ctx->args); > > call_ctx->args[fprobe_ctx->nr_args] = 0; > for (i = 0; i < fmod_ret->nr_links; i++) { > ret = call_bpf_prog(fmod_ret->links[i], &call_ctx->ctx, > call_ctx->args); > > if (ret) { > ftrace_regs_set_return_value(regs, ret); > ftrace_override_function_with_return(regs); > > bpf_fprobe_exit(fp, ip, regs, private); > return false; > } > } > > return fexit->nr_links; > } > > There's a lot of low hanging fruit to speed up there. I wouldn't be too > fast to throw out this solution if it hasn't had the care that direct calls > have had to speed that up. > > For example, trampolines currently only allow to attach to functions with 6 > parameters or less (3 on x86_32). You could make 7 specific callbacks, with > zero to 6 parameters, and unroll the argument loop. Sure, we can give this a try, I'll work on a macro that generates the 7 callbacks and we can check how much that helps. My belief right now is that ftrace's iteration over all ops on arm64 is where we lose most time but now that we have numbers it's pretty easy to check hypothesis :)