Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3887122ybf; Tue, 3 Mar 2020 15:04:50 -0800 (PST) X-Google-Smtp-Source: ADFU+vt+UILXbEyWbUakDSTBCzlQcFQEnpGMmRT4cHHJ2yKSDK81rgq/lC/ez/tZ27ifYImQcTbx X-Received: by 2002:a05:6830:1e9b:: with SMTP id n27mr169602otr.358.1583276690735; Tue, 03 Mar 2020 15:04:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583276690; cv=none; d=google.com; s=arc-20160816; b=eKbzS7GpIpOGFuwANwKOa7SSj/3YjtcQeimj9OmI+NCvI6B4iAC723M+E/jQXnzkQC koAPgvX+5Cl3wVs9e1lB8h3DK2dxDuUNe8gU5Ce2Jbi4wRMbTYDLNjeEimLl/e+bEzQR 1yCYpO96kY86WjqMzpKtXi43zX3WSMfPX0+xljAH+9DacKWucbm297SpXZozkE5GbXCV zbpvvGAHhH1ZFeLP2F3WFU/fFT4PLyM8yIyfXB5Z30m32iZxGdgv/j7zIHruqN6D52Mx Lo1CWtdAXhsCc2/AX1suX8Kqw4htL8QlG2CZLNS8Ngp/86AXgToUCpx9C9X4mKl3abOQ EnlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ULQ8Ww0+zX2XI5Cz9WZdhm5XRSdywFUQRM8EyK7ZIlk=; b=sqcHpKQnD0eyiH+sTLloYZzmvCYvAXbWCNaMnn+HGGyKk+r3OelKb56EGyLIE0L04+ ti5q4dnzexbhk2VXffFi7RtRGSOLspd7vvsL86AMWFDz2NHOLQtymDosKLOoNkImEIxU vZBTs+Ny0Az/4Qbp7oENVSh2u+KNJPj7vhTmRXnERD4TB3AznwkWUG1AsUSVtyvyOmCc ANfWzkuyG7errfoU/92qCVETmq7PiIZkKtBRnUHAyxKPVoSh8lKRDuSalneYGs70ZVVk otOVEm07kGTGJB7atScuW2igGIEUL9+o4jGc7Q2e3vI744jSBC+mgMzQECnmnBl7dSYe gb9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=jzQlv0ij; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j10si23419otr.268.2020.03.03.15.04.37; Tue, 03 Mar 2020 15:04:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=jzQlv0ij; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728057AbgCCXDh (ORCPT + 99 others); Tue, 3 Mar 2020 18:03:37 -0500 Received: from mail-qk1-f194.google.com ([209.85.222.194]:40500 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727665AbgCCXDh (ORCPT ); Tue, 3 Mar 2020 18:03:37 -0500 Received: by mail-qk1-f194.google.com with SMTP id m2so5223727qka.7; Tue, 03 Mar 2020 15:03:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ULQ8Ww0+zX2XI5Cz9WZdhm5XRSdywFUQRM8EyK7ZIlk=; b=jzQlv0ijPviWdjpZX9KaKbCYdr9QLqShtXay6oEFPIGMt06YzQADwwIV6ulDB2ygV4 y/jWCTlKAhUSSid4e0ZwSIdXyG9uLtVA/sLHrj90rwlo1hGw93cBxuFko0n9K7nDV9RJ fZvKBsBh+WAHoFwQ1du5kZI8SB3ed0Nvgpv6Ff+HC/MffCDP61Pa8H37vUvdsppch5wW a/5PkbPgMYmWw9+0OZqccGsJ41q6OeDy51Z5tM82ZOdDhZR94B8ApwtB+LKTB9P2Zc9W uTr/pNwqaEu7YCHtwIHIxptF/qJiw2de7GNp2Hu9eOa8uEui1onevbXjzOZcNm2fn3Kb D/Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ULQ8Ww0+zX2XI5Cz9WZdhm5XRSdywFUQRM8EyK7ZIlk=; b=Js+apCyNab3x59Clz67vBT9iCaiiQWxLIPKDFonQytetbDtyLV27ngTLgNNfdhD8vt PvYcvNDgewhuEuj/MUo/VKRki2346L2HAyMMQUzDT09WV/ppRsLxKVNlH7xf6uK+h64U gce7Fd4nDDg4j5T0P0Hzbktf/85/2YuCbiaM9o90fKNlJtUDMhSDUiBr7O6YXgbiIU6i yI4D48D+WuVPov78Q+HGJWEa6PU62rikMin4zDi/EdAFhBEdQ+daerH9wL4qZffIND8L FXE67gk4N7FuEiBUSzFmVMc9Hxd6hv93F+MgZjztsPRtfiykMvNMwS8ooV9dZaBdtMGr FSpg== X-Gm-Message-State: ANhLgQ2Sk2nPknMzIBGb0tFn0CO8qbMZgUgPEFLRpjMDH3F9XoIP9DbR JPmOD4yJz8/1ko/eB+xdsmQBiT38Dq2/JG+AZs8= X-Received: by 2002:a37:6716:: with SMTP id b22mr345307qkc.437.1583276615676; Tue, 03 Mar 2020 15:03:35 -0800 (PST) MIME-Version: 1.0 References: <20200303140950.6355-1-kpsingh@chromium.org> <20200303140950.6355-2-kpsingh@chromium.org> <20200303222433.GA3272@chromium.org> In-Reply-To: <20200303222433.GA3272@chromium.org> From: Andrii Nakryiko Date: Tue, 3 Mar 2020 15:03:24 -0800 Message-ID: Subject: Re: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code To: KP Singh Cc: open list , bpf , Alexei Starovoitov , Daniel Borkmann , Paul Turner , Florent Revest , Brendan Jackman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 3, 2020 at 2:24 PM KP Singh wrote: > > On 03-M=C3=A4r 14:12, Andrii Nakryiko wrote: > > On Tue, Mar 3, 2020 at 6:13 AM KP Singh wrote: > > > > > > From: KP Singh > > > > > > As we need to introduce a third type of attachment for trampolines, t= he > > > flattened signature of arch_prepare_bpf_trampoline gets even more > > > complicated. > > > > > > Refactor the prog and count argument to arch_prepare_bpf_trampoline t= o > > > use bpf_tramp_progs to simplify the addition and accounting for new > > > attachment types. > > > > > > Signed-off-by: KP Singh > > > --- > > > arch/x86/net/bpf_jit_comp.c | 31 +++++++++--------- > > > include/linux/bpf.h | 13 ++++++-- > > > kernel/bpf/bpf_struct_ops.c | 13 +++++++- > > > kernel/bpf/trampoline.c | 63 +++++++++++++++++++++--------------= -- > > > 4 files changed, 75 insertions(+), 45 deletions(-) > > > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.= c > > > index 9ba08e9abc09..15c7d28bc05c 100644 > > > --- a/arch/x86/net/bpf_jit_comp.c > > > +++ b/arch/x86/net/bpf_jit_comp.c > > > @@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_fun= c_model *m, u8 **prog, int nr_args, > > > } > > > > > > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog, > > > - struct bpf_prog **progs, int prog_cnt, int stac= k_size) > > > + struct bpf_tramp_progs *tp, int stack_size) > > > > nit: it's `tp` here, but `tprogs` in arch_prepare_bpf_trampoline. It's > > minor, but would be nice to stick to consistent naming. > > I did this to ~distinguish~ that rather than being an array of > tprogs it's a pointer to one of its members e.g. > &tprogs[BPF_TRAMP_FEXIT]). > > I change it if you feel this is not a valuable disntinction. I think it's important distinction, but naming doesn't really help with it... Not sure how you can make it more clear, though. > > > > > > { > > > u8 *prog =3D *pprog; > > > int cnt =3D 0, i; > > > > > > - for (i =3D 0; i < prog_cnt; i++) { > > > + for (i =3D 0; i < tp->nr_progs; i++) { > > > if (emit_call(&prog, __bpf_prog_enter, prog)) > > > return -EINVAL; > > > /* remember prog start time returned by __bpf_prog_en= ter */ > > > @@ -1376,17 +1376,17 @@ static int invoke_bpf(const struct btf_func_m= odel *m, u8 **pprog, > > > /* arg1: lea rdi, [rbp - stack_size] */ > > > EMIT4(0x48, 0x8D, 0x7D, -stack_size); > > > /* arg2: progs[i]->insnsi for interpreter */ > > > - if (!progs[i]->jited) > > > + if (!tp->progs[i]->jited) > > > emit_mov_imm64(&prog, BPF_REG_2, > > > - (long) progs[i]->insnsi >> 32, > > > - (u32) (long) progs[i]->insnsi)= ; > > > + (long) tp->progs[i]->insnsi >>= 32, > > > + (u32) (long) tp->progs[i]->ins= nsi); > > > /* call JITed bpf program or interpreter */ > > > - if (emit_call(&prog, progs[i]->bpf_func, prog)) > > > + if (emit_call(&prog, tp->progs[i]->bpf_func, prog)) > > > return -EINVAL; > > > > > > /* arg1: mov rdi, progs[i] */ > > > - emit_mov_imm64(&prog, BPF_REG_1, (long) progs[i] >> 3= 2, > > > - (u32) (long) progs[i]); > > > + emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] = >> 32, > > > + (u32) (long) tp->progs[i]); > > > /* arg2: mov rsi, rbx <- start time in nsec */ > > > emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); > > > if (emit_call(&prog, __bpf_prog_exit, prog)) > > > @@ -1458,12 +1458,13 @@ static int invoke_bpf(const struct btf_func_m= odel *m, u8 **pprog, > > > */ > > > int arch_prepare_bpf_trampoline(void *image, void *image_end, > > > const struct btf_func_model *m, u32 f= lags, > > > - struct bpf_prog **fentry_progs, int f= entry_cnt, > > > - struct bpf_prog **fexit_progs, int fe= xit_cnt, > > > + struct bpf_tramp_progs *tprogs, > > > void *orig_call) > > > { > > > int cnt =3D 0, nr_args =3D m->nr_args; > > > int stack_size =3D nr_args * 8; > > > + struct bpf_tramp_progs *fentry =3D &tprogs[BPF_TRAMP_FENTRY]; > > > + struct bpf_tramp_progs *fexit =3D &tprogs[BPF_TRAMP_FEXIT]; > > > u8 *prog; > > > > > > /* x86-64 supports up to 6 arguments. 7+ can be added in the = future */ > > > > [...] > > > > > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.= c > > > index c498f0fffb40..a011a77b21fa 100644 > > > --- a/kernel/bpf/bpf_struct_ops.c > > > +++ b/kernel/bpf/bpf_struct_ops.c > > > @@ -320,6 +320,7 @@ static int bpf_struct_ops_map_update_elem(struct = bpf_map *map, void *key, > > > struct bpf_struct_ops_value *uvalue, *kvalue; > > > const struct btf_member *member; > > > const struct btf_type *t =3D st_ops->type; > > > + struct bpf_tramp_progs *tprogs =3D NULL; > > > void *udata, *kdata; > > > int prog_fd, err =3D 0; > > > void *image; > > > @@ -425,10 +426,19 @@ static int bpf_struct_ops_map_update_elem(struc= t bpf_map *map, void *key, > > > goto reset_unlock; > > > } > > > > > > + tprogs =3D kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_t= ramp_progs), > > > > nit: sizeof(*tprogs) ? > > Sure. I can fix it. > > > > > > + GFP_KERNEL); > > > + if (!tprogs) { > > > + err =3D -ENOMEM; > > > + goto reset_unlock; > > > + } > > > + > > > + *tprogs[BPF_TRAMP_FENTRY].progs =3D prog; > > > > I'm very confused what's going on here, why * at the beginning here, > > but no * below?.. It seems unnecessary. > > The progs member of bpf_tramp_progs is > > struct bpf_tramp_progs { > struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS]; > int nr_progs; > }; > > Equivalent to the **progs we had before in the signature of > arch_prepare_bpf_trampoline. > > *tprogs[BPF_TRAMP_FENTRY].progs =3D prog; > > is setting the program in the **progs. The one below is setting the > count. Am I missing something :) Ok, so it's setting entry 0 in bpf_tramp_progs->progs array, right? Wouldn't it be less mind-bending and confusing written this way: tprogs[BPF_TRAMP_FENTRY].progs[0] =3D prog; ? Syntax you used treats fixed-length progs array as a pointer, which is valid C, but not the best C either. > > > > > > + tprogs[BPF_TRAMP_FENTRY].nr_progs =3D 1; > > > err =3D arch_prepare_bpf_trampoline(image, > > > st_map->image + PAG= E_SIZE, > > > &st_ops->func_model= s[i], 0, > > > - &prog, 1, NULL, 0, = NULL); > > > + tprogs, NULL); > > > if (err < 0) > > > goto reset_unlock; > > > > > > @@ -469,6 +479,7 @@ static int bpf_struct_ops_map_update_elem(struct = bpf_map *map, void *key, > > > memset(uvalue, 0, map->value_size); > > > memset(kvalue, 0, map->value_size); > > > unlock: > > > + kfree(tprogs); > > > mutex_unlock(&st_map->lock); > > > return err; > > > } > > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c > > > index 704fa787fec0..9daeb094f054 100644 > > > --- a/kernel/bpf/trampoline.c > > > +++ b/kernel/bpf/trampoline.c > > > @@ -190,40 +190,50 @@ static int register_fentry(struct bpf_trampolin= e *tr, void *new_addr) > > > return ret; > > > } > > > > > > -/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit= is ~50 > > > - * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2 > > > - */ > > > -#define BPF_MAX_TRAMP_PROGS 40 > > > +static struct bpf_tramp_progs * > > > +bpf_trampoline_update_progs(struct bpf_trampoline *tr, int *total) > > > +{ > > > + struct bpf_tramp_progs *tprogs; > > > + struct bpf_prog **progs; > > > + struct bpf_prog_aux *aux; > > > + int kind; > > > + > > > + *total =3D 0; > > > + tprogs =3D kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_pro= gs), > > > > same nit as above, sizeof(*tprogs) is shorter and less error-prone > > Sure, can fix it. > > - KP > > > > > > + GFP_KERNEL); > > > + if (!tprogs) > > > + return ERR_PTR(-ENOMEM); > > > + > > > + for (kind =3D 0; kind < BPF_TRAMP_MAX; kind++) { > > > + tprogs[kind].nr_progs =3D tr->progs_cnt[kind]; > > > + *total +=3D tr->progs_cnt[kind]; > > > + progs =3D tprogs[kind].progs; > > > + > > > + hlist_for_each_entry(aux, &tr->progs_hlist[kind], tra= mp_hlist) > > > + *progs++ =3D aux->prog; > > > + } > > > + return tprogs; > > > +} > > > > > > > [...]