Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp107160pxk; Tue, 1 Sep 2020 17:47:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyrcDp+uxlUOpXwFxSUzgcaD9ITFpjjAH8uUHWTTMOsLsKzdb06/P2JRLteDkk1Y1RxUV/a X-Received: by 2002:a17:906:3890:: with SMTP id q16mr3879846ejd.107.1599007675605; Tue, 01 Sep 2020 17:47:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599007675; cv=none; d=google.com; s=arc-20160816; b=DdPqpyOZJ23e3I7mYWlTMylqfG+muN4eDSjryYiSAgrQel8iSetYuSmzUdsftSZDqr ubG+nj3Cs7iorClL5MIzQ/nWtNkrM5CAsR30Y8g2UU1qzXU6DgiK2lrZc8MebmPmeSDc Oa4jLGIA2MOFt/9PkkeYVpNkHFAlvTjwyUtqVKJ7JAOZucL9J0mWN3P7UivTC7z4SWWv YEaAa0W29s5rlyjLzVWfIQczBUHRAYq842Svg4t4wcAvIVIfmf3cplpfyUHG29PRZelQ UaJWwFq3i8QcrFY7NlEUg3a9Y2NDuCpxP65715jUKd73hGwvMZ9EfAfdjLo/PKeZTtuB 1Gbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=3RFj3YnGQagsUtiAD7e8C2sZcfN3W2rXzRlDvYIWW7s=; b=f0Y4IqSKHN7zlSnYEjuBs+E4Vk++gjeMyS2oBxmXYbsJdCmTkaNuio2MjK/YoXWzks DCFbMm/Hf/H96LAt9sXhOIT2TOnXghYCW/M3fUBtvhjJ0BPR1+iGwlpfvHJaan7Hjz4m bnDIMbVBTqSse344cdGr0GyCZo3phnv5uYBsvs38/RuFkYhIkxkpJ3sAgmpFB9FZHitp u0RM3ONKBaMdwzekf33HBSkMOBjwHoaTKwPYgrf6HXCW3n/4GkybSbutWK0zgPCwMak5 ZI5+X54xndzhJxO1p0jsuqUHOZvILL05EKbrDnkrFGQpioGzD3i/xRhENmjjTXk30+zV 3MpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=F+tEijnE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s20si1519065edr.422.2020.09.01.17.47.31; Tue, 01 Sep 2020 17:47:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=F+tEijnE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726900AbgIBAq6 (ORCPT + 99 others); Tue, 1 Sep 2020 20:46:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726078AbgIBAq4 (ORCPT ); Tue, 1 Sep 2020 20:46:56 -0400 Received: from mail-ej1-x641.google.com (mail-ej1-x641.google.com [IPv6:2a00:1450:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F71C061245 for ; Tue, 1 Sep 2020 17:46:54 -0700 (PDT) Received: by mail-ej1-x641.google.com with SMTP id q13so4181407ejo.9 for ; Tue, 01 Sep 2020 17:46:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3RFj3YnGQagsUtiAD7e8C2sZcfN3W2rXzRlDvYIWW7s=; b=F+tEijnEaP6XzWJCoTEnnYfkcnfw9PHhdLwctg0OgJBsJLCwO1q29kG+mKooqkfm9z wOzC8S6Ez7jRmfHjnYU4N+X3IRu9oa/76LjPtTGBhBbwaGeLxAsvvh8XkQtgjTEf3mEB YKRynt7dccB8ewFztug3K7cwOhDsnxd5BibpUa1y1JgF/vGw8LE586JgYCNHrOr2sugi /OvjPKT2oBbxnsRcystaJmpXO4zjXdkMglv7fH/FE9sNpXtdA2GjkWidyWGXOSbpIVzR 2WaU/MQmu79XeZs9220HMwjtl/Ya0oBX0Y//04++14V0MfomABlsI+Pc5YAm9zlNIOfd cOcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3RFj3YnGQagsUtiAD7e8C2sZcfN3W2rXzRlDvYIWW7s=; b=obWu2Ws1SJs7k9zAAsyAYqB7zjt5MhD2kCYqrA22Zc+Wr/7+TNSSq8cI4MzGXydch1 2yCCQk9cM/HUUIkd2lAvw+jMKvWJNDryD4tX7yXCCX5N1iXOekVsgIoawcPMup7SK3B7 7LCHEqIP8bezjJ+8CvBytCYnkfbEyWbJ3hxCNMWCCso9hn8strjURizLX5Bm9NAmZYF3 yJdMHEvJDwQ3c8RtLY8NJVkdDuxAcGmMVCUzVLHpB5P4clbEa58Bs/xurORLJ2wxMJSh BTYWSN1v71X4Gbc7qBRmALoWe0MkfoP/0rZrhz0TQgsW9y3pnPFfYcO+wuvZjQIspbT1 yVIg== X-Gm-Message-State: AOAM5301cDhTPdX6z15xmQ3Lq7b6JlZxRWwgfD1ZOhAjNbSihY8t/dBD LYA1uePtgk0VWhqo7bcUM7IJnEMQkAzTuD+Ze/kEIw== X-Received: by 2002:a17:907:72d2:: with SMTP id du18mr4212453ejc.359.1599007613121; Tue, 01 Sep 2020 17:46:53 -0700 (PDT) MIME-Version: 1.0 References: <20200819224030.1615203-1-haoluo@google.com> <20200819224030.1615203-5-haoluo@google.com> In-Reply-To: From: Hao Luo Date: Tue, 1 Sep 2020 17:46:41 -0700 Message-ID: Subject: Re: [PATCH bpf-next v1 4/8] bpf/libbpf: BTF support for typed ksyms To: Andrii Nakryiko Cc: Networking , bpf , open list , "open list:KERNEL SELFTEST FRAMEWORK" , Shuah Khan , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Quentin Monnet , Steven Rostedt , Ingo Molnar , Andrey Ignatov , Jakub Sitnicki Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 1, 2020 at 4:55 PM Andrii Nakryiko wrote: > > On Tue, Sep 1, 2020 at 1:35 PM Hao Luo wrote: > > > > On Tue, Sep 1, 2020 at 11:11 AM Andrii Nakryiko > > wrote: > > > > > > On Thu, Aug 27, 2020 at 3:29 PM Hao Luo wrote: > > > > > > > > On Fri, Aug 21, 2020 at 3:37 PM Andrii Nakryiko > > > > wrote: > > > > > > > > > > On Wed, Aug 19, 2020 at 3:42 PM Hao Luo wrote: > > > > > > > > > > > > If a ksym is defined with a type, libbpf will try to find the ksym's btf > > > > > > information from kernel btf. If a valid btf entry for the ksym is found, > > > > > > libbpf can pass in the found btf id to the verifier, which validates the > > > > > > ksym's type and value. > > > > > > > > > > > > Typeless ksyms (i.e. those defined as 'void') will not have such btf_id, > > > > > > but it has the symbol's address (read from kallsyms) and its value is > > > > > > treated as a raw pointer. > > > > > > > > > > > > Signed-off-by: Hao Luo > > > > > > --- > > > > > > tools/lib/bpf/libbpf.c | 130 ++++++++++++++++++++++++++++++++++++----- > > > > > > 1 file changed, 114 insertions(+), 16 deletions(-) > > > > > > > > > > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > > > > > > index 4a81c6b2d21b..94eff612c7c2 100644 > > > > > > --- a/tools/lib/bpf/libbpf.c > > > > > > +++ b/tools/lib/bpf/libbpf.c > > > > > > @@ -357,7 +357,16 @@ struct extern_desc { > > > > > > bool is_signed; > > > > > > } kcfg; > > > > > > struct { > > > > > > - unsigned long long addr; > > > > > > + /* > > > > > > + * 1. If ksym is typeless, the field 'addr' is valid. > > > > > > + * 2. If ksym is typed, the field 'vmlinux_btf_id' is > > > > > > + * valid. > > > > > > + */ > > > > > > + bool is_typeless; > > > > > > + union { > > > > > > + unsigned long long addr; > > > > > > + int vmlinux_btf_id; > > > > > > + }; > > > > > > > > > > ksym is 16 bytes anyways, union doesn't help to save space. I propose > > > > > to encode all this with just two fields: vmlinux_btf_id and addr. If > > > > > btf_id == 0, then extern is typeless. > > > > > > > > Ack on expanding the union. But I slightly preferred keeping > > > > is_typeless. IIUC, btf_id points a VAR_KIND, we need the following > > > > pointer chasing every time > > > > > > > > t = btf__type_by_id(obj->btf, ext->btf_id); > > > > t->type; > > > > > > > > which I felt is worse than keeping a is_typeless flag. > > > > > > Sorry, I'm not following. In all places where you would check > > > sym->is_typeless, you'd now just do: > > > > > > if (ext->ksym.vmlinux_btf_id) { > > > /* typed, use ext->ksym.vmlinux_btf_id */ > > > } else { > > > /* typeless */ > > > } > > > > > > > My apologies, I should be more specific. > > > > 'vmlinux_btf_id' gets its value in bpf_object__resolve_ksyms_btf_id(). > > Before we call this function, there are three places that need to tell > > whether a ksym is typed, currently in v1. Specifically, > > > > - in bpf_object__collect_externs(), typeless ksyms are rewritten as > > 'int', in contrast, typed ones are left untouched (though this may > > change in v2). > > - bpf_object__load_vmlinux_btf() now is called before > > bpf_object__resolve_ksyms_btf_id(). In v1's design, if there is no > > typed ksym, we could skip loading vmlinux_btf potentially. > > - even bpf_object__resolve_ksyms_btf_id() itself is conditionally > > called, depending on whether there is any typed ksym. > > > > At the time when these places are called, vmlinux_btf_id is > > unavailable and we can't use it for the purpose of telling whether a > > ksym is typed. > > > > However, rather than vmlinux_btf_id, there may be an alternative. We > > can record the ksym extern's type's btf_id and use that as > > 'is_typeless' flag. This also solves the problem below. > > Oh, I was thinking that vmlinux_btf_id contains a local BTF ID this > whole time (clearly ignoring the "vmlinux_" part). > > > > > [...] > > > > > > > > } else { > > > > > > pr_warn("unrecognized extern section '%s'\n", sec_name); > > > > > > return -ENOTSUP; > > > > > > @@ -2992,9 +3006,9 @@ static int bpf_object__collect_externs(struct bpf_object *obj) > > > > > > /* sort externs by type, for kcfg ones also by (align, size, name) */ > > > > > > qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs); > > > > > > > > > > > > - /* for .ksyms section, we need to turn all externs into allocated > > > > > > - * variables in BTF to pass kernel verification; we do this by > > > > > > - * pretending that each extern is a 8-byte variable > > > > > > + /* for .ksyms section, we need to turn all typeless externs into > > > > > > + * allocated variables in BTF to pass kernel verification; we do > > > > > > + * this by pretending that each typeless extern is a 8-byte variable > > > > > > */ > > > > > > if (ksym_sec) { > > > > > > /* find existing 4-byte integer type in BTF to use for fake > > > > > > @@ -3012,7 +3026,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj) > > > > > > > > > > > > sec = ksym_sec; > > > > > > n = btf_vlen(sec); > > > > > > - for (i = 0, off = 0; i < n; i++, off += sizeof(int)) { > > > > > > + for (i = 0, off = 0; i < n; i++) { > > > > > > struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i; > > > > > > struct btf_type *vt; > > > > > > > > > > > > @@ -3025,9 +3039,14 @@ static int bpf_object__collect_externs(struct bpf_object *obj) > > > > > > return -ESRCH; > > > > > > } > > > > > > btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED; > > > > > > - vt->type = int_btf_id; > > > > > > + if (ext->ksym.is_typeless) { > > > > > > + vt->type = int_btf_id; > > > > > > + vs->size = sizeof(int); > > > > > > + } > > > > > > vs->offset = off; > > > > > > - vs->size = sizeof(int); > > > > > > + off += vs->size; > > > > > > + pr_debug("ksym var_secinfo: var '%s', type #%d, size %d, offset %d\n", > > > > > > + ext->name, vt->type, vs->size, vs->offset); > > > > > > > > > > It's a bit of a waste that we still allocate memory for those typed > > > > > ksym externs, as they don't really need space. But modifying BTF is a > > > > > pain right now, so I think we'll have to do it, until we have a better > > > > > BTF API. But let's make them integers for now to take a fixed and > > > > > small amount of space. > > > > > > > > > > > > > Do you mean making typed ksym externs of type integer? If so, we can't > > > > do that, I think. After collect_externs, we later need to compare the > > > > declared extern's type against the type defined in kernel. Better not > > > > rewrite their types in BTf. > > > > > > Then maybe we need to make btf_id to point to the actual type of the > > > variable, not BTF_KIND_VAR? Or just additionally record type's btf_id, > > > not sure which one makes more sense at the moment. > > > > > > > > > > > I am generally against modifying BTF. I initially didn't notice that > > > > all the ksym externs' types are modified to 'int' and the type > > > > comparison I mentioned above always failed. I dumped the btf in > > > > vmlinux and the btf in object file, checked the kernel variable's > > > > source code, printed out everything I could. The experience was very > > > > bad. > > > > > > > > > > It might be confusing, I agree, but the alternative is just a waste of > > > memory just to match the BTF definition of a DATASEC, which describes > > > externs. It seems sloppy to allocate a bunch of unused memory just to > > > match the kernel's variable size, while in reality we either use 8 > > > bytes used (for typeless externs, storing ksym address) or none (for > > > typed externs). > > > > > > Another alternative is to not specify BTF ID for .ksyms map, but it's > > > not great for typeless externs case, as we are losing all type info > > > completely. Trade-offs... > > > > > > > I see. It looks like rewriting all ksym externs' type to integers is > > the most straightforward solution here, though I felt a bit hacky. > > > > I can record the btf_id of the var's type before rewriting, so > > bpf_core_type_are_compat() can find the true type for comparison. One > > good thing about recording the type's btf_id is that it can be used to > > tell whether the ksym extern is typed or not, before vmlinux_btf_id > > that's what I've been getting at, but I missed that vmlinux_btf_id is > kernel BTF type ID. So let's record both local and target BTF type IDs > and use local_btf_id as an indicator of typed vs typeless? > Yup, that sounds great! [...]