Received: by 2002:ab2:69cc:0:b0:1fd:c486:4f03 with SMTP id n12csp242921lqp; Tue, 11 Jun 2024 03:13:57 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWxSawgo6+mAlFFqdbOmLiR0Vq8nHTvJmCrrvfYcx0QpVV+yLzgfPlFhm8JHIH1bE2ed989fDu2/RqU4LiN/hq9V0M5303trf2h2GBdPQ== X-Google-Smtp-Source: AGHT+IEjZrgwZZDr4eWM+/q5OzDMvkFK0Uwn/t/QcX2m8l1qsS6SqDch5eTeyo6I9iVYh7eeslaL X-Received: by 2002:a67:ebd0:0:b0:48c:47c3:7857 with SMTP id ada2fe7eead31-48c47c37b1bmr6850939137.29.1718100836943; Tue, 11 Jun 2024 03:13:56 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718100836; cv=pass; d=google.com; s=arc-20160816; b=njVBz0hvcV51okWsTKIfIj/8NcJ84MDBrYSxR/MchQhICRTT+PTEwY3o+XyCSMdAZn XcNvYFxQlen2sWcBnv7a5wUK4QCkSkhPOGrJP/YBjKiWbmuv9J0OZgiKNPZ44uOkpoln XUWnakygVoWVR544vv1C2TdL4Gp6VnAROOkCXOSNR8+B7pP5CE8ScQem7VASIqlMlXHP xYO5UxRMAVAC0tK4tJb2MHYWjzfnPAgGSwFbUaj0BAB1wlc38akTCFP/G3xqJ2S+UzhW jABS1mWN0PGjNij6Vy/3emeoIrkAnoK4tWpUSoJyWDwJyM0wX37CMtLkAr8HLZGQlR3h tSEg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:content-transfer-encoding:references:in-reply-to:date:cc :to:from:subject:message-id:dkim-signature; bh=kBBK9+emp78zSyx4y6b6AylUTB0AMIDwhsId20BEYMs=; fh=nyGzeokOT19o62Ezgel88UI+K2FaoGG6ctA5yilvySo=; b=OnAeR3lEUCRw8bZWld/jW0inUC00qFTkJgHX4bvhRuPaO+kCKCi0KHRFoTIri1/5jA Mozhlh9O4r6mDUwdEwT8JMKfLyf+obD6LYnTH+8emylIeL9QQiRAaIRKMEAg+jdR+Nkz FJMMxIeQL4NduOe7y8Tkw6Zmve9gXY6OFCwDV3zXKBUTb7Ol1mJrHKMqyOYq9f5fCg4M SANNpypmsUugWajAS6EfEkGtfiSduNgwuKjnqFKzeYiQEDxcMew0LwNFgru2xuSBgFqx Gfkf/O/AtO+F9Tpgi+gzHM576dVqZ7lsONzcp9jTrvK4BAXkz3ArP8doAwq0wIGm7Mg+ 8uvA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=OBb+9zB5; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-209619-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209619-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id af79cd13be357-7954fdba9b4si883690085a.733.2024.06.11.03.13.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jun 2024 03:13:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-209619-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=OBb+9zB5; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-209619-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209619-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 49AE11C23225 for ; Tue, 11 Jun 2024 10:13:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1ECB817839C; Tue, 11 Jun 2024 10:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OBb+9zB5" Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0FF8E57E; Tue, 11 Jun 2024 10:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718100805; cv=none; b=O6UOkRViTdAkfcat6bgYtRSJk2wcqW/AsFvJwQ7p9xxqwzZvvvmeXXVbibGQUZP39oKKi3KbJsdEiaRdD1cDE9SABiVtDtDR0GR5FBzVDL+edi6BWAkc1gIJDfleYkAR78fKzQTSwPZsrH1NbdAs1IQo3lWrSm80Gxek37tyU30= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718100805; c=relaxed/simple; bh=NFplBVQS9EeQgek8xEOVuqUTSZBHZKK+8349c2uE6pk=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=eCJ+/rhji/FEVGnEN4pHPOHA2kpz/zrLgRNb+RYcs/0LtIboy7Vd0bI7PYVB7t7yv9jqWSia1SHRmT2kCBKFhiMYMuXV4+z8nGl/alCJTneV4I6+JQ190uufWFefoS2nMnKxau8l9toyhM4/HUbW5KxM4ZzheY/Wap6HEs7q/jE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OBb+9zB5; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-70436048c25so2003371b3a.0; Tue, 11 Jun 2024 03:13:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718100803; x=1718705603; darn=vger.kernel.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=kBBK9+emp78zSyx4y6b6AylUTB0AMIDwhsId20BEYMs=; b=OBb+9zB53q/hJGrH9jrt+fEM0j8WhgMHK/O2KpKYRIi4SJ56a3FaJ/DPfiFdYhDWpw qo7JobTDni5boxWNTFIttRuOqeLJOOsBHX4hRRhT+CgYo1TF3Z58m8en05suGJl9uEt6 gR4z3XmWVtUTX4KOPfW5X8hnwyt3h+oOuxDl+CzJyvJmbGBLy4GKAMqk0wVaEUwiGVZM e2pW1jiwztMdTgZ1/81oae3IiN9q1Lwxoq8BkUT9h1s7KyYYGk8i71HHq9aGuvo+4+a1 tO6FNnRRbgGZ1jwUrfHAM+KQVG1SxcBuXNoxj5gQlZGDpgy6ysLevUw2rwCIqkcVyfC8 hS6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718100803; x=1718705603; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=kBBK9+emp78zSyx4y6b6AylUTB0AMIDwhsId20BEYMs=; b=HUNW+SokhCCIGkPNnfmP/wLejDE1C3cR783gHHEk3qmvRYkUhShcFOfg9bH34jrl6/ RWAeD3f0M8Ww72+I8I+8fYiZHRDrTdTbHvNScdjk7eBRVTPNOlkFKmEWufdNHNAggjP7 EzQE1rpgnGC567MMGehN4LG1n+BDUTxP2qYHFmkwdZEha0w7k0ClOuYqtGdc8xldQ8GP i7toflA0js3Grk1XtquVN+LXH5yE+hd4qMA7YMDZSJgYlKVhZkWUtDeYKpEB38H8dG+L 76oSDiWjNPRsB/+cKC4EtYTf3J64fd69jGYN3MvxdSghwrFTbczOo82dJKGrkTaANDUQ pAEw== X-Forwarded-Encrypted: i=1; AJvYcCWrIhfPij/kf0d5n8QhEHkwGkzcKCxJ566A21lx+pnbkOjbwOUOVP0twgHEMuyjm/hg0qp3SHP0gUqPmMUnQk240ZUKzB2KG3MVzbzKwTaiEiTezsn9453hKXIap60i/WLL X-Gm-Message-State: AOJu0YygWoMVTYfaqkzjoYchxmy9MhXZxdfwfi0dIrJITqSj8PWMr8b7 wUjWaScurGHHwQ8Jby4oIRutfl7cEcmqRHA39dqEP82RKl31xvTp X-Received: by 2002:a05:6a20:d49a:b0:1b7:4adb:1dcc with SMTP id adf61e73a8af0-1b74adb1eaemr3821491637.60.1718100802757; Tue, 11 Jun 2024 03:13:22 -0700 (PDT) Received: from [192.168.0.31] ([38.34.87.7]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-704228c2c74sm5499728b3a.209.2024.06.11.03.13.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jun 2024 03:13:22 -0700 (PDT) Message-ID: <4f551dc5fc792936ca364ce8324c0adea38162f1.camel@gmail.com> Subject: Re: [RFC PATCH v3] bpf: Using binary search to improve the performance of btf_find_by_name_kind From: Eduard Zingerman To: Donglin Peng , ast@kernel.org, andrii , alan.maguire@oracle.com, acme@kernel.org Cc: daniel@iogearbox.net, mhiramat@kernel.org, song@kernel.org, andrii@kernel.org, haoluo@google.com, yonghong.song@linux.dev, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Date: Tue, 11 Jun 2024 03:13:17 -0700 In-Reply-To: <20240608140835.965949-1-dolinux.peng@gmail.com> References: <20240608140835.965949-1-dolinux.peng@gmail.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4-0ubuntu2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Sat, 2024-06-08 at 07:08 -0700, Donglin Peng wrote: [...] > Changes in RFC v3: > - Sort the btf types during the build process in order to reduce memory = usage > and decrease boot time. >=20 > RFC v2: > - https://lore.kernel.org/all/20230909091646.420163-1-pengdonglin@sangfo= r.com.cn > --- > include/linux/btf.h | 1 + > kernel/bpf/btf.c | 160 +++++++++++++++++++++++++++++++++--- I think that kernel part is in a good shape, please split it as a separate commit. > tools/lib/bpf/btf.c | 195 ++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 345 insertions(+), 11 deletions(-) [...] > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c > index 2d0840ef599a..93c1ab677bfa 100644 I'm not sure that libbpf is the best place to put this functionality, as there might be different kinds of orderings (e.g. see a fresh commit to bpftool to output stable vmlinux.h: 94133cf24bb3 "bpftool: Introduce btf c dump sorting"). I'm curious what Andrii, Alan and Arnaldo think on libbpf vs pahole for this feature. Also, I have a selftests build failure with this patch-set (and I suspect that a bunch of dedup test cases would need an update): $ pwd /home/eddy/work/bpf-next/tools/testing/selftests/bpf $ make -j14 test_progs ... GEN-SKEL [test_progs] access_map_in_map.skel.h Binary files /home/eddy/work/bpf-next/tools/testing/selftests/bpf/access_ma= p_in_map.bpf.linked2.o and /home/eddy/work/bpf-next/tools/testing/selftests= /bpf/access_map_in_map.bpf.linked3.o differ make: *** [Makefile:658: /home/eddy/work/bpf-next/tools/testing/selftests/b= pf/access_map_in_map.skel.h] Error 1 make: *** Waiting for unfinished jobs.... If this change remains in libbpf, I think it would be great to update btf_find_by_name_kind() to work the same way as kernel one. > --- a/tools/lib/bpf/btf.c > +++ b/tools/lib/bpf/btf.c [...] > +static int btf_sort_type_by_name(struct btf *btf) > +{ > + struct btf_type *bt; > + __u32 *new_type_offs =3D NULL, *new_type_offs_noname =3D NULL; > + __u32 *maps =3D NULL, *found_offs; > + void *new_types_data =3D NULL, *loc_data; > + int i, j, k, type_cnt, ret =3D 0, type_size; > + __u32 data_size; > + > + if (btf_ensure_modifiable(btf)) > + return libbpf_err(-ENOMEM); > + > + type_cnt =3D btf->nr_types; > + data_size =3D btf->type_offs_cap * sizeof(*new_type_offs); > + > + maps =3D (__u32 *)malloc(type_cnt * sizeof(__u32)); > + if (!maps) { > + ret =3D -ENOMEM; > + goto err_out; > + } > + > + new_type_offs =3D (__u32 *)malloc(data_size); > + if (!new_type_offs) { > + ret =3D -ENOMEM; > + goto err_out; > + } > + > + new_type_offs_noname =3D (__u32 *)malloc(data_size); > + if (!new_type_offs_noname) { > + ret =3D -ENOMEM; > + goto err_out; > + } What is the point of separating offsets in new_type_offs vs new_type_offs_noname? It should be possible to use a single offsets array and have a comparison function that puts all named types before unnamed. > + > + new_types_data =3D malloc(btf->types_data_cap); > + if (!new_types_data) { > + ret =3D -ENOMEM; > + goto err_out; > + } > + > + memset(new_type_offs, 0, data_size); > + > + for (i =3D 0, j =3D 0, k =3D 0; i < type_cnt; i++) { > + const char *name; > + > + bt =3D (struct btf_type *)(btf->types_data + btf->type_offs[i]); > + name =3D btf__str_by_offset(btf, bt->name_off); > + if (!name || !name[0]) > + new_type_offs_noname[k++] =3D btf->type_offs[i]; > + else > + new_type_offs[j++] =3D btf->type_offs[i]; > + } > + > + memmove(new_type_offs + j, new_type_offs_noname, sizeof(__u32) * k); > + > + qsort_r(new_type_offs, j, sizeof(*new_type_offs), > + btf_compare_type_name, btf); > + > + for (i =3D 0; i < type_cnt; i++) { > + found_offs =3D bsearch(&new_type_offs[i], btf->type_offs, type_cnt, > + sizeof(__u32), btf_compare_offs); > + if (!found_offs) { > + ret =3D -EINVAL; > + goto err_out; > + } > + maps[found_offs - btf->type_offs] =3D i; > + } > + > + loc_data =3D new_types_data; > + for (i =3D 0; i < type_cnt; i++, loc_data +=3D type_size) { > + bt =3D (struct btf_type *)(btf->types_data + new_type_offs[i]); > + type_size =3D btf_type_size(bt); > + if (type_size < 0) { > + ret =3D type_size; > + goto err_out; > + } > + > + memcpy(loc_data, bt, type_size); > + > + bt =3D (struct btf_type *)loc_data; > + switch (btf_kind(bt)) { Please take a look at btf_dedup_remap_types(): it uses newly added iterator interface to enumerate all ID references in the type. It could be used here to avoid enumerating every BTF kind. Also, the d->hypot_map could be used instead of `maps`. And if so, I think that it should be possible to put this pass before btf_dedup_remap_types() in order for it to do the remapping. Alternatively, it might make sense to merge this pass with btf_dedup_compact_types() in order to minimize number of operations, e.g. as in my crude attempt: https://github.com/eddyz87/bpf/tree/binsort-btf-dedup (fails with similar selftests issue). > + case BTF_KIND_PTR: > + case BTF_KIND_CONST: > + case BTF_KIND_VOLATILE: > + case BTF_KIND_RESTRICT: > + case BTF_KIND_TYPEDEF: > + case BTF_KIND_TYPE_TAG: > + case BTF_KIND_FUNC: > + case BTF_KIND_VAR: > + case BTF_KIND_DECL_TAG: > + bt->type =3D btf_get_mapped_type(btf, maps, bt->type); > + break; [...]