Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp1212965rdb; Sat, 9 Sep 2023 14:25:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFSfsoG67WiM72C0gNKwXohzb4tIZKRH/Ej/Fob5v1dq+mEOJFVvgQE0lT6n412XVC1Qd1y X-Received: by 2002:a05:6512:2213:b0:4ff:70d2:4512 with SMTP id h19-20020a056512221300b004ff70d24512mr5363406lfu.23.1694294749169; Sat, 09 Sep 2023 14:25:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694294749; cv=none; d=google.com; s=arc-20160816; b=GoJkqJCVHYRRdZ+gw0b218xSVrtvhreIZ90FK/fCpcAy8GC2lFPwzki8RTIERXnjrL CCxfkjbZSfjWDXcfYWISeXxaUR4/GOu76MIYL//yTqp4KWSTW70H0IPq2SVcANUzOUNw N0ZR32zdHgiGv/9zYlXRVMXCTicAPUifM0/spOls0ntwONzs9u1iIa7utDbr77UdtjEJ a4GduAsayS4LlbGz6SBh9fyzLZAaL7yoy02Nk6GCfj1Wr3ALkJJWnLog5Yc2TX3q8tOJ LVeOgR9AANcZEObAFcTLCmPVB4tRrEp4aAVFSv5z+pJbw8+QjrOAbP891dnmPm9bAsOt dJng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=vA0kYCYY5vhSjBKMFoFJj/CX8gGygH/Ej9sPDO9R1+o=; fh=CQbFFnKpU9WIWszm+6SxT0HcuNdmSBv3z541lZKA7zg=; b=ljv7B2pbbhloY6YTw9vSiFj1um+SPQT/4eBV6O23jfWDIZabiXupUoPwn9lNEE5iQQ tQ+dwlRyMMqwBfzmLQIwWfuMMhD5NoQomIOvY+O8Ik0AsHA6ytSlvVFGuOkNpELF9R78 1p4TtzLjFtKamDzgFqnhNNUCcyArSdK1VbPaK/hNPCw5bXjxbXgaUSoFkgwp6DMERzyX uYacsnqYfv6cC9ePeLz6BISbsoQ17/uuuXg+DeuOWEawUG0vqSu+522sB97ruv0JdFMi 8gGDNz01l1u/8Z+QCHtEWSlHr9FdScfiCMJqaE02G2+0JVEWjf64T7TASOstB76Cj12G wzsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020aa7d4d2000000b0052a47104f9bsi3747552edr.184.2023.09.09.14.25.13; Sat, 09 Sep 2023 14:25:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237792AbjIIJ0M (ORCPT + 99 others); Sat, 9 Sep 2023 05:26:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229957AbjIIJ0L (ORCPT ); Sat, 9 Sep 2023 05:26:11 -0400 X-Greylist: delayed 545 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Sat, 09 Sep 2023 02:26:03 PDT Received: from mail-m12770.qiye.163.com (mail-m12770.qiye.163.com [115.236.127.70]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 848C19C; Sat, 9 Sep 2023 02:26:03 -0700 (PDT) Received: from ubuntu.localdomain (unknown [123.120.52.233]) by mail-m15579.qiye.163.com (Hmail) with ESMTPA id D7EC8A801F5; Sat, 9 Sep 2023 17:16:49 +0800 (CST) From: Donglin Peng To: martin.lau@linux.dev, ast@kernel.org Cc: song@kernel.org, yhs@fb.com, rostedt@goodmis.org, mhiramat@kernel.org, dinghui@sangfor.com.cn, huangcun@sangfor.com.cn, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Donglin Peng Subject: [RFC PATCH v2] bpf: Using binary search to improve the performance of btf_find_by_name_kind Date: Sat, 9 Sep 2023 02:16:46 -0700 Message-Id: <20230909091646.420163-1-pengdonglin@sangfor.com.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlDGBoZVhgdGEIZSBhJQxlLTFUTARMWGhIXJBQOD1 lXWRgSC1lBWUpJSFVKSUtVTklVSUhIWVdZFhoPEhUdFFlBWU9LSFVKTU9JTE5VSktLVUpCS0tZBg ++ X-HM-Tid: 0a8a793ac9012e9ckusnd7ec8a801f5 X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6MC46PBw5Lz1IQhAoSA4MMgM6 NhEwCgJVSlVKTUJPSU5KS0pKSk5IVTMWGhIXVQseFRwfFBUcFxIVOwgaFRwdFAlVGBQWVRgVRVlX WRILWUFZSklIVUpJS1VOSVVJSEhZV1kIAVlBSkpMS083Bg++ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, we are only using the linear search method to find the type id by the name, which has a time complexity of O(n). This change involves sorting the names of btf types in ascending order and using binary search, which has a time complexity of O(log(n)). This idea was inspired by the following patch: 60443c88f3a8 ("kallsyms: Improve the performance of kallsyms_lookup_name()"). At present, this improvement is only for searching in vmlinux's and module's BTFs, and the kind should only be BTF_KIND_FUNC or BTF_KIND_STRUCT. Another change is the search direction, where we search the BTF first and then its base, the type id of the first matched btf_type will be returned. Here is a time-consuming result that finding all the type ids of 67,819 kernel functions in vmlinux's BTF by their names: Before: 17000 ms After: 10 ms The average lookup performance has improved about 1700x at the above scenario. However, this change will consume more memory, for example, 67,819 kernel functions will allocate about 530KB memory. Signed-off-by: Donglin Peng --- Changes in RFC v2: - Fix the build issue reported by kernel test robot --- include/linux/btf.h | 1 + kernel/bpf/btf.c | 300 ++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 291 insertions(+), 10 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index cac9f304e27a..6260a0668773 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -201,6 +201,7 @@ bool btf_is_kernel(const struct btf *btf); bool btf_is_module(const struct btf *btf); struct module *btf_try_get_module(const struct btf *btf); u32 btf_nr_types(const struct btf *btf); +u32 btf_type_cnt(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 817204d53372..51aa9f27853b 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -240,6 +240,26 @@ struct btf_id_dtor_kfunc_tab { struct btf_id_dtor_kfunc dtors[]; }; +enum { + BTF_ID_NAME_FUNC, /* function */ + BTF_ID_NAME_STRUCT, /* struct */ + BTF_ID_NAME_MAX +}; + +struct btf_id_name { + int id; + u32 name_off; +}; + +struct btf_id_name_map { + struct btf_id_name *id_name; + u32 count; +}; + +struct btf_id_name_maps { + struct btf_id_name_map map[BTF_ID_NAME_MAX]; +}; + struct btf { void *data; struct btf_type **types; @@ -257,6 +277,7 @@ struct btf { struct btf_kfunc_set_tab *kfunc_set_tab; struct btf_id_dtor_kfunc_tab *dtor_kfunc_tab; struct btf_struct_metas *struct_meta_tab; + struct btf_id_name_maps *id_name_maps; /* split BTF support */ struct btf *base_btf; @@ -532,22 +553,142 @@ u32 btf_nr_types(const struct btf *btf) return total; } +u32 btf_type_cnt(const struct btf *btf) +{ + return btf->start_id + btf->nr_types; +} + +static inline u8 btf_id_name_idx_to_kind(int index) +{ + u8 kind; + + switch (index) { + case BTF_ID_NAME_FUNC: + kind = BTF_KIND_FUNC; + break; + case BTF_ID_NAME_STRUCT: + kind = BTF_KIND_STRUCT; + break; + default: + kind = BTF_KIND_UNKN; + break; + } + + return kind; +} + +static inline int btf_id_name_kind_to_idx(u8 kind) +{ + int index; + + switch (kind) { + case BTF_KIND_FUNC: + index = BTF_ID_NAME_FUNC; + break; + case BTF_KIND_STRUCT: + index = BTF_ID_NAME_STRUCT; + break; + default: + index = -1; + break; + } + + return index; +} + +static s32 btf_find_by_name_bsearch(struct btf_id_name *id_name, + u32 size, const char *name, + struct btf_id_name **start, + struct btf_id_name **end, + const struct btf *btf) +{ + int ret; + int low, mid, high; + const char *name_buf; + + low = 0; + high = size - 1; + + while (low <= high) { + mid = low + (high - low) / 2; + name_buf = btf_name_by_offset(btf, id_name[mid].name_off); + ret = strcmp(name, name_buf); + if (ret > 0) + low = mid + 1; + else if (ret < 0) + high = mid - 1; + else + break; + } + + if (low > high) + return -ESRCH; + + if (start) { + low = mid; + while (low) { + name_buf = btf_name_by_offset(btf, id_name[low-1].name_off); + if (strcmp(name, name_buf)) + break; + low--; + } + *start = &id_name[low]; + } + + if (end) { + high = mid; + while (high < size - 1) { + name_buf = btf_name_by_offset(btf, id_name[high+1].name_off); + if (strcmp(name, name_buf)) + break; + high++; + } + *end = &id_name[high]; + } + + return id_name[mid].id; +} + s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind) { + const struct btf_id_name_maps *maps; + const struct btf_id_name_map *map; + struct btf_id_name *start; const struct btf_type *t; const char *tname; - u32 i, total; + int index = btf_id_name_kind_to_idx(kind); + s32 id, total; - total = btf_nr_types(btf); - for (i = 1; i < total; i++) { - t = btf_type_by_id(btf, i); - if (BTF_INFO_KIND(t->info) != kind) - continue; + do { + maps = btf->id_name_maps; + if (index >= 0 && maps && maps->map[index].id_name) { + /* binary search */ + map = &maps->map[index]; + id = btf_find_by_name_bsearch(map->id_name, + map->count, name, &start, NULL, btf); + if (id > 0) { + /* + * Return the first one that + * matched + */ + return start->id; + } + } else { + /* linear search */ + total = btf_type_cnt(btf); + for (id = btf->start_id; id < total; id++) { + t = btf_type_by_id(btf, id); + if (BTF_INFO_KIND(t->info) != kind) + continue; + + tname = btf_name_by_offset(btf, t->name_off); + if (!strcmp(tname, name)) + return id; + } + } - tname = btf_name_by_offset(btf, t->name_off); - if (!strcmp(tname, name)) - return i; - } + btf = btf->base_btf; + } while (btf); return -ENOENT; } @@ -1639,6 +1780,32 @@ static void btf_free_id(struct btf *btf) spin_unlock_irqrestore(&btf_idr_lock, flags); } +static void btf_destroy_id_name(struct btf *btf, int index) +{ + struct btf_id_name_maps *maps = btf->id_name_maps; + struct btf_id_name_map *map = &maps->map[index]; + + if (map->id_name) { + kvfree(map->id_name); + map->id_name = NULL; + map->count = 0; + } +} + +static void btf_destroy_id_name_map(struct btf *btf) +{ + int i; + + if (!btf->id_name_maps) + return; + + for (i = 0; i < BTF_ID_NAME_MAX; i++) + btf_destroy_id_name(btf, i); + + kfree(btf->id_name_maps); + btf->id_name_maps = NULL; +} + static void btf_free_kfunc_set_tab(struct btf *btf) { struct btf_kfunc_set_tab *tab = btf->kfunc_set_tab; @@ -1689,6 +1856,7 @@ static void btf_free_struct_meta_tab(struct btf *btf) static void btf_free(struct btf *btf) { + btf_destroy_id_name_map(btf); btf_free_struct_meta_tab(btf); btf_free_dtor_kfunc_tab(btf); btf_free_kfunc_set_tab(btf); @@ -5713,6 +5881,107 @@ int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_ty return kctx_type_id; } +static int btf_compare_id_name(const void *a, const void *b, const void *priv) +{ + const struct btf_id_name *ia = (const struct btf_id_name *)a; + const struct btf_id_name *ib = (const struct btf_id_name *)b; + const struct btf *btf = priv; + int ret; + + /* + * Sort names in ascending order, if the name is same, sort ids in + * ascending order. + */ + ret = strcmp(btf_name_by_offset(btf, ia->name_off), + btf_name_by_offset(btf, ib->name_off)); + if (!ret) + ret = ia->id - ib->id; + + return ret; +} + +static int btf_create_id_name(struct btf *btf, int index) +{ + struct btf_id_name_maps *maps = btf->id_name_maps; + struct btf_id_name_map *map = &maps->map[index]; + const struct btf_type *t; + struct btf_id_name *id_name; + const char *name; + int i, j = 0; + u32 total, count = 0; + u8 kind; + + kind = btf_id_name_idx_to_kind(index); + if (kind == BTF_KIND_UNKN) + return -EINVAL; + + if (map->id_name || map->count != 0) + return -EINVAL; + + total = btf_type_cnt(btf); + for (i = btf->start_id; i < total; i++) { + t = btf_type_by_id(btf, i); + if (BTF_INFO_KIND(t->info) != kind) + continue; + name = btf_name_by_offset(btf, t->name_off); + if (str_is_empty(name)) + continue; + count++; + } + + if (count == 0) + return 0; + + id_name = kvcalloc(count, sizeof(struct btf_id_name), + GFP_KERNEL); + if (!id_name) + return -ENOMEM; + + for (i = btf->start_id; i < total; i++) { + t = btf_type_by_id(btf, i); + if (BTF_INFO_KIND(t->info) != kind) + continue; + name = btf_name_by_offset(btf, t->name_off); + if (str_is_empty(name)) + continue; + + id_name[j].id = i; + id_name[j].name_off = t->name_off; + j++; + } + + sort_r(id_name, count, sizeof(id_name[0]), btf_compare_id_name, + NULL, btf); + + map->id_name = id_name; + map->count = count; + + return 0; +} + +static int btf_create_id_name_map(struct btf *btf) +{ + int err, i; + struct btf_id_name_maps *maps; + + if (btf->id_name_maps) + return -EBUSY; + + maps = kzalloc(sizeof(struct btf_id_name_maps), GFP_KERNEL); + if (!maps) + return -ENOMEM; + + btf->id_name_maps = maps; + + for (i = 0; i < BTF_ID_NAME_MAX; i++) { + err = btf_create_id_name(btf, i); + if (err < 0) + break; + } + + return err; +} + BTF_ID_LIST(bpf_ctx_convert_btf_id) BTF_ID(struct, bpf_ctx_convert) @@ -5760,6 +6029,10 @@ struct btf *btf_parse_vmlinux(void) if (err) goto errout; + err = btf_create_id_name_map(btf); + if (err) + goto errout; + /* btf_parse_vmlinux() runs under bpf_verifier_lock */ bpf_ctx_convert.t = btf_type_by_id(btf, bpf_ctx_convert_btf_id[0]); @@ -5777,6 +6050,7 @@ struct btf *btf_parse_vmlinux(void) errout: btf_verifier_env_free(env); if (btf) { + btf_destroy_id_name_map(btf); kvfree(btf->types); kfree(btf); } @@ -5844,13 +6118,19 @@ static struct btf *btf_parse_module(const char *module_name, const void *data, u if (err) goto errout; + err = btf_create_id_name_map(btf); + if (err) + goto errout; + btf_verifier_env_free(env); refcount_set(&btf->refcnt, 1); + return btf; errout: btf_verifier_env_free(env); if (btf) { + btf_destroy_id_name_map(btf); kvfree(btf->data); kvfree(btf->types); kfree(btf); -- 2.25.1