Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3153459rdb; Wed, 13 Sep 2023 03:58:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHAdBlcikZS+cP/k+Aql4rfHtdYr3Ep4Zs+IsaPCM3wNkH5cIOYOdr72Jax8UUrAgKJb/3l X-Received: by 2002:a05:620a:4310:b0:76c:bfeb:97d0 with SMTP id u16-20020a05620a431000b0076cbfeb97d0mr2431130qko.58.1694602736435; Wed, 13 Sep 2023 03:58:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694602736; cv=none; d=google.com; s=arc-20160816; b=zmAWoHSrbYVv34hOwvfEF0p//T7KuKvGSPc1M3RTbwyaQaxzEYiRXrhzs9BqvDB5cf qCtbP4Ht/xCDCiTt8JNiikAyMoIPbw84sHKzEb34V1DtOfSMICXgVsCWgydYLWttpNYO LYnurt3FSZfhD/ufWpGolZxFcBHVSwfDka1v2UrXeiNvP6UD60MYQIHt3Pxe7WlhJLa0 f+7SiBxBSyJyv9bVoYQw0K6plIgnlpDU35SNb1hu/oEZ6Hb88E4iH64TV4W8xpYN880+ c8Giau3/2uWbzVeDcJ8Sh2EU1nExqcnIixy8kXlLjwx7o5U4Ig4EnDVZP+igXdeUei9/ gU4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=tROsju7XCXJjZUAJhD3FowxpO/tNWb0MJgT0QzV15Fo=; fh=96LgAWwgpb7dRelRnlPpSE3q5eaSTQeareOPQC7IOEE=; b=u1/956F6h4Yp261S+/DsK8/Tn5A0/D9aPoLWyS1cssUy2udF2X1YY1FJxCzBdd1vHf +pqsOvM0YlefEZS/NEW/w9spoY8pT59NuDNYiwSxESqi8vE3CGipxo4ka67aOu5Q5p/r Fj3F+Rk5Nvc1QZG+Vy9CcxoHEmfAYtY90qdej9L0i2G+qhtOQQRri5jyJlhu3KgnbZ+f fC6iV2VfHx5/Wy0Kp+Z/SYHo8WjTvJuOk5+xVBg9BtR/ox9OiHyedSz6Clc+R4/1AIMX 7p0Fg43Ehsoz/spsb+DWzEtDADZFnbxNjBs7beGXyt8og0+Fqi1KIscmpfZnKxnD0hZW KIkg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id q200-20020a632ad1000000b005775d4973e5si6692195pgq.728.2023.09.13.03.58.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 03:58:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 7127B807665A; Wed, 13 Sep 2023 03:53:38 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237781AbjIMKxf (ORCPT + 99 others); Wed, 13 Sep 2023 06:53:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239774AbjIMKxe (ORCPT ); Wed, 13 Sep 2023 06:53:34 -0400 Received: from mail-m127156.xmail.ntesmail.com (mail-m127156.xmail.ntesmail.com [115.236.127.156]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DE6ACA; Wed, 13 Sep 2023 03:53:29 -0700 (PDT) Received: from [10.128.10.193] (unknown [123.120.52.233]) by mail-m11873.qiye.163.com (Hmail) with ESMTPA id F37C09007E7; Wed, 13 Sep 2023 18:52:46 +0800 (CST) Message-ID: <0d8b7948-12b4-4af5-83df-b6998980398a@sangfor.com.cn> Date: Wed, 13 Sep 2023 18:52:43 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2] bpf: Using binary search to improve the performance of btf_find_by_name_kind To: Alexei Starovoitov , Eduard Zingerman Cc: Martin KaFai Lau , Alexei Starovoitov , Song Liu , Yonghong Song , Steven Rostedt , Masami Hiramatsu , dinghui@sangfor.com.cn, huangcun@sangfor.com.cn, bpf , LKML References: <20230909091646.420163-1-pengdonglin@sangfor.com.cn> <20ef8441084c9d5fd54f84987afa77eed7fe148e.camel@gmail.com> From: pengdonglin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlCTktIVk4eQx5CGU5KGEJLTVUTARMWGhIXJBQOD1 lXWRgSC1lBWUpJSFVKSUtVTklVSUhIWVdZFhoPEhUdFFlBWU9LSFVKSEpOTUlVSktLVUtZBg++ X-HM-Tid: 0a8a8e2c122f2eafkusnf37c09007e7 X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6NAg6FAw*Cz1NNw9NHiJJFRFO GDFPCitVSlVKTUJPTUtJSE1DS0lMVTMWGhIXVQseFRwfFBUcFxIVOwgaFRwdFAlVGBQWVRgVRVlX WRILWUFZSklIVUpJS1VOSVVJSEhZV1kIAVlBTkxCSjcG Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 13 Sep 2023 03:53:38 -0700 (PDT) X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email On 2023/9/13 0:40, Alexei Starovoitov wrote: > On Tue, Sep 12, 2023 at 7:19 AM Eduard Zingerman wrote: >> >> On Tue, 2023-09-12 at 16:51 +0300, Eduard Zingerman wrote: >>> On Sat, 2023-09-09 at 02:16 -0700, Donglin Peng wrote: >>>> Currently, we are only using the linear search method to find the type id >>>> by the name, which has a time complexity of O(n). This change involves >>>> sorting the names of btf types in ascending order and using binary search, >>>> which has a time complexity of O(log(n)). This idea was inspired by the >>>> following patch: >>>> >>>> 60443c88f3a8 ("kallsyms: Improve the performance of kallsyms_lookup_name()"). >>>> >>>> At present, this improvement is only for searching in vmlinux's and >>>> module's BTFs, and the kind should only be BTF_KIND_FUNC or BTF_KIND_STRUCT. >>>> >>>> Another change is the search direction, where we search the BTF first and >>>> then its base, the type id of the first matched btf_type will be returned. >>>> >>>> Here is a time-consuming result that finding all the type ids of 67,819 kernel >>>> functions in vmlinux's BTF by their names: >>>> >>>> Before: 17000 ms >>>> After: 10 ms >>>> >>>> The average lookup performance has improved about 1700x at the above scenario. >>>> >>>> However, this change will consume more memory, for example, 67,819 kernel >>>> functions will allocate about 530KB memory. >>> >>> Hi Donglin, >>> >>> I think this is a good improvement. However, I wonder, why did you >>> choose to have a separate name map for each BTF kind? >>> >>> I did some analysis for my local testing kernel config and got such numbers: >>> - total number of BTF objects: 97350 >>> - number of FUNC and STRUCT objects: 51597 >>> - number of FUNC, STRUCT, UNION, ENUM, ENUM64, TYPEDEF, DATASEC objects: 56817 >>> (these are all kinds for which lookup by name might make sense) >>> - number of named objects: 54246 >>> - number of name collisions: >>> - unique names: 53985 counts >>> - 2 objects with the same name: 129 counts >>> - 3 objects with the same name: 3 counts >>> >>> So, it appears that having a single map for all named objects makes >>> sense and would also simplify the implementation, what do you think? >> >> Some more numbers for my config: >> - 13241 types (struct, union, typedef, enum), log2 13241 = 13.7 >> - 43575 funcs, log2 43575 = 15.4 >> Thus, having separate map for types vs functions might save ~1.7 >> search iterations. Is this a significant slowdown in practice? > > What do you propose to do in case of duplicates ? > func and struct can have the same name, but they will have two different > btf_ids. How do we store them ? > Also we might add global vars to BTF. Such request came up several times. > So we need to make sure our search approach scales to > func, struct, vars. I don't recall whether we search any other kinds. > Separate arrays for different kinds seems ok. > It's a bit of code complexity, but it's not an increase in memory. > With 13k structs and 43k funcs it's 56k * (4 + 4) that's 0.5 Mbyte > extra memory. That's quite a bit. Anything we can do to compress it? > Folks requested vmlinux BTF to be a module, so it's loaded on demand. > BTF memory consumption is a concern to many. > I think before we add these per-kind search arrays we better make > BTF optional as a module. I think that making BTF as a module may not have much significance, since the function bpf_get_btf_vmlinux is invoked in many places, such as during loading a kernel module. >