Received: by 2002:a5d:925a:0:0:0:0:0 with SMTP id e26csp1003841iol; Thu, 9 Jun 2022 20:23:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwV15a2zriiX+jZGgGwwL/l0BjXAD7wsZEwgJ/r3ZgE5udvIUHLsZzcKg3ThJH5FmgVsAD6 X-Received: by 2002:a05:6a00:1a03:b0:510:a1d9:7de0 with SMTP id g3-20020a056a001a0300b00510a1d97de0mr109734235pfv.53.1654831406003; Thu, 09 Jun 2022 20:23:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654831405; cv=none; d=google.com; s=arc-20160816; b=GkwPtcheo/8Po+SZI2vy1WElcx/o8yRJ8c/4l1NP257GmbEJRxvot3+ekhUfK4vUKi M7+O8eoq6gg08mLwbn2DbjG1hpOyf6hEkS80cGRKg7wf3i7DTw19Qng34AHwDWHT1cUh r3DpqnUa196FrKW8vaanPPM2G/y51YypKxCg42ps5X2GubCsmDt1rtVpza8i86fG0NFN FbB5R4bSqducsZMZ6FktpFxFl+pt4oU5o9vHjTEODk2DhYZ9RWhL+hn8sdTKFzDHslsy Vvbf6daSL3LjlGUe7JZL0qcoG4DnEcsTy2CEw3WzQj+oihQPCrUoFXKCovR8ngsLcBWP 0j7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=CbbZx2zk8AYEZVW7p2BT5m2zFh5Pea8yTg+uQSbw+ZM=; b=AILeGU0z8hzkVdg9KU4Xlw3nM0kdKCnGtp8nUT4A196wwBk+aN9TUOhnrNPC7OtbJl VIurGpjoLcCnawrxgP2BxMk125hD4YVW0Ntsax1KzGea/9veMzmRdwobNiV2ytbHaUNR t2frScFISF0EU9o0SfUW42kozVFq42EWQ8Q65goHoOBLPIUzarw7/cb8+NzTXoXJxGll fmw/ZRyIRtc4QxzsumU4zg1pID8/iozd79p6bFq+msHUZfgGkICs7QArZQT8XYvkoY9C WaoAgVFyCmFHLccXBfhnrtU3B6564Q8jzw0DAEIKozPZq50ezBt9ucO5zrgdfX4H1Rj5 Q0+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=6NnaToAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q2-20020a656a82000000b003fbd727a15asi37185331pgu.670.2022.06.09.20.23.10; Thu, 09 Jun 2022 20:23:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=6NnaToAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238862AbiFJCdY (ORCPT + 99 others); Thu, 9 Jun 2022 22:33:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345500AbiFJCdW (ORCPT ); Thu, 9 Jun 2022 22:33:22 -0400 Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73E4F4F44E for ; Thu, 9 Jun 2022 19:33:19 -0700 (PDT) Received: by mail-pg1-x530.google.com with SMTP id 184so7567625pga.12 for ; Thu, 09 Jun 2022 19:33:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=CbbZx2zk8AYEZVW7p2BT5m2zFh5Pea8yTg+uQSbw+ZM=; b=6NnaToAenAf+1qTbIO8zMXZDwLs6CWBr6sCZKb9/F0o+JhBfLDqqpgoqUOYSBqTJaj x0ynwxQfKHhGu/udqoOK3RIeIiU1k789L4oWGS09xStR4TasotWxwyepaAwm+VdDlsI+ FSUPa5OgAP+YhMXO4d3jLQZecJzdJVgiR45HPPP35dvaVN+O9MWOXawywsD7ZvruDkb+ 9WX2Btc2oNDTSwAOsrfYOORjCV0O4/cWiMP/sVIgEVyc2Bf3qF81Af4lVf+ReemhYhpg Mg8MusUoBJqfquaQHpLwwvSAvwOIXtZLjDUMoD9L6xdvx8/BR4QJBPcyEeDLPdTiXepU 4PZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=CbbZx2zk8AYEZVW7p2BT5m2zFh5Pea8yTg+uQSbw+ZM=; b=uef3r6Dhbsod4xfb9AnoND63syPObOjYM+unAAVKQDfLfqzClKQN/YfQo9WdKzWFwL oJmPICheqOI+Yblp/x+VfnSmV2PRvSeXN9VinJjlDLo2ANqOzWQlrbRVbdF5HnwCNtr2 hGctUXsLIglG69kxpz2+RqZ/ztmIM/Dur0M+0XPEv0ijN5Y2C7TjUT3aL1EAxeiVFvjW a8A5YaCV/pVWbmvx9/mNT8jI7l3Sh13QycyDpI2vj2NDq7tAQnLWznYCAnKpK/jmFCc/ 3MTsS1XOcT5C1be8+tJsQlFMOfEflPo3JHEMHXYf93YKKFUAKmy/s9hEp2BamU7VDa5m TOLA== X-Gm-Message-State: AOAM532laFFWCjB2LAJm0gcX2xBVnzdWVnMCrBYL8xVY6BGaKqBzwvkr +0KV2pDdBxTvAatmK6a90Umacw== X-Received: by 2002:a63:1a1d:0:b0:3f5:eb02:b6b4 with SMTP id a29-20020a631a1d000000b003f5eb02b6b4mr37655988pga.343.1654828398929; Thu, 09 Jun 2022 19:33:18 -0700 (PDT) Received: from C02F52LSML85.bytedance.net ([139.177.225.225]) by smtp.gmail.com with ESMTPSA id o19-20020a170903009300b001620db30cd6sm17432481pld.201.2022.06.09.19.33.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Jun 2022 19:33:18 -0700 (PDT) From: Feng zhou To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, duanxiongchun@bytedance.com, songmuchun@bytedance.com, wangdongdong.6@bytedance.com, cong.wang@bytedance.com, zhouchengming@bytedance.com, zhoufeng.zf@bytedance.com Subject: [PATCH v6 0/2] Optimize performance of update hash-map when free is zero Date: Fri, 10 Jun 2022 10:33:06 +0800 Message-Id: <20220610023308.93798-1-zhoufeng.zf@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Feng Zhou We encountered bad case on big system with 96 CPUs that alloc_htab_elem() would last for 1ms. The reason is that after the prealloc hashtab has no free elems, when trying to update, it will still grab spin_locks of all cpus. If there are multiple update users, the competition is very serious. 0001: Use head->first to check whether the free list is empty or not before taking the lock. 0002: Add benchmark to reproduce this worst case. Changelog: v5->v6: Addressed comments from Alexei Starovoitov. - Adjust the commit log. some details in here: https://lore.kernel.org/all/20220608021050.47279-1-zhoufeng.zf@bytedance.com/ v4->v5: Addressed comments from Alexei Starovoitov. - Use head->first. - Use cpu+max_entries. some details in here: https://lore.kernel.org/bpf/20220601084149.13097-1-zhoufeng.zf@bytedance.com/ v3->v4: Addressed comments from Daniel Borkmann. - Use READ_ONCE/WRITE_ONCE. some details in here: https://lore.kernel.org/all/20220530091340.53443-1-zhoufeng.zf@bytedance.com/ v2->v3: Addressed comments from Alexei Starovoitov, Andrii Nakryiko. - Adjust the way the benchmark is tested. - Adjust the code format. some details in here: https://lore.kernel.org/all/20220524075306.32306-1-zhoufeng.zf@bytedance.com/T/ v1->v2: Addressed comments from Alexei Starovoitov. - add a benchmark to reproduce the issue. - Adjust the code format that avoid adding indent. some details in here: https://lore.kernel.org/all/877ac441-045b-1844-6938-fcaee5eee7f2@bytedance.com/T/ Feng Zhou (2): bpf: avoid grabbing spin_locks of all cpus when no free elems selftest/bpf/benchs: Add bpf_map benchmark kernel/bpf/percpu_freelist.c | 20 ++-- tools/testing/selftests/bpf/Makefile | 4 +- tools/testing/selftests/bpf/bench.c | 2 + .../benchs/bench_bpf_hashmap_full_update.c | 96 +++++++++++++++++++ .../run_bench_bpf_hashmap_full_update.sh | 11 +++ .../bpf/progs/bpf_hashmap_full_update_bench.c | 40 ++++++++ 6 files changed, 166 insertions(+), 7 deletions(-) create mode 100644 tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh create mode 100644 tools/testing/selftests/bpf/progs/bpf_hashmap_full_update_bench.c -- 2.20.1