Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1919427pxb; Fri, 22 Oct 2021 10:05:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwI+jHNwFCIfF5YUi6JvZDEWod3/J1IdqyzWFE8DU6XKoeSRlp0SQU6qibJpAF09bDmf4C4 X-Received: by 2002:aa7:c797:: with SMTP id n23mr1438523eds.275.1634922307582; Fri, 22 Oct 2021 10:05:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634922307; cv=none; d=google.com; s=arc-20160816; b=Jhv9biW8uI61RuMKl4dloe6TJQoc1DOhdwCfVfrpGIxzrAcKGrFr2L9QVae2JIBjRc zONSrMbAlinc+GXGrqgVrtcF2kM7lgOVOuR3stjk/qzXxLIiapyxwfVQdChlROte96I3 OzbVK5HbU8lsQo86Yk2PtcjRV1go1DvtYEs06epauG+7nKPpRYRq/ip5He5xh/58KjYU 0t3KJygJfirujPDsreMu29bLa/iMHQhT4C1km6buJhM/2v6+P4m6XOBdrQ427bXfXlT0 4oodEo7tXsOwwd9d4gvjg3SL/gFg3QY+yLMscsrh2k5MUVLK89OzSg9hvRKfydTqcaIy Xo/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=XHJUHTrwjU4F/ZFIi+0+Hy9sqvSA84f2Ooi/Dw6+GEk=; b=ykDBb2mlYhRY6gQiDKHs0XurzpKOL4tlvJI0BAYXz7D6IgZVSi7qmIcUmNVH3bSVw9 o+f05Zp1XTQr+dnxcvEtTnMPhtwDOHFmJ7BN46L4+2ltYhlDr8P2TS6jpyexGp3h1Sds nw26s4VI9W7LpJ5sKFIjcScUCg6d24skpp7/RP4XSWYWGq7qO9LqBuy/slp2obVzxHsY rh8wdjwIrE60j+5LLDQM+8k6z1Oo+z3lcJMaVYySFAtLPnXCywsmhajzg1+cQGKI/SOO 3RXa1yZhBGrlDIsJbg3E42chTlMnpeJDpzq3T2DF7MHq3cselxor094smHGWIt94dY/G t7Yg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="L+/AHfLV"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ka26si11741065ejc.280.2021.10.22.10.04.41; Fri, 22 Oct 2021 10:05:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="L+/AHfLV"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233653AbhJVREF (ORCPT + 99 others); Fri, 22 Oct 2021 13:04:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233356AbhJVREE (ORCPT ); Fri, 22 Oct 2021 13:04:04 -0400 Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF43BC061764; Fri, 22 Oct 2021 10:01:46 -0700 (PDT) Received: by mail-yb1-xb34.google.com with SMTP id t127so8351634ybf.13; Fri, 22 Oct 2021 10:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XHJUHTrwjU4F/ZFIi+0+Hy9sqvSA84f2Ooi/Dw6+GEk=; b=L+/AHfLVAcfGzmNDDNI7rTgSxAQRkLSDtue0Jg/yxkgUpZra0JwzBnv6bZ8An6fkSq 2lu4n87u5VnJ//VuXjlBWPZmy4Oea18w6h7HbqE7IH6kf7hZLMGNDLgCsyrNzfPE4uXl 3DWWkIfwW2eoQIvk/K6BmRJAwu3Ug+efv859rFVQAFugKtw8wSLMZif15BgymUAPHK9T mLa4H/xouTzuXPdJA1V1Jt+kTIY/nSClpPmeRlLEW8hNkGLNFk2xD/O/iQetZIY9adjF 9vi2Ght64F+df9PVxsaKbD1RfIjkRIXpEPZnU2lYGzGovm0Ssmex3Df5UZ/b9KPheuf1 WHKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XHJUHTrwjU4F/ZFIi+0+Hy9sqvSA84f2Ooi/Dw6+GEk=; b=b1hUta4k91Nxbhht9FO9tcQ+s0YPC5sJvip/tFjeWz7y6kL55Y0NAg50o+nUjBTjg0 tcM/woPjTzekgfDZyeR8/gDoSXnUp92Mx5pOU3PAymxNxYbFNYGn9XS/NnBd+CLfMDTl XjBlPDRsFC+pT6VCgLPRJkxW5KvYtN1hUr599wdTxwKP+pjlhPF4BK8SM/Jf1FEJMA/0 xR+JokhCUKX5nLUBgQST0nf2sGne8rz2gZc1tE0SCbNiCbeEDngdZzUF3wWDSFwfHJY+ et7YgRfFUQrdLkyR3vWrZ4jAnUbKn81SXsW758ucPFX8JKanNLkpXb3RJGJLFW3Qqt0M 4N+w== X-Gm-Message-State: AOAM53078GCchtdIwJ01FaJg5CRGEe/THymdh+QoAG1mGE4gKG5L78pF VMGGpEwXrL1XobQ+VH4HGlCQ90JEZK99VEhV1oY4ArcGlHQ9gQ== X-Received: by 2002:a25:e7d7:: with SMTP id e206mr869619ybh.267.1634922105930; Fri, 22 Oct 2021 10:01:45 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andrii Nakryiko Date: Fri, 22 Oct 2021 10:01:34 -0700 Message-ID: Subject: Re: [RFC] bpf: Implement prealloc for task_local_storage To: Tejun Heo , Song Liu Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Ingo Molnar , Peter Zijlstra , bpf , Kernel Team , open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 1:16 PM Tejun Heo wrote: > > task_local_storage currently does not support pre-allocation and the memory > is allocated on demand using the GFP_ATOMIC mask. While atomic allocations > succeed most of the time and the occasional failures aren't a problem for > many use-cases, there are some which can benefit from reliable allocations - > e.g. tracking acquisitions and releases of specific resources to root cause > long-term reference leaks. > > This patchset implements prealloc support for task_local_storage so that > once a map is created, it's guaranteed that the storage area is always > available for all tasks unless explicitly deleted. Song, Martin, can you please take a look at this? It might be worthwhile to consider having pre-allocated local storage for all supported types: socket, cgroup, task. Especially for cases where BPF app is going to touch all or almost all system entities (sockets, cgroups, tasks, respectively). Song, in ced47e30ab8b ("bpf: runqslower: Use task local storage") you did some benchmarking of task-local storage vs hashmap and it was faster in all cases but the first allocation of task-local storage. It would be curious to see how numbers change if task-local storage is pre-allocated, if you get a chance to benchmark it with Tejun's changes. Thanks! > > The only tricky part is syncronizing against the fork path. Fortunately, > cgroup needs to do the same thing and is already using > cgroup_threadgroup_rwsem and we can use the same mechanism without adding > extra overhead. This patchset generalizes the rwsem and make cgroup and bpf > select it. > > This patchset is on top of bpf-next 223f903e9c83 ("bpf: Rename BTF_KIND_TAG > to BTF_KIND_DECL_TAG") and contains the following patches: > > 0001-cgroup-Drop-cgroup_-prefix-from-cgroup_threadgroup_r.patch > 0002-sched-cgroup-Generalize-threadgroup_rwsem.patch > 0003-bpf-Implement-prealloc-for-task_local_storage.patch > > and also available in the following git branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git bpf-task-local-storage-prealloc > > diffstat follows. Thanks. > > fs/exec.c | 7 + > include/linux/bpf.h | 6 + > include/linux/bpf_local_storage.h | 12 +++ > include/linux/cgroup-defs.h | 33 --------- > include/linux/cgroup.h | 11 +-- > include/linux/sched/threadgroup_rwsem.h | 46 ++++++++++++ > init/Kconfig | 4 + > kernel/bpf/Kconfig | 1 > kernel/bpf/bpf_local_storage.c | 112 ++++++++++++++++++++++-------- > kernel/bpf/bpf_task_storage.c | 138 +++++++++++++++++++++++++++++++++++++- > kernel/cgroup/cgroup-internal.h | 4 - > kernel/cgroup/cgroup-v1.c | 9 +- > kernel/cgroup/cgroup.c | 74 ++++++++++++-------- > kernel/cgroup/pids.c | 2 > kernel/fork.c | 16 ++++ > kernel/sched/core.c | 4 + > kernel/sched/sched.h | 1 > kernel/signal.c | 7 + > tools/testing/selftests/bpf/prog_tests/task_local_storage.c | 101 +++++++++++++++++++++++++++ > tools/testing/selftests/bpf/progs/task_ls_prealloc.c | 15 ++++ > 20 files changed, 489 insertions(+), 114 deletions(-) > > -- > tejun >