Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1161884pxb; Thu, 23 Sep 2021 20:37:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9JyaoDAOChbGMtIV9ONiVbSJ/LOTn8fSdBW9pfQNb+Fu7u2btQbYgee2Ut3hyQi96/kkU X-Received: by 2002:a05:6602:208f:: with SMTP id a15mr5902409ioa.123.1632454679636; Thu, 23 Sep 2021 20:37:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632454679; cv=none; d=google.com; s=arc-20160816; b=z9+PAlo+lsyrwqCPQ8q7ouM7A2sADU9LvIK/OOJuphF2XVlqoADcclzl4EhfHFRqbJ 7LGOWpDpO9aeYWmnQCI3jgSVoX+dcDS+2onAcTXbRYqJQ4FoCe63qVCikT+9dCU4v5AR gNRqgfHwEywqyqZNwoP2wHJ1EEcPXZMgbAwP3TxthXOK1NCS6C7EZnHYR6F3a24aGaKW ULcom3nkslTSt3sFrqy3Ik4OpaqLCg/dkFCmTT6QT6n5NbbJTBVf7Bn2eu12fHD5C2kc Kmv2dpB4D0Wzej138petceFhBQI/BJTacSKyh3Vzu1ugBdRuarNEhTKwTQMXvg6IhOb3 XG7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:date:user-agent:message-id; bh=iv8EoFz1fC2UHkbXn6MN7p7KhqOpazcE/2KBdBUV67E=; b=ZF4Spt8S5tJVcoKvFsmNEBRRxLj5pIVXwGuNaMdYuxDQL8QXzfOzejd03nwK0tDuT/ xlYO1P8yyr0d/SPCZlTGCleqDETKAs3E5FTAkkUJqfSlc+Fw4DAt4/MG0z2MEMI8jo6G D0SrxyXm4sjilxLRqTf5gs/JH7geHesx3trOJavM6YtvhtgTPRnQdT6lGOYn/WN0B4/c T4ljC0zf6FG+mBQ9vULqEtIXwKDDyZgaWrksMSew74loZv79yal98TvEfZ0QyR1Viac6 ua5mvGA7kJCVPQzAxecYHSAL/PorIgppVy3myUxgXkS+X/cIKTmPPzXsgowMZ9dQuElB WxFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r17si11289651iov.104.2021.09.23.20.37.48; Thu, 23 Sep 2021 20:37:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244036AbhIXDik (ORCPT + 99 others); Thu, 23 Sep 2021 23:38:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:35990 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243960AbhIXDij (ORCPT ); Thu, 23 Sep 2021 23:38:39 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 83C6960FA0; Fri, 24 Sep 2021 03:37:07 +0000 (UTC) Received: from rostedt by gandalf.local.home with local (Exim 4.94.2) (envelope-from ) id 1mTc1K-003Hys-9p; Thu, 23 Sep 2021 23:37:06 -0400 Message-ID: <20210924033547.939554938@goodmis.org> User-Agent: quilt/0.66 Date: Thu, 23 Sep 2021 23:35:47 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org Cc: Ingo Molnar , Andrew Morton , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-devel@vger.kernel.org Subject: [PATCH 0/2] tracing: Have trace_pid_list be a sparse array Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When the trace_pid_list was created, the default pid max was 32768. Creating a bitmask that can hold one bit for all 32768 took up 4096 (one page). Having a one page bitmask was not much of a problem, and that was used for mapping pids. But today, systems are bigger and can run more tasks, and now the default pid_max is usually set to 4194304. Which means to handle that many pids requires 524288 bytes. Worse yet, the pid_max can be set to 2^30 (1073741824 or 1G) which would take 134217728 (128M) of memory to store this array. Since the pid_list array is very sparsely populated, it is a huge waste of memory to store all possible bits for each pid when most will not be set. Instead, use a page table scheme to store the array, and allow this to handle up to 32 bit pids. The pid_mask will start out with 1024 entries for the first 10 MSB bits. This will cost 4K for 32 bit architectures and 8K for 64 bit. Each of these will have a 1024 array to store the next 10 bits of the pid (another 4 or 8K). These will hold an 512 byte bitmask (which will cover the LSB 12 bits or 4096 bits). When the trace_pid_list is allocated, it will have the 4/8K upper bits allocated, and then it will allocate a cache for the next upper chunks and the lower chunks (default 6 of each). Then when a bit is "set", these chunks will be pulled from the free list and added to the array. If the free list gets down to a lever (default 2), it will trigger an irqwork that will refill the cache back up. On clearing a bit, if the clear causes the bitmask to be zero, that chunk will then be placed back into the free cache for later use, keeping the need to allocate more down to a minimum. Steven Rostedt (VMware) (2): tracing: Place trace_pid_list logic into abstract functions tracing: Create a sparse bitmask for pid filtering ---- kernel/trace/Makefile | 1 + kernel/trace/ftrace.c | 6 +- kernel/trace/pid_list.c | 551 ++++++++++++++++++++++++++++++++++++++++++++ kernel/trace/trace.c | 78 +++---- kernel/trace/trace.h | 14 +- kernel/trace/trace_events.c | 6 +- 6 files changed, 595 insertions(+), 61 deletions(-) create mode 100644 kernel/trace/pid_list.c