Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp4147808ybc; Tue, 26 Nov 2019 04:40:32 -0800 (PST) X-Google-Smtp-Source: APXvYqyfngTU3D9ggu1MjXgFvZkjkE9ZfEFzO3T7HDZeCnUKBfJYewnEvR3gqXBVZOM4W77fWxcR X-Received: by 2002:a17:906:52c4:: with SMTP id w4mr41218604ejn.99.1574772032782; Tue, 26 Nov 2019 04:40:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574772032; cv=none; d=google.com; s=arc-20160816; b=yhmYTckOhsia2jj897jWNiLXyRQYAuyILlMMiudJiXcL8pS7oi71mlvWQLOrcu/hNc +Mft60ZENYGmCFxnz3L02z4TKmSEhRTCjnjUi6GL6kk2z44EpnjAIr4RGBsL61/1X7Ad +z0RpWB4vRx1UGLvDpdS44A0+xczTdksFog3CtMK5HncSxOdw2tKJoocEySx9sLpBxfa fjkRXYX05ucGduYjpAVUsyt6TIvApZ1ns0kBVnHalfjqfSF022nnpPmueHa9mYFPGJUK ZOKZKQtnlqTnm6Za3DgNBH9gDkbyBAwH9dJf3JdwYENN/YuVxchUrHb8GlZgt+U4Ux7C kKFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id :organization:cc:to:subject:from; bh=1XkXR/r/gKbnhV9uki0SMmKuyJ4LQ350NyNJGslrFQw=; b=eQPTjPBAWKBkoJWSJ0coN7H9x6sInfU2FPZa/HkLRheyWui4MDLfVdoSPJOIkTy+/T 0VIOLweOH4LT22p0EYpjSSNEoFN6WAD2Ubrsv7fXfSPmzgklZo5pGC5H8XoDtISoL7S4 nP2CN3/FeF69dGqxctZoOT2VO4OxB7qzwcHfff+shzWjy/L4ul6u5zFWGDx5d7sfNqSl 6qxOZpt+CPbuGJwLa8R8Nk2iWqKG/67JJ8lEZco22eqqakZe107fuW9jDFckM4ckOBkA krEYSgLtiFtZxxY7O9Qr71GmYRHcy2pWmZvSLglufgiXM5sdN8QARMZTiLGU0y6+bE94 Bm8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r8si71756edp.343.2019.11.26.04.40.08; Tue, 26 Nov 2019 04:40:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728063AbfKZLPn (ORCPT + 99 others); Tue, 26 Nov 2019 06:15:43 -0500 Received: from mga01.intel.com ([192.55.52.88]:2334 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728053AbfKZLPm (ORCPT ); Tue, 26 Nov 2019 06:15:42 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Nov 2019 03:15:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,245,1571727600"; d="scan'208";a="409944662" Received: from linux.intel.com ([10.54.29.200]) by fmsmga006.fm.intel.com with ESMTP; 26 Nov 2019 03:15:42 -0800 Received: from [10.125.252.207] (abudanko-mobl.ccr.corp.intel.com [10.125.252.207]) by linux.intel.com (Postfix) with ESMTP id B61055802E4; Tue, 26 Nov 2019 03:15:39 -0800 (PST) From: Alexey Budankov Subject: [PATCH v3 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K To: Arnaldo Carvalho de Melo Cc: Jiri Olsa , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , Andi Kleen , linux-kernel Organization: Intel Corp. Message-ID: <6b2be869-28c1-ae9b-92e8-5ababf143308@linux.intel.com> Date: Tue, 26 Nov 2019 14:15:38 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Current implementation of cpu_set_t type by glibc has internal cpu mask size limitation of no more than 1024 CPUs. This limitation confines NUMA awareness of Perf tool in record mode, thru --affinity option, to the first 1024 CPUs on machines with larger amount of CPUs. This patch set enables Perf tool to overcome 1024 CPUs limitation by using a dedicated struct mmap_cpu_mask type and applying tool's bitmap API operations to manipulate affinity masks of the tool's thread and the mmaped data buffers. tools bitmap API has been extended with bitmap_free() function and bitmap_equal() operation whose implementation is derived from the kernel one. --- Alexey Budankov (3): tools bitmap: implement bitmap_equal() operation at bitmap API perf mmap: declare type for cpu mask of arbitrary length perf record: adapt affinity to machines with #CPUs > 1K tools/include/linux/bitmap.h | 30 +++++++++++++++++++++++++++ tools/lib/bitmap.c | 15 ++++++++++++++ tools/perf/builtin-record.c | 27 ++++++++++++++++++------ tools/perf/util/mmap.c | 40 ++++++++++++++++++++++++++++++------ tools/perf/util/mmap.h | 13 +++++++++++- 5 files changed, 112 insertions(+), 13 deletions(-) --- Changes in v3: - implemented perf_mmap__print_cpu_mask() function - use perf_mmap__print_cpu_mask() to log thread and mmap cpus masks when verbose level is equal to 2 Changes in v2: - implemented bitmap_free() for symmetry with bitmap_alloc() - capitalized MMAP_CPU_MASK_BYTES() macro - returned -1 from perf_mmap__setup_affinity_mask() - implemented releasing of masks using bitmap_free() - moved debug printing under -vv option --- Testing: tools/perf/perf record -vv --affinity=cpu -- ls thread mask[8]: empty ... ------------------------------------------------------------ perf_event_attr: ... mmap size 528384B 0x7fddecf760b8: mmap mask[8]: 0 0x7fddecf86180: mmap mask[8]: 1 0x7fddecf96248: mmap mask[8]: 2 0x7fddecfa6310: mmap mask[8]: 3 0x7fddecfb63d8: mmap mask[8]: 4 0x7fddecfc64a0: mmap mask[8]: 5 0x7fddecfd6568: mmap mask[8]: 6 0x7fddecfe6630: mmap mask[8]: 7 ------------------------------------------------------------ perf_event_attr: ... Synthesizing TSC conversion information 0x7fddecf760b8: thread mask[8]: 0 0x7fddecf86180: thread mask[8]: 1 0x7fddecf96248: thread mask[8]: 2 arch copy Documentation init kernel MAINTAINERS modules.builtin.modinfo perf.data scripts System.map vmlinux block COPYING drivers ipc lbuild Makefile modules.order perf.data.old security tools vmlinux.o certs CREDITS fs Kbuild lib mm Module.symvers README sound usr config-5.2.7-100.fc29.x86_64 crypto include Kconfig LICENSES modules.builtin net samples stdio virt 0x7fddecfa6310: thread mask[8]: 3 0x7fddecfb63d8: thread mask[8]: 4 0x7fddecfc64a0: thread mask[8]: 5 0x7fddecfd6568: thread mask[8]: 6 0x7fddecfe6630: thread mask[8]: 7 0x7fddecf760b8: thread mask[8]: 0 0x7fddecf86180: thread mask[8]: 1 0x7fddecf96248: thread mask[8]: 2 0x7fddecfa6310: thread mask[8]: 3 0x7fddecfb63d8: thread mask[8]: 4 0x7fddecfc64a0: thread mask[8]: 5 0x7fddecfd6568: thread mask[8]: 6 0x7fddecfe6630: thread mask[8]: 7 [ perf record: Woken up 0 times to write data ] 0x7fddecf760b8: thread mask[8]: 0 0x7fddecf86180: thread mask[8]: 1 0x7fddecf96248: thread mask[8]: 2 0x7fddecfa6310: thread mask[8]: 3 0x7fddecfb63d8: thread mask[8]: 4 0x7fddecfc64a0: thread mask[8]: 5 0x7fddecfd6568: thread mask[8]: 6 0x7fddecfe6630: thread mask[8]: 7 Looking at the vmlinux_path (8 entries long) Using vmlinux for symbols [ perf record: Captured and wrote 0.014 MB perf.data (8 samples) ] -- 2.20.1