Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp1813377ybc; Wed, 20 Nov 2019 04:36:51 -0800 (PST) X-Google-Smtp-Source: APXvYqwX2r+Hj/hBrt5HTguFoohCujcWQ3leVboKKB3rOZhz2HltItI7chzoVqZQ3O4IyUCFpOGe X-Received: by 2002:a5d:54cb:: with SMTP id x11mr3074254wrv.161.1574253411114; Wed, 20 Nov 2019 04:36:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574253411; cv=none; d=google.com; s=arc-20160816; b=ROqUefRIZlfzDlAR5WXbVIVybBEg3MlYDbTMuyLkrvATP/AqEw6t3l/K8yBci9FJuU WVO+SWIc44TOvabaizsRaGZ0yOM6sI8sYE00phmy19csKs2gEt81F7ePCef+/lujhRHC Ing32km8wmIRr0ye9dYIBjauqK327zlBAOO7PyVFl+vHsY20GCEyYmHRLvUhJBoEwnuw Rg5x6E7JQk1nidlB+979lF1u65mMFWh5QO0SzeALW2a6QnrsT18UbC6rXBHTXE+1SEyZ J4yjtfT2Pi1AWepQqtboHuV7QFvj6BN+l0MzJeiiIYk4xrkZ/meEdrjGD7gbMIGpRp6i HtPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id :organization:cc:to:subject:from; bh=/FwW9fHRybT9hi5N+KY09+yLZqn5B0lpjcH/3DcyHU0=; b=AgTMD+1dcto6SbLpBjNDeQxR1WTpDppDs8YF8JHrqIOXa9jJdHDrZ6PXd0CqfkqkaK R6achtFs8xSD+l3FmiVBAGSt5AMegVQ77Y0tK+58Cz0mbE3bFN5nRxetuwXL1BtXZniC HeEmCqQHrAOgQU63LswZ9DqmQu5G4RG2G/D6HAXT2yoEIHSzjVMCmSAkuCR+qUkzvzLn VOf0QjcOCDC7ZaUPrKVmrJiWBrtiQZei/+Wlm3gFBRl8h+KqlIb5X7FCjBkHAsuqj2/h wacqb00wMJ+d82LX+H7YQwG64aAp1FZX35UcS9U2o/W0oOi+E1alX8XcKlS75BFhxWYX zhaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6si18761378eda.377.2019.11.20.04.36.26; Wed, 20 Nov 2019 04:36:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728276AbfKTJdP (ORCPT + 99 others); Wed, 20 Nov 2019 04:33:15 -0500 Received: from mga14.intel.com ([192.55.52.115]:4016 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725956AbfKTJdO (ORCPT ); Wed, 20 Nov 2019 04:33:14 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Nov 2019 01:33:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,221,1571727600"; d="scan'208";a="200656867" Received: from linux.intel.com ([10.54.29.200]) by orsmga008.jf.intel.com with ESMTP; 20 Nov 2019 01:33:13 -0800 Received: from [10.249.33.94] (abudanko-mobl.ccr.corp.intel.com [10.249.33.94]) by linux.intel.com (Postfix) with ESMTP id 74AD658049B; Wed, 20 Nov 2019 01:33:11 -0800 (PST) From: Alexey Budankov Subject: [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K To: Arnaldo Carvalho de Melo Cc: Jiri Olsa , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , Andi Kleen , linux-kernel Organization: Intel Corp. Message-ID: <26d1512a-9dea-bf7e-d18e-705846a870c4@linux.intel.com> Date: Wed, 20 Nov 2019 12:33:10 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Current implementation of cpu_set_t type by glibc has internal cpu mask size limitation of no more than 1024 CPUs. This limitation confines NUMA awareness of Perf tool in record mode, thru --affinity option, to the first 1024 CPUs on machines with larger amount of CPUs. This patch set enables Perf tool to overcome 1024 CPUs limitation by using a dedicated struct mmap_cpu_mask type and applying tool's bitmap API operations to manipulate affinity masks of the tool's thread and the mmaped data buffers. tools bitmap API has been extended with bitmap_equal() operation and its implementation is derived from the kernel one. --- Alexey Budankov (3): tools bitmap: extend bitmap API with bitmap_equal() perf mmap: declare type for cpu mask of arbitrary length perf record: adapt affinity to machines with #CPUs > 1K tools/include/linux/bitmap.h | 21 +++++++++++++++++++++ tools/lib/bitmap.c | 15 +++++++++++++++ tools/perf/builtin-record.c | 28 ++++++++++++++++++++++------ tools/perf/util/mmap.c | 28 ++++++++++++++++++++++------ tools/perf/util/mmap.h | 11 ++++++++++- 5 files changed, 90 insertions(+), 13 deletions(-) --- Testing: $ tools/perf/perf record -v --affinity=cpu -- ls thread mask[8]: empty Using CPUID GenuineIntel-6-5E-3 ... mmap size 528384B 0x7f95f8f85010: mmap mask[8]: 0 0x7f95f8f950d8: mmap mask[8]: 1 0x7f95f8fa51a0: mmap mask[8]: 2 0x7f95f8fb5268: mmap mask[8]: 3 0x7f95f8fc5330: mmap mask[8]: 4 0x7f95f8fd53f8: mmap mask[8]: 5 0x7f95f8fe54c0: mmap mask[8]: 6 0x7f95f8ff5588: mmap mask[8]: 7 ... thread mask[8]: 0 thread mask[8]: 1 thread mask[8]: 2 thread mask[8]: 3 arch copy Documentation init kernel MAINTAINERS modules.builtin.modinfo perf.data scripts System.map vmlinux block COPYING drivers ipc lbuild Makefile modules.order perf.data.old security tools vmlinux.o certs CREDITS fs Kbuild lib mm Module.symvers README sound usr config-5.2.7-100.fc29.x86_64 crypto include Kconfig LICENSES modules.builtin net samples stdio virt thread mask[8]: 4 thread mask[8]: 5 thread mask[8]: 6 thread mask[8]: 7 thread mask[8]: 0 thread mask[8]: 1 thread mask[8]: 2 thread mask[8]: 3 thread mask[8]: 4 thread mask[8]: 5 thread mask[8]: 6 thread mask[8]: 7 [ perf record: Woken up 0 times to write data ] thread mask[8]: 0 thread mask[8]: 1 thread mask[8]: 2 thread mask[8]: 3 thread mask[8]: 4 thread mask[8]: 5 thread mask[8]: 6 thread mask[8]: 7 ... [ perf record: Captured and wrote 0.014 MB perf.data (11 samples) ] -- 2.20.1