Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp3222534ybg; Sun, 20 Oct 2019 09:14:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqyZi7NQCInQRVGv2y7yQPQhJ5mX4Z2e7T9D7icK1khfbEtY4xeLoc9CKoTOIW9bplrDW5zj X-Received: by 2002:a17:906:4dce:: with SMTP id f14mr3389030ejw.22.1571588078741; Sun, 20 Oct 2019 09:14:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571588078; cv=none; d=google.com; s=arc-20160816; b=NIzDBofcw8+PzB9Fc0KKGtmyV7JVp3gyM3QFKEJlL17oBxl49EeUM/Jqh6q52WcRpJ uXU3uMY2pXaS6kZCl5FEa9SLBqt+3PjlC1PS3B6AD1FATsAu5Aq05TFQ8YyDwbazC2rD Z8KtUjDq7sG1cdz+KAQLqd7x9o4gEQ2qpYLrx+ZrctM/DPGFlQu1gnxveGjnaqBW6+TP o3xzt8O9lsZGEcT9JwdQaGwj1zE2bV+V0ul5Kvdm/62BAiH8GzosFSacHr8IosG+JwDN lgYdTHw4+ZMG0W7o7sdL+r9SPBh9p246b0Wx1XZ0sSoAFV5mJqKp6/tUZLxAAxuhRQrR 9FLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=yclL8/wmEhi0kOTYQVUpC1xQNpgLmIEracCpRmWZUwc=; b=g3aEzYezqewMLR0mMUQZ++w/2T/HzlNftxJWUtUEdVdgDjDLK751ftn5jlYgRD8sNA nSm5ViY685B/TKdq0rlU5RQ9f6TPebgiTF/NGyGreJ/qmA7kYHDDHQ85Qq0p6DBiAa+d 1yktZf3wML8ai39ykKQYm0FURhvbTI8Lg66+Olk80fQXhGUnL1lTJXLg+CdLLV2wdDmM Q9qAU60XRr9E0zTJoojiULBS9xqBlUQD9ygPz2d4E90WaYHI2aAWTq1m/RU8YZiforLl tt2WFRSSGT0HL4+j6enm77/OYB7rSe5aNzfW3+JoN+kSzn6akRs2aBL9/FZQXqKkCyjy XO8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b26si6930169eju.87.2019.10.20.09.14.14; Sun, 20 Oct 2019 09:14:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726700AbfJTQOE (ORCPT + 99 others); Sun, 20 Oct 2019 12:14:04 -0400 Received: from mga06.intel.com ([134.134.136.31]:28692 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726511AbfJTQN7 (ORCPT ); Sun, 20 Oct 2019 12:13:59 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Oct 2019 09:13:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,320,1566889200"; d="scan'208";a="203147502" Received: from tassilo.jf.intel.com (HELO tassilo.localdomain) ([10.7.201.137]) by FMSMGA003.fm.intel.com with ESMTP; 20 Oct 2019 09:13:58 -0700 Received: by tassilo.localdomain (Postfix, from userid 1000) id 393F430038C; Sun, 20 Oct 2019 09:13:58 -0700 (PDT) From: Andi Kleen To: acme@kernel.org Cc: jolsa@kernel.org, eranian@google.com, kan.liang@linux.intel.com, peterz@infradead.org, linux-kernel@vger.kernel.org, Andi Kleen Subject: [PATCH v1 4/9] perf affinity: Add infrastructure to save/restore affinity Date: Sun, 20 Oct 2019 09:13:41 -0700 Message-Id: <20191020161346.18938-5-andi@firstfloor.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191020161346.18938-1-andi@firstfloor.org> References: <20191020161346.18938-1-andi@firstfloor.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andi Kleen The kernel perf subsystem has to IPI to the target CPU for many operations. On systems with many CPUs and when managing many events the overhead can be dominated by lots of IPIs. An alternative is to set up CPU affinity in the perf tool, then set up all the events for that CPU, and then move on to the next CPU. Add some affinity management infrastructure to enable such a model. Used in followon patches. Signed-off-by: Andi Kleen --- tools/perf/util/Build | 1 + tools/perf/util/affinity.c | 71 ++++++++++++++++++++++++++++++++++++++ tools/perf/util/affinity.h | 15 ++++++++ 3 files changed, 87 insertions(+) create mode 100644 tools/perf/util/affinity.c create mode 100644 tools/perf/util/affinity.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 5477f6afe735..302c7fda1e3a 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -74,6 +74,7 @@ perf-y += sort.o perf-y += hist.o perf-y += util.o perf-y += cpumap.o +perf-y += affinity.o perf-y += cputopo.o perf-y += cgroup.o perf-y += target.o diff --git a/tools/perf/util/affinity.c b/tools/perf/util/affinity.c new file mode 100644 index 000000000000..12e8024a6300 --- /dev/null +++ b/tools/perf/util/affinity.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Manage affinity to optimize IPIs inside the kernel perf API. */ +#define _GNU_SOURCE 1 +#include +#include +#include +#include "perf.h" +#include "cpumap.h" +#include "affinity.h" + +static int get_cpu_set_size(void) +{ + int sz = (cpu__max_cpu() + 64 - 1) / 64; + /* + * sched_getaffinity doesn't like masks smaller than the kernel. + * Hopefully that's big enough. + */ + if (sz < 4096/8) + sz = 4096/8; + return sz; +} + +int affinity__setup(struct affinity *a) +{ + int cpu_set_size = get_cpu_set_size(); + + a->orig_cpus = malloc(cpu_set_size); + if (!a->orig_cpus) + return -1; + sched_getaffinity(0, cpu_set_size, (cpu_set_t *)a->orig_cpus); + a->sched_cpus = zalloc(cpu_set_size); + if (!a->sched_cpus) { + free(a->orig_cpus); + return -1; + } + a->changed = false; + return 0; +} + +/* + * perf_event_open does an IPI internally to the target CPU. + * It is more efficient to change perf's affinity to the target + * CPU and then set up all events on that CPU, so we amortize + * CPU communication. + */ +void affinity__set(struct affinity *a, int cpu) +{ + int cpu_set_size = get_cpu_set_size(); + + if (cpu == -1) + return; + a->changed = true; + a->sched_cpus[cpu / 8] |= 1 << (cpu % 8); + /* + * We ignore errors because affinity is just an optimization. + * This could happen for example with isolated CPUs or cpusets. + * In this case the IPIs inside the kernel's perf API still work. + */ + sched_setaffinity(0, cpu_set_size, (cpu_set_t *)a->sched_cpus); + a->sched_cpus[cpu / 8] ^= 1 << (cpu % 8); +} + +void affinity__cleanup(struct affinity *a) +{ + int cpu_set_size = get_cpu_set_size(); + + if (a->changed) + sched_setaffinity(0, cpu_set_size, (cpu_set_t *)a->orig_cpus); + free(a->sched_cpus); + free(a->orig_cpus); +} diff --git a/tools/perf/util/affinity.h b/tools/perf/util/affinity.h new file mode 100644 index 000000000000..e56148607e33 --- /dev/null +++ b/tools/perf/util/affinity.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef AFFINITY_H +#define AFFINITY_H 1 + +struct affinity { + unsigned char *orig_cpus; + unsigned char *sched_cpus; + bool changed; +}; + +void affinity__cleanup(struct affinity *a); +void affinity__set(struct affinity *a, int cpu); +int affinity__setup(struct affinity *a); + +#endif -- 2.21.0