Date: Thu, 20 Oct 2016 20:03:06 -0700
From: Davidlohr Bueso <dave@stgolabs.net>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
        Jiri Olsa <jolsa@redhat.com>, Jiri Olsa <jolsa@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>,
        linux-kernel@vger.kernel.org, Davidlohr Bueso <dbueso@suse.de>
Subject: Re: [PATCH 2/2 v2] perf bench futex: add NUMA support
Message-ID: <20161021030306.GD30561@linux-80c1.suse>
References: <20161016190803.3392-1-bigeasy@linutronix.de>
 <20161016190803.3392-2-bigeasy@linutronix.de>
 <20161017143821.GO12815@kernel.org>
 <20161017150123.GA18595@krava>
 <20161017150442.GQ12815@kernel.org>
 <20161017153331.pkzaij5a3ma5c5s5@linutronix.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
In-Reply-To: <20161017153331.pkzaij5a3ma5c5s5@linutronix.de>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1811
Lines: 75

On Mon, 17 Oct 2016, Sebastian Andrzej Siewior wrote:

>+#ifdef HAVE_LIBNUMA_SUPPORT
>+#include <numa.h>
>+#endif

In futex.h

>+static int numa_node = -1;

In futex.h (perhaps rename to futexbench_numa_node?)

>+#ifndef HAVE_LIBNUMA_SUPPORT
>+static int numa_run_on_node(int node __maybe_unused) { return 0; }
>+static int numa_node_of_cpu(int node __maybe_unused) { return 0; }
>+static void *numa_alloc_local(size_t size) { return malloc(size); }
>+static void numa_free(void *p, size_t size __maybe_unused) { return free(p); }
>+#endif
>+
>+static bool cpu_is_local(int cpu)
>+{
>+	if (numa_node < 0)
>+		return true;
>+	if (numa_node_of_cpu(cpu) == numa_node)
>+		return true;
>+	return false;
>+}

In futex.h

>-	if (!nthreads) /* default to the number of CPUs */
>-		nthreads = ncpus;
>+	if (!nthreads) {
>+		/* default to the number of CPUs per NUMA node */

This comment should go...

>+		if (numa_node < 0) {
>+			nthreads = ncpus;
>+		} else {

here.

>+			for (i = 0; i < ncpus; i++) {
>+				if (cpu_is_local(i))
>+					nthreads++;
>+			}
>+			if (!nthreads)
>+				err(EXIT_FAILURE, "No online CPUs for this node");
>+		}
>+	} else {
>+		int cpu_available = 0;
>
>-	worker = calloc(nthreads, sizeof(*worker));
>+		for (i = 0; i < ncpus && !cpu_available; i++) {
>+			if (cpu_is_local(i))
>+				cpu_available = 1;
>+		}

Is this really necessary? If the user passes the number of threads, then we shouldn't
care about ncpus; we just run all the threads on the specified node. Wouldn't the
below numa_run_on_node() ensure that the node is not, for example, CPU-less?

>+		if (!cpu_available)
>+			err(EXIT_FAILURE, "No online CPUs for this node");
>+	}
>+
>+	if (numa_node >= 0) {
>+		ret = numa_run_on_node(numa_node);
>+		if (ret < 0)
>+			err(EXIT_FAILURE, "numa_run_on_node");


Thanks,
Davidlohr