Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S944061AbcJSSNw (ORCPT ); Wed, 19 Oct 2016 14:13:52 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:45715 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757948AbcJSSNi (ORCPT ); Wed, 19 Oct 2016 14:13:38 -0400 Date: Wed, 19 Oct 2016 20:13:08 +0200 From: Sebastian Andrzej Siewior To: Davidlohr Bueso Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, Davidlohr Bueso Subject: Re: [PATCH] perf/bench-futex: Avoid worker cacheline bouncing Message-ID: <20161019181308.maacqqzdx4ep5yld@linutronix.de> References: <20161016190803.3392-1-bigeasy@linutronix.de> <20161018010949.GD29373@linux-80c1.suse> <20161019130722.t7viruflpg2xu5sx@linutronix.de> <20161019175933.GA28074@linux-80c1.suse> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20161019175933.GA28074@linux-80c1.suse> User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1131 Lines: 27 On 2016-10-19 10:59:33 [-0700], Davidlohr Bueso wrote: > Sebastian noted that overhead for worker thread ops (throughput) > accounting was producing 'perf' to appear in the profiles, consuming > a non-trivial (ie 13%) amount of CPU. This is due to cacheline > bouncing due to the increment of w->ops. We can easily fix this by > just working on a local copy and updating the actual worker once > done running, and ready to show the program summary. There is no > danger of the worker being concurrent, so we can trust that no stale > value is being seen by another thread. > > Reported-by: Sebastian Andrzej Siewior Acked-by: Sebastian Andrzej Siewior > --- a/tools/perf/bench/futex-hash.c > +++ b/tools/perf/bench/futex-hash.c > @@ -63,8 +63,9 @@ static const char * const bench_futex_hash_usage[] = { > static void *workerfn(void *arg) > { > int ret; > - unsigned int i; > struct worker *w = (struct worker *) arg; > + unsigned int i; > + unsigned long ops = w->ops; /* avoid cacheline bouncing */ we start at 0 so there is probably no need to init it with w->ops. Sebastian