Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp2415708imm; Mon, 16 Jul 2018 07:42:01 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcJ3QPeJnH4V+HFbv8ZKrHxgl07hIUeWAl2hcs9uJvt/T2nZHSUdZIRdpKhSYYhaD39WNSN X-Received: by 2002:a62:23d1:: with SMTP id q78-v6mr18204282pfj.179.1531752121892; Mon, 16 Jul 2018 07:42:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531752121; cv=none; d=google.com; s=arc-20160816; b=uhH0Qr1sUfoCnbAnT+n4gSR189fDsxSCXoYDJ0Y6wfUAD87904rpIcwd6TbHC50gtE t+Mo/U9UdjcaTu9xa6mWm1ku8XEFDv+QaEf7zWAN1mw8+yIK8u0X3/KY18E2jRyfzBv0 IjBFVGd2Y7leUnBuUddbGC7UjTqGw/BlCpgs5zSxeGhBmmJyajGxDk2O36QLovvyRFhd BmbJAfo7Ig+ioDAEzWYS0oIkW69E77XkPzg+SxHBf3SwumLkAJLbRKoppxVyLe4MNS+O 89TZoCyCr5lQi4f8Ue2DUFRYcfZyZUgH9Yp9MGiGgQr8S7ffvl3718gRJrNi0em4Oe7h HYUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:arc-authentication-results; bh=BCPNigk/bZKo9bnRu6LERH84a+TN1rtBiZSP0OgMLSI=; b=xaGnG66iBD7VuDyVTGniroBB1tIvGpvhSdBRfoOw2dK8IRuV+jXKVi6IKgHpp2B2UJ Ow965OZ+uHK7Vj581T9FSURfsoTVPv2qKU7edgz26lvJgRRwZjSQzpZWcu2O8a4dPwJz spHFkVPIRWJrUBHJQbrpYFeMZu/nKwrBixE4vvzQ0xjbOkaKIoCjw7XkrXruzL3rpCcX 3Z28+H4eyIkzkeOsaCvdJZsDA+nMHSvgrQYAW0AJZYtWBx8V8ZeaPBSCH83ssjx1Qo+1 h2YosZ0cIrS8r/X2OHH5F+SjsBts44pPTbodwEjHbipGC77RQG+pWixo3NqVFycWVFBK MM5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 32-v6si30316381plc.452.2018.07.16.07.41.46; Mon, 16 Jul 2018 07:42:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727672AbeGPPIQ (ORCPT + 99 others); Mon, 16 Jul 2018 11:08:16 -0400 Received: from ozlabs.org ([203.11.71.1]:50727 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727150AbeGPPIQ (ORCPT ); Mon, 16 Jul 2018 11:08:16 -0400 Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPSA id 41TmKh5hhYz9ryt; Tue, 17 Jul 2018 00:40:24 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ellerman.id.au From: Michael Ellerman To: Peter Zijlstra Cc: Linus Torvalds , Paul McKenney , Alan Stern , andrea.parri@amarulasolutions.com, Will Deacon , Akira Yokosawa , Boqun Feng , Daniel Lustig , David Howells , Jade Alglave , Luc Maranget , Nick Piggin , Linux Kernel Mailing List Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire In-Reply-To: <20180713164239.GZ2494@hirez.programming.kicks-ass.net> References: <20180712134821.GT2494@hirez.programming.kicks-ass.net> <20180712172838.GU3593@linux.vnet.ibm.com> <20180712180511.GP2476@hirez.programming.kicks-ass.net> <20180713110851.GY2494@hirez.programming.kicks-ass.net> <87tvp3xonl.fsf@concordia.ellerman.id.au> <20180713164239.GZ2494@hirez.programming.kicks-ass.net> Date: Tue, 17 Jul 2018 00:40:19 +1000 Message-ID: <87601fz1kc.fsf@concordia.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > On Fri, Jul 13, 2018 at 11:15:26PM +1000, Michael Ellerman wrote: ... >> >> >> So 18-32% slower, or 23-47 cycles. > > Very good info. Note that another option is to put the SYNC in lock() it > doesn't really matter which of the two primitives gets it. I don't > suppose it really matters for timing either way around. If the numbers can be trusted it is actually slower to put the sync in lock, at least on one of the machines: Time lwsync_sync 84,932,987,977 sync_lwsync 93,185,930,333 On the other machine it's slower but only by 0.1%, so that's slightly weird. The other advantage of putting the sync in unlock is we could get rid of our SYNC_IO logic, which conditionally puts a sync in unlock to order IO accesses vs unlock. >> Next week I can do some macro benchmarks, to see if it's actually >> detectable at all. I guess arguably it's not a very macro benchmark, but we have a context_switch benchmark in the tree[1] which we often use to tune things, and it degrades badly. It just spins up two threads and has them ping-pong using yield. The numbers are context switch iterations, so more == better. | Before | After | Change | Change % +------------+------------+------------+---------- | 35,601,160 | 32,371,164 | -3,229,996 | -9.07% | 35,762,126 | 32,438,798 | -3,323,328 | -9.29% | 35,690,870 | 32,353,676 | -3,337,194 | -9.35% | 35,440,346 | 32,336,750 | -3,103,596 | -8.76% | 35,614,868 | 32,676,378 | -2,938,490 | -8.25% | 35,659,690 | 32,462,624 | -3,197,066 | -8.97% | 35,594,058 | 32,403,922 | -3,190,136 | -8.96% | 35,682,682 | 32,353,146 | -3,329,536 | -9.33% | 35,954,454 | 32,306,168 | -3,648,286 | -10.15% | 35,849,314 | 32,291,094 | -3,558,220 | -9.93% ----------+------------+------------+------------+---------- Average | 35,684,956 | 32,399,372 | -3,285,584 | -9.21% Std Dev | 143,877 | 111,385 | Std Dev % | 0.40% | 0.34% | [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/powerpc/benchmarks/context_switch.c I'll do some kernbench runs tomorrow and see if it shows up there. >> My personal preference would be to switch to sync, we don't want to be >> the only arch finding (or not finding!) exotic ordering bugs. >> >> But we'd also rather not make our slow locks any slower than they have >> to be. > > I completely understand, but I'll get you beer (lots) if you do manage > to make SYNC happen :-) :-) Just so we're clear Fosters is not beer :) cheers