From: Dexuan Cui <decui@microsoft.com>
To: "linux-x86_64@vger.kernel.org" <linux-x86_64@vger.kernel.org>,
        "Thomas Gleixner" <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>,
        "H. Peter Anvin" <hpa@zytor.com>, David Howells <dhowells@redhat.com>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: x86 memory barrier: why does Linux prefer MFENCE to Locked ADD?
Thread-Topic: x86 memory barrier: why does Linux prefer MFENCE to Locked ADD?
Thread-Index: AdF1UXxpl5IsBBt0SWOmf/05vrR4MA==
Date: Thu, 3 Mar 2016 14:33:15 +0000
Message-ID: <BLUPR03MB1410A48DDA4C0A4902A8E163BFBD0@BLUPR03MB1410.namprd03.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
spamdiagnosticoutput: 1:23
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Mar 2016 14:33:15.6743
 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR03MB1411
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Length: 1004
Lines: 31

Hi,
My understanding about arch/x86/include/asm/barrier.h is: obviously Linux
more likes {L,S,M}FENCE -- Locked ADD is only used in x86_32 platforms that
don't support XMM2.

However, it looks people say Locked Add is much faster than the FENCE
instructions, even on modern Intel CPUs like Haswell, e.g., please see
the three sources:

" 11.5.1 Locked Instructions as Memory Barriers
Optimization
Use locked instructions to implement Store/Store and Store/Load barriers.
"
http://support.amd.com/TechDocs/47414_15h_sw_opt_guide.pdf

"lock addl %(rsp), 0 is a better solution for StoreLoad barrier ":
http://shipilev.net/blog/2014/on-the-fence-with-dependencies/

"...locked instruction are more efficient barriers...":
http://www.pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-essay/

I also found that FreeBSD prefers Locked Add.

So, I'm curious why Linux prefers MFENCE.
I guess I may be missing something.

I tried to google the question, but didn't find an answer.

Thanks,
-- Dexuan