Subject: Re: [PATCH] Define wc_wmb, a write barrier for PCI write combining
From: "Bryan O'Sullivan" <bos@pathscale.com>
To: Andi Kleen <ak@suse.de>, Benjamin LaHaise <bcrl@kvack.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
In-Reply-To: <200602282033.48570.ak@suse.de>
References: <1140841250.2587.33.camel@localhost.localdomain>
	 <20060228190354.GE24306@kvack.org>
	 <1141154424.20227.11.camel@serpentine.pathscale.com>
	 <200602282033.48570.ak@suse.de>
Content-Type: text/plain
Date: Wed, 01 Mar 2006 11:20:23 -0800
Message-Id: <1141240823.2899.84.camel@localhost.localdomain>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1808
Lines: 43

On Tue, 2006-02-28 at 20:33 +0100, Andi Kleen wrote:

> Anyways if MFENCE improved performance you're probably relying
> on some very specific artifact of the microarchitecture of your 
> CPU or Northbridge. I don't think it's a architecurally guaranteed
> feature.

I looked this up, and you appear to be wrong here.

Here's the appropriate quote from page 246 of the PDF of "AMD64
Architecture Programmer's Manual Volume 2: System Programming":

        http://www.amd.com/us-en/assets/content_type/DownloadableAssets/dwamd_24593.pdf

Section 7.4.1 specifically describes what happens to write buffers:

        [...] the processor completely empties the write buffer by
        writing the contents to memory as a result of performing any of
        the following operations:
        
        SFENCE Instruction
        Executing a store-fence (SFENCE) instruction forces all memory
        writes before the SFENCE (in program order) to be written into
        memory before memory writes that follow the SFENCE instruction.
        The memory-fence (MFENCE) instruction has a similar effect, but
        it forces the ordering of loads in addition to stores.
        [...]

So in fact SFENCE is the appropriate, architecturally guaranteed, thing
for us to be doing on x86_64.

With respect to Ben's contention that wmb() will suffice instead, that
isn't true, either, even on x86-class hardware.  The writes absolutely
travel over the HT bus in non-ascending order on AMD64 systems unless we
fence them, and we've verified this using a HT bus analyser.

	<b

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/