2020-03-02 16:13:24

by Jan Kiszka

[permalink] [raw]
Subject: x2apic_wrmsr_fence vs. Intel manual

Hi all,

as I generated a nice bug around fence vs. x2apic icr writes, I studied
the kernel code and the Intel manual in this regard more closely. But
there is a discrepancy:

arch/x86/include/asm/apic.h:

/*
* Make previous memory operations globally visible before
* sending the IPI through x2apic wrmsr. We need a serializing instruction or
* mfence for this.
*/
static inline void x2apic_wrmsr_fence(void)
{
asm volatile("mfence" : : : "memory");
}

Intel SDM, 10.12.3 MSR Access in x2APIC Mode:

"A WRMSR to an APIC register may complete before all preceding stores
are globally visible; software can prevent this by inserting a
serializing instruction or the sequence MFENCE;LFENCE before the WRMSR."

The former dates back to ce4e240c279a, but that commit does not mention
why lfence is not needed. Did the manual read differently back then? Or
why are we safe? To my reading of lfence, it also has a certain
instruction serializing effect that mfence does not have.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


2020-03-02 16:21:05

by Thomas Gleixner

[permalink] [raw]
Subject: Re: x2apic_wrmsr_fence vs. Intel manual

Jan Kiszka <[email protected]> writes:
> as I generated a nice bug around fence vs. x2apic icr writes, I studied
> the kernel code and the Intel manual in this regard more closely. But
> there is a discrepancy:
>
> arch/x86/include/asm/apic.h:
>
> /*
> * Make previous memory operations globally visible before
> * sending the IPI through x2apic wrmsr. We need a serializing instruction or
> * mfence for this.
> */
> static inline void x2apic_wrmsr_fence(void)
> {
> asm volatile("mfence" : : : "memory");
> }
>
> Intel SDM, 10.12.3 MSR Access in x2APIC Mode:
>
> "A WRMSR to an APIC register may complete before all preceding stores
> are globally visible; software can prevent this by inserting a
> serializing instruction or the sequence MFENCE;LFENCE before the WRMSR."
>
> The former dates back to ce4e240c279a, but that commit does not mention
> why lfence is not needed. Did the manual read differently back then? Or
> why are we safe? To my reading of lfence, it also has a certain
> instruction serializing effect that mfence does not have.

The 2011 SDM says:

A WRMSR to an APIC register may complete before all preceding stores
are globally visible; software can prevent this by inserting a
serializing instruction, an SFENCE, or an MFENCE before the WRMSR.

Sigh....

2020-03-02 16:34:25

by Jan Kiszka

[permalink] [raw]
Subject: Re: x2apic_wrmsr_fence vs. Intel manual

On 02.03.20 17:20, Thomas Gleixner wrote:
> Jan Kiszka <[email protected]> writes:
>> as I generated a nice bug around fence vs. x2apic icr writes, I studied
>> the kernel code and the Intel manual in this regard more closely. But
>> there is a discrepancy:
>>
>> arch/x86/include/asm/apic.h:
>>
>> /*
>> * Make previous memory operations globally visible before
>> * sending the IPI through x2apic wrmsr. We need a serializing instruction or
>> * mfence for this.
>> */
>> static inline void x2apic_wrmsr_fence(void)
>> {
>> asm volatile("mfence" : : : "memory");
>> }
>>
>> Intel SDM, 10.12.3 MSR Access in x2APIC Mode:
>>
>> "A WRMSR to an APIC register may complete before all preceding stores
>> are globally visible; software can prevent this by inserting a
>> serializing instruction or the sequence MFENCE;LFENCE before the WRMSR."
>>
>> The former dates back to ce4e240c279a, but that commit does not mention
>> why lfence is not needed. Did the manual read differently back then? Or
>> why are we safe? To my reading of lfence, it also has a certain
>> instruction serializing effect that mfence does not have.
>
> The 2011 SDM says:
>
> A WRMSR to an APIC register may complete before all preceding stores
> are globally visible; software can prevent this by inserting a
> serializing instruction, an SFENCE, or an MFENCE before the WRMSR.
>
> Sigh....
>

OK, that explains it. Curious since when (date and CPU generation) we
unknowingly started to play roulette here.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

2020-03-02 16:35:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: x2apic_wrmsr_fence vs. Intel manual

On Mon, Mar 02, 2020 at 05:20:26PM +0100, Thomas Gleixner wrote:
> Jan Kiszka <[email protected]> writes:
> > as I generated a nice bug around fence vs. x2apic icr writes, I studied
> > the kernel code and the Intel manual in this regard more closely. But
> > there is a discrepancy:
> >
> > arch/x86/include/asm/apic.h:
> >
> > /*
> > * Make previous memory operations globally visible before
> > * sending the IPI through x2apic wrmsr. We need a serializing instruction or
> > * mfence for this.
> > */
> > static inline void x2apic_wrmsr_fence(void)
> > {
> > asm volatile("mfence" : : : "memory");
> > }
> >
> > Intel SDM, 10.12.3 MSR Access in x2APIC Mode:
> >
> > "A WRMSR to an APIC register may complete before all preceding stores
> > are globally visible; software can prevent this by inserting a
> > serializing instruction or the sequence MFENCE;LFENCE before the WRMSR."
> >
> > The former dates back to ce4e240c279a, but that commit does not mention
> > why lfence is not needed. Did the manual read differently back then? Or
> > why are we safe? To my reading of lfence, it also has a certain
> > instruction serializing effect that mfence does not have.
>
> The 2011 SDM says:
>
> A WRMSR to an APIC register may complete before all preceding stores
> are globally visible; software can prevent this by inserting a
> serializing instruction, an SFENCE, or an MFENCE before the WRMSR.
>
> Sigh....

*groan*, The only way I can explain this is...

... because we changed the definition of LFENCE from:

- wait until completion of all prior LOADs

to

- wait until completion of all prior instructions

Because Spectre (and because apparently it was implemented that way,
mostly). It could be that MFENCE, which is basically a completion
barrier for all prior LOADs *AND* STOREs, is no longer a stict superset
of LFENCE anymore...

Which makes the otherwise perverted sequence: MFENCE;LFENCE, actually
mean something :/

la-la-la

Would be good to have that clarified though.


2020-03-04 18:28:10

by Dave Hansen

[permalink] [raw]
Subject: Re: x2apic_wrmsr_fence vs. Intel manual

On 3/2/20 8:11 AM, Jan Kiszka wrote:
> The former dates back to ce4e240c279a, but that commit does not mention
> why lfence is not needed. Did the manual read differently back then? Or
> why are we safe? To my reading of lfence, it also has a certain
> instruction serializing effect that mfence does not have.

I asked around Intel about this.

The old "SFENCE, or MFENCE" recommendation was deemed insufficient
because it has no impact on the ordering of WRMSR since it is not a
"load or store instruction". LFENCE's instruction-ordering semantic is
needed because it ensures later ordering of all instructions, not just
loads and stores.

Jan, do you think you're seeing a bug resulting from WRMSR ordering?

2020-03-04 18:41:04

by Jan Kiszka

[permalink] [raw]
Subject: Re: x2apic_wrmsr_fence vs. Intel manual

On 04.03.20 19:27, Dave Hansen wrote:
> On 3/2/20 8:11 AM, Jan Kiszka wrote:
>> The former dates back to ce4e240c279a, but that commit does not mention
>> why lfence is not needed. Did the manual read differently back then? Or
>> why are we safe? To my reading of lfence, it also has a certain
>> instruction serializing effect that mfence does not have.
>
> I asked around Intel about this.
>
> The old "SFENCE, or MFENCE" recommendation was deemed insufficient
> because it has no impact on the ordering of WRMSR since it is not a
> "load or store instruction". LFENCE's instruction-ordering semantic is
> needed because it ensures later ordering of all instructions, not just
> loads and stores.
>
> Jan, do you think you're seeing a bug resulting from WRMSR ordering?
>

Nope, not so far. I'm hunting a race between two guests over Jailhouse
where the kick (sent as IPI) seems to come before the data, but changing
the fences didn't solve it, unfortunately. Along that, I was reading
code and manuals up and down and ran into this inconsistency. That's the
story.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux