Hello, I'm experiencing a serious bug on AMD Zen2 that is causing
registers to "roll back" to previous values after a context switch.
I know that sounds unbelievable - but I have a reliable reproducer!
$ grep -m1 'model name' /proc/cpuinfo
model name : AMD Ryzen Threadripper PRO 3945WX 12-Cores
The bug occurs when there is a context switch after a ucomiss
instruction. The YMM registers are "rolled back" to some previous state.
It's not clear to me how or why this is happening, or if it can happen
across a process boundary.
To reproduce:
$ nasm -felf64 -O0 zenymmasm.asm
$ ld -o zenymmasm zenymmasm.o
If you run it it should just print some nuls:
$ ./zenymmasm
$
That is the expected, correct result.
However, if you run the attached program hammer.c (it just runs
sched_yield() in a loop), and then pin this testcase to the same core as
that:
$ taskset -c 1 ./zenymmasm
SECRETSECRET
The previous register values are restored.
I think this should be impossible.
The code does this:
vmovdqu ymm0, [rel secret] <--- put SECRET into ymm0
mov rax, SYS_sched_yield
syscall
vpxor ymm0, ymm0, ymm0 <--- Here the value of ymm0 should be lost
It's not related to to VPXOR, you can use VZEROALL or whatever else.
ucomiss xmm0, dword [rel space]
mov rax, SYS_sched_yield
syscall
It's the UCOMISS r128,m32 instruction that triggers the bug, the value
of the m32 must be < 0x80000. As far as I know, this should only ever
change condition flags, but if we dump the value of ymm0:
mov rax, SYS_write
mov rdi, 1
lea rsi, [rel regstate]
mov rdx, 32
syscall
It will have reverted back to the pre-vpxor SECRET value !?!?
In the original C program, we were seeing register values randomly
restored from significantly earlier in the program execution (like, from
ld.so), sometimes values we don't recognize and can't explain.
We've reproduced on multiple Zen2 machines.
Obviously, not great if your registers are randomly time travelling :)
Thanks, Tavis.
--
_o) $ lynx lock.cmpxchg8b.com
/\\ _o) _o) $ finger [email protected]
_\_V _( ) _( ) @taviso
On Tue, Feb 21, 2023 at 10:40:07PM -0800, Tavis Ormandy wrote:
> Hello, I'm experiencing a serious bug on AMD Zen2 that is causing
> registers to "roll back" to previous values after a context switch.
It probably doesn't matter all too much but which kernel are you running
this on?
Latest and greatest I hope...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Feb 22, 2023 at 09:33:19AM +0100, Borislav Petkov wrote:
> It probably doesn't matter all too much but which kernel are you running
> this on?
>
> Latest and greatest I hope...
Also, what does
$ grep microcode /proc/cpuinfo | uniq
say?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 22/02/2023 9:14 am, Borislav Petkov wrote:
> On Wed, Feb 22, 2023 at 09:33:19AM +0100, Borislav Petkov wrote:
>> It probably doesn't matter all too much but which kernel are you running
>> this on?
>>
>> Latest and greatest I hope...
> Also, what does
>
> $ grep microcode /proc/cpuinfo | uniq
>
> say?
This sounds suspiciously like an errata which was fixed with a ucode
update last year.
Check that you are up to date with ucode. If you are, perhaps the fix
needs another look at...
~Andrew
On Wed, Feb 22, 2023 at 09:38:09AM +0000, Andrew Cooper wrote:
> This sounds suspiciously like an errata which was fixed with a ucode
> update last year.
Yes, it looks like it.
Alternatively, you can try booting with "clearcpuid=xsaves" - that
should take care of your observation too but yeah, you should rather
update your microcode.
HTH.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Feb 22, 2023 at 11:09:51AM +0100, Borislav Petkov wrote:
> On Wed, Feb 22, 2023 at 09:38:09AM +0000, Andrew Cooper wrote:
> > This sounds suspiciously like an errata which was fixed with a ucode
> > update last year.
>
> Yes, it looks like it.
>
> Alternatively, you can try booting with "clearcpuid=xsaves" - that
> should take care of your observation too but yeah, you should rather
> update your microcode.
>
Thanks - confirmed, it *doesn't* repro with 0x8301055, but does repro
with 0x830104d.
Annoyingly, I thought I was using the most recent microcode, but it seems
like there is some bug and debian wasn't applying it at boot.
That seems like a scary errata :-/
Tavis.
--
_o) $ lynx lock.cmpxchg8b.com
/\\ _o) _o) $ finger [email protected]
_\_V _( ) _( ) @taviso
On 22/02/2023 9:26 pm, Tavis Ormandy wrote:
> On Wed, Feb 22, 2023 at 11:09:51AM +0100, Borislav Petkov wrote:
>> On Wed, Feb 22, 2023 at 09:38:09AM +0000, Andrew Cooper wrote:
>>> This sounds suspiciously like an errata which was fixed with a ucode
>>> update last year.
>> Yes, it looks like it.
>>
>> Alternatively, you can try booting with "clearcpuid=xsaves" - that
>> should take care of your observation too but yeah, you should rather
>> update your microcode.
>>
> Thanks - confirmed, it *doesn't* repro with 0x8301055, but does repro
> with 0x830104d.
For completeness, this is erratum 1386, details of which can be found in
https://www.amd.com/system/files/TechDocs/56323-PUB_1.00.pdf amongst others.
That said, looking at the list of preconditions to tickle this erratum,
I don't see anything in your code which would cause a change in MXCSR.
Which most likely means there's a path in the kernel modifying MXCSR and
that doesn't sound like a thing that ought to be happening by default.
> Annoyingly, I thought I was using the most recent microcode, but it seems
> like there is some bug and debian wasn't applying it at boot.
>
> That seems like a scary errata :-/
Honestly, this is tame compared to some of the things which get fixed in
ucode...
~Andrew
On Wed, Feb 22, 2023 at 01:26:37PM -0800, Tavis Ormandy wrote:
> Thanks - confirmed, it *doesn't* repro with 0x8301055, but does repro
> with 0x830104d.
Good.
> Annoyingly, I thought I was using the most recent microcode, but it seems
> like there is some bug and debian wasn't applying it at boot.
Well, actually, microcode loading is so simple - you don't absolutely
need to rely on the distros to do it for ya.
You simply get it from
git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
and copy all the bin files from amd-ucode/ into /lib/firmware/amd-ucode/
Technically, you'd need only the one for your family but the tools can
deal with multiple files in there.
And then you're all set - dracut or whatever creates your initrd will
add it. You can even add it by hand, see
Documentation/x86/microcode.rst
> That seems like a scary errata :-/
Yeah, modern x86 hardware is crazy complex. And we've had worse. :-\
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Feb 22, 2023 at 10:17:35PM +0000, Andrew Cooper wrote:
> I don't see anything in your code which would cause a change in MXCSR.
> Which most likely means there's a path in the kernel modifying MXCSR and
> that doesn't sound like a thing that ought to be happening by default.
We were just talking about this internally too. We could trace it to see
what fiddles with MXCSR but meh, bigger fish to fry.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, 22 Feb 2023, Borislav Petkov wrote:
> On Wed, Feb 22, 2023 at 09:38:09AM +0000, Andrew Cooper wrote:
> > This sounds suspiciously like an errata which was fixed with a ucode
> > update last year.
>
> Yes, it looks like it.
>
> Alternatively, you can try booting with "clearcpuid=xsaves" - that
> should take care of your observation too but yeah, you should rather
> update your microcode.
Hi folks,
I can reproduce this bug on AMD Renoir SoC:
vendor_id : AuthenticAMD
cpu family : 23
model : 96
model name : AMD Ryzen 5 4600G with Radeon Graphics
stepping : 1
microcode : 0x8600104
for which there's no microcode update, the microcode_amd_fam17h.bin file
in the linux-firmware.git repo carries only the following patches:
$ ./amd_ucode_info.py microcode_amd_fam17h.bin
Microcode patches in microcode_amd_fam17h.bin:
Family=0x17 Model=0x08 Stepping=0x02: Patch=0x0800820d Length=3200 bytes
Family=0x17 Model=0x01 Stepping=0x02: Patch=0x0800126e Length=3200 bytes
Family=0x17 Model=0x31 Stepping=0x00: Patch=0x08301055 Length=3200 bytes
I've seen microcode version increase after a BIOS update, so it seems like
internally microcode patches exist for Renoir too, but it's up to hardware
vendors to pick them up as part of a BIOS update. Is there any chance of
a conventional release like for the above three CPU models?
Thanks.
Alexander
On Tue, Feb 28, 2023 at 09:47:22PM +0300, Alexander Monakov wrote:
> I've seen microcode version increase after a BIOS update,
Are you saying there's no BIOS update for your board?
Are you running the latest BIOS?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, 28 Feb 2023, Borislav Petkov wrote:
> On Tue, Feb 28, 2023 at 09:47:22PM +0300, Alexander Monakov wrote:
> > I've seen microcode version increase after a BIOS update,
>
> Are you saying there's no BIOS update for your board?
No, on the contrary, if there were no updates I would not be able to
see microcode version increase after updating.
> Are you running the latest BIOS?
Yes.
Alexander
On Tue, Feb 28, 2023 at 10:24:40PM +0300, Alexander Monakov wrote:
> No, on the contrary, if there were no updates I would not be able to
> see microcode version increase after updating.
So what are you saying then?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, 28 Feb 2023, Borislav Petkov wrote:
> On Tue, Feb 28, 2023 at 10:24:40PM +0300, Alexander Monakov wrote:
> > No, on the contrary, if there were no updates I would not be able to
> > see microcode version increase after updating.
>
> So what are you saying then?
That I can reproduce the bug even with the latest BIOS, i.e. the microcode
in the latest BIOS update does not have the fix, and the linux-firmware.git
microcode does not have a patch for this CPU family at all.
Hence the question how to get fixed microcode for this CPU family.
Alexander
On 28/02/2023 7:29 pm, Alexander Monakov wrote:
> On Tue, 28 Feb 2023, Borislav Petkov wrote:
>
>> On Tue, Feb 28, 2023 at 10:24:40PM +0300, Alexander Monakov wrote:
>>> No, on the contrary, if there were no updates I would not be able to
>>> see microcode version increase after updating.
>> So what are you saying then?
> That I can reproduce the bug even with the latest BIOS, i.e. the microcode
> in the latest BIOS update does not have the fix, and the linux-firmware.git
> microcode does not have a patch for this CPU family at all.
>
> Hence the question how to get fixed microcode for this CPU family.
I shouldn't be recommending this, but I've begged AMD many times to be
better at publishing microcode...
https://github.com/platomav/CPUMicrocodes
and you'll need
---8<---
#!/usr/bin/env python3
# -*- coding:utf-8 -*-
import sys
from pathlib import Path
from struct import pack
fout = sys.stdout
def stream_write(_):
"""Write to the output"""
return fout.buffer.write(_)
blobs = []
for ucode in sorted(sys.argv[1:]):
# cpu00xx0Fxx_
fm = int(ucode[3:11], 16)
eq = int("".join([ucode[5], ucode[6], ucode[9], ucode[10]]), 16)
sz = Path(ucode).stat().st_size
print("%08x, %s => %04x" % (fm, ucode[3:11], eq), file=sys.stderr)
blobs.append((fm, eq, sz, open(ucode, "rb").read()))
equiv = []
for x in blobs:
equiv.append(pack(b"=IIIHH", x[0], 0, 0, x[1], 0))
equiv.append(pack("=IIIHH", 0, 0, 0, 0, 0))
stream_write(pack("I", 0x00414d44)) # Magic
equiv_bin = b"".join(equiv)
stream_write(pack("II", 0, len(equiv_bin))) # Equiv table
stream_write(equiv_bin)
for blob in blobs:
stream_write(pack("II", 1, len(blob[3])))
stream_write(blob[3])
---8<---
to wrap the blob(s) from that git repo into the container format that
Linux and others know how to parse.
~Andrew
On Tue, Feb 28, 2023 at 10:29:23PM +0300, Alexander Monakov wrote:
> That I can reproduce the bug even with the latest BIOS,
Can you reproduce if you boot with
clearcpuid=xsaves
?
CONFIG_X86_FEATURE_NAMES must be enabled in your .config for that to
work.
Otherwise, try
clearcpuid=323
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, 28 Feb 2023, Borislav Petkov wrote:
> On Tue, Feb 28, 2023 at 10:29:23PM +0300, Alexander Monakov wrote:
> > That I can reproduce the bug even with the latest BIOS,
>
> Can you reproduce if you boot with
>
> clearcpuid=xsaves
>
> ?
No, with this option it is not reproducible.
Alexander
On Tue, 28 Feb 2023, Andrew Cooper wrote:
> On 28/02/2023 7:29 pm, Alexander Monakov wrote:
> > On Tue, 28 Feb 2023, Borislav Petkov wrote:
> >
> > That I can reproduce the bug even with the latest BIOS, i.e. the microcode
> > in the latest BIOS update does not have the fix, and the linux-firmware.git
> > microcode does not have a patch for this CPU family at all.
> >
> > Hence the question how to get fixed microcode for this CPU family.
>
> I shouldn't be recommending this, but I've begged AMD many times to be
> better at publishing microcode...
>
> https://github.com/platomav/CPUMicrocodes
>
> and you'll need
[snip script]
Thank you, this is good to have. Unfortunately, even with microcode 0x08600109
taken from that repo the issue is reproducible (it is dated 2022-03-28, so
perhaps not really surprising).
Alexander
On 28/02/2023 9:16 pm, Alexander Monakov wrote:
> On Tue, 28 Feb 2023, Borislav Petkov wrote:
>> On Tue, Feb 28, 2023 at 10:29:23PM +0300, Alexander Monakov wrote:
>>> That I can reproduce the bug even with the latest BIOS,
>> Can you reproduce if you boot with
>>
>> clearcpuid=xsaves
>>
>> ?
> No, with this option it is not reproducible.
Ok.
Given that AMD do appear to have screwed up here, and the exploit does
reliably work on modern versions of Linux and up-to-date firmware, the
next course of action is to clobber XSAVES by default.
So we need a table for all Zen2 parts of ucode revisions below which we
force hide XSAVES as the erratum workaround.
That, or we skip the table and just hide XSAVES unconditionally on all
Fam17/18h CPUs... The Zen1/2 uarches have no supervisor states to
manage (AFAICT - the first supervisor states are CET in Zen3 I think),
and Linux already knows how to use XSAVEC (from virt usecases) which is
equivalent given no supervisor states.
Thoughts?
~Andrew
On Wed, Mar 01, 2023 at 12:23:01AM +0000, Andrew Cooper wrote:
> So we need a table for all Zen2 parts of ucode revisions below which we
> force hide XSAVES as the erratum workaround.
Working on it - stay tuned.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Mar 01, 2023 at 09:54:14AM +0100, Borislav Petkov wrote:
> On Wed, Mar 01, 2023 at 12:23:01AM +0000, Andrew Cooper wrote:
> > So we need a table for all Zen2 parts of ucode revisions below which we
> > force hide XSAVES as the erratum workaround.
>
> Working on it - stay tuned.
>
FYI, I heard from someone offlist that they can reproduce with the
latest microcode available on this cpu:
cpu family : 23
model : 17
model name : AMD Ryzen 3 2200G with Radeon Vega Graphics
stepping : 0
microcode : 0x8101016
It seems likely many zen1 systems are affected without any microcode
patch, which seems not great to me.
I guess it would be nice if AMD could provide a list of devices they're
not going to provide a patch for...
Tavis.
--
_o) $ lynx lock.cmpxchg8b.com
/\\ _o) _o) $ finger [email protected]
_\_V _( ) _( ) @taviso
AMD Erratum 1386 is summarised as:
XSAVES Instruction May Fail to Save XMM Registers to the Provided
State Save Area
This piece of accidental chronomancy causes the %xmm registers to
occasionally reset back to an older value.
Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC
instruction (which works fine) is equivalent on affected parts.
Reported-by: Tavis Ormandy <[email protected]>
Signed-off-by: Andrew Cooper <[email protected]>
---
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: Dave Hansen <[email protected]>
CC: [email protected]
CC: "H. Peter Anvin" <[email protected]>
CC: [email protected]
CC: Tavis Ormandy <[email protected]>
CC: Alexander Monakov <[email protected]>
Only compile tested.
This wants backporting to all stable trees that understand XSAVES, but
before 5.19(?) needs the XSAVEC support backporting too...
---
arch/x86/kernel/cpu/amd.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 380753b14cab..f3a4bb479fd5 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -890,6 +890,17 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
node_reclaim_distance = 32;
#endif
+ /*
+ * Work around Erratum 1386. The XSAVES instruction malfunctions in
+ * certain circumstances on Zen1/2 uarch, and not all parts have had
+ * updated microcode at the time of writing (March 2023).
+ *
+ * Affected parts all have no supervisor XSAVE states, meaning that
+ * the XSAVEC instruction (which works fine) is equivelent.
+ */
+ if (c->x86 == 0x17)
+ clear_cpu_cap(c, X86_FEATURE_XSAVES);
+
/* Fix up CPUID bits, but only if not virtualised. */
if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {
--
2.30.2
On Tue, Mar 07, 2023 at 05:46:43PM +0000, Andrew Cooper wrote:
> + * Work around Erratum 1386. The XSAVES instruction malfunctions in
> + * certain circumstances on Zen1/2 uarch, and not all parts have had
> + * updated microcode at the time of writing (March 2023).
Any reason for rushing this and not waiting for me to clarify affected
models first?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 07/03/2023 5:50 pm, Borislav Petkov wrote:
> On Tue, Mar 07, 2023 at 05:46:43PM +0000, Andrew Cooper wrote:
>> + * Work around Erratum 1386. The XSAVES instruction malfunctions in
>> + * certain circumstances on Zen1/2 uarch, and not all parts have had
>> + * updated microcode at the time of writing (March 2023).
> Any reason for rushing this and not waiting for me to clarify affected
> models first?
Well yes - more and more reports of impacted systems.
Having the full list of affected models is great and all, but how is it
going to change this patch as a workaround ?
~Andrew
On Tue, Mar 07, 2023 at 06:22:01PM +0000, Andrew Cooper wrote:
> Well yes - more and more reports of impacted systems.
>
> Having the full list of affected models is great and all, but how is it
> going to change this patch as a workaround ?
We don't have to clear the feature flag on those systems which have
a fix.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 07/03/2023 6:56 pm, Borislav Petkov wrote:
> On Tue, Mar 07, 2023 at 06:22:01PM +0000, Andrew Cooper wrote:
>> Well yes - more and more reports of impacted systems.
>>
>> Having the full list of affected models is great and all, but how is it
>> going to change this patch as a workaround ?
> We don't have to clear the feature flag on those systems which have
> a fix.
Sure, but why is that helpful?
XSAVES and XSAVEC are functionally identical on Zen1/2 because these
CPUs don't advertise any supervisor XSAVE states.
It is only Zen3 where XSAVES starts doing more than XSAVEC (and even
then, only after the CET series actually gets merged...)
~Andrew
On Tue, Mar 07, 2023 at 08:01:36PM +0000, Andrew Cooper wrote:
> Sure, but why is that helpful?
>
> XSAVES and XSAVEC are functionally identical on Zen1/2 because these
> CPUs don't advertise any supervisor XSAVE states.
I guess... We'll know soon enough.
> It is only Zen3 where XSAVES starts doing more than XSAVEC (and even
> then, only after the CET series actually gets merged...)
Yeah, latter should probably happen this time around. :-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: b0563468eeac88ebc70559d52a0b66efc37e4e9d
Gitweb: https://git.kernel.org/tip/b0563468eeac88ebc70559d52a0b66efc37e4e9d
Author: Andrew Cooper <[email protected]>
AuthorDate: Tue, 07 Mar 2023 17:46:43
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Wed, 08 Mar 2023 16:56:08 +01:00
x86/CPU/AMD: Disable XSAVES on AMD family 0x17
AMD Erratum 1386 is summarised as:
XSAVES Instruction May Fail to Save XMM Registers to the Provided
State Save Area
This piece of accidental chronomancy causes the %xmm registers to
occasionally reset back to an older value.
Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC
instruction (which works fine) is equivalent on affected parts.
[ bp: Typos, move it into the F17h-specific function. ]
Reported-by: Tavis Ormandy <[email protected]>
Signed-off-by: Andrew Cooper <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/amd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 380753b..95cdd08 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -880,6 +880,15 @@ void init_spectral_chicken(struct cpuinfo_x86 *c)
}
}
#endif
+ /*
+ * Work around Erratum 1386. The XSAVES instruction malfunctions in
+ * certain circumstances on Zen1/2 uarch, and not all parts have had
+ * updated microcode at the time of writing (March 2023).
+ *
+ * Affected parts all have no supervisor XSAVE states, meaning that
+ * the XSAVEC instruction (which works fine) is equivalent.
+ */
+ clear_cpu_cap(c, X86_FEATURE_XSAVES);
}
static void init_amd_zn(struct cpuinfo_x86 *c)
On 3/7/23 12:01, Andrew Cooper wrote:
> On 07/03/2023 6:56 pm, Borislav Petkov wrote:
>> On Tue, Mar 07, 2023 at 06:22:01PM +0000, Andrew Cooper wrote:
>>> Well yes - more and more reports of impacted systems.
>>>
>>> Having the full list of affected models is great and all, but how is it
>>> going to change this patch as a workaround ?
>> We don't have to clear the feature flag on those systems which have
>> a fix.
> Sure, but why is that helpful?
>
> XSAVES and XSAVEC are functionally identical on Zen1/2 because these
> CPUs don't advertise any supervisor XSAVE states.
On Intel at least, XSAVES uses the modified optimization and XSAVEC does
not. I'm not sure if you include this in your definition of
"functionally identical", but it is one of those little XSAVE
architecture oddities that's bitten me a time or two.
The AMD docs don't talk about the modified optimization verbatim, but I
think this nugget is alluding to the same behavior:
> 4. XRSTOR loads an internal state value XRSTOR_INFO that can be used to further optimize a
> subsequent XSAVEOPT or XSAVES. This reflects the current privilege level and virtualization
> mode as well as the save area's base address and XCOMP_BV field.