2018-12-31 10:04:17

by Paul Menzel

[permalink] [raw]
Subject: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

Dear Linux folks,


Linux 4.19.13 from Debian Sid/unstable logs the message below on the
board MSI MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.

As a result, the early time stamps do not seem to be working.

> [ 0.000000] Linux version 4.19.0-1-amd64 ([email protected]) (gcc version 8.2.0 (Debian 8.2.0-13)) #1 SMP Debian 4.19.13-1 (2018-12-30)
> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.19.0-1-amd64 root=UUID=8883f733-2248-47e0-90b9-ee5384f18d62 ro quiet
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
> [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
> [ 0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000009d7ffff] usable
> [ 0.000000] BIOS-e820: [mem 0x0000000009d80000-0x0000000009ffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x000000000a000000-0x000000000a1fffff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000a200000-0x000000000a209fff] ACPI NVS
> [ 0.000000] BIOS-e820: [mem 0x000000000a20a000-0x000000000affffff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000b000000-0x000000000b01ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x000000000b020000-0x000000005d11afff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000005d11b000-0x000000005d20bfff] reserved
> [ 0.000000] BIOS-e820: [mem 0x000000005d20c000-0x000000005d389fff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000005d38a000-0x000000005d790fff] ACPI NVS
> [ 0.000000] BIOS-e820: [mem 0x000000005d791000-0x000000005e5b9fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x000000005e5ba000-0x000000005e663fff] type 20
> [ 0.000000] BIOS-e820: [mem 0x000000005e664000-0x000000005effffff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000005f000000-0x00000000dfffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fd100000-0x00000000fdffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fea00000-0x00000000fea0ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000feb80000-0x00000000fec01fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fec30000-0x00000000fec30fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fedc2000-0x00000000fedcffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fedd4000-0x00000000fedd5fff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000041f33ffff] usable
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] efi: EFI v2.70 by American Megatrends
> [ 0.000000] efi: ACPI 2.0=0x5d711000 ACPI=0x5d711000 SMBIOS=0x5e483000 MEMATTR=0x597ce018 ESRT=0x597d2f98
> [ 0.000000] secureboot: Secure boot could not be determined (mode 0)
> [ 0.000000] SMBIOS 2.8 present.
> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
> [ 0.000000] tsc: Fast TSC calibration failed
> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
Please find the Linux messages attached.


Kind regards,

Paul


Attachments:
20181231-105503--initcall.dmesg (74.33 kB)

2019-01-07 15:35:37

by Paul Menzel

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

Dear Thomas,


On 01/07/19 16:24, Thomas Gleixner wrote:

> On Mon, 31 Dec 2018, Paul Menzel wrote:
>
>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>
>> As a result, the early time stamps do not seem to be working.
>
>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>> [ 0.000000] tsc: Fast TSC calibration failed
>
> And the further boot log says:
>
> [ 0.036000] tsc: Unable to calibrate against PIT
> [ 0.036000] tsc: using HPET reference calibration
> [ 0.036000] tsc: Detected 3500.117 MHz processor
>
> So the quick calibration in early boot fails because the PIT seems not to
> do what the kernel expects. Nothing we can cure :(

I see. Can AMD confirm that this is the expected behavior? If yes, should
the fast TSC calibration be skipped on these devices?


Kind regards,

Paul


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2019-01-07 18:47:31

by Thomas Gleixner

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

Paul,

On Mon, 31 Dec 2018, Paul Menzel wrote:

> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>
> As a result, the early time stamps do not seem to be working.

> > [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR
> > (MS-7A37), BIOS 1.I0 11/06/2018
> > [ 0.000000] tsc: Fast TSC calibration failed

And the further boot log says:

[ 0.036000] tsc: Unable to calibrate against PIT
[ 0.036000] tsc: using HPET reference calibration
[ 0.036000] tsc: Detected 3500.117 MHz processor

So the quick calibration in early boot fails because the PIT seems not to
do what the kernel expects. Nothing we can cure :(

Thanks,

tglx



2019-01-11 21:18:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

Paul,

On Mon, 7 Jan 2019, Paul Menzel wrote:
> On 01/07/19 16:24, Thomas Gleixner wrote:
> >> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
> >> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
> >>
> >> As a result, the early time stamps do not seem to be working.
> >
> >>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
> >>> [ 0.000000] tsc: Fast TSC calibration failed
> >
> > And the further boot log says:
> >
> > [ 0.036000] tsc: Unable to calibrate against PIT
> > [ 0.036000] tsc: using HPET reference calibration
> > [ 0.036000] tsc: Detected 3500.117 MHz processor
> >
> > So the quick calibration in early boot fails because the PIT seems not to
> > do what the kernel expects. Nothing we can cure :(
>
> I see. Can AMD confirm that this is the expected behavior? If yes, should
> the fast TSC calibration be skipped on these devices?

It should work and we really don't want to add cpu family/model based
decisions whether we invoke something or not. Those tables are stale before
they hit mainline.

Thanks,

tglx

2019-01-14 10:11:14

by Paul Menzel

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

Dear Thomas,


On 01/11/19 21:43, Thomas Gleixner wrote:

> On Mon, 7 Jan 2019, Paul Menzel wrote:
>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>
>>>> As a result, the early time stamps do not seem to be working.
>>>
>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>
>>> And the further boot log says:
>>>
>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>> [ 0.036000] tsc: using HPET reference calibration
>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>
>>> So the quick calibration in early boot fails because the PIT seems not to
>>> do what the kernel expects. Nothing we can cure :(
>>
>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>> the fast TSC calibration be skipped on these devices?
>
> It should work and we really don't want to add cpu family/model based
> decisions whether we invoke something or not. Those tables are stale before
> they hit mainline.

Understood. If it’s supposed to work, any hints on how to debug this?

Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
the issue?

It seems to fail with an AMD Ryzen 2400G too [1].

It also fails on an AMD Ryzen 7 1700 [2].

```
[ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
[…]
[ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.028000] tsc: Fast TSC calibration failed
[ 0.032000] tsc: PIT calibration matches HPET. 1 loops
[ 0.032000] tsc: Detected 2994.246 MHz processor
[…]
[ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
```

It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.

```
[ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
[…]
[ 0.000000] tsc: Fast TSC calibration using PIT
[…]
[ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
```


Kind regards,

Paul


[1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
[2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2019-01-22 16:56:23

by Paul Menzel

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

[Adding Tom to CC]

Dear Thomas, dear Tom,


On 01/14/19 11:09, Paul Menzel wrote:

> On 01/11/19 21:43, Thomas Gleixner wrote:
>
>> On Mon, 7 Jan 2019, Paul Menzel wrote:
>>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>>
>>>>> As a result, the early time stamps do not seem to be working.
>>>>
>>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>>
>>>> And the further boot log says:
>>>>
>>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>>> [ 0.036000] tsc: using HPET reference calibration
>>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>>
>>>> So the quick calibration in early boot fails because the PIT seems not to
>>>> do what the kernel expects. Nothing we can cure :(
>>>
>>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>>> the fast TSC calibration be skipped on these devices?
>>
>> It should work and we really don't want to add cpu family/model based
>> decisions whether we invoke something or not. Those tables are stale before
>> they hit mainline.
>
> Understood. If it’s supposed to work, any hints on how to debug this?
>
> Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
> the issue?
>
> It seems to fail with an AMD Ryzen 2400G too [1].

We now have an HP EliteDesk 705 G4 MT with that processsor, showing the same
problem.

```
[ 0.000000] Linux version 4.20.0.mx64.238 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 24 14:50:00 CET 2018
[…]
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 3.1 present.
[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: Fast TSC calibration failed
[ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[…]
[ 0.017860] smpboot: CPU0: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
[…]
```

> It also fails on an AMD Ryzen 7 1700 [2].
>
> ```
> [ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
> […]
> [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> [ 0.028000] tsc: Fast TSC calibration failed
> [ 0.032000] tsc: PIT calibration matches HPET. 1 loops
> [ 0.032000] tsc: Detected 2994.246 MHz processor
> […]
> [ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
> ```
>
> It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.
>
> ```
> [ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
> […]
> [ 0.000000] tsc: Fast TSC calibration using PIT
> […]
> [ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
> ```

How to continue from here? Is documentation for that available from AMD?
I didn’t find a BKDG (Bios Kernel Developer Guide) at [3].


Kind regards,

Paul


> [1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
> [2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver[3]: https://developer.amd.com/resources/developer-guides-manuals/


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2019-01-22 20:26:26

by Tom Lendacky

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with AMD B350M/Ryzen 3 2200G

On 1/22/19 10:53 AM, Paul Menzel wrote:
> [Adding Tom to CC]
>
> Dear Thomas, dear Tom,
>
>
> On 01/14/19 11:09, Paul Menzel wrote:
>
>> On 01/11/19 21:43, Thomas Gleixner wrote:
>>
>>> On Mon, 7 Jan 2019, Paul Menzel wrote:
>>>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>>>
>>>>>> As a result, the early time stamps do not seem to be working.
>>>>>
>>>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>>>
>>>>> And the further boot log says:
>>>>>
>>>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>>>> [ 0.036000] tsc: using HPET reference calibration
>>>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>>>
>>>>> So the quick calibration in early boot fails because the PIT seems not to
>>>>> do what the kernel expects. Nothing we can cure :(
>>>>
>>>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>>>> the fast TSC calibration be skipped on these devices?

Hi Paul,

It's not expected behavior. All of the systems that I have access to do
not exhibit this issue. Having said that, I have a limited number of
systems available to me.

I don't have much experience in this area, but if it is something that
consistently occurs, you might try to see if you can better identify why
it fails. The message is issued in pit_hpet_ptimer_calibrate_cpu() in file
arch/x86/kernel/tsc.c.

Thanks,
Tom

>>>
>>> It should work and we really don't want to add cpu family/model based
>>> decisions whether we invoke something or not. Those tables are stale before
>>> they hit mainline.
>>
>> Understood. If it’s supposed to work, any hints on how to debug this?
>>
>> Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
>> the issue?
>>
>> It seems to fail with an AMD Ryzen 2400G too [1].
>
> We now have an HP EliteDesk 705 G4 MT with that processsor, showing the same
> problem.
>
> ```
> [ 0.000000] Linux version 4.20.0.mx64.238 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 24 14:50:00 CET 2018
> […]
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] SMBIOS 3.1 present.
> [ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
> [ 0.000000] tsc: Fast TSC calibration failed
> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> […]
> [ 0.017860] smpboot: CPU0: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
> […]
> ```
>
>> It also fails on an AMD Ryzen 7 1700 [2].
>>
>> ```
>> [ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
>> […]
>> [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>> [ 0.028000] tsc: Fast TSC calibration failed
>> [ 0.032000] tsc: PIT calibration matches HPET. 1 loops
>> [ 0.032000] tsc: Detected 2994.246 MHz processor
>> […]
>> [ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>> ```
>>
>> It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.
>>
>> ```
>> [ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
>> […]
>> [ 0.000000] tsc: Fast TSC calibration using PIT
>> […]
>> [ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>> ```
>
> How to continue from here? Is documentation for that available from AMD?
> I didn’t find a BKDG (Bios Kernel Developer Guide) at [3].
>
>
> Kind regards,
>
> Paul
>
>
>> [1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
>> [2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver[3]: https://developer.amd.com/resources/developer-guides-manuals/
>

2019-01-23 12:59:05

by Paul Menzel

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with sever AMD Ryzen processor (2200G, 2400G, Ryzen 7 1700)

Dear Tom,


On 01/22/19 21:24, Lendacky, Thomas wrote:
> On 1/22/19 10:53 AM, Paul Menzel wrote:
>> [Adding Tom to CC]

>> On 01/14/19 11:09, Paul Menzel wrote:
>>
>>> On 01/11/19 21:43, Thomas Gleixner wrote:
>>>
>>>> On Mon, 7 Jan 2019, Paul Menzel wrote:
>>>>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>>>>
>>>>>>> As a result, the early time stamps do not seem to be working.
>>>>>>
>>>>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>>>>
>>>>>> And the further boot log says:
>>>>>>
>>>>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>>>>> [ 0.036000] tsc: using HPET reference calibration
>>>>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>>>>
>>>>>> So the quick calibration in early boot fails because the PIT seems not to
>>>>>> do what the kernel expects. Nothing we can cure :(
>>>>>
>>>>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>>>>> the fast TSC calibration be skipped on these devices?

> It's not expected behavior. All of the systems that I have access to do
> not exhibit this issue. Having said that, I have a limited number of
> systems available to me.

But as a data point, what Ryzen systems did you test with? Just to know, if
there are configurations where the same processor behaves inconsistently.

Can you request one of the failing systems mentioned below to reproduce the
problem?

> I don't have much experience in this area, but if it is something that
> consistently occurs, you might try to see if you can better identify why
> it fails. The message is issued in pit_hpet_ptimer_calibrate_cpu() in file
> arch/x86/kernel/tsc.c.

With the attached patch applied, I get:

[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 42
[ 0.000000] tsc: Fast TSC calibration failed, i = 42
[ 0.000000] tsc: Using PIT calibration value

The functions `pit_verify_msb()` and `pit_expect_msb()` are:

```
static inline int pit_verify_msb(unsigned char val)
{
/* Ignore LSB */
inb(0x42);
return inb(0x42) == val;
}

static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
{
int count;
u64 tsc = 0, prev_tsc = 0;

for (count = 0; count < 50000; count++) {
if (!pit_verify_msb(val))
break;
prev_tsc = tsc;
tsc = get_cycles();
}
*deltap = get_cycles() - prev_tsc;
*tscp = tsc;

/*
* We require _some_ success, but the quality control
* will be based on the error terms on the TSC values.
*/
return count > 5;
}
```

So count is smaller than or equal to 5, and `pit_verify_msb(val)` failed early,
right?

>>>> It should work and we really don't want to add cpu family/model based
>>>> decisions whether we invoke something or not. Those tables are stale before
>>>> they hit mainline.
>>>
>>> Understood. If it’s supposed to work, any hints on how to debug this?
>>>
>>> Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
>>> the issue?
>>>
>>> It seems to fail with an AMD Ryzen 2400G too [1].
>>
>> We now have an HP EliteDesk 705 G4 MT with that processsor, showing the same
>> problem.
>>
>> ```
>> [ 0.000000] Linux version 4.20.0.mx64.238 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 24 14:50:00 CET 2018
>> […]
>> [ 0.000000] NX (Execute Disable) protection: active
>> [ 0.000000] SMBIOS 3.1 present.
>> [ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
>> [ 0.000000] tsc: Fast TSC calibration failed
>> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
>> […]
>> [ 0.017860] smpboot: CPU0: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
>> […]
>> ```
>>
>>> It also fails on an AMD Ryzen 7 1700 [2].
>>>
>>> ```
>>> [ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
>>> […]
>>> [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>> [ 0.028000] tsc: Fast TSC calibration failed
>>> [ 0.032000] tsc: PIT calibration matches HPET. 1 loops
>>> [ 0.032000] tsc: Detected 2994.246 MHz processor
>>> […]
>>> [ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>> ```
>>>
>>> It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.
>>>
>>> ```
>>> [ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
>>> […]
>>> [ 0.000000] tsc: Fast TSC calibration using PIT
>>> […]
>>> [ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>> ```
>>
>> How to continue from here? Is documentation for that available from AMD?
>> I didn’t find a BKDG (Bios Kernel Developer Guide) at [3].


Kind regards,

Paul


>>> [1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
>>> [2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver[3]: https://developer.amd.com/resources/developer-guides-manuals/


Attachments:
0001-x86-kernel-tsc-Debug-early-TSC-calibration.patch (1.38 kB)
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature
Download all attachments

2019-01-23 23:34:22

by Tom Lendacky

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with sever AMD Ryzen processor (2200G, 2400G, Ryzen 7 1700)

On 1/23/19 6:56 AM, Paul Menzel wrote:
> Dear Tom,
>
>
> On 01/22/19 21:24, Lendacky, Thomas wrote:
>> On 1/22/19 10:53 AM, Paul Menzel wrote:
>>> [Adding Tom to CC]
>
>>> On 01/14/19 11:09, Paul Menzel wrote:
>>>
>>>> On 01/11/19 21:43, Thomas Gleixner wrote:
>>>>
>>>>> On Mon, 7 Jan 2019, Paul Menzel wrote:
>>>>>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>>>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>>>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>>>>>
>>>>>>>> As a result, the early time stamps do not seem to be working.
>>>>>>>
>>>>>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>>>>>
>>>>>>> And the further boot log says:
>>>>>>>
>>>>>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>>>>>> [ 0.036000] tsc: using HPET reference calibration
>>>>>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>>>>>
>>>>>>> So the quick calibration in early boot fails because the PIT seems not to
>>>>>>> do what the kernel expects. Nothing we can cure :(
>>>>>>
>>>>>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>>>>>> the fast TSC calibration be skipped on these devices?
>
>> It's not expected behavior. All of the systems that I have access to do
>> not exhibit this issue. Having said that, I have a limited number of
>> systems available to me.
>
> But as a data point, what Ryzen systems did you test with? Just to know, if
> there are configurations where the same processor behaves inconsistently.
>
> Can you request one of the failing systems mentioned below to reproduce the
> problem?
>
>> I don't have much experience in this area, but if it is something that
>> consistently occurs, you might try to see if you can better identify why
>> it fails. The message is issued in pit_hpet_ptimer_calibrate_cpu() in file
>> arch/x86/kernel/tsc.c.
>
> With the attached patch applied, I get:
>
> [ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 42
> [ 0.000000] tsc: Fast TSC calibration failed, i = 42
> [ 0.000000] tsc: Using PIT calibration value
>
> The functions `pit_verify_msb()` and `pit_expect_msb()` are:
>
> ```
> static inline int pit_verify_msb(unsigned char val)
> {
> /* Ignore LSB */
> inb(0x42);
> return inb(0x42) == val;
> }
>
> static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
> {
> int count;
> u64 tsc = 0, prev_tsc = 0;
>
> for (count = 0; count < 50000; count++) {
> if (!pit_verify_msb(val))
> break;
> prev_tsc = tsc;
> tsc = get_cycles();
> }
> *deltap = get_cycles() - prev_tsc;
> *tscp = tsc;
>
> /*
> * We require _some_ success, but the quality control
> * will be based on the error terms on the TSC values.
> */
> return count > 5;
> }
> ```
>
> So count is smaller than or equal to 5, and `pit_verify_msb(val)` failed early,
> right?

Depends on what you mean by "failed early." The loop iterated 42 times
as it tried to reach the acceptable error rate, so pit_expect_msb() had
succeeded a number of times before failing.

One of the things I noticed while searching for info on the PIT is that
one of the specs [1] mentioned that just reading the counter values (as
opposed to using the counter latch command or the read-back command) could
return undefined values if the counters are being updated while being
read. I'm not sure if that is what is occurring or if that matters in
this day and age, I'm not familiar with this area.

There's also talk of SMIs messing with this calibration. Could a long
running SMI (something close to 214us) run that results in keeping "count"
under 6? Again, just speculation on my part.

Thanks,
Tom

[1] http://www.scs.stanford.edu/10wi-cs140/pintos/specs/8254.pdf

>
>>>>> It should work and we really don't want to add cpu family/model based
>>>>> decisions whether we invoke something or not. Those tables are stale before
>>>>> they hit mainline.
>>>>
>>>> Understood. If it’s supposed to work, any hints on how to debug this?
>>>>
>>>> Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
>>>> the issue?
>>>>
>>>> It seems to fail with an AMD Ryzen 2400G too [1].
>>>
>>> We now have an HP EliteDesk 705 G4 MT with that processsor, showing the same
>>> problem.
>>>
>>> ```
>>> [ 0.000000] Linux version 4.20.0.mx64.238 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 24 14:50:00 CET 2018
>>> […]
>>> [ 0.000000] NX (Execute Disable) protection: active
>>> [ 0.000000] SMBIOS 3.1 present.
>>> [ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
>>> [ 0.000000] tsc: Fast TSC calibration failed
>>> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
>>> […]
>>> [ 0.017860] smpboot: CPU0: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
>>> […]
>>> ```
>>>
>>>> It also fails on an AMD Ryzen 7 1700 [2].
>>>>
>>>> ```
>>>> [ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
>>>> […]
>>>> [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>> [ 0.028000] tsc: Fast TSC calibration failed
>>>> [ 0.032000] tsc: PIT calibration matches HPET. 1 loops
>>>> [ 0.032000] tsc: Detected 2994.246 MHz processor
>>>> […]
>>>> [ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>>> ```
>>>>
>>>> It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.
>>>>
>>>> ```
>>>> [ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
>>>> […]
>>>> [ 0.000000] tsc: Fast TSC calibration using PIT
>>>> […]
>>>> [ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>>> ```
>>>
>>> How to continue from here? Is documentation for that available from AMD?
>>> I didn’t find a BKDG (Bios Kernel Developer Guide) at [3].
>
>
> Kind regards,
>
> Paul
>
>
>>>> [1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
>>>> [2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver[3]: https://developer.amd.com/resources/developer-guides-manuals/

2019-01-28 17:19:39

by Paul Menzel

[permalink] [raw]
Subject: Re: tsc: Fast TSC calibration failed with sever AMD Ryzen processor (2200G, 2400G, Ryzen 7 1700)

Dear Tom,


On 01/24/19 00:33, Lendacky, Thomas wrote:
> On 1/23/19 6:56 AM, Paul Menzel wrote:

>> On 01/22/19 21:24, Lendacky, Thomas wrote:
>>> On 1/22/19 10:53 AM, Paul Menzel wrote:
>>>> [Adding Tom to CC]
>>
>>>> On 01/14/19 11:09, Paul Menzel wrote:
>>>>
>>>>> On 01/11/19 21:43, Thomas Gleixner wrote:
>>>>>
>>>>>> On Mon, 7 Jan 2019, Paul Menzel wrote:
>>>>>>> On 01/07/19 16:24, Thomas Gleixner wrote:
>>>>>>>>> Linux 4.19.13 from Debian Sid/unstable logs the message below on the board MSI
>>>>>>>>> MS-7A37/B350M MORTAR with the processor AMD Ryzen 3 2200G.
>>>>>>>>>
>>>>>>>>> As a result, the early time stamps do not seem to be working.
>>>>>>>>
>>>>>>>>>> [ 0.000000] DMI: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.I0 11/06/2018
>>>>>>>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>>>>>>
>>>>>>>> And the further boot log says:
>>>>>>>>
>>>>>>>> [ 0.036000] tsc: Unable to calibrate against PIT
>>>>>>>> [ 0.036000] tsc: using HPET reference calibration
>>>>>>>> [ 0.036000] tsc: Detected 3500.117 MHz processor
>>>>>>>>
>>>>>>>> So the quick calibration in early boot fails because the PIT seems not to
>>>>>>>> do what the kernel expects. Nothing we can cure :(
>>>>>>>
>>>>>>> I see. Can AMD confirm that this is the expected behavior? If yes, should
>>>>>>> the fast TSC calibration be skipped on these devices?
>>
>>> It's not expected behavior. All of the systems that I have access to do
>>> not exhibit this issue. Having said that, I have a limited number of
>>> systems available to me.
>>
>> But as a data point, what Ryzen systems did you test with? Just to know, if
>> there are configurations where the same processor behaves inconsistently.
>>
>> Can you request one of the failing systems mentioned below to reproduce the
>> problem?
>>
>>> I don't have much experience in this area, but if it is something that
>>> consistently occurs, you might try to see if you can better identify why
>>> it fails. The message is issued in pit_hpet_ptimer_calibrate_cpu() in file
>>> arch/x86/kernel/tsc.c.
>>
>> With the attached patch applied, I get:
>>
>> [ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 42
>> [ 0.000000] tsc: Fast TSC calibration failed, i = 42
>> [ 0.000000] tsc: Using PIT calibration value
>>
>> The functions `pit_verify_msb()` and `pit_expect_msb()` are:
>>
>> ```
>> static inline int pit_verify_msb(unsigned char val)
>> {
>> /* Ignore LSB */
>> inb(0x42);
>> return inb(0x42) == val;
>> }
>>
>> static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
>> {
>> int count;
>> u64 tsc = 0, prev_tsc = 0;
>>
>> for (count = 0; count < 50000; count++) {
>> if (!pit_verify_msb(val))
>> break;
>> prev_tsc = tsc;
>> tsc = get_cycles();
>> }
>> *deltap = get_cycles() - prev_tsc;
>> *tscp = tsc;
>>
>> /*
>> * We require _some_ success, but the quality control
>> * will be based on the error terms on the TSC values.
>> */
>> return count > 5;
>> }
>> ```
>>
>> So count is smaller than or equal to 5, and `pit_verify_msb(val)` failed early,
>> right?
>
> Depends on what you mean by "failed early." The loop iterated 42 times
> as it tried to reach the acceptable error rate, so pit_expect_msb() had
> succeeded a number of times before failing.

The maximum is 233, and in my tests the counter `i` never surpassed 55 or so.

> One of the things I noticed while searching for info on the PIT is that
> one of the specs [1] mentioned that just reading the counter values (as
> opposed to using the counter latch command or the read-back command) could
> return undefined values if the counters are being updated while being
> read. I'm not sure if that is what is occurring or if that matters in
> this day and age, I'm not familiar with this area.
>
> There's also talk of SMIs messing with this calibration. Could a long
> running SMI (something close to 214us) run that results in keeping "count"
> under 6? Again, just speculation on my part.

I printed the values read from `inb(0x42)`, and log them, if they are unequal.
Please find the patch attached.

Here are the results of five boots.

[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ff != 0 = val
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fe != ff = val
[ 0.000000] tsc: pit_expect_msb: val = 0xff: count = 52
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fd != fe = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfe: count = 51
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fc != fd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfd: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fb != fc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfc: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fa != fb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfb: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f9 != fa = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfa: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f8 != f9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf9: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != f8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf8: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f6 != f7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf7: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f5 != f6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf6: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f4 != f5 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf5: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f3 != f4 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf4: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f2 != f3 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf3: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f1 != f2 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf2: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f0 != f1 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf1: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ef != f0 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf0: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ee != ef = val
[ 0.000000] tsc: pit_expect_msb: val = 0xef: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ed != ee = val
[ 0.000000] tsc: pit_expect_msb: val = 0xee: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ec != ed = val
[ 0.000000] tsc: pit_expect_msb: val = 0xed: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = eb != ec = val
[ 0.000000] tsc: pit_expect_msb: val = 0xec: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ea != eb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xeb: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e9 != ea = val
[ 0.000000] tsc: pit_expect_msb: val = 0xea: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e8 != e9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe9: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e7 != e8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe8: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e6 != e7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe7: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e5 != e6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe6: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e4 != e5 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe5: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e3 != e4 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe4: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e2 != e3 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe3: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e1 != e2 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe2: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e0 != e1 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe1: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = df != e0 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe0: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = de != df = val
[ 0.000000] tsc: pit_expect_msb: val = 0xdf: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = dd != de = val
[ 0.000000] tsc: pit_expect_msb: val = 0xde: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = dc != dd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xdd: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = da != dc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xdc: count = 4
[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 35
[ 0.000000] tsc: Fast TSC calibration failed, i = 35 from 233

[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ff != 0 = val
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fe != ff = val
[ 0.000000] tsc: pit_expect_msb: val = 0xff: count = 53
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fd != fe = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfe: count = 53
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fc != fd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfd: count = 52
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fb != fc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfc: count = 51
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fa != fb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfb: count = 50
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f9 != fa = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfa: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f8 != f9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf9: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != f8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf8: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f6 != f7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf7: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f5 != f6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf6: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f4 != f5 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf5: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f3 != f4 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf4: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f2 != f3 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf3: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f1 != f2 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf2: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f0 != f1 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf1: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ef != f0 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf0: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ee != ef = val
[ 0.000000] tsc: pit_expect_msb: val = 0xef: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ed != ee = val
[ 0.000000] tsc: pit_expect_msb: val = 0xee: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ec != ed = val
[ 0.000000] tsc: pit_expect_msb: val = 0xed: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = eb != ec = val
[ 0.000000] tsc: pit_expect_msb: val = 0xec: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ea != eb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xeb: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e9 != ea = val
[ 0.000000] tsc: pit_expect_msb: val = 0xea: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e8 != e9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe9: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e7 != e8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe8: count = 45
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e5 != e7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe7: count = 14
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e5 != e6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe6: count = 0
[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 25
[ 0.000000] tsc: Fast TSC calibration failed, i = 25 from 233

[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ff != 0 = val
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fe != ff = val
[ 0.000000] tsc: pit_expect_msb: val = 0xff: count = 53
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fd != fe = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfe: count = 53
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fc != fd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfd: count = 52
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fb != fc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfc: count = 44
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fa != fb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfb: count = 51
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != fa = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfa: count = 13
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != f9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf9: count = 0
[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 6
[ 0.000000] tsc: Fast TSC calibration failed, i = 6 from 233

[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ff != 0 = val
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fe != ff = val
[ 0.000000] tsc: pit_expect_msb: val = 0xff: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fd != fe = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfe: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fc != fd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfd: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fb != fc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfc: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fa != fb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfb: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f9 != fa = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfa: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f8 != f9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf9: count = 40
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != f8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf8: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f6 != f7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf7: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f5 != f6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf6: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f4 != f5 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf5: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f3 != f4 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf4: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f2 != f3 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf3: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f1 != f2 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf2: count = 40
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f0 != f1 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf1: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ef != f0 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf0: count = 39
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ed != ef = val
[ 0.000000] tsc: pit_expect_msb: val = 0xef: count = 13
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ed != ee = val
[ 0.000000] tsc: pit_expect_msb: val = 0xee: count = 0
[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 17
[ 0.000000] tsc: Fast TSC calibration failed, i = 17 from 233

[ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ff != 0 = val
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fe != ff = val
[ 0.000000] tsc: pit_expect_msb: val = 0xff: count = 52
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fd != fe = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfe: count = 51
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fc != fd = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfd: count = 44
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fb != fc = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfc: count = 46
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = fa != fb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfb: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f9 != fa = val
[ 0.000000] tsc: pit_expect_msb: val = 0xfa: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f8 != f9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf9: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f7 != f8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf8: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f6 != f7 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf7: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f5 != f6 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf6: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f4 != f5 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf5: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f3 != f4 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf4: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f2 != f3 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf3: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f1 != f2 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf2: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = f0 != f1 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf1: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ef != f0 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xf0: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ee != ef = val
[ 0.000000] tsc: pit_expect_msb: val = 0xef: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ed != ee = val
[ 0.000000] tsc: pit_expect_msb: val = 0xee: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ec != ed = val
[ 0.000000] tsc: pit_expect_msb: val = 0xed: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = eb != ec = val
[ 0.000000] tsc: pit_expect_msb: val = 0xec: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = ea != eb = val
[ 0.000000] tsc: pit_expect_msb: val = 0xeb: count = 48
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e9 != ea = val
[ 0.000000] tsc: pit_expect_msb: val = 0xea: count = 47
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e6 != e9 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe9: count = 42
[ 0.000000] tsc: pit_verify_msb: inb(0x42) = e6 != e8 = val
[ 0.000000] tsc: pit_expect_msb: val = 0xe8: count = 0
[ 0.000000] tsc: quick_pit_calibrate: break in if !pit_expect_msb, i = 23
[ 0.000000] tsc: Fast TSC calibration failed, i = 23 from 233

So you can see, that, if my patch does not have any side effects, the
read value is in the end smaller than the expected value and decreases
not by one, but by more than one.

So in the last boot from e9 to e6, and the code decreases by one, so
expects e8 in the next iteration.

Of course your explanation that an SMI could cause this, could still
apply. Another explanation would be, that the PIT(?) does not decrease
by a constant rate.

I remembered that there was also a TSC related problem in coreboot,
that Google fixed by the commit below [2].

> soc/amd/stoneyridge: remove dependence on TSC
>
> The TSC rate is empirically swinging during early boot. That
> leaves timestamps and udelay()s to not be correct. To rectify this
> stop using TSC for all of these time sources. Instead use the
> performance TSC which is at a fixed 100MHz clock. That provides
> stable time sources and legit timestamps.
>
> BUG=b:72378235,b:72170796

Aaron replied in #[email protected]:

> I don't know that the lkml report applies to this ryzen, but my
> recollection of that bug on stoney was that TSC wasn't exactly constant
> rate until deeper into the boot flow. I think there's something that
> happens in SMU at a point in the boot that stabilizes the clock rate.
> Ryzen could be the same thing or a completely different bug all
> together.

The HP and MSI firmwares take more than eight seconds though, compared
to the less than one second by coreboot, so it might be something else.

Unfortunately, I have neither resources nor knowledge to look into this,
and it looks like a hardware issue to me, only AMD can debug. Tom, could
you forward that to the appropriate departments in AMD?


Kind regards,

Paul


> [1] http://www.scs.stanford.edu/10wi-cs140/pintos/specs/8254.pdf
[2] https://review.coreboot.org/23424

>>>>>> It should work and we really don't want to add cpu family/model based
>>>>>> decisions whether we invoke something or not. Those tables are stale before
>>>>>> they hit mainline.
>>>>>
>>>>> Understood. If it’s supposed to work, any hints on how to debug this?
>>>>>
>>>>> Does some Linux kernel developers have an AMD Ryzen system, and can reproduce
>>>>> the issue?
>>>>>
>>>>> It seems to fail with an AMD Ryzen 2400G too [1].
>>>>
>>>> We now have an HP EliteDesk 705 G4 MT with that processsor, showing the same
>>>> problem.
>>>>
>>>> ```
>>>> [ 0.000000] Linux version 4.20.0.mx64.238 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 24 14:50:00 CET 2018
>>>> […]
>>>> [ 0.000000] NX (Execute Disable) protection: active
>>>> [ 0.000000] SMBIOS 3.1 present.
>>>> [ 0.000000] DMI: HP HP EliteDesk 705 G4 MT/83E7, BIOS Q06 Ver. 02.04.01 09/14/2018
>>>> [ 0.000000] tsc: Fast TSC calibration failed
>>>> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
>>>> […]
>>>> [ 0.017860] smpboot: CPU0: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
>>>> […]
>>>> ```
>>>>
>>>>> It also fails on an AMD Ryzen 7 1700 [2].
>>>>>
>>>>> ```
>>>>> [ 0.000000] Linux version 4.15.0-kali3-amd64 ([email protected]) (gcc version 7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1kali1 (2018-04-25)
>>>>> […]
>>>>> [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>>> [ 0.028000] tsc: Fast TSC calibration failed
>>>>> [ 0.032000] tsc: PIT calibration matches HPET. 1 loops
>>>>> [ 0.032000] tsc: Detected 2994.246 MHz processor
>>>>> […]
>>>>> [ 0.044000] smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>>>> ```
>>>>>
>>>>> It *works* here on one system with AMD Ryzen 5 PRO 1500 and Linux 4.14.87.
>>>>>
>>>>> ```
>>>>> [ 0.000000] Linux version 4.14.87.mx64.236 ([email protected]) (gcc version 7.3.0 (GCC)) #1 SMP Mon Dec 10 09:48:57 CET 2018
>>>>> […]
>>>>> [ 0.000000] tsc: Fast TSC calibration using PIT
>>>>> […]
>>>>> [ 0.035000] smpboot: CPU0: AMD Ryzen 5 PRO 1500 Quad-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
>>>>> ```
>>>>
>>>> How to continue from here? Is documentation for that available from AMD?
>>>> I didn’t find a BKDG (Bios Kernel Developer Guide) at [3].
>>
>>
>> Kind regards,
>>
>> Paul
>>
>>
>>>>> [1]: https://bbs.archlinux.org/viewtopic.php?pid=1781282#p1781282
>>>>> [2]: https://forums.kali.org/showthread.php?40444-error-loading-amdgpu-drivers-AMD-RX580-driver[3]: https://developer.amd.com/resources/developer-guides-manuals


Attachments:
0001-x86-kernel-tsc-Debug-early-TSC-calibration.patch (2.24 kB)
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature
Download all attachments