On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>
> I saw some system lockups though:
> When doing a cold boot, this kernel never managed to boot up. The last
> message seen is:
> x86: Booting SMP configuration:
> .... node #0, CPUs: #1
Weird. That really would be interesting to figure out what goes wrong
there. What bothers me is that we don't see something like this:
> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
> -577423766270
Can you please apply the debug patch below and provide the output ?
> [ 0.094492] x86: Booting SMP configuration:
> [ 0.094534] .... node #0, CPUs: #1
> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
> -577423766270
What on earth is this BIOS doing? That's a couple of minutes back in time.
And the difference between the max and min adjust value is 2050932 cycles.
> Without the patches on cold boot it just was as expected:
> [ 0.093700] x86: Booting SMP configuration:
> [ 0.093737] .... node #0, CPUs: #1
> [ 0.174304] TSC synchronization [CPU#0 -> CPU#1]:
> [ 0.174375] Measured 1837188 cycles TSC warp between CPUs, turning
> off TSC clock.
Not surprising given the above numbers.
Thanks,
tglx
8<-------------------------
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -95,6 +95,8 @@ bool tsc_store_and_check_tsc_adjust(void
cur->nextcheck = jiffies + HZ;
cur->warned = false;
+ pr_info("TSC ADJUST: CPU%u: %lld\n", cpu, bootval);
+
/*
* Check whether this CPU is the first in a package to come up. In
* this case do not check the boot value against another package
@@ -282,17 +284,22 @@ void check_tsc_sync_source(int cpu)
else
atomic_set(&test_runs, 3);
retry:
+ pr_info("TSC source sync %d -> %d runs %d\n", smp_processor_id(),
+ cpu, atomic_read(&test_runs));
/*
* Wait for the target to start or to skip the test:
*/
while (atomic_read(&start_count) != cpus - 1) {
if (atomic_read(&skip_test) > 0) {
+ pr_info("TSC source sync skipped\n");
atomic_set(&skip_test, 0);
return;
}
cpu_relax();
}
+ pr_info("TSC source sync starting\n");
+
/*
* Trigger the target to continue into the measurement too:
*/
@@ -300,6 +307,7 @@ void check_tsc_sync_source(int cpu)
check_tsc_warp(loop_timeout(cpu));
+ pr_info("TSC source sync wait for stop\n");
while (atomic_read(&stop_count) != cpus-1)
cpu_relax();
@@ -311,7 +319,7 @@ void check_tsc_sync_source(int cpu)
if (!nr_warps) {
atomic_set(&test_runs, 0);
- pr_debug("TSC synchronization [CPU#%d -> CPU#%d]: passed\n",
+ pr_info("TSC synchronization [CPU#%d -> CPU#%d]: passed\n",
smp_processor_id(), cpu);
} else if (atomic_dec_and_test(&test_runs) || random_warps) {
@@ -367,6 +375,7 @@ void check_tsc_sync_target(void)
* successful skip the test.
*/
if (tsc_store_and_check_tsc_adjust()) {
+ pr_info("TSC target sync skip\n");
atomic_inc(&skip_test);
return;
}
@@ -376,6 +385,7 @@ void check_tsc_sync_target(void)
* Register this CPU's participation and wait for the
* source CPU to start the measurement:
*/
+ pr_info("TSC target sync starting\n");
atomic_inc(&start_count);
while (atomic_read(&start_count) != cpus)
cpu_relax();
@@ -392,6 +402,7 @@ void check_tsc_sync_target(void)
*/
atomic_inc(&stop_count);
+ pr_info("TSC target sync stop\n");
/*
* Wait for the source CPU to print stuff:
*/
@@ -403,6 +414,8 @@ void check_tsc_sync_target(void)
*/
atomic_set(&stop_count, 0);
+ pr_info("TSC target sync remaining runs %d\n", atomic_read(&test_runs));
+
/*
* Check the number of remaining test runs. If not zero, the test
* failed and a retry with adjusted TSC is possible. If zero the
Am 09.12.2016 um 10:59 schrieb Thomas Gleixner:
> On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>>
>> I saw some system lockups though:
>> When doing a cold boot, this kernel never managed to boot up. The last
>> message seen is:
>> x86: Booting SMP configuration:
>> .... node #0, CPUs: #1
>
> Weird. That really would be interesting to figure out what goes wrong
> there. What bothers me is that we don't see something like this:
>
>> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
>> -577423766270
>
> Can you please apply the debug patch below and provide the output ?
Ok, this is the output (minus some typos maybe...):
x86: Booting SMP configuration:
.... node #0, CPUs: #1
TSC ADJUST: CPU1: -2806491604
TSC source sync 0 -> 1 runs 3
TSC ADJUST differs: Reference CPU0: -2805503200 CPU1: -2806491604
TSC ADJUST synchronize: Reference CPU0: -2805503200 CPU1: -2806491604
TSC target sync skip
TSC source sync skipped
And that's it.
>
>
>> [ 0.094492] x86: Booting SMP configuration:
>> [ 0.094534] .... node #0, CPUs: #1
>> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
>> -577423766270
>
> What on earth is this BIOS doing? That's a couple of minutes back in time.
Looks like that's only after a reset, after a cold boot the numbers are
significantly smaller (1 sec or so?). Though the difference between
these two is nearly a million cycles too, so might be similar between
min and max for all.
>
> And the difference between the max and min adjust value is 2050932 cycles.
>
>> Without the patches on cold boot it just was as expected:
>> [ 0.093700] x86: Booting SMP configuration:
>> [ 0.093737] .... node #0, CPUs: #1
>> [ 0.174304] TSC synchronization [CPU#0 -> CPU#1]:
>> [ 0.174375] Measured 1837188 cycles TSC warp between CPUs, turning
>> off TSC clock.
>
> Not surprising given the above numbers.
>
Hope that helps,
Roland
On Fri, 9 Dec 2016, Roland Scheidegger wrote:
> Am 09.12.2016 um 10:59 schrieb Thomas Gleixner:
> > On Fri, 9 Dec 2016, Roland Scheidegger wrote:
> >>
> >> I saw some system lockups though:
> >> When doing a cold boot, this kernel never managed to boot up. The last
> >> message seen is:
> >> x86: Booting SMP configuration:
> >> .... node #0, CPUs: #1
> >
> > Weird. That really would be interesting to figure out what goes wrong
> > there. What bothers me is that we don't see something like this:
> >
> >> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
> >> -577423766270
> >
> > Can you please apply the debug patch below and provide the output ?
> Ok, this is the output (minus some typos maybe...):
>
> x86: Booting SMP configuration:
> .... node #0, CPUs: #1
> TSC ADJUST: CPU1: -2806491604
> TSC source sync 0 -> 1 runs 3
> TSC ADJUST differs: Reference CPU0: -2805503200 CPU1: -2806491604
> TSC ADJUST synchronize: Reference CPU0: -2805503200 CPU1: -2806491604
> TSC target sync skip
> TSC source sync skipped
>
> And that's it.
That's a cold boot (power on), right?
That looks like I expected. As we know that the CPUs are in the same
package simply resynchronizing the TSC ADJUST MSR is enough. And the sync
test is skipped as we have resynced TSC ADJUST already to be the same
value.
What's unexpected is that the thing stops working :(
I'm pretty sure the boot cpu hangs in that endless loop waiting for CPU1 to
set the online bit, but for whatever reason this does not happen.
Can you add the patch below to gather more information? There is a hunk in
there with an '#if 0' which sets the TSC ADJUST to 0 on boot, which you can
turn on as second step.
This BIOS seems to be rather strange ...
Thanks.
tglx
----
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -95,7 +95,7 @@ bool tsc_store_and_check_tsc_adjust(void
cur->nextcheck = jiffies + HZ;
cur->warned = false;
- pr_info("TSC ADJUST: CPU%u: %lld\n", cpu, bootval);
+ pr_info("TSC ADJUST: CPU%u: %lld %lld\n", cpu, bootval, rdtsc());
/*
* Check whether this CPU is the first in a package to come up. In
@@ -114,8 +114,15 @@ bool tsc_store_and_check_tsc_adjust(void
* the sync mechanism. If that fails we still can yell
* about boot values not being consistent.
*/
- cur->adjusted = bootval;
pr_info_once("TSC ADJUST: Boot CPU%u: %lld\n", cpu, bootval);
+#if 0
+ if (bootval) {
+ pr_warn("TSC ADJUST: CPU%u: Setting to 0\n", cpu);
+ bootval = 0;
+ wrmsrl(MSR_IA32_TSC_ADJUST, bootval);
+ }
+#endif
+ cur->adjusted = bootval;
return false;
}
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -245,7 +245,9 @@ static void notrace start_secondary(void
* from seeing a half valid vector space.
*/
lock_vector_lock();
+ pr_info("Vector locked\n");
setup_vector_irq(smp_processor_id());
+ pr_info("Vector setup done\n");
set_cpu_online(smp_processor_id(), true);
unlock_vector_lock();
cpu_set_state_online(smp_processor_id());
@@ -257,8 +259,10 @@ static void notrace start_secondary(void
/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();
+ pr_info("Clock setup\n");
x86_cpuinit.setup_percpu_clockev();
+ pr_info("Enter idle startup\n");
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}
@@ -1133,7 +1137,7 @@ int native_cpu_up(unsigned int cpu, stru
cpu_relax();
touch_nmi_watchdog();
}
-
+ pr_info("Target CPU is online\n");
return 0;
}
Am 09.12.2016 um 18:33 schrieb Thomas Gleixner:
> On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>> Am 09.12.2016 um 10:59 schrieb Thomas Gleixner:
>>> On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>>>>
>>>> I saw some system lockups though:
>>>> When doing a cold boot, this kernel never managed to boot up. The last
>>>> message seen is:
>>>> x86: Booting SMP configuration:
>>>> .... node #0, CPUs: #1
>>>
>>> Weird. That really would be interesting to figure out what goes wrong
>>> there. What bothers me is that we don't see something like this:
>>>
>>>> [ 0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
>>>> -577423766270
>>>
>>> Can you please apply the debug patch below and provide the output ?
>> Ok, this is the output (minus some typos maybe...):
>>
>> x86: Booting SMP configuration:
>> .... node #0, CPUs: #1
>> TSC ADJUST: CPU1: -2806491604
>> TSC source sync 0 -> 1 runs 3
>> TSC ADJUST differs: Reference CPU0: -2805503200 CPU1: -2806491604
>> TSC ADJUST synchronize: Reference CPU0: -2805503200 CPU1: -2806491604
>> TSC target sync skip
>> TSC source sync skipped
>>
>> And that's it.
>
> That's a cold boot (power on), right?
>
> That looks like I expected. As we know that the CPUs are in the same
> package simply resynchronizing the TSC ADJUST MSR is enough. And the sync
> test is skipped as we have resynced TSC ADJUST already to be the same
> value.
>
> What's unexpected is that the thing stops working :(
>
> I'm pretty sure the boot cpu hangs in that endless loop waiting for CPU1 to
> set the online bit, but for whatever reason this does not happen.
>
> Can you add the patch below to gather more information? There is a hunk in
> there with an '#if 0' which sets the TSC ADJUST to 0 on boot, which you can
> turn on as second step.
Ok, here's the results:
...
TSC ADJUST synchronize: Reference CPU0: -2820267100 CPU1: -2822498296
TSC target sync skipped
smpboot: Vector locked
smpboot: Vector setup done
smpboot: Clock setup
TSC source sync skipped
smpboot: Target CPU is online
With the #if 0 block activated, it boots up fine, the output was:
[ 1.038892] x86: Booting SMP configuration:
[ 1.038930] .... node #0, CPUs: #1
[ 0.171851] TSC ADJUST: CPU1: -2830353064 218577682002
[ 1.117495] TSC source sync 0 -> 1 runs 3
[ 0.171852] TSC ADJUST differs: Reference CPU0: -2828600940 CPU1:
-2830353064
[ 0.171853] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2830353064
[ 1.117497] TSC target sync skip
[ 1.117497] smpboot: Vector locked
[ 1.117506] smpboot: Vector setup done
[ 1.117508] smpboot: Clock setup
[ 1.117510] smpboot: Enter idle startup
[ 1.117860] TSC source sync skipped
[ 1.117896] smpboot: Target CPU is online
[ 1.117990] #2
[ 0.252336] TSC ADJUST: CPU2: -2828948550 218818578394
[ 1.197511] TSC source sync 0 -> 2 runs 3
[ 0.252337] TSC ADJUST differs: Reference CPU0: -2828600940 CPU2:
-2828948550
[ 0.252338] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2828948550
[ 1.197513] TSC target sync skip
[ 1.197513] smpboot: Vector locked
[ 1.197523] smpboot: Vector setup done
[ 1.197524] smpboot: Clock setup
[ 1.197527] smpboot: Enter idle startup
[ 1.197866] TSC source sync skipped
[ 1.197902] smpboot: Target CPU is online
[ 1.197997] #3
[ 0.332199] TSC ADJUST: CPU3: -2829409422 219057612986
[ 1.277528] TSC source sync 0 -> 3 runs 3
[ 0.332201] TSC ADJUST differs: Reference CPU0: -2828600940 CPU3:
-2829409422
[ 0.332202] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2829409422
[ 1.277530] TSC target sync skip
[ 1.277530] smpboot: Vector locked
[ 1.277540] smpboot: Vector setup done
[ 1.277542] smpboot: Clock setup
[ 1.277544] smpboot: Enter idle startup
[ 1.277884] TSC source sync skipped
[ 1.277920] smpboot: Target CPU is online
[ 1.278016] #4
[ 0.411955] TSC ADJUST: CPU4: -2830194520 219296322250
[ 1.357546] TSC source sync 0 -> 4 runs 3
[ 0.411956] TSC ADJUST differs: Reference CPU0: -2828600940 CPU4:
-2830194520
[ 0.411957] TSC ADJUST synchronize: Reference CPU0: 0 CPU4: -2830194520
[ 1.357548] TSC target sync skip
[ 1.357548] smpboot: Vector locked
[ 1.357558] smpboot: Vector setup done
[ 1.357560] smpboot: Clock setup
[ 1.357562] smpboot: Enter idle startup
[ 1.357901] TSC source sync skipped
[ 1.357937] smpboot: Target CPU is online
[ 1.358030] #5
[ 0.491957] TSC ADJUST: CPU5: -2830235658 219535775620
[ 1.437562] TSC source sync 0 -> 5 runs 3
[ 0.491959] TSC ADJUST differs: Reference CPU0: -2828600940 CPU5:
-2830235658
[ 0.491960] TSC ADJUST synchronize: Reference CPU0: 0 CPU5: -2830235658
[ 1.437564] TSC target sync skip
[ 1.437565] smpboot: Vector locked
[ 1.437574] smpboot: Vector setup done
[ 1.437575] smpboot: Clock setup
[ 1.437578] smpboot: Enter idle startup
[ 1.437917] TSC source sync skipped
[ 1.437953] smpboot: Target CPU is online
[ 1.438047] #6
[ 0.571943] TSC ADJUST: CPU6: -2830332712 219775177676
[ 1.517581] TSC source sync 0 -> 6 runs 3
[ 0.571945] TSC ADJUST differs: Reference CPU0: -2828600940 CPU6:
-2830332712
[ 0.571946] TSC ADJUST synchronize: Reference CPU0: 0 CPU6: -2830332712
[ 1.517583] TSC target sync skip
[ 1.517583] smpboot: Vector locked
[ 1.517592] smpboot: Vector setup done
[ 1.517594] smpboot: Clock setup
[ 1.517597] smpboot: Enter idle startup
[ 1.517935] TSC source sync skipped
[ 1.517971] smpboot: Target CPU is online
[ 1.518066] #7
[ 0.652035] TSC ADJUST: CPU7: -2830107566 220014895498
[ 1.597597] TSC source sync 0 -> 7 runs 3
[ 0.652037] TSC ADJUST differs: Reference CPU0: -2828600940 CPU7:
-2830107566
[ 0.652037] TSC ADJUST synchronize: Reference CPU0: 0 CPU7: -2830107566
[ 1.597599] TSC target sync skip
[ 1.597600] smpboot: Vector locked
[ 1.597609] smpboot: Vector setup done
[ 1.597610] smpboot: Clock setup
[ 1.597613] smpboot: Enter idle startup
[ 1.597952] TSC source sync skipped
[ 1.597988] smpboot: Target CPU is online
[ 1.598083] #8
[ 0.732530] TSC ADJUST: CPU8: -2828717392 220255817924
[ 1.677628] TSC source sync 0 -> 8 runs 3
[ 0.732532] TSC ADJUST differs: Reference CPU0: -2828600940 CPU8:
-2828717392
[ 0.732532] TSC ADJUST synchronize: Reference CPU0: 0 CPU8: -2828717392
[ 1.677630] TSC target sync skip
[ 1.677630] smpboot: Vector locked
[ 1.677643] smpboot: Vector setup done
[ 1.677644] smpboot: Clock setup
[ 1.677647] smpboot: Enter idle startup
[ 1.677984] TSC source sync skipped
[ 1.678020] smpboot: Target CPU is online
[ 1.678123] #9
[ 0.812034] TSC ADJUST: CPU9: -2830240300 220493781182
[ 1.757641] TSC source sync 0 -> 9 runs 3
[ 0.812036] TSC ADJUST differs: Reference CPU0: -2828600940 CPU9:
-2830240300
[ 0.812036] TSC ADJUST synchronize: Reference CPU0: 0 CPU9: -2830240300
[ 1.757642] TSC target sync skip
[ 1.757643] smpboot: Vector locked
[ 1.757652] smpboot: Vector setup done
[ 1.757654] smpboot: Clock setup
[ 1.757656] smpboot: Enter idle startup
[ 1.757996] TSC source sync skipped
[ 1.758032] smpboot: Target CPU is online
[ 1.758130] #10
[ 0.892439] TSC ADJUST: CPU10: -2829079162 220734436186
[ 1.837658] TSC source sync 0 -> 10 runs 3
[ 0.892441] TSC ADJUST differs: Reference CPU0: -2828600940 CPU10:
-2829079162
[ 0.892441] TSC ADJUST synchronize: Reference CPU0: 0 CPU10: -2829079162
[ 1.837659] TSC target sync skip
[ 1.837660] smpboot: Vector locked
[ 1.837669] smpboot: Vector setup done
[ 1.837671] smpboot: Clock setup
[ 1.837673] smpboot: Enter idle startup
[ 1.838023] TSC source sync skipped
[ 1.838059] smpboot: Target CPU is online
[ 1.838155] #11
[ 0.972104] TSC ADJUST: CPU11: -2830133522 220972875474
[ 1.917674] TSC source sync 0 -> 11 runs 3
[ 0.972105] TSC ADJUST differs: Reference CPU0: -2828600940 CPU11:
-2830133522
[ 0.972106] TSC ADJUST synchronize: Reference CPU0: 0 CPU11: -2830133522
[ 1.917676] TSC target sync skip
[ 1.917677] smpboot: Vector locked
[ 1.917685] smpboot: Vector setup done
[ 1.917686] smpboot: Clock setup
[ 1.917689] smpboot: Enter idle startup
[ 1.918039] TSC source sync skipped
[ 1.918075] smpboot: Target CPU is online
[ 1.918171] #12
[ 1.052297] TSC ADJUST: CPU12: -2829606124 221212897186
[ 1.997691] TSC source sync 0 -> 12 runs 3
[ 1.052298] TSC ADJUST differs: Reference CPU0: -2828600940 CPU12:
-2829606124
[ 1.052299] TSC ADJUST synchronize: Reference CPU0: 0 CPU12: -2829606124
[ 1.997693] TSC target sync skip
[ 1.997693] smpboot: Vector locked
[ 1.997702] smpboot: Vector setup done
[ 1.997704] smpboot: Clock setup
[ 1.997706] smpboot: Enter idle startup
[ 1.998057] TSC source sync skipped
[ 1.998093] smpboot: Target CPU is online
[ 1.998189] #13
[ 1.132267] TSC ADJUST: CPU13: -2829742434 221452252884
[ 2.077708] TSC source sync 0 -> 13 runs 3
[ 1.132269] TSC ADJUST differs: Reference CPU0: -2828600940 CPU13:
-2829742434
[ 1.132270] TSC ADJUST synchronize: Reference CPU0: 0 CPU13: -2829742434
[ 2.077709] TSC target sync skip
[ 2.077710] smpboot: Vector locked
[ 2.077719] smpboot: Vector setup done
[ 2.077720] smpboot: Clock setup
[ 2.077722] smpboot: Enter idle startup
[ 2.078073] TSC source sync skipped
[ 2.078109] smpboot: Target CPU is online
[ 2.078203] #14
[ 1.212227] TSC ADJUST: CPU14: -2829914872 221691574212
[ 2.157724] TSC source sync 0 -> 14 runs 3
[ 1.212228] TSC ADJUST differs: Reference CPU0: -2828600940 CPU14:
-2829914872
[ 1.212229] TSC ADJUST synchronize: Reference CPU0: 0 CPU14: -2829914872
[ 2.157726] TSC target sync skip
[ 2.157726] smpboot: Vector locked
[ 2.157735] smpboot: Vector setup done
[ 2.157737] smpboot: Clock setup
[ 2.157740] smpboot: Enter idle startup
[ 2.159900] TSC source sync skipped
[ 2.159936] smpboot: Target CPU is online
[ 2.160032] #15
[ 1.292113] TSC ADJUST: CPU15: -2830310478 221930677086
[ 2.237743] TSC source sync 0 -> 15 runs 3
[ 1.292115] TSC ADJUST differs: Reference CPU0: -2828600940 CPU15:
-2830310478
[ 1.292115] TSC ADJUST synchronize: Reference CPU0: 0 CPU15: -2830310478
[ 2.237745] TSC target sync skip
[ 2.237745] smpboot: Vector locked
[ 2.237755] smpboot: Vector setup done
[ 2.237756] smpboot: Clock setup
[ 2.237759] smpboot: Enter idle startup
[ 2.238112] TSC source sync skipped
[ 2.238148] smpboot: Target CPU is online
[ 2.238186] x86: Booted up 1 node, 16 CPUs
[ 2.238224] smpboot: Total of 16 processors activated (95793.84 BogoMIPS)
(And fwiw with my quick hack the lockups disappear to when I change that
back to blast a zero into TSC_ADJ for all cpus.)
The system also came back up fine from suspend with this (well - still
minus graphics...), however disabled tsc clocksource:
[ 579.931739] Enabling non-boot CPUs ...
[ 579.943106] x86: Booting SMP configuration:
[ 579.943107] smpboot: Booting Node 0 Processor 1 APIC 0x2
[ 579.943189] TSC ADJUST: CPU1: -1504429974 21601834126
[ 579.943837] TSC source sync 0 -> 1 runs 3
[ 579.943191] TSC ADJUST differs: Reference CPU0: -2828600940 CPU1:
-1504429974
[ 579.943191] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -1504429974
[ 580.445833] TSC target sync skip
[ 580.445834] smpboot: Vector locked
[ 579.943839] TSC source sync skipped
[ 580.445845] smpboot: Vector setup done
[ 579.943850] smpboot: Target CPU is online
[ 580.445847] smpboot: Clock setup
[ 580.445852] smpboot: Enter idle startup
[ 580.445914] cache: parent cpu1 should not be sleeping
[ 579.944093] CPU1 is up
[ 580.458983] clocksource: timekeeping watchdog on CPU1: Marking
clocksource 'tsc' as unstable because the skew is too large:
[ 580.458985] clocksource: 'hpet' wd_now: 587c1
wd_last: 437c7 mask: ffffffff
[ 580.458986] clocksource: 'tsc' cs_now:
563963cd8 cs_last: 508f5a02a mask: ffffffffffffffff
[ 579.959249] smpboot: Booting Node 0 Processor 2 APIC 0x4
[ 579.959454] TSC ADJUST: CPU2: -1504299836 21650515856
[ 579.960058] TSC source sync 0 -> 2 runs 3
[ 579.959456] TSC ADJUST differs: Reference CPU0: -2828600940 CPU2:
-1504299836
[ 579.959456] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -1504299836
[ 580.462055] TSC target sync skip
[ 579.960060] TSC source sync skipped
[ 580.462056] smpboot: Vector locked
[ 580.462067] smpboot: Vector setup done
[ 579.960072] smpboot: Target CPU is online
[ 580.462068] smpboot: Clock setup
[ 580.462071] smpboot: Enter idle startup
[ 580.462136] cache: parent cpu2 should not be sleeping
[ 579.960327] CPU2 is up
[ 580.503230] smpboot: Booting Node 0 Processor 3 APIC 0x6
[ 580.503782] TSC ADJUST: CPU3: -1504357188 23279711288
[ 580.504405] TSC source sync 0 -> 3 runs 3
[ 580.503784] TSC ADJUST differs: Reference CPU0: -2828600940 CPU3:
-1504357188
[ 580.503784] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -1504357188
[ 581.006402] TSC target sync skip
[ 581.006403] smpboot: Vector locked
[ 580.504408] TSC source sync skipped
[ 581.006416] smpboot: Vector setup done
[ 580.504421] smpboot: Target CPU is online
[ 581.006418] smpboot: Clock setup
[ 581.006424] smpboot: Enter idle startup
[ 581.006506] cache: parent cpu3 should not be sleeping
[ 580.504741] CPU3 is up
[ 581.006760] [Firmware Bug]: TSC ADJUST differs: CPU0 0 -->
-1502494750. Restoring
[ 581.019271] smpboot: Booting Node 0 Processor 4 APIC 0x8
[ 580.517802] TSC ADJUST: CPU4: -1504419274 23321675136
[ 581.020442] TSC source sync 0 -> 4 runs 3
[ 580.517804] TSC ADJUST differs: Reference CPU0: -2828600940 CPU4:
-1504419274
[ 580.517805] TSC ADJUST synchronize: Reference CPU0: 0 CPU4: -1504419274
[ 581.020444] TSC target sync skip
[ 581.020444] TSC source sync skipped
[ 581.020445] smpboot: Vector locked
[ 581.020459] smpboot: Vector setup done
[ 581.020459] smpboot: Target CPU is online
[ 581.020461] smpboot: Clock setup
[ 581.020465] smpboot: Enter idle startup
[ 581.020548] cache: parent cpu4 should not be sleeping
[ 581.020785] CPU4 is up
[ 581.035383] smpboot: Booting Node 0 Processor 5 APIC 0xa
[ 580.534095] TSC ADJUST: CPU5: -1503892568 23370438110
[ 581.036558] TSC source sync 0 -> 5 runs 3
[ 580.534096] TSC ADJUST differs: Reference CPU0: -2828600940 CPU5:
-1503892568
[ 580.534097] TSC ADJUST synchronize: Reference CPU0: 0 CPU5: -1503892568
[ 581.036560] TSC target sync skip
[ 581.036560] TSC source sync skipped
[ 581.036561] smpboot: Vector locked
[ 581.036574] smpboot: Vector setup done
[ 581.036575] smpboot: Target CPU is online
[ 581.036576] smpboot: Clock setup
[ 581.036580] smpboot: Enter idle startup
[ 581.036664] cache: parent cpu5 should not be sleeping
[ 581.036913] CPU5 is up
[ 581.051460] smpboot: Booting Node 0 Processor 6 APIC 0xc
[ 580.550097] TSC ADJUST: CPU6: -1504107826 23418332356
[ 581.052632] TSC source sync 0 -> 6 runs 3
[ 580.550098] TSC ADJUST differs: Reference CPU0: -2828600940 CPU6:
-1504107826
[ 580.550099] TSC ADJUST synchronize: Reference CPU0: 0 CPU6: -1504107826
[ 581.052634] TSC target sync skip
[ 581.052634] smpboot: Vector locked
[ 581.052634] TSC source sync skipped
[ 581.052647] smpboot: Vector setup done
[ 581.052648] smpboot: Target CPU is online
[ 581.052650] smpboot: Clock setup
[ 581.052654] smpboot: Enter idle startup
[ 581.052738] cache: parent cpu6 should not be sleeping
[ 581.053002] CPU6 is up
[ 581.071544] smpboot: Booting Node 0 Processor 7 APIC 0xe
[ 580.570040] TSC ADJUST: CPU7: -1504535440 23478024002
[ 581.072718] TSC source sync 1 -> 7 runs 3
[ 580.570041] TSC ADJUST differs: Reference CPU0: -2828600940 CPU7:
-1504535440
[ 580.570042] TSC ADJUST synchronize: Reference CPU0: 0 CPU7: -1504535440
[ 581.072720] TSC target sync skip
[ 581.072720] smpboot: Vector locked
[ 581.072720] TSC source sync skipped
[ 581.072733] smpboot: Vector setup done
[ 581.072734] smpboot: Target CPU is online
[ 581.072736] smpboot: Clock setup
[ 581.072740] smpboot: Enter idle startup
[ 581.072822] cache: parent cpu7 should not be sleeping
[ 581.073092] CPU7 is up
[ 581.095281] smpboot: Booting Node 0 Processor 8 APIC 0x1
[ 580.594433] TSC ADJUST: CPU8: -1502571034 23551035204
[ 581.096455] TSC source sync 1 -> 8 runs 3
[ 580.594435] TSC ADJUST differs: Reference CPU0: -2828600940 CPU8:
-1502571034
[ 580.594436] TSC ADJUST synchronize: Reference CPU0: 0 CPU8: -1502571034
[ 581.096457] TSC target sync skip
[ 581.096457] smpboot: Vector locked
[ 581.096458] TSC source sync skipped
[ 581.096470] smpboot: Vector setup done
[ 581.096470] smpboot: Target CPU is online
[ 581.096472] smpboot: Clock setup
[ 581.096476] smpboot: Enter idle startup
[ 581.096593] cache: parent cpu8 should not be sleeping
[ 581.096881] CPU8 is up
[ 581.111327] smpboot: Booting Node 0 Processor 9 APIC 0x3
[ 580.609809] TSC ADJUST: CPU9: -1504556458 23597051168
[ 581.112494] TSC source sync 1 -> 9 runs 3
[ 580.609810] TSC ADJUST differs: Reference CPU0: -2828600940 CPU9:
-1504556458
[ 580.609811] TSC ADJUST synchronize: Reference CPU0: 0 CPU9: -1504556458
[ 581.112496] TSC target sync skip
[ 581.112496] TSC source sync skipped
[ 581.112496] smpboot: Vector locked
[ 581.112515] smpboot: Vector setup done
[ 581.112515] smpboot: Target CPU is online
[ 581.112518] smpboot: Clock setup
[ 581.112522] smpboot: Enter idle startup
[ 581.112635] cache: parent cpu9 should not be sleeping
[ 581.112919] CPU9 is up
[ 581.127367] smpboot: Booting Node 0 Processor 10 APIC 0x5
[ 580.626298] TSC ADJUST: CPU10: -1503281384 23646406682
[ 581.128558] TSC source sync 1 -> 10 runs 3
[ 580.626300] TSC ADJUST differs: Reference CPU0: -2828600940 CPU10:
-1503281384
[ 580.626301] TSC ADJUST synchronize: Reference CPU0: 0 CPU10: -1503281384
[ 581.128559] TSC target sync skip
[ 581.128560] TSC source sync skipped
[ 581.128560] smpboot: Vector locked
[ 581.128573] smpboot: Vector setup done
[ 581.128573] smpboot: Target CPU is online
[ 581.128575] smpboot: Clock setup
[ 581.128579] smpboot: Enter idle startup
[ 581.128689] cache: parent cpu10 should not be sleeping
[ 581.128997] CPU10 is up
[ 581.143754] smpboot: Booting Node 0 Processor 11 APIC 0x7
[ 580.642580] TSC ADJUST: CPU11: -1503566952 23695138352
[ 581.144935] TSC source sync 1 -> 11 runs 3
[ 580.642582] TSC ADJUST differs: Reference CPU0: -2828600940 CPU11:
-1503566952
[ 580.642583] TSC ADJUST synchronize: Reference CPU0: 0 CPU11: -1503566952
[ 581.144937] TSC target sync skip
[ 581.144937] smpboot: Vector locked
[ 581.144938] TSC source sync skipped
[ 581.144950] smpboot: Vector setup done
[ 581.144950] smpboot: Target CPU is online
[ 581.144952] smpboot: Clock setup
[ 581.144956] smpboot: Enter idle startup
[ 581.145069] cache: parent cpu11 should not be sleeping
[ 581.145493] CPU11 is up
[ 581.163769] smpboot: Booting Node 0 Processor 12 APIC 0x9
[ 580.662529] TSC ADJUST: CPU12: -1503752538 23754846788
[ 581.164945] TSC source sync 1 -> 12 runs 3
[ 580.662531] TSC ADJUST differs: Reference CPU0: -2828600940 CPU12:
-1503752538
[ 580.662531] TSC ADJUST synchronize: Reference CPU0: 0 CPU12: -1503752538
[ 581.164947] TSC target sync skip
[ 581.164948] TSC source sync skipped
[ 581.164949] smpboot: Vector locked
[ 581.164961] smpboot: Vector setup done
[ 581.164962] smpboot: Target CPU is online
[ 581.164963] smpboot: Clock setup
[ 581.164967] smpboot: Enter idle startup
[ 581.165084] cache: parent cpu12 should not be sleeping
[ 581.165564] CPU12 is up
[ 581.191866] smpboot: Booting Node 0 Processor 13 APIC 0xb
[ 580.690388] TSC ADJUST: CPU13: -1504470824 23838230778
[ 581.193045] TSC source sync 1 -> 13 runs 3
[ 580.690390] TSC ADJUST differs: Reference CPU0: -2828600940 CPU13:
-1504470824
[ 580.690391] TSC ADJUST synchronize: Reference CPU0: 0 CPU13: -1504470824
[ 581.193047] TSC target sync skip
[ 581.193047] smpboot: Vector locked
[ 581.193048] TSC source sync skipped
[ 581.193060] smpboot: Vector setup done
[ 581.193061] smpboot: Target CPU is online
[ 581.193063] smpboot: Clock setup
[ 581.193066] smpboot: Enter idle startup
[ 581.193184] cache: parent cpu13 should not be sleeping
[ 581.193637] CPU13 is up
[ 581.215919] smpboot: Booting Node 0 Processor 14 APIC 0xd
[ 580.714423] TSC ADJUST: CPU14: -1504515582 23910167978
[ 581.217095] TSC source sync 1 -> 14 runs 3
[ 580.714425] TSC ADJUST differs: Reference CPU0: -2828600940 CPU14:
-1504515582
[ 580.714426] TSC ADJUST synchronize: Reference CPU0: 0 CPU14: -1504515582
[ 581.217097] TSC target sync skip
[ 581.217097] smpboot: Vector locked
[ 581.217098] TSC source sync skipped
[ 581.217110] smpboot: Vector setup done
[ 581.217111] smpboot: Target CPU is online
[ 581.217113] smpboot: Clock setup
[ 581.217116] smpboot: Enter idle startup
[ 581.217234] cache: parent cpu14 should not be sleeping
[ 581.217706] CPU14 is up
[ 581.251975] smpboot: Booting Node 0 Processor 15 APIC 0xf
[ 580.750545] TSC ADJUST: CPU15: -1504326376 24018283718
[ 581.253154] TSC source sync 1 -> 15 runs 3
[ 580.750547] TSC ADJUST differs: Reference CPU0: -2828600940 CPU15:
-1504326376
[ 580.750548] TSC ADJUST synchronize: Reference CPU0: 0 CPU15: -1504326376
[ 581.253156] TSC target sync skip
[ 581.253156] TSC source sync skipped
[ 581.253157] smpboot: Vector locked
[ 581.253169] smpboot: Vector setup done
[ 581.253169] smpboot: Target CPU is online
[ 581.253171] smpboot: Clock setup
[ 581.253175] smpboot: Enter idle startup
[ 581.253294] cache: parent cpu15 should not be sleeping
[ 581.253774] CPU15 is up
[ 581.308954] ACPI: Waking up from system sleep state S3
[ 581.308984] clocksource: Switched to clocksource hpet
>
> This BIOS seems to be rather strange ...
Don't tell me...
No idea what it's doing, but I think it's safe to say whatever it's
trying to do, it's doing it wrong...
Roland
On Fri, 9 Dec 2016, Roland Scheidegger wrote:
Cc'ed someone from Dell.
> Am 09.12.2016 um 18:33 schrieb Thomas Gleixner:
> > Can you add the patch below to gather more information? There is a hunk in
> > there with an '#if 0' which sets the TSC ADJUST to 0 on boot, which you can
> > turn on as second step.
>
> Ok, here's the results:
> ...
> TSC ADJUST synchronize: Reference CPU0: -2820267100 CPU1: -2822498296
> TSC target sync skipped
> smpboot: Vector locked
> smpboot: Vector setup done
> smpboot: Clock setup
> TSC source sync skipped
> smpboot: Target CPU is online
I did not expect that to happen. Now I'm puzzled and curious where the
machine gets lost after that. See below.
> With the #if 0 block activated, it boots up fine, the output was:
That does not make any sense at all, but yes, nothing in this context makes
sense.
> [ 1.038892] x86: Booting SMP configuration:
> [ 1.038930] .... node #0, CPUs: #1
> [ 0.171851] TSC ADJUST: CPU1: -2830353064 218577682002
> [ 1.117495] TSC source sync 0 -> 1 runs 3
> [ 0.171852] TSC ADJUST differs: Reference CPU0: -2828600940 CPU1:
> -2830353064
> [ 0.171853] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2830353064
> [ 1.117497] TSC target sync skip
> (And fwiw with my quick hack the lockups disappear to when I change that
> back to blast a zero into TSC_ADJ for all cpus.)
Right, That's what that hunk does as well.
Now what's interesting is that the adjustement of CPU1 in the non write to
zero case results in the following:
TSC ADJUST: CPU1: -2830353064 218577682002 <-- TSC value
TSC ADJUST differs: Reference CPU0: -2828600940 CPU1: -2830353064
We write CPU1 adjust register to -2828600940 which makes the TSC on CPU1
jump forwards by -2828600940 - -2830353064 = 1752124 cycles.
In the write to zero case the jump is forward as well, but this time it's
huge, i.e. 2830353064 cycles.
I tried to wreckage the TSC by writing similar values to the adjust MSR on
early boot, but independent of the values and independent of the write to
zero part the machine comes up happily all the time.
The only difference is that my machine has a somewhat saner BIOS. So the
thing might just die in the value add SMM crap, but who knows.
In the patch below is another bunch of debug prints which emit the state
information of CPU1 during bringup. Maybe that gives a hint where the
system gets stuck when you disable the 'write to zero' magic again.
The NMI watchdog does not catch anything, right?
> The system also came back up fine from suspend with this (well - still
> minus graphics...), however disabled tsc clocksource:
>
> [ 579.931739] Enabling non-boot CPUs ...
> [ 579.943107] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [ 579.943189] TSC ADJUST: CPU1: -1504429974 21601834126
Fun, yet another adjust value. Are they set by a random number generator?
> [ 579.944093] CPU1 is up
> [ 580.458983] clocksource: timekeeping watchdog on CPU1: Marking
> clocksource 'tsc' as unstable because the skew is too large:
> [ 580.458985] clocksource: 'hpet' wd_now: 587c1
> wd_last: 437c7 mask: ffffffff
> [ 580.458986] clocksource: 'tsc' cs_now:
> 563963cd8 cs_last: 508f5a02a mask: ffffffffffffffff
Ok, that's caused by the fact that we do not sanitize the TSC adjust
register on the boot CPU in the resume path.
> [ 581.006760] [Firmware Bug]: TSC ADJUST differs: CPU0 0 -->
> -1502494750. Restoring
We only detect it later and correct it, but that's too late. The untested
patch below should cure that.
> > This BIOS seems to be rather strange ...
> Don't tell me...
>
> No idea what it's doing, but I think it's safe to say whatever it's
> trying to do, it's doing it wrong...
Amen to that. I've seen a lot of Value Add BIOSes which broke things left
and right, but this one takes it to a new level of insanity.
Did you report that back to DELL already?
Thanks,
tglx
8<--------------------
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -47,7 +47,7 @@ extern int tsc_clocksource_reliable;
*/
#ifdef CONFIG_X86_TSC
extern bool tsc_store_and_check_tsc_adjust(void);
-extern void tsc_verify_tsc_adjust(void);
+extern void tsc_verify_tsc_adjust(bool resume);
extern void check_tsc_sync_source(int cpu);
extern void check_tsc_sync_target(void);
#else
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -277,7 +277,7 @@ void exit_idle(void)
void arch_cpu_idle_enter(void)
{
- tsc_verify_tsc_adjust();
+ tsc_verify_tsc_adjust(false);
local_touch_nmi();
enter_idle();
}
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1080,6 +1080,11 @@ static void detect_art(void)
static struct clocksource clocksource_tsc;
+static void tsc_resume(struct clocksource *cs)
+{
+ tsc_verify_tsc_adjust(true);
+}
+
/*
* We used to compare the TSC to the cycle_last value in the clocksource
* structure to avoid a nasty time-warp. This can be observed in a
@@ -1112,6 +1117,7 @@ static struct clocksource clocksource_ts
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
CLOCK_SOURCE_MUST_VERIFY,
.archdata = { .vclock_mode = VCLOCK_TSC },
+ .resume = tsc_resume,
};
void mark_tsc_unstable(char *reason)
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -30,7 +30,7 @@ struct tsc_adjust {
static DEFINE_PER_CPU(struct tsc_adjust, tsc_adjust);
-void tsc_verify_tsc_adjust(void)
+void tsc_verify_tsc_adjust(bool resume)
{
struct tsc_adjust *adj = this_cpu_ptr(&tsc_adjust);
s64 curval;
@@ -39,7 +39,7 @@ void tsc_verify_tsc_adjust(void)
return;
/* Rate limit the MSR check */
- if (time_before(jiffies, adj->nextcheck))
+ if (!resume && time_before(jiffies, adj->nextcheck))
return;
adj->nextcheck = jiffies + HZ;
@@ -51,7 +51,7 @@ void tsc_verify_tsc_adjust(void)
/* Restore the original value */
wrmsrl(MSR_IA32_TSC_ADJUST, adj->adjusted);
- if (!adj->warned) {
+ if (!adj->warned || resume) {
pr_warn(FW_BUG "TSC ADJUST differs: CPU%u %lld --> %lld. Restoring\n",
smp_processor_id(), adj->adjusted, curval);
adj->warned = true;
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -256,6 +256,7 @@ static void notrace __restore_processor_
mtrr_bp_restore();
perf_restore_debug_store();
msr_restore_context(ctxt);
+ tsc_verify_tsc_adjust(true);
}
/* Needed by apm.c */
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -133,7 +133,9 @@ static int cpuhp_invoke_callback(unsigne
if (!cb)
return 0;
trace_cpuhp_enter(cpu, st->target, state, cb);
+ pr_info("Invoking state %d CB %pF\n", state, cb);
ret = cb(cpu);
+ pr_info("CB returned %d\n", ret);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
@@ -153,7 +155,10 @@ static int cpuhp_invoke_callback(unsigne
cnt = 0;
hlist_for_each(node, &step->list) {
trace_cpuhp_multi_enter(cpu, st->target, state, cbm, node);
+ pr_info("Invoking state %d CB %pF instance %p\n", state, cbm,
+ node);
ret = cbm(cpu, node);
+ pr_info("CB returned %d\n", ret);
trace_cpuhp_exit(cpu, st->state, state, ret);
if (ret)
goto err;
@@ -550,6 +555,7 @@ static void cpuhp_thread_fun(unsigned in
st->should_run = false;
+ pr_info("CPUHP thread running\n");
/* Single callback invocation for [un]install ? */
if (st->single) {
if (st->cb_state < CPUHP_AP_ONLINE) {
Am 09.12.2016 um 23:59 schrieb Thomas Gleixner:
> On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>
> Cc'ed someone from Dell.
>
>> Am 09.12.2016 um 18:33 schrieb Thomas Gleixner:
>>> Can you add the patch below to gather more information? There is a hunk in
>>> there with an '#if 0' which sets the TSC ADJUST to 0 on boot, which you can
>>> turn on as second step.
>>
>> Ok, here's the results:
>> ...
>> TSC ADJUST synchronize: Reference CPU0: -2820267100 CPU1: -2822498296
>> TSC target sync skipped
>> smpboot: Vector locked
>> smpboot: Vector setup done
>> smpboot: Clock setup
>> TSC source sync skipped
>> smpboot: Target CPU is online
>
> I did not expect that to happen. Now I'm puzzled and curious where the
> machine gets lost after that. See below.
>
>> With the #if 0 block activated, it boots up fine, the output was:
>
> That does not make any sense at all, but yes, nothing in this context makes
> sense.
>
>> [ 1.038892] x86: Booting SMP configuration:
>> [ 1.038930] .... node #0, CPUs: #1
>> [ 0.171851] TSC ADJUST: CPU1: -2830353064 218577682002
>> [ 1.117495] TSC source sync 0 -> 1 runs 3
>> [ 0.171852] TSC ADJUST differs: Reference CPU0: -2828600940 CPU1:
>> -2830353064
>> [ 0.171853] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2830353064
>> [ 1.117497] TSC target sync skip
>
>> (And fwiw with my quick hack the lockups disappear to when I change that
>> back to blast a zero into TSC_ADJ for all cpus.)
>
> Right, That's what that hunk does as well.
>
> Now what's interesting is that the adjustement of CPU1 in the non write to
> zero case results in the following:
>
> TSC ADJUST: CPU1: -2830353064 218577682002 <-- TSC value
> TSC ADJUST differs: Reference CPU0: -2828600940 CPU1: -2830353064
>
> We write CPU1 adjust register to -2828600940 which makes the TSC on CPU1
> jump forwards by -2828600940 - -2830353064 = 1752124 cycles.
>
> In the write to zero case the jump is forward as well, but this time it's
> huge, i.e. 2830353064 cycles.
>
> I tried to wreckage the TSC by writing similar values to the adjust MSR on
> early boot, but independent of the values and independent of the write to
> zero part the machine comes up happily all the time.
>
> The only difference is that my machine has a somewhat saner BIOS. So the
> thing might just die in the value add SMM crap, but who knows.
>
> In the patch below is another bunch of debug prints which emit the state
> information of CPU1 during bringup. Maybe that gives a hint where the
> system gets stuck when you disable the 'write to zero' magic again.
>
> The NMI watchdog does not catch anything, right?
Nope. (Though as mentioned earlier, with my hack when not writing zero
it did - but the lockup there was later after all 16 cpus were online,
and I only really tried that with the ubuntu 4.4 kernel. I never got to
see the full output from that NMI though due to limited screen space, my
attempts to try anything different than text mode were met with a blank
screen, and from the parts I did see I didn't really see anything
interesting albeit that's not saying much as I really have no idea about
that code...)
With the new patch here's the output (albeit the typing gets a bit
annoying...)
...
Invoking state 32 CB replay_prepare_cpu+0x0/0xe0
CB returned 0
Invoking state 35 CB rcutree_prepare_cpu+0x0/0x50
CB returned 0
Invoking state 41 CB notify_prepare+0x0/0xa0
CBreturned 0
Invoking state 48 CB bringup_cpu+0x0/0x90
x86: Booting SMP configuration:
.... node #0, CPUs: #1
Invoking state 51 CB sched_cpu_starting+0x0/0x60
CB returned 0
Invoking state 62 CB x86_pmu_starting_cpu+0x0/0x20
CB returned 0
TSC ADJUST: CPU1: -2846131604 175264328618
TSC ADJUST differs: Reference CPU0: -2843967660 CPU1: -2846131604
TSC ADJUST synchronize: Reference CPU0: -2843967660 CPU1: -2846131604
TSC target sync skip
smpboot: Vector locked
TSC source sync 0 -> 1 runs 3
smpboot: Vector setup done
smpboot: Clock setup
TSC source sync skipped
smpboot: Target CPU is online
>
>> The system also came back up fine from suspend with this (well - still
>> minus graphics...), however disabled tsc clocksource:
>>
>> [ 579.931739] Enabling non-boot CPUs ...
>> [ 579.943107] smpboot: Booting Node 0 Processor 1 APIC 0x2
>> [ 579.943189] TSC ADJUST: CPU1: -1504429974 21601834126
>
> Fun, yet another adjust value. Are they set by a random number generator?
>
>> [ 579.944093] CPU1 is up
>
>> [ 580.458983] clocksource: timekeeping watchdog on CPU1: Marking
>> clocksource 'tsc' as unstable because the skew is too large:
>> [ 580.458985] clocksource: 'hpet' wd_now: 587c1
>> wd_last: 437c7 mask: ffffffff
>> [ 580.458986] clocksource: 'tsc' cs_now:
>> 563963cd8 cs_last: 508f5a02a mask: ffffffffffffffff
>
> Ok, that's caused by the fact that we do not sanitize the TSC adjust
> register on the boot CPU in the resume path.
Yep. (Which is why my hack hacked the restore path as well - I am still
really confused about boot/resume boot cpu initialization...)
>
>> [ 581.006760] [Firmware Bug]: TSC ADJUST differs: CPU0 0 -->
>> -1502494750. Restoring
>
> We only detect it later and correct it, but that's too late. The untested
> patch below should cure that.
>
>>> This BIOS seems to be rather strange ...
>> Don't tell me...
>>
>> No idea what it's doing, but I think it's safe to say whatever it's
>> trying to do, it's doing it wrong...
>
> Amen to that. I've seen a lot of Value Add BIOSes which broke things left
> and right, but this one takes it to a new level of insanity.
>
> Did you report that back to DELL already?
No, I never did (even though the box actually shipped with ubuntu
(albeit a really old version)). Since it was basically working - just
without TSCs...
I was quite happy with using my hack for quite some time, it just got
annoying I'll either have to stick to one kernel or reapply it all the
time :-).
Roland
>
> Thanks,
>
> tglx
>
> 8<--------------------
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -47,7 +47,7 @@ extern int tsc_clocksource_reliable;
> */
> #ifdef CONFIG_X86_TSC
> extern bool tsc_store_and_check_tsc_adjust(void);
> -extern void tsc_verify_tsc_adjust(void);
> +extern void tsc_verify_tsc_adjust(bool resume);
> extern void check_tsc_sync_source(int cpu);
> extern void check_tsc_sync_target(void);
> #else
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -277,7 +277,7 @@ void exit_idle(void)
>
> void arch_cpu_idle_enter(void)
> {
> - tsc_verify_tsc_adjust();
> + tsc_verify_tsc_adjust(false);
> local_touch_nmi();
> enter_idle();
> }
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1080,6 +1080,11 @@ static void detect_art(void)
>
> static struct clocksource clocksource_tsc;
>
> +static void tsc_resume(struct clocksource *cs)
> +{
> + tsc_verify_tsc_adjust(true);
> +}
> +
> /*
> * We used to compare the TSC to the cycle_last value in the clocksource
> * structure to avoid a nasty time-warp. This can be observed in a
> @@ -1112,6 +1117,7 @@ static struct clocksource clocksource_ts
> .flags = CLOCK_SOURCE_IS_CONTINUOUS |
> CLOCK_SOURCE_MUST_VERIFY,
> .archdata = { .vclock_mode = VCLOCK_TSC },
> + .resume = tsc_resume,
> };
>
> void mark_tsc_unstable(char *reason)
> --- a/arch/x86/kernel/tsc_sync.c
> +++ b/arch/x86/kernel/tsc_sync.c
> @@ -30,7 +30,7 @@ struct tsc_adjust {
>
> static DEFINE_PER_CPU(struct tsc_adjust, tsc_adjust);
>
> -void tsc_verify_tsc_adjust(void)
> +void tsc_verify_tsc_adjust(bool resume)
> {
> struct tsc_adjust *adj = this_cpu_ptr(&tsc_adjust);
> s64 curval;
> @@ -39,7 +39,7 @@ void tsc_verify_tsc_adjust(void)
> return;
>
> /* Rate limit the MSR check */
> - if (time_before(jiffies, adj->nextcheck))
> + if (!resume && time_before(jiffies, adj->nextcheck))
> return;
>
> adj->nextcheck = jiffies + HZ;
> @@ -51,7 +51,7 @@ void tsc_verify_tsc_adjust(void)
> /* Restore the original value */
> wrmsrl(MSR_IA32_TSC_ADJUST, adj->adjusted);
>
> - if (!adj->warned) {
> + if (!adj->warned || resume) {
> pr_warn(FW_BUG "TSC ADJUST differs: CPU%u %lld --> %lld. Restoring\n",
> smp_processor_id(), adj->adjusted, curval);
> adj->warned = true;
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -256,6 +256,7 @@ static void notrace __restore_processor_
> mtrr_bp_restore();
> perf_restore_debug_store();
> msr_restore_context(ctxt);
> + tsc_verify_tsc_adjust(true);
> }
>
> /* Needed by apm.c */
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -133,7 +133,9 @@ static int cpuhp_invoke_callback(unsigne
> if (!cb)
> return 0;
> trace_cpuhp_enter(cpu, st->target, state, cb);
> + pr_info("Invoking state %d CB %pF\n", state, cb);
> ret = cb(cpu);
> + pr_info("CB returned %d\n", ret);
> trace_cpuhp_exit(cpu, st->state, state, ret);
> return ret;
> }
> @@ -153,7 +155,10 @@ static int cpuhp_invoke_callback(unsigne
> cnt = 0;
> hlist_for_each(node, &step->list) {
> trace_cpuhp_multi_enter(cpu, st->target, state, cbm, node);
> + pr_info("Invoking state %d CB %pF instance %p\n", state, cbm,
> + node);
> ret = cbm(cpu, node);
> + pr_info("CB returned %d\n", ret);
> trace_cpuhp_exit(cpu, st->state, state, ret);
> if (ret)
> goto err;
> @@ -550,6 +555,7 @@ static void cpuhp_thread_fun(unsigned in
>
> st->should_run = false;
>
> + pr_info("CPUHP thread running\n");
> /* Single callback invocation for [un]install ? */
> if (st->single) {
> if (st->cb_state < CPUHP_AP_ONLINE) {
>
Am 10.12.2016 um 02:55 schrieb Roland Scheidegger:
> Am 09.12.2016 um 23:59 schrieb Thomas Gleixner:
>> On Fri, 9 Dec 2016, Roland Scheidegger wrote:
>>
>> Cc'ed someone from Dell.
>>
>>> Am 09.12.2016 um 18:33 schrieb Thomas Gleixner:
>>>> Can you add the patch below to gather more information? There is a hunk in
>>>> there with an '#if 0' which sets the TSC ADJUST to 0 on boot, which you can
>>>> turn on as second step.
>>>
>>> Ok, here's the results:
>>> ...
>>> TSC ADJUST synchronize: Reference CPU0: -2820267100 CPU1: -2822498296
>>> TSC target sync skipped
>>> smpboot: Vector locked
>>> smpboot: Vector setup done
>>> smpboot: Clock setup
>>> TSC source sync skipped
>>> smpboot: Target CPU is online
>>
>> I did not expect that to happen. Now I'm puzzled and curious where the
>> machine gets lost after that. See below.
>>
>>> With the #if 0 block activated, it boots up fine, the output was:
>>
>> That does not make any sense at all, but yes, nothing in this context makes
>> sense.
>>
>>> [ 1.038892] x86: Booting SMP configuration:
>>> [ 1.038930] .... node #0, CPUs: #1
>>> [ 0.171851] TSC ADJUST: CPU1: -2830353064 218577682002
>>> [ 1.117495] TSC source sync 0 -> 1 runs 3
>>> [ 0.171852] TSC ADJUST differs: Reference CPU0: -2828600940 CPU1:
>>> -2830353064
>>> [ 0.171853] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2830353064
>>> [ 1.117497] TSC target sync skip
>>
>>> (And fwiw with my quick hack the lockups disappear to when I change that
>>> back to blast a zero into TSC_ADJ for all cpus.)
>>
>> Right, That's what that hunk does as well.
>>
>> Now what's interesting is that the adjustement of CPU1 in the non write to
>> zero case results in the following:
>>
>> TSC ADJUST: CPU1: -2830353064 218577682002 <-- TSC value
>> TSC ADJUST differs: Reference CPU0: -2828600940 CPU1: -2830353064
>>
>> We write CPU1 adjust register to -2828600940 which makes the TSC on CPU1
>> jump forwards by -2828600940 - -2830353064 = 1752124 cycles.
>>
>> In the write to zero case the jump is forward as well, but this time it's
>> huge, i.e. 2830353064 cycles.
>>
>> I tried to wreckage the TSC by writing similar values to the adjust MSR on
>> early boot, but independent of the values and independent of the write to
>> zero part the machine comes up happily all the time.
>>
>> The only difference is that my machine has a somewhat saner BIOS. So the
>> thing might just die in the value add SMM crap, but who knows.
>>
>> In the patch below is another bunch of debug prints which emit the state
>> information of CPU1 during bringup. Maybe that gives a hint where the
>> system gets stuck when you disable the 'write to zero' magic again.
>>
>> The NMI watchdog does not catch anything, right?
> Nope. (Though as mentioned earlier, with my hack when not writing zero
> it did - but the lockup there was later after all 16 cpus were online,
> and I only really tried that with the ubuntu 4.4 kernel. I never got to
> see the full output from that NMI though due to limited screen space, my
> attempts to try anything different than text mode were met with a blank
> screen, and from the parts I did see I didn't really see anything
> interesting albeit that's not saying much as I really have no idea about
> that code...)
>
> With the new patch here's the output (albeit the typing gets a bit
> annoying...)
> ...
> Invoking state 32 CB replay_prepare_cpu+0x0/0xe0
> CB returned 0
> Invoking state 35 CB rcutree_prepare_cpu+0x0/0x50
> CB returned 0
> Invoking state 41 CB notify_prepare+0x0/0xa0
> CBreturned 0
> Invoking state 48 CB bringup_cpu+0x0/0x90
> x86: Booting SMP configuration:
> .... node #0, CPUs: #1
> Invoking state 51 CB sched_cpu_starting+0x0/0x60
> CB returned 0
> Invoking state 62 CB x86_pmu_starting_cpu+0x0/0x20
> CB returned 0
> TSC ADJUST: CPU1: -2846131604 175264328618
> TSC ADJUST differs: Reference CPU0: -2843967660 CPU1: -2846131604
> TSC ADJUST synchronize: Reference CPU0: -2843967660 CPU1: -2846131604
> TSC target sync skip
> smpboot: Vector locked
> TSC source sync 0 -> 1 runs 3
> smpboot: Vector setup done
> smpboot: Clock setup
> TSC source sync skipped
> smpboot: Target CPU is online
Ok I did some more digging. Since it appeared it never returned from
x86_cpuinit.setup_percpu_clockev() I followed that a bit more. This is
using the tsc deadline timer, ending up in clockevents_register_device()
finally. This executes all well except the raw_spin_unlock_irqrestore()
at the end which we never get past.
I disabled the tsc deadline timer (lapic=notscdeadline) and indeed, no
more lockups!
So could there be something be wrong with setting this up? Warping past
some event due to resynchronization or something?
Or hitting some bugs with TSC deadline interrupts?
Anyway, that's definitely out of my area of knowledge, hope it helps...
Roland