Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753002AbcJKVFo (ORCPT ); Tue, 11 Oct 2016 17:05:44 -0400 Received: from mga14.intel.com ([192.55.52.115]:48825 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752121AbcJKVFn (ORCPT ); Tue, 11 Oct 2016 17:05:43 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,330,1473145200"; d="scan'208";a="889039128" Date: Tue, 11 Oct 2016 14:11:22 -0700 From: Bin Gao To: Thomas Gleixner Cc: Ingo Molnar , "H. Peter Anvin" , John Stultz , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org, bin.gao@intel.com Subject: Re: [PATCH v2] x86/tsc: Set X86_FEATURE_TSC_RELIABLE to skip refined calibration Message-ID: <20161011211121.GA15041@worksta> References: <20160816174240.GA33372@worksta> <20160825164350.GA245186@worksta> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4591 Lines: 110 On Fri, Aug 26, 2016 at 12:14:58PM +0200, Thomas Gleixner wrote: > On Fri, 26 Aug 2016, Thomas Gleixner wrote: > > On Thu, 25 Aug 2016, Bin Gao wrote: > > > On Wed, Aug 24, 2016 at 10:51:20AM +0200, Thomas Gleixner wrote: > > > > On Tue, 16 Aug 2016, Bin Gao wrote: > > > > > On some newer Intel x86 processors/SoCs the TSC frequency can be directly > > > > > calculated by factors read from specific MSR registers or from a cpuid > > > > > leaf (0x15). TSC frequency calculated by native msr/cpuid is absolutely > > > > > accurate so we should always skip calibrating TSC aginst another clock, > > > > > e.g. PIT, HPET, etc. So we want to skip the refined calibration by setting > > > > > the X86_FEATURE_TSC_RELIABLE flag. Existing code setting the flag by > > > > > set_cpu_cap() doesn't work as the flag is cleared later in identify_cpu(). > > > > > A cpu caps flag is not cleared only if it's set by setup_force_cpu_cap(). > > > > > This patch converted set_cpu_cap() to setup_force_cpu_cap() to ensure > > > > > refined calibration is skipped. > > > > > > > > > > We had a test on Intel CherryTrail platform: the 24 hours time drift is > > > > > 3.6 seconds if refined calibration was not skipped while the drift is less > > > > > than 0.6 second when refined calibration was skipped. > > > > > > > > > > Correctly setting the X86_FEATURE_TSC_RELIABLE flag also guarantees TSC is > > > > > not monitored by timekeeping watchdog because on most of these system TSC > > > > > is the only reliable clocksource. HPET, for instance, works but may not > > > > > be reliable. So kernel may report a physically reliable TSC is not reliable > > > > > just because a physically not reliable HPET is acting as timekeeping > > > > > watchdog. > > > > > > > > What about non SoC systems where the MSR is available, but we still see that > > > > cross socket TSC wreckage? This change will prevent the watchdog from > > > > detecting that. > > > > > > MSR is only available on Intel Atom SoCs. There is no such a multi-socket system. > > > > Fair enough. > > Second thoughts. We should seperate the calibration aspect from the reliablity > aspect. > > If a MSR/CPUID readout provides reliable calibration then this does not tell > us about the reliablity (i.e. no watchdog required). So having two flags for > this - and sure you can set both on those SoCs is the proper solution. > > Thanks, > > tglx Hi Thomas, The Linux kernel does think a reliable calibration implies the reliability (i.e. no watchdog required). I'm posting some code pieces to explain. X86_FEATURE_TSC_RELIABLE is referred only in two places as shown below. As you can see from init_tsc_clocksource(), X86_FEATURE_TSC_RELIABLE acts as a switch to launch the delayed calibration work. The delayed calibration is skipped if X86_FEATURE_TSC_RELIABLE is set, else not. In check_system_tsc_reliable(), X86_FEATURE_TSC_RELIABLE helps to set tsc_clocksource_reliable which in turn enables TSC as a clocksource watchdog, i.e. watching others instead of being watched by others. So X86_FEATURE_TSC_RELIABLE really means two things: 1) Calibrated(or directly calculated) result is trustable so delayed calibration is skipped. 2) TSC is reliable so clocksource framework won't monitor it, instead TSC acts as a watchdog monitoring other clocksources. X86_FEATURE_TSC_RELIABLE is set also only from two places: arch/x86/platform/intel-mid/mrfld.c and arch/x86/platform/intel-mid/mrfld.c which are for Intel Atom SoCs with MSR based TSC frequency calculation. The patch I'm doing is to set this flag (X86_FEATURE_TSC_RELIABLE) for another case: Intel processors/SoCs with CPUID based TSC frequency calculation. arch/x86/kernel/tsc.c: satic void __init check_system_tsc_reliable(void) { ...... (lines ignored) if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) /* This flag is used by init_tsc_clocksource(), see below. */ tsc_clocksource_reliable = 1; } arch/x86/kernel/tsc.c: static int __init init_tsc_clocksource(void) { ...... (lines ignored) if (tsc_clocksource_reliable) clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; ...... (lines ignored) /* * Trust the results of the earlier calibration on systems * exporting a reliable TSC. */ if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) { clocksource_register_khz(&clocksource_tsc, tsc_khz); return 0; } schedule_delayed_work(&tsc_irqwork, 0); return 0; } Thanks, Bin