Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp316213imm; Wed, 13 Jun 2018 00:28:31 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKO7ynfzu3Cci1k9Vl6ViKcn3sZ935PsingCxqt104OBrX94pnfxdNPFWrg8u7LRrykuU8S X-Received: by 2002:a62:ccdc:: with SMTP id j89-v6mr3730262pfk.232.1528874911106; Wed, 13 Jun 2018 00:28:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528874911; cv=none; d=google.com; s=arc-20160816; b=LfcGlslZVGG98fQ2182Xe+LBVvVAJsKEcKitHnkXEWGlitMGlMq8XMof6yafcNzNIG 6hUn2zy0uxqdgqLMGXOjp+c0IJddhMzA7M896Ud9WqNMaHX6Vd8xP6SzSJXtJ94rx+CC Ju+g4DTnsGLZ1p8MHK/OcIgfPcKpIXNzSMusP5qSgZk8c3fbmdr7ARtqG/AMlL+wKtgy DWqByJZUhkYhjeU8YKkksgYREYCOy4GlAmrJQ1PXtQ0dxVNtyHda7x2EQPbS3CbBzsJU gd2lORl0ccBi5PEViOlB41b1/Z8i0kkxQ8EIBXokVmiOEcz6lCQ2K2F+W7LZKaugFZXC IA/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=c0MfKlN1+FLjVcmj4CJsk6wwNlWHWop0zcACq0OynIA=; b=1Kg/mtREToRVcR0VANJaUtuLQ1IJdXagpTCbnKhqYCFz6jMcydX/Usup/fvFlPBROF vP5ZUO+DSisXdnhKrFNiQbDVE2HEDZYtcC6SBwcgoF4aP/s+dYx/DK/ut8PeGO3Aqjpu 7RHBYT0SMDkWkJAhmESmnQbMbpWWKF9OzIlZfDQzKPwsbhyiwdF8WZAMa44WCc57K2+9 E0unOVj4/hyMCzfE0rwy9FqSdUNRNkvO5aG5w5iCqfw5XfbNkUFaoGrgdM/SXuOqoWJ6 s1X6+95L9t5vUS3f3P4Peb/YaWW+DnE1ss3orsHFH5JgzXQAFTPafauLhEGpesQVmffJ rQBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p18-v6si1738021pgu.671.2018.06.13.00.28.16; Wed, 13 Jun 2018 00:28:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934230AbeFMH1v (ORCPT + 99 others); Wed, 13 Jun 2018 03:27:51 -0400 Received: from mga14.intel.com ([192.55.52.115]:5064 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933912AbeFMH1u (ORCPT ); Wed, 13 Jun 2018 03:27:50 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Jun 2018 00:27:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,217,1526367600"; d="scan'208";a="48589918" Received: from shbuild888.sh.intel.com (HELO localhost) ([10.239.146.239]) by orsmga007.jf.intel.com with ESMTP; 13 Jun 2018 00:27:46 -0700 Date: Wed, 13 Jun 2018 15:29:46 +0800 From: Feng Tang To: Peter Zijlstra , Petr Mladek Cc: Ingo Molnar , Thomas Gleixner , "H . Peter Anvin" , Alan Cox , linux-kernel@vger.kernel.org, alek.du@intel.com, pasha.tatashin@oracle.com, arjan@linux.intel.com, len.brown@intel.com, feng.tang@intel.com Subject: Re: [RFC 2/2] x86, tsc: Enable clock for ealry printk timestamp Message-ID: <20180613072946.2riinain6g2r7pmg@shbuild888> References: <1527672059-6225-1-git-send-email-feng.tang@intel.com> <1527672059-6225-2-git-send-email-feng.tang@intel.com> <20180531135542.4j7w7bxsw43ydx3j@pathway.suse.cz> <20180531155210.GL12180@hirez.programming.kicks-ass.net> <20180601161213.tm44nhrhwfxa2767@shbuild888> <20180606093833.vqg47yhdq7mnj2kp@shbuild888> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180606093833.vqg47yhdq7mnj2kp@shbuild888> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 06, 2018 at 05:38:33PM +0800, Feng Tang wrote: > On Sat, Jun 02, 2018 at 12:12:13AM +0800, Feng Tang wrote: > > Hi Peter and all, > > > > Hi Peter and Petr, > > > > Thanks for your suggestions, will try to find a cleaner and less hacky way, > > and it may take some time as dealing with all kinds of TSC is tricky :) > > > > - Feng > > > > On Thu, May 31, 2018 at 05:52:10PM +0200, Peter Zijlstra wrote: > > > On Thu, May 31, 2018 at 03:55:42PM +0200, Petr Mladek wrote: > > > > I wonder if we could get some cleaner integration into the timer and > > > > printk code. > > > > > > Yes, these patches are particularly horrific.. > > > > > > There were some earlier patches by Pavel Tatashin, which attempted do > > > get things running earlier. > > > > > > http://lkml.kernel.org/r/20180209211143.16215-1-pasha.tatashin@oracle.com > > > > > > I'm not entirely happy with that, but I never did get around to > > > reviewing that last version :-( In particuarly, now that you made me > > > look, I dislike his patch 6 almost as much as these patches. > > > > > > The idea was to get regular sched_clock() running earlier, not to botch > > > some early_sched_clock() into it. > > > > > > Basically run calibrate_tsc() earlier (like _waaay_ earlier, it doesn't > > > rely on anything other than CPUID) and if you have a recent part (with > > > exception of SKX) you'll get a usable tsc rate (and TSC_RELIABLE) and > > > things will work. > > > I just did a hacky experiment by moving the tsc_init()earlier into > setup_arch() and remove the tsc_early_delay_calibrate(). The printk stamp > does start working much earlier! > > > But the __use_tsc and __sched_clock_stable are relying on jump_label, > which can't be used so early (I tried to call the jump_label_init() before > tsc_init(), but kernel crashs, and I worked around it for now). Just figured out the kernel crash when taking jump_label_init() earlier into setup_arch(), the tsc_init() will enable static_key __use_tsc static_key_enable __jump_label_update arch_jump_label_transform __jump_label_transform text_poke_bp text_poke text_poke() will involve page , but paging is not initialized so early yet, so it triggers a panic. Beside this __use_tsc, the sched_clock also has one static key __sched_clock_stable Thanks, Feng > > Please review the debug patch, thanks! > > --- > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 5c623df..b636888 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1201,7 +1201,8 @@ void __init setup_arch(char **cmdline_p) > kvmclock_init(); > #endif > > - tsc_early_delay_calibrate(); > + tsc_init(); > + > if (!early_xdbc_setup_hardware()) > early_xdbc_register_console(); > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > index 4008dd6..8288f39 100644 > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -33,6 +33,7 @@ EXPORT_SYMBOL(cpu_khz); > unsigned int __read_mostly tsc_khz; > EXPORT_SYMBOL(tsc_khz); > > +int tsc_inited; > /* > * TSC can be unstable due to cpufreq or due to unsynced TSCs > */ > @@ -192,7 +193,7 @@ static void set_cyc2ns_scale(unsigned long khz, int cpu, unsigned long long tsc_ > */ > u64 native_sched_clock(void) > { > - if (static_branch_likely(&__use_tsc)) { > + if (static_branch_likely(&__use_tsc) || tsc_inited) { > u64 tsc_now = rdtsc(); > > /* return the value in ns */ > @@ -1387,30 +1391,16 @@ static int __init init_tsc_clocksource(void) > */ > device_initcall(init_tsc_clocksource); > > -void __init tsc_early_delay_calibrate(void) > -{ > - unsigned long lpj; > - > - if (!boot_cpu_has(X86_FEATURE_TSC)) > - return; > - > - cpu_khz = x86_platform.calibrate_cpu(); > - tsc_khz = x86_platform.calibrate_tsc(); > - > - tsc_khz = tsc_khz ? : cpu_khz; > - if (!tsc_khz) > - return; > - > - lpj = tsc_khz * 1000; > - do_div(lpj, HZ); > - loops_per_jiffy = lpj; > -} > - > void __init tsc_init(void) > { > u64 lpj, cyc; > int cpu; > > + if (tsc_inited) > + return; > + > + tsc_inited = 1; > + > if (!boot_cpu_has(X86_FEATURE_TSC)) { > setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER); > return; > @@ -1474,11 +1464,15 @@ void __init tsc_init(void) > lpj = ((u64)tsc_khz * 1000); > do_div(lpj, HZ); > lpj_fine = lpj; > + loops_per_jiffy = lpj; > > use_tsc_delay(); > > check_system_tsc_reliable(); > > + extern void early_set_sched_clock_stable(u64 sched_clock_offset); > + early_set_sched_clock_stable(div64_u64(rdtsc() * 1000, tsc_khz)); > + > if (unsynchronized_tsc()) { > mark_tsc_unstable("TSCs unsynchronized"); > return; > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c > index 10c83e7..6c5c22d 100644 > --- a/kernel/sched/clock.c > +++ b/kernel/sched/clock.c > @@ -119,6 +119,13 @@ static void __scd_stamp(struct sched_clock_data *scd) > scd->tick_raw = sched_clock(); > } > > + > +void early_set_sched_clock_stable(u64 sched_clock_offset) > +{ > + __sched_clock_offset = sched_clock_offset; > + static_branch_enable(&__sched_clock_stable); > +} > + > static void __set_sched_clock_stable(void) > { > struct sched_clock_data *scd; > @@ -342,12 +349,14 @@ static u64 sched_clock_remote(struct sched_clock_data *scd) > * > * See cpu_clock(). > */ > + > +extern int tsc_inited; > u64 sched_clock_cpu(int cpu) > { > struct sched_clock_data *scd; > u64 clock; > > - if (sched_clock_stable()) > + if (sched_clock_stable() || tsc_inited) > return sched_clock() + __sched_clock_offset; > > if (unlikely(!sched_clock_running)) > > > > > > > > > > If you have a dodgy part (sorry SKX), you'll just have to live with > > > sched_clock starting late(r). > > > > > > Do not cobble things on the side, try and get the normal things running > > > earlier.