Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934028AbXJPPCU (ORCPT ); Tue, 16 Oct 2007 11:02:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S933025AbXJPPCE (ORCPT ); Tue, 16 Oct 2007 11:02:04 -0400 Received: from [12.38.223.190] ([12.38.223.190]:1076 "EHLO mail.sw.starentnetworks.com" rhost-flags-FAIL-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1759311AbXJPPCD (ORCPT ); Tue, 16 Oct 2007 11:02:03 -0400 X-Greylist: delayed 678 seconds by postgrey-1.27 at vger.kernel.org; Tue, 16 Oct 2007 11:02:03 EDT MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18196.53154.100115.92459@zeus.sw.starentnetworks.com> Date: Tue, 16 Oct 2007 10:50:10 -0400 From: Dave Johnson To: linux-kernel@vger.kernel.org Subject: [PATCH] i386: fix TSC clock source calibration error X-Mailer: VM 7.17 under 21.4 (patch 17) "Jumbo Shrimp" XEmacs Lucid Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3267 Lines: 96 From: Dave Johnson I ran into this problem on a system that was unable to obtain NTP sync because the clock was running very slow (over 10000ppm slow). ntpd had declared all of its peers 'reject' with 'peer_dist' reason. On investigation, the tsc_khz variable was significantly incorrect causing xtime to run slow. After a reboot tsc_khz was correct so I did a reboot test to see how often the problem occurred: Test was done on a 2000 Mhz Xeon system. Of 689 reboots, 8 of them had unacceptable tsc_khz values (>500ppm): range of tsc_khz # of boots % of boots ----------------- ---------- ---------- < 1999750 0 0.000% 1999750 - 1999800 21 3.048% 1999800 - 1999850 166 24.128% 1999850 - 1999900 241 35.029% 1999900 - 1999950 211 30.669% 1999950 - 2000000 42 6.105% 2000000 - 2000000 0 0.000% 2000050 - 2000100 0 0.000% [...] 2000100 - 2015000 1 0.145% << BAD 2015000 - 2030000 6 0.872% << BAD 2030000 - 2045000 1 0.145% << BAD 2045000 < 0 0.000% The worst boot was 2032.577 Mhz, over 1.5% off! It appears that on rare occasions, mach_countup() is taking longer to complete than necessary. I suspect that this is caused by the CPU taking a periodic SMI interrupt right at the end of the 30ms calibration loop. This would cause the loop to delay while the SMI BIOS hander runs. The resulting TSC value is beyond what it actually should be resulting in a higher tsc_khz. The below patch makes native_calculate_cpu_khz() take the best (shortest duration, lowest khz) run of it's 3 calibration loops. If a SMI goes off causing a bad result (long duration, higher khz) it will be discarded. With the patch applied, 300 boots of the same system produce good results: range of tsc_khz # of boots % of boots ----------------- ---------- ---------- < 1999750 0 0.000% 1999750 - 1999800 30 10.000% 1999800 - 1999850 166 55.333% 1999850 - 1999900 89 29.667% 1999900 - 1999950 15 5.000% 1999950 < 0 0.000% Problem was found and tested against 2.6.18. Patch is against 2.6.22. Signed-off-by: Dave Johnson ===== arch/i386/kernel/tsc.c 1.27 vs edited ===== --- 1.27/arch/i386/kernel/tsc.c 2007-05-02 13:27:18 -04:00 +++ edited/arch/i386/kernel/tsc.c 2007-10-15 16:31:04 -04:00 @@ -122,7 +122,7 @@ { unsigned long long start, end; unsigned long count; - u64 delta64; + u64 delta64 = (u64)ULLONG_MAX; int i; unsigned long flags; @@ -134,6 +134,7 @@ rdtscll(start); mach_countup(&count); rdtscll(end); + delta64 = min(delta64, (end - start)); } /* * Error: ECTCNEVERSET @@ -143,8 +144,6 @@ */ if (count <= 1) goto err; - - delta64 = end - start; /* cpu freq too fast: */ if (delta64 > (1ULL<<32)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/