Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752408AbcDNAZ1 (ORCPT ); Wed, 13 Apr 2016 20:25:27 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:51117 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750802AbcDNAZ0 (ORCPT ); Wed, 13 Apr 2016 20:25:26 -0400 Date: Thu, 14 Apr 2016 02:25:03 +0200 From: Peter Zijlstra To: Waiman Long Cc: Ingo Molnar , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , linux-kernel@vger.kernel.org, x86@kernel.org, Jiang Liu , Borislav Petkov , Andy Lutomirski , Scott J Norton , Douglas Hatch , Randy Wright Subject: Re: [PATCH v4] x86/hpet: Reduce HPET counter read contention Message-ID: <20160414002503.GQ2906@worktop> References: <1460486768-34024-1-git-send-email-Waiman.Long@hpe.com> <20160413061813.GB4705@gmail.com> <570E67B1.3000708@hpe.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <570E67B1.3000708@hpe.com> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1047 Lines: 20 On Wed, Apr 13, 2016 at 11:37:21AM -0400, Waiman Long wrote: > The TSC clocksource, on the other hand, is per cpu. So there won't be much > contention in accessing it. Normally TSC will be used the default clock > source. However, if there is too much variation in the actual clock speeds > of the individual CPUs, Does the system actually have a clock rate skew? Not an offset? > it will cause the TSC calibration to fail and revert > to use hpet as the clock source. During bootup, hpet will usually be > selected as the default clock source first. After a short time, the TSC will > take over as the default clock source. Problem can happen during that short > period of transition time too. In fact, we have 16-socket Broadwell-EX > systems that has this soft lockup problem once in a few reboot cycles which > prompted me to find a solution to fix it. This 16 socket system is a completely broken trainwreck. Trying to use HPET with _that_ many CPUs is absolutely insane. Please tell your hardware engineers to fix the TSC clock domain.