Date:   Wed, 10 Nov 2021 19:07:21 -0800
From:   "Paul E. McKenney" <paulmck@kernel.org>
To:     Feng Tang <feng.tang@intel.com>
Cc:     Waiman Long <longman@redhat.com>,
        John Stultz <john.stultz@linaro.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Stephen Boyd <sboyd@kernel.org>, linux-kernel@vger.kernel.org,
        Peter Zijlstra <peterz@infradead.org>,
        Cassio Neri <cassio.neri@gmail.com>,
        Linus Walleij <linus.walleij@linaro.org>,
        Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [PATCH 0/2] clocksource: Avoid incorrect hpet fallback
Message-ID: <20211111030721.GJ641268@paulmck-ThinkPad-P17-Gen-1>
Reply-To: paulmck@kernel.org
References: <20211110221732.272986-1-longman@redhat.com>
 <20211111012301.GA15663@shbuild999.sh.intel.com>
 <551c33a1-25d0-a840-f26b-238bee51d4ec@redhat.com>
 <20211111015331.GA15724@shbuild999.sh.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20211111015331.GA15724@shbuild999.sh.intel.com>
Precedence: bulk

On Thu, Nov 11, 2021 at 09:53:31AM +0800, Feng Tang wrote:
> On Wed, Nov 10, 2021 at 08:30:10PM -0500, Waiman Long wrote:
> > 
> > On 11/10/21 20:23, Feng Tang wrote:
> > > Hi Waiman, Paul,
> > > 
> > > On Wed, Nov 10, 2021 at 05:17:30PM -0500, Waiman Long wrote:
> > > > It was found that when an x86 system was being stressed by running
> > > > various different benchmark suites, the clocksource watchdog might
> > > > occasionally mark TSC as unstable and fall back to hpet which will
> > > > have a signficant impact on system performance.
> > > We've seen similar cases while running 'netperf' and 'lockbus/ioport'
> > > cases of 'stress-ng' tool.
> > > 
> > > In those scenarios, the clocksource used by kernel is tsc, while
> > > hpet is used as watchdog. And when the "screwing" happens, we found
> > > mostly it's the hpet's 'fault', that when system is under extreme
> > > pressure, the read of hpet could take a long time, and even 2
> > > consecutive read of hpet will have a big gap (up to 1ms+) in between.
> > > So the screw we saw is actually caused by hpet instead of tsc, as
> > > tsc read is a lightweight cpu operation
> > > 
> > > I tried the following patch to detect the screw of watchdog itself,
> > > and avoid wrongly judging the tsc to be unstable. It does help in
> > > our tests, please help to review.
> > > 
> > > And one futher idea is to also adding 2 consecutive read of current
> > > clocksource, and compare its gap with watchdog's, and skip the check
> > > if the watchdog's is bigger.
> > 
> > That is what I found too. And I also did a 2nd watchdog read to compare the
> > consecutive delay versus half the threshold and skip the test if it exceeds
> > it. My patch is actually similar in concept to what your patch does.
>  
> Aha, yes, I missed that.
> 
> I just got to office, and saw the disucssion around 0/2 patch and replied,
> without going through the patches, sorry about that.
> 
> 0day reported some cases about stress-ng testing, and we are still testing
> differenct cases we've seen.

Whichever way it happens, I agree that an extra read to correctly
attribute the latency could be very effective!  Good on you both!

							Thanx, Paul