Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1428667AbdDYJcA convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2017 05:32:00 -0400 Received: from mga06.intel.com ([134.134.136.31]:2364 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1176149AbdDYJbp (ORCPT ); Tue, 25 Apr 2017 05:31:45 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,248,1488873600"; d="scan'208";a="78558668" From: "Lofstedt, Marta" To: Peter Zijlstra , "tglx@linutronix.de" , "mingo@kernel.org" CC: "linux-kernel@vger.kernel.org" , "ville.syrjala@linux.intel.com" , "daniel.lezcano@linaro.org" , "Wysocki, Rafael J" , "martin.peres@linux.intel.com" , "pasha.tatashin@oracle.com" , "daniel.vetter@ffwll.ch" Subject: RE: [PATCH 0/9] sched_clock fixes Thread-Topic: [PATCH 0/9] sched_clock fixes Thread-Index: AQHSurJYD2WqOBEwVkyRQSq4DZK/WqHV1b9g Date: Tue, 25 Apr 2017 09:31:40 +0000 Message-ID: References: <20170421145756.305735607@infradead.org> In-Reply-To: <20170421145756.305735607@infradead.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZjhjMjlkYjMtYzkwNi00NjQ0LWE5ODEtYzg3NGQyODk4MGI5IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IkhNdHB0UnhZNloyNFwvaWw2SnA1RmVMcTNFeVBHbXg3ZllseklyclRMcUUwPSJ9 x-ctpclassification: CTP_IC dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2864 Lines: 64 Hi Peterz, I tested your patch-set on the same Core2 machine as where we discovered the regression. With the tsc=unstable boot param that passrate has improved significantly; 350 fails -> 15 fails. BR, Marta > -----Original Message----- > From: Peter Zijlstra [mailto:peterz@infradead.org] > Sent: Friday, April 21, 2017 5:58 PM > To: tglx@linutronix.de; mingo@kernel.org > Cc: linux-kernel@vger.kernel.org; ville.syrjala@linux.intel.com; > daniel.lezcano@linaro.org; Wysocki, Rafael J ; > Lofstedt, Marta ; martin.peres@linux.intel.com; > pasha.tatashin@oracle.com; peterz@infradead.org; daniel.vetter@ffwll.ch > Subject: [PATCH 0/9] sched_clock fixes > > Hi, > > These patches were inspired (and hopefully fix) two independent bug > reports on > Core2 machines. > > I never could quite reproduce one, but my Core2 machine no longer switches > to stable sched_clock and therefore no longer tickles the problematic stable - > > unstable transition either. > > Before I dug up my Core2 machine, I tried emulating TSC wreckage by poking > random values into the TSC MSR from userspace. Behaviour in that case is > improved as well. > > People have to realize that if we manage to boot with TSC 'stable' (both > sched_clock and clocksource) and we later find out we were mistaken (we > observe a TSC wobble) the clocks that are derived from it _will_ have had an > observable hickup. This is fundamentally unfixable. > > If you own a machine where the BIOS tries to hide SMI latencies by > rewinding TSC (yes, this is a thing), the very best we can do is mark TSC > unstable with a boot parameter. > > For example, this is me writing a stupid value into the TSC: > > [ 46.745082] random: crng init done > [18443029775.010069] clocksource: timekeeping watchdog on CPU0: Marking > clocksource 'tsc' as unstable because the skew is too large: > [18443029775.023141] clocksource: 'hpet' wd_now: 3ebec538 > wd_last: 3e486ec9 mask: ffffffff > [18443029775.034214] clocksource: 'tsc' cs_now: 5025acce9 cs_last: > 24dc3bd21c88ee mask: ffffffffffffffff > [18443029775.046651] tsc: Marking TSC unstable due to clocksource > watchdog [18443029775.054211] TSC found unstable after boot, most likely > due to broken BIOS. Use 'tsc=unstable'. > [18443029775.064434] sched_clock: Marking unstable (70569005835, - > 17833788)<-(-3714295689546517, -2965802361) > [ 70.573700] clocksource: Switched to clocksource hpet > > With some trace_printk()s (not included) I could tell that the wobble occured > at 69.965474. The clock now resumes where it 'should' have been. > > But an unfortunate scheduling event could have resulted in one task having > seen a runtime of ~584 years with 'obvious' effects. Similar jumps can also be > observed from userspace GTOD usage. >