Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752818AbbKKPzx (ORCPT ); Wed, 11 Nov 2015 10:55:53 -0500 Received: from mail-bn1on0112.outbound.protection.outlook.com ([157.56.110.112]:19136 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752432AbbKKPzv (ORCPT ); Wed, 11 Nov 2015 10:55:51 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=gratian.crisan@ni.com; References: <1447099142-10220-1-git-send-email-gratian.crisan@ni.com> <20151109220232.GO17308@twins.programming.kicks-ass.net> <87mvultz5f.fsf@spline.amer.corp.natinst.com> User-agent: mu4e 0.9.13; emacs 24.3.1 From: Gratian Crisan To: Josh Hunt CC: Gratian Crisan , Peter Zijlstra , Thomas Gleixner , LKML , Ingo Molnar , "H . Peter Anvin" , , Borislav Petkov , Josh Cartwright , Subject: Re: [RFC PATCH] tsc: synchronize TSCs on buggy Intel Xeon E5 CPUs with offset error In-Reply-To: Date: Wed, 11 Nov 2015 09:41:25 -0600 Message-ID: <87io58tufu.fsf@spline.amer.corp.natinst.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [130.164.62.191] X-ClientProxiedBy: SN1PR12CA0040.namprd12.prod.outlook.com (25.162.96.178) To BL2PR04MB852.namprd04.prod.outlook.com (10.242.197.16) X-Microsoft-Exchange-Diagnostics: 1;BL2PR04MB852;2:45ngL6DeXwIImMKwZh8efKGce/lm7+R3JUGcc5uvXA/iHJShhc+Ql2Jllg2i8MQqM588C2seA1+dPbfPry4sm9jeqQidEJE6RgSSr0wqQuA6GxjOTyvnDT+O8zFNNv9bfcS24zkMDky7aWGLHqbJZ/wlA4Wak/r76qXr7QmcvXo=;3:yb5St8QpeUY3kgNMNz7Rw+u464IxgeXZkL7UwQaqlHatxHVtDBoe1ryhzA/ur4b6cGrPkfhrxLOBkMHyuK6+Aa7hYsQaUWnTGThqkUAUJ8CfV+IwpKLQEzCO2HFL8Tga626kpH2J0dN0FxP8RZ/0TQ==;25:/W8aB6GErM61CBYovo/Zx7tjQfvfw87CpRoKi5N/cmwCScTFpgN/I70JZhVRNXoUys9ph1aMEkQ75PoqrQ9mXL/eC/TVSA6N65fl0EK18jgLQQHVJYWlIaYJpHN3ji3/kJf42svNqJkGBHTZgqroPaVhWFo3iLOVz5AD3NIfSt3tPA3gyQ7vQmO69lceXxV08TwW8sOECjL+bJknw9djguiy62hoB7utNU7B6swv+rzNnxQC78/FEUA2tTMhlh+WpBMp17MWF0M+GNoFfy23SQ== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BL2PR04MB852; X-Microsoft-Exchange-Diagnostics: 1;BL2PR04MB852;20:099bUBim1cmieYazsSf2G+Goec+D955KuB4qB+818x3InQcjHEQvMV2P519fDy0S3dAabBpJ3LmNP2UGlBHGcN9vJK3nGrEA2d5UpxeGKConBZzqShbyi2Fa3k77HSuJnZvfgnZEoeZBoamXhkfvMqZBPBYqWr4TrGMu1BzXD+5xDqANmoZpTv0dQzmnCoBQ4qVYw24ZoQWH60jyQYQzM1Yil2mDw/GE4YKglGsGs3M/iXSBJXY/K2IAq4NmCJMtf9KMkq6yRZbIYYj5uvjVBACJSDuVzG4DnTPV33ZnPdl1zx3Zx09kkP9wDgrIOJH3nZ0Lxohf7Pt2EgE3fKiVDkI0AULRIyosU0wpFe3H/3t3mQ3A+H9gdPIBJqPGidtBTtQ5KvVwM6jZIBqYfLEBseZt7URxI46O5y52MvjRiS4epcicDnVG5JlYzhHZVQYPoewuNGh3NXEaC3C3XBQ9B4++wxr3U25esHmtJR+hCQFLja7zGo3YYudJ0+UvsZbYKFB0YGbbNIlo2WsbuAYREgg5yjafQyOprjRNvWUDMFR8Eu8mtumU/jA+7rPaWpQ0eGGxN/VxTfBEJr4N8f34d/U3nI0G1hsqZZm25OIqodo= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(520078)(5005006)(8121501046)(10201501046)(3002001);SRVR:BL2PR04MB852;BCL:0;PCL:0;RULEID:;SRVR:BL2PR04MB852; X-Microsoft-Exchange-Diagnostics: 1;BL2PR04MB852;4:zev+2FqonUAR2cnQ8c8eG7HKScI0XmYGx3E9b6hRxxrklwSF3TlJlmXBIzpQIxgHleoCIJa74aMw3RkKUl2vf+HkL83VqpInsd+S3ZpJtAsVoVE/hcHSuGvT8LlrR4wu5uXceVoxX4c+OFqpJjSfRq140JY6q4nLlqsyQyYKoS5RdOKCR/gXga9Alp5v+B09OnLQseK81NNvM+mfPibsgNgOjUD7eolF+N+C3CLn4EkiDYXOeQ4aMHlbuM8uwmDm5AsNHPYC01Y00X3Yq7KnICWX0tOF5SZgj+NDAaSEY2Qgjy5NWHJl/w0k/Goc+FDgcZHLiDezWPYAy2cGzZD2Z7AEARqd2nU1gaYwiP6Wp4zwgzIYa2TKGoD82P7ct7/Y X-Forefront-PRVS: 0757EEBDCA X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6009001)(189002)(24454002)(377454003)(199003)(19580395003)(48376002)(54356999)(122386002)(42186005)(86362001)(66066001)(50466002)(87976001)(2950100001)(105586002)(101416001)(40100003)(19580405001)(5003940100001)(93886004)(106356001)(50986999)(76176999)(33646002)(77096005)(1411001)(5008740100001)(83506001)(92566002)(189998001)(5001920100001)(4001350100001)(5004730100002)(5001960100002)(97736004)(5007970100001)(47776003)(110136002)(81156007);DIR:OUT;SFP:1102;SCL:1;SRVR:BL2PR04MB852;H:spline.amer.corp.natinst.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;BL2PR04MB852;23:Wcs0kTGqoxrLiAqUbN1IFV6xBV9TCgRgdOR1fwbzLa?= =?us-ascii?Q?uuRyLarHsM5pVNakTcA9c6gG5drLyjprWxmPFDdpD/tgpDW1ixQTm0p1Z7UT?= =?us-ascii?Q?/cU3McwAug30iFaNifOg7Uyi0n9zscIezfN+zLl1B5+lvtRtDJrS7qK14+gZ?= =?us-ascii?Q?Dnp0YkuiYZG8H08CdECJVnCLzCxtMpwZS41PALLDhi8cwgRrw5lZY9k5dsU1?= =?us-ascii?Q?Eolt7Fa5G9ZcTXRDINslGo5MQnrdH8Cde/FuCnaRRMKeTZAajPUe7UEQAmbW?= =?us-ascii?Q?8qt95sT9xQwX1JosYxkIBZ1xc/AWB3K5ga/NtmjraPdAx+w4MWroP5806xxD?= =?us-ascii?Q?wWA98IJp11gin18cAk/Spl0/43JNcwIlC1g+N7V0tlmLEy0D20+/r+Z8Bcu1?= =?us-ascii?Q?rl+fHajkGo/qjbk3BAaVgb1gnlRb37v/Io/7jonTNzLVDoRyrqjOtAEZ9Lwg?= =?us-ascii?Q?Thie4FkrI2tZaSgAsYbs9tyc5djU2OrtrdKx2FYQ/V1swfKfKGJCei4UQkMl?= =?us-ascii?Q?ckBuFj+T6ywIKhAyqqjlZptifVoEGhP5vep3h0hF3ecRFbxB7Dk5PuZertKh?= =?us-ascii?Q?pK/j5mP/wIThKy7aRcRcnPvN0bfG25IOEaE2YldhOGYNBu6dcGPqVUriBMpu?= =?us-ascii?Q?Tp1vkiJ+iJunEKDXjHysdhb79ul7XDHcHw9GnT4/R9qvSFkeHImI365pLG+8?= =?us-ascii?Q?w6gdLvBHQrplhu4LsYFkZ/DKEA7R0CHACCRTTlUjZv4qrIsmbo034Ugc4PvK?= =?us-ascii?Q?2sB2+QzC2IDP1JjGA6YkmouS/jt7XNkzDAVhz5oJVJUwRIFQ96i9VGLQQrRd?= =?us-ascii?Q?NlT3XmMellV3EYY421e9PDZrjvPFvGwksK9n/InP/7VnTWzX3NZzzT0NjpjC?= =?us-ascii?Q?kxmRsHxxhuTwcbmGgBKLH3x01jLDbKg+reevOLFRRxfqSbx19wpAjX5YLsQu?= =?us-ascii?Q?p4E69Xta1EzxO9jD7EfQXTSajH41DKnru//UYRfHQxHX9u0qc93gDqPFtT/e?= =?us-ascii?Q?BN+br6Q6h4QcLGHMEM6+cgoWOgoPFfJUWO2uphZro2pTmYq20ej6jN7biMGD?= =?us-ascii?Q?JT8MjwvpiDT4qGowDQR8K2oSXBpJJhl1oke/OT0kV0sJ43Vv27Xp2OJxY78K?= =?us-ascii?Q?RjK/5KY0g=3D?= X-Microsoft-Exchange-Diagnostics: 1;BL2PR04MB852;5:E72KYWxcSY/Pwd5rbM7xdNNSlqjY4Cabyrg0GXEOOulzwx1uDPpizlv3jbs8ezpCexQgY/np9yZSU6Cxze+dJP/PRb65zG5HOrEeD5WowhbUHo9U+c7okCfUrwhNCmY9R0VXG+d/EwhLl5+w8niDvw==;24:nym2I4THvP0P0hv+aN312Nk4O2c1vz5yyzItdMG0NR0v4SbFnqTntaclWdtrbHm+0zF/DSL+oUJNiIZVHj9PsEsQ5Q1eLZc3CMi6000NRCI=;20:PTMX32nAWl5qNmepX9JgoU7H7UCTGhVo9mXCpp1b4jDGuphjnukCSy06nzlzAkbCC9i0fCAO0/6EM5tk1VgA6g== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ni.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Nov 2015 15:41:32.4547 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR04MB852 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6107 Lines: 175 Josh Hunt writes: > On Tue, Nov 10, 2015 at 1:47 PM, Gratian Crisan wrote: >> >> The observed behavior does seem to match BT81 errata i.e. the TSC does >> not get reset on warm reboots and it is otherwise stable. >> > If you have a simple testcase to reproduce the problem I'd be > interested in seeing it. We have first hit this bug on a 4.1 PREEMPT_RT kernel where it actually causes a boot hang on warm reboots. I haven't quite got to the bottom of why the TSC having a large offset vs. CPU0 would cause the hang yet. I have some stack traces around somewhere I can dig up. I also wrote a small C utility[1], with a bit of code borrowed from the kernel, for reading the TSC on all CPUs. It starts a high priority thread per CPU, tries to synchronize them and prints out the TSC values and their offset with regards to CPU0. It can be called from a SysV init shell script[2] at the beginning of the boot process and right before a reboot to save the values in a file. I've pasted the results after 3 reboots [3]. You can see the CPU0's TSC getting reset on reboot and the other cores happily ticking on throughout the reboot. -Gratian [1] read-tsc.c --8<---------------cut here---------------start------------->8--- #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #define DECLARE_ARGS(val, low, high) unsigned low, high #define EAX_EDX_VAL(val, low, high) ((low) | ((uint64_t)(high) << 32)) #define EAX_EDX_ARGS(val, low, high) "a" (low), "d" (high) #define EAX_EDX_RET(val, low, high) "=a" (low), "=d" (high) static int thread_sync; static unsigned long long *tsc_data = NULL; #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) static inline void rep_nop(void) { asm volatile("rep; nop" ::: "memory"); } static inline void cpu_relax(void) { rep_nop(); } static inline unsigned long long rdtsc_ordered(void) { DECLARE_ARGS(val, low, high); asm volatile("lfence" : : : "memory"); asm volatile("rdtsc" : EAX_EDX_RET(val, low, high)); return EAX_EDX_VAL(val, low, high); } static void* threadfn(void *param) { long cpu = (long)param; cpu_set_t mask; struct sched_param schedp; CPU_ZERO(&mask); CPU_SET(cpu, &mask); if (sched_setaffinity(0, sizeof(mask), &mask) == -1) { perror("error: Failed to set the CPU affinity"); return NULL; } /* * Set the thread priority just below the migration thread's. The idea * is to minimize the chances of being preempted while running the test. */ memset(&schedp, 0, sizeof(schedp)); schedp.sched_priority = sched_get_priority_max(SCHED_FIFO) - 1; if (sched_setscheduler(0, SCHED_FIFO, &schedp) == -1) { perror("error: Failed to set the thread priority"); return NULL; } __sync_sub_and_fetch(&thread_sync, 1); while (ACCESS_ONCE(thread_sync)) cpu_relax(); tsc_data[cpu] = rdtsc_ordered(); return NULL; } int main(int argc, char* argv[]) { long i; unsigned long n_cpus; pthread_t *th = NULL; int ret = EXIT_SUCCESS; n_cpus = sysconf(_SC_NPROCESSORS_ONLN); thread_sync = n_cpus; __sync_synchronize(); tsc_data = (unsigned long long*)malloc(n_cpus * sizeof(unsigned long long)); if (!tsc_data) { fprintf(stderr, "error: Failed to allocate memory for TSC data\n"); ret = EXIT_FAILURE; goto out; } th = (pthread_t *)malloc(n_cpus * sizeof(pthread_t)); if (!th) { fprintf(stderr, "error: Failed to allocate memory for thread data\n"); ret = EXIT_FAILURE; goto out; } for (i = 0; i < n_cpus; i++) pthread_create(&th[i], NULL, threadfn, (void*)i); for (i = 0; i < n_cpus; i++) pthread_join(th[i], NULL); if (argc > 1) printf("%s: ", argv[1]); for (i = 0; i < n_cpus; i++) printf("%llu[%lld] ", tsc_data[i], tsc_data[i] - tsc_data[0]); printf("\n"); out: free(tsc_data); free(th); return ret; } --8<---------------cut here---------------end--------------->8--- [2] /etc/init.d/save-tsc --8<---------------cut here---------------start------------->8--- #!/bin/sh read-tsc "$1" >> /tsc.dat exit 0 --8<---------------cut here---------------end--------------->8--- [3] tsc.dat --8<---------------cut here---------------start------------->8--- stop: 222292260504[0] 146566095145777[146343802885273] 146566095145866[146343802885362] 146566095145817[146343802885313] 146566095145895[146343802885391] 146566095145840[146343802885336] 146566095145751[146343802885247] 146566095145707[146343802885203] start: 42437383741[0] 146626054987730[146583617603989] 146626054987813[146583617604072] 146626054987873[146583617604132] 146626054987444[146583617603703] 146626054987557[146583617603816] 146626054987703[146583617603962] 146626054987922[146583617604181] stop: 175075718467[0] 146758693322318[146583617603851] 146758693322251[146583617603784] 146758693322294[146583617603827] 146758693322276[146583617603809] 146758693322197[146583617603730] 146758693322228[146583617603761] 146758693322116[146583617603649] start: 42318111746[0] 146818573335855[146776255224109] 146818573336118[146776255224372] 146818573335988[146776255224242] 146818573335796[146776255224050] 146818573335930[146776255224184] 146818573335738[146776255223992] 146818573335619[146776255223873] stop: 117186647162[0] 146893441871380[146776255224218] 146893441871412[146776255224250] 146893441871361[146776255224199] 146893441871287[146776255224125] 146893441871335[146776255224173] 146893441871439[146776255224277] 146893441871269[146776255224107] start: 42577639385[0] 146953539519284[146910961879899] 146953539519333[146910961879948] 146953539519268[146910961879883] 146953539536718[146910961897333] 146953539519223[146910961879838] 146953539519068[146910961879683] 146953539519185[146910961879800] --8<---------------cut here---------------end--------------->8--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/