Received: by 10.213.65.16 with SMTP id m16csp246778imf; Mon, 12 Mar 2018 02:15:57 -0700 (PDT) X-Google-Smtp-Source: AG47ELtfi0cp05INKx4zr8RGNNhbp98bsw4++4iBvzATeT8b6B/QO9FIZW9B5iXPY9d3FzPVBWCS X-Received: by 10.167.129.129 with SMTP id g1mr7296139pfi.224.1520846157358; Mon, 12 Mar 2018 02:15:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520846157; cv=none; d=google.com; s=arc-20160816; b=IHE2v6/daxKEsmrMZym+wG21v+H6Dl2j+A00oJmuJp8pY0vbnw0pTTRNXbYAPRfAXH DV3WzyZI+0RIuS8NG5QLp9DKa7eAJXciYZNNI3lE94ZSDmhJ+VzZ/a+c88aI9H457xi5 ZidinU+ifWLe8Sk1uw+9ejAob3mSOxOutO4aeAdTqbxw/gShjizwNJfhBBuBWKvR3fhd Q1lJiVVwwe5WVo5oLkXFHJwzGNBJJzBWRk7TuP+Gw15Ee480HZ9gSYJYAEF6sZxTi9lx YleMp2fQgvaaPHVQqeNedhwkMejyeN9Ex12v0m2gtEl8r4O73TabzEivhqLqw1j8sACp /fwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id:date :subject:to:from:dkim-signature:arc-authentication-results; bh=8ETesbWggJU7NcSHNlv3Bi1M0MBVqntgKzGjqUl7bFs=; b=xQ8eVbRrEwFt0zb/eAntlMT1TH+DKv8Dv93o+DUqezEhCm6y4VfGnuJ5nqfJg2PWpA AICC7TVy7Pk1QWHcd+hfMT6eml3SfaefiBapJBVRKKJfzoA8J+rWYZ4Rw29+No+jM9Ii rLJY6F8J0tYWxxtkRaRU36HnXNAr+PSDrDQPq87lqtS3ob2O61E6ngzent8aIC40E7Zu JAxarSTcdFI7pSWZi51rRuhKBHwWuUI4SfqWpPerQr2NwUPy+lMBFLQwavFUVeIamGHA f7B+T+sDlyJ1ukBB2JIofmlbHVeq6aD6dhD8Keo6xiuR9nGSMaLKr3qh92sfZbAX4GaN xSKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=lGCnsb1I; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i131si1531917pgc.347.2018.03.12.02.15.42; Mon, 12 Mar 2018 02:15:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=lGCnsb1I; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752246AbeCLJOf (ORCPT + 99 others); Mon, 12 Mar 2018 05:14:35 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:55187 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751255AbeCLJOd (ORCPT ); Mon, 12 Mar 2018 05:14:33 -0400 Received: by mail-wm0-f66.google.com with SMTP id z81so14850412wmb.4 for ; Mon, 12 Mar 2018 02:14:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:user-agent:mime-version; bh=8ETesbWggJU7NcSHNlv3Bi1M0MBVqntgKzGjqUl7bFs=; b=lGCnsb1ISRRWaZLXxAhj3uAjf3gwp0KvHSuV/D641ufg8qcR2v8JX65ONWZ2SwiJjt pzkASpaP2aE7AXOCHnAxGQEynTCw/iBWryUgWUnSadPqRHd1/2qb9tgVDm3hihJ6vC2s 4u6cFvDwDYwzVuWBST/R7vXNERzvhC+jfvARzXFFLXQTUBNdC24669E89J7MPVRDyS+o MpJnI3W0PLSsY9GToR6EtiX9C+7uEblGYnEFF5v51JZUeNI0jejaAek/jgOn3fgfMlV8 yKaiwSmqfGN+Rak8/cuPMaHaKHolK0e/O0jdOQWnJ4Ouoz+i1rP9Fgm4Yp3HrK+f3l1X kvWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:user-agent :mime-version; bh=8ETesbWggJU7NcSHNlv3Bi1M0MBVqntgKzGjqUl7bFs=; b=SZiI6802WNDiva6T9B+2kgcJOVWLSMrRgznI1+qLSQs65M1ZZmPpjzc2ieF3OYCWOS CpwD4Z7G55LyKAvr6O0NyUAagnwXrlFJWSYi2eI37gNVC40ttaHKUCx9YCsvXSiK6wsb vrFiwloL/Xm4pzxX18BJmH8L5bB6vGFIlJd3yzq2O3eKVVHBRIN7NC6Tu0Xw7Qeagox4 uWafGqoziOBiZcCfZnXN0zob5E/rlVLcCebBpd/JMgSLSUr5YpYf2QM5ei4nJxp+Jvdf JKijPVffetPwcCcMbhZuHYiJWoag6NEONre7w3oGbqjUih+IttPjwwPhXCrmSU+W8qrP fEog== X-Gm-Message-State: AElRT7GdZHn7p4bxDWj0dAFw70bHT+vOdz/DzRCwXaHTkR6wqrIFbz8b jhz/Fn5u3VRb++o527pIHeQ= X-Received: by 10.80.174.6 with SMTP id c6mr9916008edd.217.1520846071990; Mon, 12 Mar 2018 02:14:31 -0700 (PDT) Received: from jvdlux ([109.125.19.25]) by smtp.gmail.com with ESMTPSA id v15sm4869737ede.90.2018.03.12.02.14.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Mar 2018 02:14:31 -0700 (PDT) From: Jason Vas Dias To: x86@kernel.org, LKML , Thomas Gleixner , andi , Peter Zijlstra Subject: [PATCH v4.16-rc4 2/2] x86/vdso: on Intel, VDSO should handle CLOCK_MONOTONIC_RAW Date: Mon, 12 Mar 2018 09:14:28 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently the VDSO does not handle clock_gettime( CLOCK_MONOTONIC_RAW, &ts ) on Intel / AMD - it calls vdso_fallback_gettime() for this clock, which issues a syscall, having an unacceptably high latency (minimum measurable time or time between measurements) of 300-700ns on 2 2.8-3.9ghz Haswell x86_64 Family'_'Model : 06_3C machines under various versions of Linux. Sometimes, particularly when correlating elapsed time to performance counter values, code needs to know elapsed time from the perspective of the CPU no matter how "hot" / fast or "cold" / slow it might be running wrt NTP / PTP ; when code needs this, the latencies with a syscall are often unacceptably high. I reported this as Bug #198161 : 'https://bugzilla.kernel.org/show_bug.cgi?id=198961' and in previous posts with subjects matching 'CLOCK_MONOTONIC_RAW' . This patch handles CLOCK_MONOTONIC_RAW clock_gettime() in the VDSO , by exporting the raw clock calibration, last cycles, last xtime_nsec, and last raw_sec value in the vsyscall_gtod_data during vsyscall_update() . Now the new do_monotonic_raw() function in the vDSO has a latency of @ 24ns on average, and the test program: tools/testing/selftest/timers/inconsistency-check.c succeeds with arguments: '-c 4 -t 120' or any arbitrary -t value. The patch is against Linus' latest 4.16-rc5 tree, current HEAD of : git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git . This patch affects only files: arch/x86/include/asm/msr.h arch/x86/include/asm/vgtod.h arch/x86/entry/vdso/vclock_gettime.c arch/x86/entry/vsyscall/vsyscall_gtod.c This is the second patch in the series, which adds use of rdtscp . Best Regards, Jason Vas Dias . --- diff -up linux-4.16-rc5/arch/x86/entry/vdso/vclock_gettime.c.4.16-rc5-p1 linux-4.16-rc5/arch/x86/entry/vdso/vclock_gettime.c --- linux-4.16-rc5/arch/x86/entry/vdso/vclock_gettime.c.4.16-rc5-p1 2018-03-12 08:12:17.110120433 +0000 +++ linux-4.16-rc5/arch/x86/entry/vdso/vclock_gettime.c 2018-03-12 08:59:21.135475862 +0000 @@ -187,7 +187,7 @@ notrace static u64 vread_tsc_raw(void) u64 tsc , last = gtod->raw_cycle_last; - tsc = rdtsc_ordered(); + tsc = gtod->has_rdtscp ? rdtscp((void*)0UL) : rdtsc_ordered(); if (likely(tsc >= last)) return tsc; asm volatile (""); diff -up linux-4.16-rc5/arch/x86/entry/vsyscall/vsyscall_gtod.c.4.16-rc5-p1 linux-4.16-rc5/arch/x86/entry/vsyscall/vsyscall_gtod.c --- linux-4.16-rc5/arch/x86/entry/vsyscall/vsyscall_gtod.c.4.16-rc5-p1 2018-03-12 07:58:07.974214168 +0000 +++ linux-4.16-rc5/arch/x86/entry/vsyscall/vsyscall_gtod.c 2018-03-12 08:54:07.490267640 +0000 @@ -16,6 +16,7 @@ #include #include #include +#include int vclocks_used __read_mostly; @@ -49,6 +50,7 @@ void update_vsyscall(struct timekeeper * vdata->raw_mask = tk->tkr_raw.mask; vdata->raw_mult = tk->tkr_raw.mult; vdata->raw_shift = tk->tkr_raw.shift; + vdata->has_rdtscp = static_cpu_has(X86_FEATURE_RDTSCP); vdata->wall_time_sec = tk->xtime_sec; vdata->wall_time_snsec = tk->tkr_mono.xtime_nsec; diff -up linux-4.16-rc5/arch/x86/include/asm/msr.h.4.16-rc5-p1 linux-4.16-rc5/arch/x86/include/asm/msr.h --- linux-4.16-rc5/arch/x86/include/asm/msr.h.4.16-rc5-p1 2018-03-12 00:25:09.000000000 +0000 +++ linux-4.16-rc5/arch/x86/include/asm/msr.h 2018-03-12 09:06:03.902728749 +0000 @@ -218,6 +218,36 @@ static __always_inline unsigned long lon return rdtsc(); } +/** + * rdtscp() - read the current TSC and (optionally) CPU number, with built-in + * cancellation point replacing barrier - only available + * if static_cpu_has(X86_FEATURE_RDTSCP) . + * returns: The 64-bit Time Stamp Counter (TSC) value. + * Optionally, 'cpu_out' can be non-null, and on return it will contain + * the number (Intel CPU ID) of the CPU that the task is currently running on. + * As does EAX_EDT_RET, this uses the "open-coded asm" style to + * force the compiler + assembler to always use (eax, edx, ecx) registers, + * NOT whole (rax, rdx, rcx) on x86_64 , because only 32-bit + * variables are used - exactly the same code should be generated + * for this instruction on 32-bit as on 64-bit when this asm stanza is used. + * See: SDM , Vol #2, RDTSCP instruction. + */ +static __always_inline u64 rdtscp(u32 *cpu_out) +{ + u32 tsc_lo, tsc_hi, tsc_cpu; + asm volatile + ( "rdtscp" + : "=a" (tsc_lo) + , "=d" (tsc_hi) + , "=c" (tsc_cpu) + ); + if ( unlikely(cpu_out != ((void*)0)) ) + *cpu_out = tsc_cpu; + return ((((u64)tsc_hi) << 32) | + (((u64)tsc_lo) & 0x0ffffffffULL ) + ); +} + /* Deprecated, keep it for a cycle for easier merging: */ #define rdtscll(now) do { (now) = rdtsc_ordered(); } while (0) diff -up linux-4.16-rc5/arch/x86/include/asm/vgtod.h.4.16-rc5-p1 linux-4.16-rc5/arch/x86/include/asm/vgtod.h --- linux-4.16-rc5/arch/x86/include/asm/vgtod.h.4.16-rc5-p1 2018-03-12 07:44:17.910539760 +0000 +++ linux-4.16-rc5/arch/x86/include/asm/vgtod.h 2018-03-12 08:51:48.204845624 +0000 @@ -26,6 +26,7 @@ struct vsyscall_gtod_data { u64 raw_mask; u32 raw_mult; u32 raw_shift; + u32 has_rdtscp; /* open coded 'struct timespec' */ u64 wall_time_snsec; ---