Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754242AbbGQQX2 (ORCPT ); Fri, 17 Jul 2015 12:23:28 -0400 Received: from mail.efficios.com ([78.47.125.74]:46669 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753112AbbGQQX0 (ORCPT ); Fri, 17 Jul 2015 12:23:26 -0400 Date: Fri, 17 Jul 2015 16:23:13 +0000 (UTC) From: Mathieu Desnoyers To: Nikolay Borisov Cc: Paul Turner , linux-kernel@vger.kernel.org, Andrew Hunter , Peter Zijlstra , Ingo Molnar , Ben Maurer , rostedt , "Paul E. McKenney" , Josh Triplett , Linus Torvalds , Andrew Morton , linux-api Message-ID: <1277152121.1054.1437150193382.JavaMail.zimbra@efficios.com> In-Reply-To: <55A8F9B2.2070008@siteground.com> References: <1437076851-14848-1-git-send-email-mathieu.desnoyers@efficios.com> <55A8F9B2.2070008@siteground.com> Subject: Re: [RFC PATCH] thread_local_abi system call: caching current CPU number (x86) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [78.47.125.74] X-Mailer: Zimbra 8.6.0_GA_1153 (ZimbraWebClient - FF39 (Linux)/8.6.0_GA_1153) Thread-Topic: thread_local_abi system call: caching current CPU number (x86) Thread-Index: mjBD1/3Ph4kiywhUwUqVops/RiBb3g== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4040 Lines: 102 ----- On Jul 17, 2015, at 8:48 AM, Nikolay Borisov n.borisov@siteground.com wrote: > On 07/16/2015 11:00 PM, Mathieu Desnoyers wrote: >> Expose a new system call allowing threads to register a userspace memory >> area where to store the current CPU number. Scheduler migration sets the >> TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, >> a notify-resume handler updates the current CPU value within that >> user-space memory area. >> >> This getcpu cache is an alternative to the sched_getcpu() vdso which has >> a few benefits: >> - It is faster to do a memory read that to call a vDSO, >> - This cache value can be read from within an inline assembly, which >> makes it a useful building block for restartable sequences. >> >> This approach is inspired by Paul Turner and Andrew Hunter's work >> on percpu atomics, which lets the kernel handle restart of critical >> sections: >> Ref.: >> * https://lkml.org/lkml/2015/6/24/665 >> * https://lwn.net/Articles/650333/ >> * >> http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf >> >> Benchmarking sched_getcpu() vs tls cache approach. Getting the >> current CPU number: >> >> - With Linux vdso: 12.7 ns >> - With TLS-cached cpu number: 0.3 ns >> >> The system call can be extended by registering a larger structure in >> the future. >> [...] >> +/* >> + * sys_thread_local_abi - setup thread-local ABI for caller thread >> + */ >> +SYSCALL_DEFINE3(thread_local_abi, struct thread_local_abi __user *, tlap, >> + size_t, len, int, flags) >> +{ >> + size_t minlen; >> + >> + if (flags) >> + return -EINVAL; >> + if (current->thread_local_abi && tlap) >> + return -EBUSY; >> + /* Agree on the intersection of userspace and kernel features */ >> + minlen = min_t(size_t, len, sizeof(struct thread_local_abi)); >> + current->thread_local_abi_len = minlen; >> + current->thread_local_abi = tlap; >> + if (!tlap) >> + return 0; >> + /* >> + * Migration checks ->thread_local_abi to see if notify_resume >> + * flag should be set. Therefore, we need to ensure that >> + * the scheduler sees ->thread_local_abi before we update its content. >> + */ >> + barrier(); /* Store thread_local_abi before update content */ >> + if (getcpu_cache_active(current)) { > > Just checking whether my understanding of the code is correct, but this > 'if' is necessary in case we have been moved to a different CPU after > the store of the thread_local_abi? No, this is not correct. Currently, only the getcpu_cache feature is implemented, but if struct thread_local_abi eventually grows with more fields, userspace could call the kernel with a "len" argument that does not cover some of the features. Therefore, the generic way to check whether getcpu_cache is implemented by the current thread is to call "getcpu_cache_active()". If it is enabled, then we need to update the getcpu_cache content for the current thread. The barrier() above is required because we want to store thread_local_abi (and thread_local_abi_len) before we get the current CPU number and store it into the getcpu_cache, because we could be migrated by the scheduler with CONFIG_PREEMPT=y at any point between the moment we read the current CPU number within getcpu_cache_update() and resume userspace. Having thread_local_abi and thread_local_abi_len set before fetching the current CPU number ensures that the scheduler will succeed its own getcpu_cache_active() check, and will therefore raise the resume notifier flag upon migration, which will then fix the CPU number before resuming to userspace. Thanks, Mathieu > >> + if (getcpu_cache_update(current)) >> + return -EFAULT; >> + } >> + return minlen; >> +} -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/