Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752746AbbEGMSz (ORCPT ); Thu, 7 May 2015 08:18:55 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:37325 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752244AbbEGMSw (ORCPT ); Thu, 7 May 2015 08:18:52 -0400 Date: Thu, 7 May 2015 14:18:49 +0200 From: Frederic Weisbecker To: Ingo Molnar Cc: Rik van Riel , Andy Lutomirski , Mike Galbraith , "linux-kernel@vger.kernel.org" , X86 ML , williams@redhat.com, Andrew Lutomirski , fweisbec@redhat.com, Peter Zijlstra , Heiko Carstens , Thomas Gleixner , Ingo Molnar , Paolo Bonzini Subject: Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry Message-ID: <20150507121848.GB32271@lerouge> References: <1430429035-25563-1-git-send-email-riel@redhat.com> <1430429035-25563-4-git-send-email-riel@redhat.com> <20150501064044.GA18957@gmail.com> <554399D1.6010405@redhat.com> <1430659432.4233.3.camel@gmail.com> <55465B2D.6010300@redhat.com> <55466E72.8060602@redhat.com> <20150507104845.GB14924@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150507104845.GB14924@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3267 Lines: 77 On Thu, May 07, 2015 at 12:48:45PM +0200, Ingo Molnar wrote: > > * Rik van Riel wrote: > > > > If, on the other hand, you're just going to remotely sample the > > > in-memory context, that sounds good. > > > > It's the latter. > > > > If you look at /proc//{stack,syscall,wchan} and other files, > > you will see we already have ways to determine, from in memory > > content, where a program is running at a certain point in time. > > > > In fact, the timer interrupt based accounting does a similar thing. > > It has a task examine its own in-memory state to figure out what it > > was doing before the timer interrupt happened. > > > > The kernel side stack pointer is probably enough to tell us whether > > a task is active in kernel space, on an irq stack, or (maybe) in > > user space. Not convinced about the latter, we may need to look at > > the same state the RCU code keeps track of to see what mode a task > > is in... > > > > I am looking at the code to see what locks we need to grab. > > > > I suspect the runqueue lock may be enough, to ensure that the task > > struct, and stack do not go away while we are looking at them. > > That will be enough, especially if you get to the task reference via > rq->curr. > > > We cannot take the lock_trace(task) from irq context, and we > > probably do not need to anyway, since we do not care about a precise > > stack trace for the task. > > So one worry with this and similar approaches of statistically > detecting user mode would be the fact that on the way out to > user-space we don't really destroy the previous call trace - we just > pop off the stack (non-destructively), restore RIPs and are gone. > > We'll need that percpu flag I suspect. Note we have the context tracking state which tells where the current task is: user/system/guest. > > And once we have the flag, we can get rid of the per syscall RCU > callback as well, relatively easily: with CMPXCHG (in > synchronize_rcu()!) we can reliably sample whether a CPU is in user > mode right now, while the syscall entry/exit path does not use any > atomics, we can just use a simple MOV. > > Once we observe 'user mode', then we have observed quiescent state and > synchronize_rcu() can continue. If we've observed kernel mode we can > frob the remote task's TIF_ flags to make it go into a quiescent state > publishing routine on syscall-return. > > The only hard requirement of this scheme from the RCU synchronization > POV is that all kernel contexts that may touch RCU state need to flip > this flag reliably to 'kernel mode': i.e. all irq handlers, traps, > NMIs and all syscall variants need to do this. > > But once it's there, it's really neat. > > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/