Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752814AbbG3Obv (ORCPT ); Thu, 30 Jul 2015 10:31:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49568 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752430AbbG3Obt (ORCPT ); Thu, 30 Jul 2015 10:31:49 -0400 Date: Thu, 30 Jul 2015 10:31:47 -0400 From: Luiz Capitulino To: Frederic Weisbecker Cc: Chris Metcalf , LKML , Peter Zijlstra , Thomas Gleixner , Preeti U Murthy , Christoph Lameter , Ingo Molnar , Viresh Kumar , Rik van Riel Subject: Re: [PATCH 08/10] posix-cpu-timers: Migrate to use new tick dependency mask model Message-ID: <20150730103147.116f98ba@redhat.com> In-Reply-To: <20150730004444.GA14744@lerouge> References: <1437669735-8786-1-git-send-email-fweisbec@gmail.com> <1437669735-8786-9-git-send-email-fweisbec@gmail.com> <55B26E74.5040803@ezchip.com> <20150729132343.GC11554@lerouge> <55B90C40.5090000@ezchip.com> <20150730004444.GA14744@lerouge> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4799 Lines: 95 On Thu, 30 Jul 2015 02:44:45 +0200 Frederic Weisbecker wrote: > > On Wed, Jul 29, 2015 at 01:24:16PM -0400, Chris Metcalf wrote: > > On 07/29/2015 09:23 AM, Frederic Weisbecker wrote: > > >>At a higher level, is the posix-cpu-timers code here really providing the > > >>>right semantics? It seems like before, the code was checking a struct > > >>>task-specific state, and now you are setting a global state such that if ANY > > >>>task anywhere in the system (even on housekeeping cores) has a pending posix > > >>>cpu timer, then nothing can go into nohz_full mode. > > >>> > > >>>Perhaps what is needed is a task_struct->tick_dependency to go along with > > >>>the system-wide and per-cpu flag words? > > >That's an excellent point! Indeed the tick dependency check on posix-cpu-timers > > >was made on task granularity before and now it's a global dependency. > > > > > >Which means that if any task in the system has a posix-cpu-timer enqueued, it > > >prevents all CPUs from shutting down the tick. I need to mention that in the > > >changelog. > > > > > >Now here is the rationale: I expect that nohz full users are not interested in > > >posix cpu timers at all. The only chance for one to run without breaking the > > >isolation is on housekeeping CPUs. So perhaps there is a corner case somewhere > > >but I assume there isn't until somebody reports an issue. > > > > > >Keeping a task level dependency check means that we need to update it on context > > >switch. Plus it's not only about task but also process. So that means two > > >states to update on context switch and to check from interrupts. I don't think > > >it's worth the effort if there is no user at all. > > > > I really worry about this! The vision EZchip offers our customers is > > that they can run whatever they want on the slow path housekeeping > > cores, i.e. random control-plane code. Then, on the fast-path cores, > > they run their nohz_full stuff without interruption. Often they don't > > even know what the hell is running on their control plane cores - SNMP > > or random third-party crap or god knows what. And there is a decent > > likelihood that some posix cpu timer code might sneak in. I share this thinking. We do the exactly same thing for KVM-RT and I wouldn't be surprised at all if a posix timer pops up in the housekeeping CPUs. > I see. But note that installing a posix cpu timer ends up triggering an > IPI to all nohz full CPUs. That's how nohz full has always behaved. > So users running posix timers on nohz should already suffer issues anyway. I haven't checked how this would affect us, but seems a lot less serious then not having nohz at all. > > > > > You mentioned needing two fields, for task and for process, but in > > fact let's just add the one field to the one thing that needs it and > > not worry about additional possible future needs. And note that it's > > the task_struct->signal where we need to add the field for posix cpu > > timers (the signal_struct) since that's where the sharing occurs, and > > given CLONE_SIGHAND I imagine it could be different from the general > > "process" model anyway. > > Well, posix cpu timers can be install per process (signal struct) or > per thread (task struct). > > But we can certainly simplify that with a per process flag and expand > the thread dependency to the process scope. > > Still there is the issue of telling the CPUs where a process runs when > a posix timer is installed there. There is no process-like tsk->cpus_allowed. > Either we send an IPI everywhere like we do now or we iterate through all > threads in the process to OR all their cpumasks in order to send that IPI. > > > > > In any case it seems like we don't need to do work at context switch. > > Updates to the task's tick_dependency are just done as normal in the > > task context via "current->signal->". When we are returning to user > > space and we want to check the tick, again, we can just read via > > "current->signal->". Why would we need to copy the value around at > > task switch time? That's only necessary if you want to do something > > like read/write the task tick_dependency via the cpu index, I would think. > > Yeah you're right, at least the context switch should be fine. > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/