Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932860AbbEMEfw (ORCPT ); Wed, 13 May 2015 00:35:52 -0400 Received: from mail-la0-f43.google.com ([209.85.215.43]:35616 "EHLO mail-la0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750825AbbEMEfs (ORCPT ); Wed, 13 May 2015 00:35:48 -0400 MIME-Version: 1.0 In-Reply-To: <20150512125200.GB17244@gmail.com> References: <1431107927-13998-1-git-send-email-cmetcalf@ezchip.com> <1431107927-13998-5-git-send-email-cmetcalf@ezchip.com> <20150512093349.GH21418@twins.programming.kicks-ass.net> <20150512095030.GD11477@gmail.com> <20150512103805.GJ21418@twins.programming.kicks-ass.net> <20150512125200.GB17244@gmail.com> From: Andy Lutomirski Date: Tue, 12 May 2015 21:35:25 -0700 Message-ID: Subject: Re: [PATCH 4/6] nohz: support PR_DATAPLANE_QUIESCE To: Ingo Molnar Cc: Peter Zijlstra , Chris Metcalf , Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Andrew Morton , Rik van Riel , Tejun Heo , Thomas Gleixner , Frederic Weisbecker , "Paul E. McKenney" , Christoph Lameter , "Srivatsa S. Bhat" , "linux-doc@vger.kernel.org" , Linux API , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1949 Lines: 48 On Tue, May 12, 2015 at 5:52 AM, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > >> > So if then a prctl() (or other system call) could be a shortcut >> > to: >> > >> > - move the task to an isolated CPU >> > - make sure there _is_ such an isolated domain available >> > >> > I.e. have some programmatic, kernel provided way for an >> > application to be sure it's running in the right environment. >> > Relying on random administration flags here and there won't cut >> > it. >> >> No, we already have sched_setaffinity() and we should not duplicate >> its ability to move tasks about. > > But sched_setaffinity() does not guarantee isolation - it's just a > syscall to move a task to a set of CPUs, which might be isolated or > not. > > What I suggested is that it might make sense to offer a system call, > for example a sched_setparam() variant, that makes such guarantees. > > Say if user-space does: > > ret = sched_setscheduler(0, BIND_ISOLATED, &isolation_params); > > ... then we would get the task moved to an isolated domain and get a 0 > return code if the kernel is able to do all that and if the current > uid/namespace/etc. has the required permissions and such. > > ( BIND_ISOLATED will not replace the current p->policy value, so it's > still possible to use the regular policies as well on top of this. ) I think we shouldn't have magic selection of an isolated domain. Anyone using this has already configured some isolated CPUs and probably wants to choose the CPU and, especially, NUMA node themselves. Also, maybe it should be a special type of realtime class/priority -- doing this should require RT permission IMO. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/