Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933978AbbENUy5 (ORCPT ); Thu, 14 May 2015 16:54:57 -0400 Received: from mail-am1on0096.outbound.protection.outlook.com ([157.56.112.96]:11841 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933339AbbENUyy (ORCPT ); Thu, 14 May 2015 16:54:54 -0400 Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none; Message-ID: <55550B88.8030306@ezchip.com> Date: Thu, 14 May 2015 16:54:32 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Peter Zijlstra CC: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Andrew Morton , Rik van Riel , Tejun Heo , Thomas Gleixner , Frederic Weisbecker , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , , , Subject: Re: [PATCH 4/6] nohz: support PR_DATAPLANE_QUIESCE References: <1431107927-13998-1-git-send-email-cmetcalf@ezchip.com> <1431107927-13998-5-git-send-email-cmetcalf@ezchip.com> <20150512093349.GH21418@twins.programming.kicks-ass.net> In-Reply-To: <20150512093349.GH21418@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BLUPR11CA0082.namprd11.prod.outlook.com (10.141.30.50) To DB5PR02MB0776.eurprd02.prod.outlook.com (25.161.243.147) X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0776; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:DB5PR02MB0776;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0776; X-Forefront-PRVS: 0576145E86 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(6009001)(24454002)(76104003)(51704005)(377454003)(479174004)(77096005)(59896002)(83506001)(15975445007)(33656002)(5001960100002)(110136002)(561944003)(65956001)(46102003)(66066001)(65816999)(50986999)(87266999)(76176999)(54356999)(42186005)(189998001)(47776003)(122386002)(4001350100001)(92566002)(36756003)(50466002)(62966003)(77156002)(2950100001)(40100003)(19580395003)(23746002)(87976001)(86362001)(80316001)(99136001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR02MB0776;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 May 2015 20:54:48.5752 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR02MB0776 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2938 Lines: 65 On 05/12/2015 05:33 AM, Peter Zijlstra wrote: > On Fri, May 08, 2015 at 01:58:45PM -0400, Chris Metcalf wrote: >> This prctl() flag for PR_SET_DATAPLANE sets a mode that requires the >> kernel to quiesce any pending timer interrupts prior to returning >> to userspace. When running with this mode set, sys calls (and page >> faults, etc.) can be inordinately slow. However, user applications >> that want to guarantee that no unexpected interrupts will occur >> (even if they call into the kernel) can set this flag to guarantee >> that semantics. > Currently people hot-unplug and hot-plug the CPU to do this. Obviously > that's a wee bit horrible :-) > > Not sure if a prctl like this is any better though. This is a CPU > properly not a process one. The CPU property aspects, I think, should be largely handled by fixing kernel bugs that let work end up running on nohz_full cores without having been explicitly requested to run there. As you said in a follow-up email: On 05/12/2015 06:38 AM, Peter Zijlstra wrote: > Ideally we'd never have to clear the state because it should be > impossible to get into this predicament in the first place. What my prctl() proposal does is quiesce things that end up happening specifically because the user process called on purpose into the kernel. For example, perhaps RCU was invoked in the kernel, and the core has to wait a timer tick to quiesce RCU. Whatever causes it, the intent is that you're not allowed back into userspace until everything has settled down from your call into the kernel; the presumption is that it's all due to the kernel entry that was just made, and not from other stray work. In that sense, it's very appropriate for it to be a process property. > ISTR people talking about 'quiesce' sysfs file, along side the hotplug > stuff, I can't quite remember. It seems somewhat similar (adding Viresh to the cc's) but does seem like it might have been more intended to address the CPU properties rather than process properties: https://lkml.org/lkml/2014/4/4/99 One thing the original Tilera dataplane code did was to require setting dataplane flags to succeed only on dataplane cores, and only when the task had been affinitized to that single core. This did not protect the task from later being re-affinitized in a way that broke those assumptions, but I suppose you could also imagine make sched_setaffinity() fail for such a process. Somewhat unrelated, but it occurred to me in the context of this reply, so what do you think? I can certainly add this to the patch series if it seems like it makes setting the prctl() flags more conservative. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/