Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2992538AbbEOWRm (ORCPT ); Fri, 15 May 2015 18:17:42 -0400 Received: from www.linutronix.de ([62.245.132.108]:43407 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2992435AbbEOWRj (ORCPT ); Fri, 15 May 2015 18:17:39 -0400 Date: Sat, 16 May 2015 00:17:30 +0200 (CEST) From: Thomas Gleixner To: Chris Metcalf cc: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , Rik van Riel , Tejun Heo , Frederic Weisbecker , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , linux-doc@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/5] nohz_full: add support for "cpu_isolated" mode In-Reply-To: <1431725251-20943-1-git-send-email-cmetcalf@ezchip.com> Message-ID: References: <1431725178-20876-1-git-send-email-cmetcalf@ezchip.com> <1431725251-20943-1-git-send-email-cmetcalf@ezchip.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3059 Lines: 91 On Fri, 15 May 2015, Chris Metcalf wrote: > +/* > + * We normally return immediately to userspace. > + * > + * In "cpu_isolated" mode we wait until no more interrupts are > + * pending. Otherwise we nap with interrupts enabled and wait for the > + * next interrupt to fire, then loop back and retry. > + * > + * Note that if you schedule two "cpu_isolated" processes on the same > + * core, neither will ever leave the kernel, and one will have to be > + * killed manually. And why are we not preventing that situation in the first place? The scheduler should be able to figure that out easily.. > + Otherwise in situations where another process is > + * in the runqueue on this cpu, this task will just wait for that > + * other task to go idle before returning to user space. > + */ > +void tick_nohz_cpu_isolated_enter(void) > +{ > + struct clock_event_device *dev = > + __this_cpu_read(tick_cpu_device.evtdev); > + struct task_struct *task = current; > + unsigned long start = jiffies; > + bool warned = false; > + > + /* Drain the pagevecs to avoid unnecessary IPI flushes later. */ > + lru_add_drain(); > + > + while (ACCESS_ONCE(dev->next_event.tv64) != KTIME_MAX) { What's the ACCESS_ONCE for? > + if (!warned && (jiffies - start) >= (5 * HZ)) { > + pr_warn("%s/%d: cpu %d: cpu_isolated task blocked for %ld jiffies\n", > + task->comm, task->pid, smp_processor_id(), > + (jiffies - start)); What additional value has the jiffies delta over a plain human readable '5sec' ? > + warned = true; > + } > + if (should_resched()) > + schedule(); > + if (test_thread_flag(TIF_SIGPENDING)) > + break; > + > + /* Idle with interrupts enabled and wait for the tick. */ > + set_current_state(TASK_INTERRUPTIBLE); > + arch_cpu_idle(); Oh NO! Not another variant of fake idle task. The idle implementations can call into code which rightfully expects that the CPU is actually IDLE. I wasted enough time already debugging the resulting wreckage. Feel free to use it for experimental purposes, but this is not going anywhere near to a mainline kernel. I completely understand WHY you want to do that, but we need proper mechanisms for that and not some duct tape engineering band aids which will create hard to debug side effects. Hint: It's a scheduler job to make sure that the machine has quiesced _BEFORE_ letting the magic task off to user land. > + set_current_state(TASK_RUNNING); > + } > + if (warned) { > + pr_warn("%s/%d: cpu %d: cpu_isolated task unblocked after %ld jiffies\n", > + task->comm, task->pid, smp_processor_id(), > + (jiffies - start)); > + dump_stack(); And that dump_stack() tells us which important information? tick_nohz_cpu_isolated_enter context_tracking_enter context_tracking_user_enter arch_return_to_user_code Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/