Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753971AbbHCSBS (ORCPT ); Mon, 3 Aug 2015 14:01:18 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:35786 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752203AbbHCSBQ (ORCPT ); Mon, 3 Aug 2015 14:01:16 -0400 Date: Mon, 3 Aug 2015 20:01:09 +0200 From: Frederic Weisbecker To: Chris Metcalf Cc: LKML , Peter Zijlstra , Thomas Gleixner , Preeti U Murthy , Christoph Lameter , Ingo Molnar , Viresh Kumar , Rik van Riel Subject: Re: [PATCH 08/10] posix-cpu-timers: Migrate to use new tick dependency mask model Message-ID: <20150803180108.GD26022@lerouge> References: <1437669735-8786-9-git-send-email-fweisbec@gmail.com> <55B26E74.5040803@ezchip.com> <20150729132343.GC11554@lerouge> <55B90C40.5090000@ezchip.com> <20150730004444.GA14744@lerouge> <55BA7C6A.1050602@ezchip.com> <20150730194519.GA24607@lerouge> <55BA8096.7030601@ezchip.com> <20150731144954.GB27875@lerouge> <55BF8FCB.6060409@ezchip.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55BF8FCB.6060409@ezchip.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2823 Lines: 50 On Mon, Aug 03, 2015 at 11:59:07AM -0400, Chris Metcalf wrote: > On 07/31/2015 10:49 AM, Frederic Weisbecker wrote: > >Instead of doing a per signal dependency, I'm going to use a per task > >one. Which means that if a per-process timer is enqueued, every thread > >of that process will have the tick dependency. But if the timer is > >enqueued to a single thread, only the thread is concerned. > > > >We'll see if offloading becomes really needed. It's not quite free because > >the housekeepers will have to poll on all nohz CPUs at a Hz frequency. > > Seems reasonable for now! > > Why would we need the Hz frequency polling, though? I would > think it should be possible to just arrange it such that the timer > for posix cpu timers would just always be placed either on the core > that requested it, or if that core is nohz_full, on a housekeeping > core. Then it would eventually fire from the housekeeping core, > and the logic could be such that (for a process-wide timer) it > would preferentially interrupt threads from that process that > were running on the housekeeping cores. No polling. But you need to periodically poll on timer expiration from a housekeeper. It's not only about firing the timer, it's about elapsing it against the target cputime. Since there is no tick on a nohz full CPU to account the time spent by the task, you must do that elsewhere. And if you don't poll in a sufficient frequency, the time accounted is less precise (a quick round-trip to kernel space can be missed if the polling frequency is too low). Or you can combine it with the VIRT_CPU_ACCOUNTING_GEN that we are using currently which records the time spent in user and kernel space using hooks. Still you must check periodically that the timer hasn't expired at a frequency that doesn't go further the expiration time. Easy in the case of a timer attached to a single task but what about a timer attached to a process? You must poll at least at expiration/nr_threads, so you must handle thread creation as well. Offlining posix timers sounds like a big headache if we don't poll at Hz time. That said Rick has posted patches that offline cputime accounting. I'm not yet sure this patchset is a good idea but offlining posix timers can be done on top of that. Another thing: now I recall why I turned posix timers to a global tick dependency. In case of a per task/process dependency we still need the context switch hook because if we enqueue a timer to a sleeping task, the tick must be restarted when the task wakes up. And that requires a check on context switch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/