MIME-Version: 1.0
In-Reply-To: <1605b087-2b3b-77c1-01ac-084e378f5f28@mellanox.com>
References: <1471382376-5443-1-git-send-email-cmetcalf@mellanox.com> <1605b087-2b3b-77c1-01ac-084e378f5f28@mellanox.com>
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Wed, 9 Nov 2016 11:07:56 +0000
Message-ID: <CAFTL4hz2RjpWFJRMXu46ESKHXzQn1EN0bhjNs_svYgV5fFX2rQ@mail.gmail.com>
Subject: Re: task isolation discussion at Linux Plumbers
To: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Gilad Ben Yossef <giladb@mellanox.com>,
        Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Rik van Riel <riel@redhat.com>, Tejun Heo <tj@kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Christoph Lameter <cl@linux.com>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Will Deacon <will.deacon@arm.com>,
        Andy Lutomirski <luto@amacapital.net>,
        Daniel Lezcano <daniel.lezcano@linaro.org>,
        Francis Giraldeau <francis.giraldeau@gmail.com>,
        Andi Kleen <andi@firstfloor.org>, Arnd Bergmann <arnd@arndb.de>,
        LKML <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3299
Lines: 80

2016-11-05 4:04 GMT+00:00 Chris Metcalf <cmetcalf@mellanox.com>:
> A bunch of people got together this week at the Linux Plumbers
> Conference to discuss nohz_full, task isolation, and related stuff.
> (Thanks to Thomas for getting everyone gathered at one place and time!)
>
> Here are the notes I took; I welcome any corrections and follow-up.
>

Thanks for that report Chris!

> == rcu_nocbs ==
>
> We started out by discussing this option.  It is automatically enabled
> by nohz_full, but we spent a little while side-tracking on the
> implementation of one kthread per rcu flavor per core.  The suggestion
> was made (by Peter or Andy; I forget) that each kthread could handle
> all flavors per core by using a dedicated worklist.  It certainly
> seems like removing potentially dozens or hundreds of kthreads from
> larger systems will be a win if this works out.
>
> Paul said he would look into this possibility.

Sounds good.

>
>
> == Remote statistics ==
>
> We discussed the possibility of remote statistics gathering, i.e. load
> average etc.  The idea would be that we could have housekeeping
> core(s) periodically iterate over the nohz cores to load their rq
> remotely and do update_current etc.  Presumably it should be possible
> for a single housekeeping core to handle doing this for all the
> nohz_full cores, as we only need to do it quite infrequently.
>
> Thomas suggested that this might be the last remaining thing that
> needed to be done to allow disabling the current behavior of falling
> back to a 1 Hz clock in nohz_full.
>
> I believe Thomas said he had a patch to do this already.
>

There are also some other details among update_curr to take care of,
but that's certainly a big piece of it.
I had wished we could find a solution that doesn't involve remote
accounting but at least it could be a first step.
I have let that idea rotting for too long, I need to get my hands into
it for good.

> == Disabling the dyn tick ==
>
> One issue that the current task isolation patch series encounters is
> when we request disabling the dyntick, but it doesn't happen.  At the
> moment we just wait until the the tick is properly disabled, by
> busy-waiting in the kernel (calling schedule etc as needed).  No one
> is particularly fond of this scheme.  The consensus seems to be to try
> harder to figure out what is going on, fix whatever problems exist,
> then consider it a regression going forward if something causes the
> dyntick to become difficult to disable again in the future.  I will
> take a look at this and try to gather more data on if and when this is
> happening in 4.9.
>

We could enhance dynticks tracing, expand the tick stop failure codes
for example in order to report more details about what's going on.

> == Missing oneshot_stopped callbacks ==
>
> I raised the issue that various clock_event_device sources don't
> always support oneshot_stopped, which can cause an additional
> final interrupt to occur after the timer infrastructure believes the
> interrupt has been stopped.  I have patches to fix this for tile and
> arm64 in my patch series; Thomas volunteered to look at adding
> equivalent support for x86.
>
>
> Many thanks to all those who participated in the discussion.
> Frederic, we wished you had been there!

I wish I had too!