Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938921AbcJXOHM (ORCPT ); Mon, 24 Oct 2016 10:07:12 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:33018 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936480AbcJXOHI (ORCPT ); Mon, 24 Oct 2016 10:07:08 -0400 From: Luca Abeni To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Claudio Scordino , Steven Rostedt , Luca Abeni Subject: [RFC v3 0/6] CPU reclaiming for SCHED_DEADLINE Date: Mon, 24 Oct 2016 16:06:32 +0200 Message-Id: <1477317998-7487-1-git-send-email-luca.abeni@unitn.it> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3954 Lines: 79 Hi all, this patchset implements CPU reclaiming (using the GRUB algorithm[1]) for SCHED_DEADLINE: basically, this feature allows SCHED_DEADLINE tasks to consume more than their reserved runtime, up to a maximum fraction of the CPU time (so that other tasks are left some spare CPU time to execute), if this does not break the guarantees of other SCHED_DEADLINE tasks. The patchset applies on top of tip/master. The implemented CPU reclaiming algorithm is based on tracking the utilization U_act of active tasks (first 2 patches), and modifying the runtime accounting rule (see patch 0004). The original GRUB algorithm is modified as described in [2] to support multiple CPUs (the original algorithm only considered one single CPU, this one tracks U_act per runqueue) and to leave an "unreclaimable" fraction of CPU time to non SCHED_DEADLINE tasks (see patch 0005: the original algorithm can consume 100% of the CPU time, starving all the other tasks). Patch 0003 uses the newly introduced "inactive timer" (introduced in patch 0002) to fix dl_overflow() and __setparam_dl(). Patch 0006 allows to enable CPU reclaiming only on selected tasks. Changes since v2: in general, I tried to address all the comments I received, and to add some more comments in "critical" parts of the code. In particular, I: - Updated to latest tip/master. This required some changes (for example, using "struct rq_flags" instead of "unsigned long" for task_rq_lock()) - Merged patches 0001 and 0002, as suggested by Juri - Added some comments about GRUB in the changelog of the patch adding GRUB accounting - Exchanged the order of two patches ("Make GRUB a task's flag" and "Do not reclaim the whole CPU bandwidth"), as suggested by Juri - Removed unused code ("if (task_on_rq_queued(p))" in task_dead_dl(), noticed by Peter) from patch 0001 - Properly consider the migrations of queued dl tasks when updating the active utilization. This should address Peter's concern from http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02612.html - Simplified the code for setting up the inactive timer, as suggested by Peter: http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02620.html - Use hrtimer_is_queued() instead of "(hrtimer_active() && !hrtimer_callback_running())", as pointed out by Peter: http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02805.html - Fix select_task_rq_dl() (using "task_cpu(p) != cpu" instead of "rq != cpu_rq(cpu)"), as pointed out by Peter: http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02822.html I also changed the logic used in select_task_rq_dl() (that now does not increase the active utilization in the selected runqueue) - Because of the changes in the code, I am not sure if the race condition pointed out by Peter can still happen. I tried to trigger it in many ways, but I failed... If it turns out that the race is still possible, I'll fix it in the next round of patches, by introducing a new "is_contending" field (protected by pi_mutex) in the dl scheduling entity. [1] Lipari, G., & Baruah, S. (2000). Greedy reclamation of unused bandwidth in constant-bandwidth servers. In Real-Time Systems, 2000. Euromicro RTS 2000. 12th Euromicro Conference on (pp. 193-200). IEEE. [2] Abeni, L., Lelli, J., Scordino, C., & Palopoli, L. (2014, October). Greedy CPU reclaiming for SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS), Dusseldorf, Germany. Luca Abeni (6): Track the active utilisation Improve the tracking of active utilisation Fix the update of the total -deadline utilization GRUB accounting Do not reclaim the whole CPU bandwidth Make GRUB a task's flag include/linux/sched.h | 1 + include/uapi/linux/sched.h | 1 + kernel/sched/core.c | 44 ++++----- kernel/sched/deadline.c | 220 ++++++++++++++++++++++++++++++++++++++++----- kernel/sched/sched.h | 13 +++ 5 files changed, 234 insertions(+), 45 deletions(-) -- 2.7.4