Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933917AbcJSBl4 (ORCPT ); Tue, 18 Oct 2016 21:41:56 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:42744 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752676AbcJSBll (ORCPT ); Tue, 18 Oct 2016 21:41:41 -0400 DMARC-Filter: OpenDMARC Filter v1.3.1 smtp.codeaurora.org D30D6616FB Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=pass smtp.mailfrom=joonwoop@codeaurora.org From: Joonwoo Park To: Peter Zijlstra Cc: Joonwoo Park , Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH 2/2] sched/fair: avoid unnecessary hrtick rearm when it's possible Date: Tue, 18 Oct 2016 18:41:19 -0700 Message-Id: <1476841279-21958-2-git-send-email-joonwoop@codeaurora.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1476841279-21958-1-git-send-email-joonwoop@codeaurora.org> References: <1476841279-21958-1-git-send-email-joonwoop@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3992 Lines: 134 At present, scheduler rearms hrtick always when the nr_running is low enough to matter. We can also skip rearming of hrtick when newly enqueued or dequeued task is on a different cgroup than current task's because as long as enqueue/dequeue didn't change nr_running of current task's cfs_rq, current task's slice won't change. Find the top most ancestor cfs_rq whose cfs_rq has changed since enqueue/dequeue and rearm hrtick only when current task is under the cfs_rq. A modified hackbench which creates sender and receiver groups on a separate parent and child cgroup showed 13% of hrtick rearm reduction with this optimization whereas there was no measurable throughput degradation when both sender and receiver groups are in the same cgroup. Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Joonwoo Park --- kernel/sched/fair.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f465448..2eb091b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4495,19 +4495,72 @@ static void hrtick_start_fair(struct rq *rq, struct task_struct *p) } } +#ifdef CONFIG_FAIR_GROUP_SCHED +/* + * Find whether slice of 'p' has changed since 'pnew' enqueued or dequeued. + */ +static bool task_needs_new_slice(struct task_struct *p, + struct task_struct *pnew) +{ + struct cfs_rq *cfs_rq; + struct sched_entity *se; + + SCHED_WARN_ON(task_cpu(p) != task_cpu(pnew)); + + se = &pnew->se; + for_each_sched_entity(se) { + cfs_rq = cfs_rq_of(se); + /* + * We can stop walking up hierarchy at the ancestor level + * which has more than 1 nr_running because enqueue/dequeue + * of new task won't affect cfs_rq's task_group load_avg from + * that level through the root cfs_rq. + */ + if (cfs_rq->nr_running > 1) + break; + } + + /* + * The new task enqueue/dequeue ended up adding or removing of se in + * the root cfs_rq. All the ses now have new slice. + */ + if (!se) + return true; + + se = &p->se; + for_each_sched_entity(se) { + /* + * All the ses under 'cfs_rq' now have new slice. Find if + * 'cfs_rq' is ancestor of 'p'. + */ + if (cfs_rq == cfs_rq_of(se)) + return true; + } + + return false; +} +#else /* !CONFIG_FAIR_GROUP_SCHED */ +static inline bool +task_needs_new_slice(struct task_struct *p, struct task_struct *pnew) +{ + return true; +} +#endif + /* * called from enqueue/dequeue and updates the hrtick when the - * current task is from our class and nr_running is low enough - * to matter. + * current task is from our class, nr_running is low enough to matter and + * current task's slice can be changed by enqueue or deqeueue of 'p'. */ -static void hrtick_update(struct rq *rq) +static void hrtick_update(struct rq *rq, struct task_struct *p) { struct task_struct *curr = rq->curr; if (!hrtick_enabled(rq) || curr->sched_class != &fair_sched_class) return; - if (cfs_rq_of(&curr->se)->nr_running <= sched_nr_latency) + if (task_cfs_rq(curr)->nr_running <= sched_nr_latency && + task_needs_new_slice(curr, p)) hrtick_start_fair(rq, curr); } #else /* !CONFIG_SCHED_HRTICK */ @@ -4516,7 +4569,7 @@ hrtick_start_fair(struct rq *rq, struct task_struct *p) { } -static inline void hrtick_update(struct rq *rq) +static inline void hrtick_update(struct rq *rq, struct task_struct *p) { } #endif @@ -4573,7 +4626,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (!se) add_nr_running(rq, 1); - hrtick_update(rq); + hrtick_update(rq, p); } static void set_next_buddy(struct sched_entity *se); @@ -4632,7 +4685,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (!se) sub_nr_running(rq, 1); - hrtick_update(rq); + hrtick_update(rq, p); } #ifdef CONFIG_SMP -- 2.9.3