Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2333625imm; Fri, 7 Sep 2018 14:51:48 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZUdHseRuBM7kw3h1XPZwGcZUaqf8sI1kXK2qa8Fq7saPAxNNqNe2nIrxq06GyS6hYsFPn1 X-Received: by 2002:a62:d44a:: with SMTP id u10-v6mr10764344pfl.144.1536357108759; Fri, 07 Sep 2018 14:51:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536357108; cv=none; d=google.com; s=arc-20160816; b=s7MRpw3NxKbUrUgGQ9Sk5HKHY5Tkf8j6lV+qiodCF9jVzXBpi++uyEaN2NE4BgkZVS 2Dnzl2Jchy/YgWVrWJYTswov7LZcahjCRnsydu7Q0OvTEH1U/9EzMZAK37x+fcn0LrPM xBXh0fG9fPwtUcQO7SoxST2LWTR7N/x+hVMTaY7/JZCASiXFR+tGzRDRCjwRoyyZUndx b1BrkqbXmxaFknMA9P5tYwxUl6uLVD2o8ymVHE6e3zj/c5dqkTyFJ4LdIBLS8N/W22QC 7S4n3uPPdgJf1riaY1oZ/NFPjbVIOqrQ7I60wv7pihWtFsGv29Awdr53SL/GSukhypxQ k1XA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=YZeaBs6bQwAyKdtvL96N+LC8VPinOgTwTPuPCYkvXsw=; b=D0hqz1zIuDPccybqnFU/3zHG7g7IPtaB2ev1L9i5T195gt2KpFGMFvEi3Dt525I0kO j+y8aaOtnp3z9hT1FN+763SfB1i2MAGFq6Iaweeif54k7J0P6en9g2h1+RzJcEYR7Lev Hz26rHFtjgfW9TvF9ITWlFWZA79IXBQs0fX43QBTfpLFmAUA4aM97falqR1hPGhR+0Vd EcxXbRQbunKc/AZWcU9DYf+I0bLyu+IyneCiM0uVH/WvaX5RstTVZEautAgLCmVr5ykq 8hS+YsTmRcxK85lUVOo606/0k/7whtqcYldVhVNga0ha/bMdpgm0s2VTgMrUDpy38BP+ vf1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=tJz5JoxF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s21-v6si9339945pgm.651.2018.09.07.14.51.33; Fri, 07 Sep 2018 14:51:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=tJz5JoxF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730188AbeIHCci (ORCPT + 99 others); Fri, 7 Sep 2018 22:32:38 -0400 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:42560 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729174AbeIHCZw (ORCPT ); Fri, 7 Sep 2018 22:25:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1536356577; x=1567892577; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YZeaBs6bQwAyKdtvL96N+LC8VPinOgTwTPuPCYkvXsw=; b=tJz5JoxF4fEIWCfQWncHYW3CxtZuTnfcD+iV5QL26J4UelkmwGPtXDo6 E1EKsW9ymlkKZ9yHK55OUTHT+pElzdCvFcvRWn4dko0y/rBf5ju2OkKRB QKyHIU4NeEzllu7zIHiWARwmQszdqpgNntTMezxzy/+6/qFkX4kvY+Kj3 8=; X-IronPort-AV: E=Sophos;i="5.53,343,1531785600"; d="scan'208";a="696510016" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1e-c7c08562.us-east-1.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 07 Sep 2018 21:42:56 +0000 Received: from u7588a65da6b65f.ant.amazon.com (iad7-ws-svc-lb50-vlan2.amazon.com [10.0.93.210]) by email-inbound-relay-1e-c7c08562.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id w87LgoBL056568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Fri, 7 Sep 2018 21:42:52 GMT Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id w87Lgmko027751 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 7 Sep 2018 23:42:48 +0200 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id w87Lglul027750; Fri, 7 Sep 2018 23:42:47 +0200 From: =?UTF-8?q?Jan=20H=2E=20Sch=C3=B6nherr?= To: Ingo Molnar , Peter Zijlstra Cc: =?UTF-8?q?Jan=20H=2E=20Sch=C3=B6nherr?= , linux-kernel@vger.kernel.org Subject: [RFC 49/60] cosched: Adjust locking for enqueuing and dequeueing Date: Fri, 7 Sep 2018 23:40:36 +0200 Message-Id: <20180907214047.26914-50-jschoenh@amazon.de> X-Mailer: git-send-email 2.9.3.1.gcba166c.dirty In-Reply-To: <20180907214047.26914-1-jschoenh@amazon.de> References: <20180907214047.26914-1-jschoenh@amazon.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Enqueuing and dequeuing of tasks (or entities) are a general activities that span across leader boundaries. They start from the bottom of the runqueue hierarchy and bubble upwards, until they hit their terminating condition (for example, enqueuing stops when the parent entity is already enqueued). We employ chain-locking in these cases to minimize lock contention. For example, if enqueuing has moved past a hierarchy level of a different leader, that leader can already make scheduling decisions again. Also, this opens the possibility to combine concurrent enqueues/dequeues to some extend, so that only one of multiple CPUs has to walk up the hierarchy. Signed-off-by: Jan H. Schönherr --- kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6d64f4478fda..0dc4d289497c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4510,6 +4510,7 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) struct sched_entity *se; long task_delta, dequeue = 1; bool empty; + struct rq_chain rc; /* * FIXME: We can only handle CPU runqueues at the moment. @@ -4532,8 +4533,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) rcu_read_unlock(); task_delta = cfs_rq->h_nr_running; + rq_chain_init(&rc, rq); for_each_sched_entity(se) { struct cfs_rq *qcfs_rq = cfs_rq_of(se); + + rq_chain_lock(&rc, se); /* throttled entity or throttle-on-deactivate */ if (!se->on_rq) break; @@ -4549,6 +4553,8 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) if (!se) sub_nr_running(rq, task_delta); + rq_chain_unlock(&rc); + cfs_rq->throttled = 1; cfs_rq->throttled_clock = rq_clock(rq); raw_spin_lock(&cfs_b->lock); @@ -4577,6 +4583,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) struct sched_entity *se; int enqueue = 1; long task_delta; + struct rq_chain rc; SCHED_WARN_ON(!is_cpu_rq(rq)); @@ -4598,7 +4605,9 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) return; task_delta = cfs_rq->h_nr_running; + rq_chain_init(&rc, rq); for_each_sched_entity(se) { + rq_chain_lock(&rc, se); if (se->on_rq) enqueue = 0; @@ -4614,6 +4623,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) if (!se) add_nr_running(rq, task_delta); + rq_chain_unlock(&rc); + /* Determine whether we need to wake up potentially idle CPU: */ if (rq->curr == rq->idle && nr_cfs_tasks(rq)) resched_curr(rq); @@ -5136,8 +5147,11 @@ bool enqueue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, unsigned int task_delta) { struct cfs_rq *cfs_rq; + struct rq_chain rc; + rq_chain_init(&rc, rq); for_each_sched_entity(se) { + rq_chain_lock(&rc, se); if (se->on_rq) break; cfs_rq = cfs_rq_of(se); @@ -5157,6 +5171,8 @@ bool enqueue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, } for_each_sched_entity(se) { + /* FIXME: taking locks up to the top is bad */ + rq_chain_lock(&rc, se); cfs_rq = cfs_rq_of(se); cfs_rq->h_nr_running += task_delta; @@ -5167,6 +5183,8 @@ bool enqueue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, update_cfs_group(se); } + rq_chain_unlock(&rc); + return se != NULL; } @@ -5211,9 +5229,12 @@ bool dequeue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, unsigned int task_delta) { struct cfs_rq *cfs_rq; + struct rq_chain rc; int task_sleep = flags & DEQUEUE_SLEEP; + rq_chain_init(&rc, rq); for_each_sched_entity(se) { + rq_chain_lock(&rc, se); cfs_rq = cfs_rq_of(se); dequeue_entity(cfs_rq, se, flags); @@ -5231,6 +5252,9 @@ bool dequeue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, if (cfs_rq->load.weight) { /* Avoid re-evaluating load for this entity: */ se = parent_entity(se); + if (se) + rq_chain_lock(&rc, se); + /* * Bias pick_next to pick a task from this cfs_rq, as * p is sleeping when it is within its sched_slice. @@ -5243,6 +5267,8 @@ bool dequeue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, } for_each_sched_entity(se) { + /* FIXME: taking locks up to the top is bad */ + rq_chain_lock(&rc, se); cfs_rq = cfs_rq_of(se); cfs_rq->h_nr_running -= task_delta; @@ -5253,6 +5279,8 @@ bool dequeue_entity_fair(struct rq *rq, struct sched_entity *se, int flags, update_cfs_group(se); } + rq_chain_unlock(&rc); + return se != NULL; } @@ -9860,11 +9888,15 @@ static inline bool vruntime_normalized(struct task_struct *p) static void propagate_entity_cfs_rq(struct sched_entity *se) { struct cfs_rq *cfs_rq; + struct rq_chain rc; + + rq_chain_init(&rc, hrq_of(cfs_rq_of(se))); /* Start to propagate at parent */ se = parent_entity(se); for_each_sched_entity(se) { + rq_chain_lock(&rc, se); cfs_rq = cfs_rq_of(se); if (cfs_rq_throttled(cfs_rq)) @@ -9872,6 +9904,7 @@ static void propagate_entity_cfs_rq(struct sched_entity *se) update_load_avg(cfs_rq, se, UPDATE_TG); } + rq_chain_unlock(&rc); } #else static void propagate_entity_cfs_rq(struct sched_entity *se) { } -- 2.9.3.1.gcba166c.dirty