Received: by 2002:ac0:aa62:0:0:0:0:0 with SMTP id w31-v6csp1902802ima; Thu, 25 Oct 2018 06:49:11 -0700 (PDT) X-Google-Smtp-Source: AJdET5cIBRbnGiHVhZqGyyVLPxj211Lo2dUxM8g3qRsaGuxQsqvlOMEgPCL4ysCAe51x3Cq30rUH X-Received: by 2002:a62:5919:: with SMTP id n25-v6mr1626109pfb.176.1540475351463; Thu, 25 Oct 2018 06:49:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540475351; cv=none; d=google.com; s=arc-20160816; b=K0gKbpDuFXKxuGQ9fE4SshYyzaG8lSaKQYYvo6/MtVviB2N9m7RhB9QmiVWVCvy0g3 wmkYPnQ3wWP/BmvcMhpgc35O/0rTvip2/S2MYtiIafea3iVFsXIFIOWr0l8jVnq6vDwa 03zby2Fa20PW1/0fIFw8cTzlDRbTarbfDSUI/gw9nKJYmV+0BJbFiL+QUw923ufa+t7l PGEzlKY8wkx6VqI2QbDntQ4pVNl6iG/ZDp9iTHAluRARlspoFifXzUAcfzN8d3UHYKC/ RLs0O48eBMsEdwaMBl+r6ZrBiG9PeTx9tVxj+Z+3oVpTX+f8Von1KsYxw/3QInEvn9ht ADcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=TBkHaOgYmc4MeZcnELmo24qA0nCwRriFq/YJ5dNWjqg=; b=ZwfiAa4D533Py1uyc3vOos55xOoUqhvKytdz8NEYDMtf0H6z+pHOJbSYyN2BUoVy9V wXr1nk3x4DaT6gRsuYBcYetuGhHsM/4ysA3dwUsKXpyEXRgZXPgwuAL8mYQJeHupUVt9 8hwGei5qwpVAkiexmUxLjtnx8La7FNU/dh0VncsvZRs/CvxqFKG37B3pKi5wkapdnHNP 4jZmhDF8yWlMENjHUS0yXOuWnH9TCp96AOhyFqkLZd475eatMrsvvjRGWtWCKEIFvsCr Wux1vuSc89F6/FBhrc5owwrJcLvpV2EByC+XsSuGAUEpQaBJJPTdU1wCycJWOdI80g/s oMRg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h131-v6si8046234pgc.122.2018.10.25.06.48.54; Thu, 25 Oct 2018 06:49:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727535AbeJYWUy (ORCPT + 99 others); Thu, 25 Oct 2018 18:20:54 -0400 Received: from foss.arm.com ([217.140.101.70]:57140 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727308AbeJYWUy (ORCPT ); Thu, 25 Oct 2018 18:20:54 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B151DA78; Thu, 25 Oct 2018 06:48:03 -0700 (PDT) Received: from [10.1.194.37] (e113632-lin.cambridge.arm.com [10.1.194.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7B08E3F6A8; Thu, 25 Oct 2018 06:48:01 -0700 (PDT) Subject: Re: [PATCH 08/10] sched/fair: Steal work from an overloaded CPU when CPU goes idle To: Steve Sistare , mingo@redhat.com, peterz@infradead.org Cc: subhra.mazumdar@oracle.com, dhaval.giani@oracle.com, rohit.k.jain@oracle.com, daniel.m.jordan@oracle.com, pavel.tatashin@microsoft.com, matt@codeblueprint.co.uk, umgwanakikbuti@gmail.com, riel@redhat.com, jbacik@fb.com, juri.lelli@redhat.com, linux-kernel@vger.kernel.org References: <1540220381-424433-1-git-send-email-steven.sistare@oracle.com> <1540220381-424433-9-git-send-email-steven.sistare@oracle.com> From: Valentin Schneider Message-ID: <223bfcdb-bb0c-b25e-04dc-26226a7c3ab3@arm.com> Date: Thu, 25 Oct 2018 14:48:00 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <1540220381-424433-9-git-send-email-steven.sistare@oracle.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Steve, On 22/10/2018 15:59, Steve Sistare wrote: [...] > @@ -9683,6 +9698,141 @@ void trigger_load_balance(struct rq *rq) > nohz_balancer_kick(rq); > } > > +/* > + * Search the runnable tasks in @cfs_rq in order of next to run, and find > + * the first one that can be migrated to @dst_rq. @cfs_rq is locked on entry. > + * On success, dequeue the task from @cfs_rq and return it, else return NULL. > + */ > +static struct task_struct * > +detach_next_task(struct cfs_rq *cfs_rq, struct rq *dst_rq) > +{ > + int dst_cpu = dst_rq->cpu; > + struct task_struct *p; > + struct rq *rq = rq_of(cfs_rq); > + > + lockdep_assert_held(&rq_of(cfs_rq)->lock); > + > + list_for_each_entry_reverse(p, &rq->cfs_tasks, se.group_node) { > + if (can_migrate_task_llc(p, rq, dst_rq)) { > + detach_task(p, rq, dst_cpu); > + return p; > + } > + } > + return NULL; > +} > + > +/* > + * Attempt to migrate a CFS task from @src_cpu to @dst_rq. @locked indicates > + * whether @dst_rq is already locked on entry. This function may lock or > + * unlock @dst_rq, and updates @locked to indicate the locked state on return. > + * The locking protocol is based on idle_balance(). > + * Returns 1 on success and 0 on failure. > + */ > +static int steal_from(struct rq *dst_rq, struct rq_flags *dst_rf, bool *locked, > + int src_cpu) > +{ > + struct task_struct *p; > + struct rq_flags rf; > + int stolen = 0; > + int dst_cpu = dst_rq->cpu; > + struct rq *src_rq = cpu_rq(src_cpu); > + > + if (dst_cpu == src_cpu || src_rq->cfs.h_nr_running < 2) > + return 0; > + > + if (*locked) { > + rq_unpin_lock(dst_rq, dst_rf); > + raw_spin_unlock(&dst_rq->lock); > + *locked = false; > + } > + rq_lock_irqsave(src_rq, &rf); > + update_rq_clock(src_rq); > + > + if (src_rq->cfs.h_nr_running < 2 || !cpu_active(src_cpu)) > + p = NULL; > + else > + p = detach_next_task(&src_rq->cfs, dst_rq); > + > + rq_unlock(src_rq, &rf); > + > + if (p) { > + raw_spin_lock(&dst_rq->lock); > + rq_repin_lock(dst_rq, dst_rf); > + *locked = true; > + update_rq_clock(dst_rq); > + attach_task(dst_rq, p); > + stolen = 1; > + } > + local_irq_restore(rf.flags); > + > + return stolen; > +} > + > +/* > + * Try to steal a runnable CFS task from a CPU in the same LLC as @dst_rq, > + * and migrate it to @dst_rq. rq_lock is held on entry and return, but > + * may be dropped in between. Return 1 on success, 0 on failure, and -1 > + * if a task in a different scheduling class has become runnable on @dst_rq. > + */ > +static int try_steal(struct rq *dst_rq, struct rq_flags *dst_rf) > +{ > + int src_cpu; > + int dst_cpu = dst_rq->cpu; > + bool locked = true; > + int stolen = 0; > + struct sparsemask *overload_cpus; > + > + if (!sched_feat(STEAL)) > + return 0; > + > + if (!cpu_active(dst_cpu)) > + return 0; > + > + /* Get bitmap of overloaded CPUs in the same LLC as @dst_rq */ > + > + rcu_read_lock(); > + overload_cpus = rcu_dereference(dst_rq->cfs_overload_cpus); > + if (!overload_cpus) { > + rcu_read_unlock(); > + return 0; > + } > + > +#ifdef CONFIG_SCHED_SMT > + /* > + * First try overloaded CPUs on the same core to preserve cache warmth. > + */ > + if (static_branch_likely(&sched_smt_present)) { > + for_each_cpu(src_cpu, cpu_smt_mask(dst_cpu)) { > + if (sparsemask_test_elem(src_cpu, overload_cpus) && > + steal_from(dst_rq, dst_rf, &locked, src_cpu)) { > + stolen = 1; > + goto out; > + } > + } > + } > +#endif /* CONFIG_SCHED_SMT */ > + > + /* Accept any suitable task in the LLC */ > + > + for_each_sparse_wrap(src_cpu, overload_cpus, dst_cpu) { > + if (steal_from(dst_rq, dst_rf, &locked, src_cpu)) { > + stolen = 1; > + break; ^^^^^^ You might want to have a 'goto out' there for consistency and to make GCC happy for !CONFIG_SCHED_SMT (I get a "warning: label ‘out’ defined but not used") [...]