Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757674AbeAIO4N (ORCPT + 1 other); Tue, 9 Jan 2018 09:56:13 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:60468 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751592AbeAIO4L (ORCPT ); Tue, 9 Jan 2018 09:56:11 -0500 Subject: Re: [RFC PATCH V2] sched: Improve scalability of select_idle_sibling using SMT balance To: subhra mazumdar Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, mingo@redhat.com, dhaval.giani@oracle.com References: <20180108221237.31761-1-subhra.mazumdar@oracle.com> <20180108221851.GV29822@worktop.programming.kicks-ass.net> From: Steven Sistare Organization: Oracle Corporation Message-ID: <8d6b2a4a-6fc0-3dd9-1d47-3b1d0e5af066@oracle.com> Date: Tue, 9 Jan 2018 09:50:44 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180108221851.GV29822@worktop.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8768 signatures=668652 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=777 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801090210 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 1/8/2018 5:18 PM, Peter Zijlstra wrote: > On Mon, Jan 08, 2018 at 02:12:37PM -0800, subhra mazumdar wrote: >> @@ -2751,6 +2763,31 @@ context_switch(struct rq *rq, struct task_struct *prev, >> struct task_struct *next, struct rq_flags *rf) >> { >> struct mm_struct *mm, *oldmm; >> + int this_cpu = rq->cpu; >> + struct sched_domain *sd; >> + int prev_busy, next_busy; >> + >> + if (rq->curr_util == UTIL_UNINITIALIZED) >> + prev_busy = 0; >> + else >> + prev_busy = (prev != rq->idle); >> + next_busy = (next != rq->idle); >> + >> + /* >> + * From sd_llc downward update the SMT utilization. >> + * Skip the lowest level 0. >> + */ >> + sd = rcu_dereference_sched(per_cpu(sd_llc, this_cpu)); >> + if (next_busy != prev_busy) { >> + for_each_lower_domain(sd) { >> + if (sd->level == 0) >> + break; >> + sd_context_switch(sd, rq, next_busy - prev_busy); >> + } >> + } >> + > > No, we're not going to be adding atomic ops here. We've been arguing > over adding a single memory barrier to this path, atomic are just not > going to happen. > > Also this is entirely the wrong way to do this, we already have code > paths that _know_ if they're going into or coming out of idle. Yes, it would be more efficient to adjust the busy-cpu count of each level of the hierarchy in pick_next_task_idle and put_prev_task_idle. - Steve