Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751960AbdFHWEq (ORCPT ); Thu, 8 Jun 2017 18:04:46 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:43551 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751547AbdFHWEp (ORCPT ); Thu, 8 Jun 2017 18:04:45 -0400 Subject: Re: [RFC PATCH] sched: select_idle_core should select least utilized core To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@kernel.org References: <1496949992-629076-1-git-send-email-subhra.mazumdar@oracle.com> <20170608195900.GA8337@worktop.programming.kicks-ass.net> From: subhra mazumdar Message-ID: Date: Thu, 8 Jun 2017 15:06:39 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170608195900.GA8337@worktop.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1637 Lines: 30 On 06/08/2017 12:59 PM, Peter Zijlstra wrote: > On Thu, Jun 08, 2017 at 03:26:32PM -0400, Subhra Mazumdar wrote: >> Current select_idle_core tries to find a fully idle core and if it fails >> select_idle_cpu next returns any idle cpu in the llc domain. This is not optimal >> for architectures with many (more than 2) hyperthreads in a core. This patch >> changes select_idle_core to find the core with least number of busy >> hyperthreads and return an idle cpu in that core. > Yeah, I think not. That makes select_idle_siblings _vastly_ more > expensive. I am not sure if the cost will increase vastly. Firstly I removed the select_idle_cpu for archs that have SMT. For them select_idle_core (called from select_idle_sibling) should return the final cpu. For archs w/o SMT there is no select_idle_core and select_idle_cpu will return it. If there are 8 hyperthreads per core (some existing archs) it is worth to pay some extra cost to find the most idle core since threads can run for longer time than the cost paid to search it. Also in the case where almost all cpus are busy current select_idle_cpu will pay almost same cost as the new select_idle_core (they will both iterate almost all cpus). Only for 2 threads/core I can see the cost will somewhat increase if the system is semi utilized, in that case iterating all cores will not give anything better. Do you suggest to keep the old way for 2 threads and find the least idle core for archs with more hyptherthreads? I ran hackbench at a few points on a x86 socket with 18 cores and didn't see any statistically significant change in performance or sys/usr %. Thanks, Subhra