Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2697083ybi; Mon, 1 Jul 2019 17:09:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqwOoG557aDS36X/V4jlAlM+5jdycdmXcDvJZRl1Y0Bs32ZuzK9Ns1ANTPJjt6QMPzUK79sW X-Received: by 2002:a17:90a:d595:: with SMTP id v21mr2167519pju.34.1562026141940; Mon, 01 Jul 2019 17:09:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562026141; cv=none; d=google.com; s=arc-20160816; b=OxH4l25i3wcOPU5n8RSAJZukP1gCTc/i6O2KEZpsVoapaTn1JTQzg4CFJKRjwn+w48 3j9cxKRZGMlfvrg3mu6X7fn6f7haphb7CG8LA46QljbLj/oCfKyEkBW+kwu9wMdrs/i9 vszdeS7zN7FB8UwW0U+Y4E9vJwgMiH7B6GkE+28m78ZUMcSbwoktn6aEofcyijHg8SD2 oMuuPuJYEtEDQN9SaLGOmq1Pjz004Wf1CRdrUTL/d7vcroIK4clZWM2/iSy8Mu9xfzxp hNDEN2T+oQzifY6onnPNSCMBkkB/P6aHr3wO/G6QicTPh1GABXllOBKdcrr89Z/W2u4M 2r5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=W5Q+NkHpzRK/P7xEh54iSkYNc3/t6agDnWcFSfwc5W4=; b=0aoG6fDEt4enFCgJXsuDlf3f3Dk9Pt4cX0g46qhwRjIxcsJ9hXZIQedt+xQxrX+5e/ REmXsmn7MUUcJAWxpv/sgX4BXkw2CQbl5f/y2WmCuoO9sWOyPvcwuZpmkzyikFhb+lj4 lTQbJHKKRIbTJLly5r0JGH7Nos2SvphHTaNaSjbt5NzzRXgzsOPpJeqaoxq12ZAd3UOa ydnE65ICS24+Fclu6h50mZAEsPayXxD9+bOyGeaPJ171JkMWZXruMEIaR+TlOWm28GP7 cztkjITTHmkIAa52PppEeM0DP7b9S1ih4oMNbwaBzEfl7HilPsN9Xf3H7KYCUGlR9RTy rB0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=lUfzdCzu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q195si11324294pgq.119.2019.07.01.17.08.46; Mon, 01 Jul 2019 17:09:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=lUfzdCzu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727063AbfGBAIJ (ORCPT + 99 others); Mon, 1 Jul 2019 20:08:09 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34520 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727010AbfGBAIJ (ORCPT ); Mon, 1 Jul 2019 20:08:09 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6204287022201; Tue, 2 Jul 2019 00:06:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=W5Q+NkHpzRK/P7xEh54iSkYNc3/t6agDnWcFSfwc5W4=; b=lUfzdCzurW0l+g+6kS6bCNBdI0RrWuhRpJFso1wTg8RYtLSoTbfF00NPWM9akaXhrY7C J9J7YUwApCsC3rz6+CpfD+TBdEtup1nPc72DjIpjWBs0xq4S8+PpqPeY6t67OdulQzNA aI8/B6ZLPwF1ziyvqHAYkXIG5Xr3nYP0MkSIVu/WzdVfAIUhA3zdt50bqo92OiMOLwH0 ibjcy7EHlhBKr7PHGp6JeyI9oZ97l/FoCkMjpAjTmO/eccd/zpbc5s2WhWmDOn7IhB0y OM5fn4FdJgzVgtLBXaFeg+wszT2B13niyJLE4Q2fKZVTxFmQDoMApsuTBOYRBFVg0z5b cg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2te5tbgdsw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 02 Jul 2019 00:06:27 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6202hMB077713; Tue, 2 Jul 2019 00:06:26 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2tebqg6vjp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 02 Jul 2019 00:06:26 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x6206HWB029180; Tue, 2 Jul 2019 00:06:22 GMT Received: from [10.132.91.175] (/10.132.91.175) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 01 Jul 2019 17:06:17 -0700 Subject: Re: [RESEND PATCH v3 0/7] Improve scheduler scalability for fast path To: Patrick Bellasi , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, tglx@linutronix.de, steven.sistare@oracle.com, dhaval.giani@oracle.com, daniel.lezcano@linaro.org, vincent.guittot@linaro.org, viresh.kumar@linaro.org, tim.c.chen@linux.intel.com, mgorman@techsingularity.net, Paul Turner , riel@surriel.com, morten.rasmussen@arm.com References: <20190627012919.4341-1-subhra.mazumdar@oracle.com> <20190701090204.GQ3402@hirez.programming.kicks-ass.net> <20190701135552.kb4os6bxxhh2lyw6@e110439-lin> From: Subhra Mazumdar Message-ID: <81b2288a-579d-8dd1-f179-d672cf1edd68@oracle.com> Date: Mon, 1 Jul 2019 17:01:07 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <20190701135552.kb4os6bxxhh2lyw6@e110439-lin> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9305 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907010280 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9305 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907010281 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/1/19 6:55 AM, Patrick Bellasi wrote: > On 01-Jul 11:02, Peter Zijlstra wrote: >> On Wed, Jun 26, 2019 at 06:29:12PM -0700, subhra mazumdar wrote: >>> Hi, >>> >>> Resending this patchset, will be good to get some feedback. Any suggestions >>> that will make it more acceptable are welcome. We have been shipping this >>> with Unbreakable Enterprise Kernel in Oracle Linux. >>> >>> Current select_idle_sibling first tries to find a fully idle core using >>> select_idle_core which can potentially search all cores and if it fails it >>> finds any idle cpu using select_idle_cpu. select_idle_cpu can potentially >>> search all cpus in the llc domain. This doesn't scale for large llc domains >>> and will only get worse with more cores in future. >>> >>> This patch solves the scalability problem by: >>> - Setting an upper and lower limit of idle cpu search in select_idle_cpu >>> to keep search time low and constant >>> - Adding a new sched feature SIS_CORE to disable select_idle_core >>> >>> Additionally it also introduces a new per-cpu variable next_cpu to track >>> the limit of search so that every time search starts from where it ended. >>> This rotating search window over cpus in LLC domain ensures that idle >>> cpus are eventually found in case of high load. >> Right, so we had a wee conversation about this patch series at OSPM, and >> I don't see any of that reflected here :-( >> >> Specifically, given that some people _really_ want the whole L3 mask >> scanned to reduce tail latency over raw throughput, while you guys >> prefer the other way around, it was proposed to extend the task model. >> >> Specifically something like a latency-nice was mentioned (IIRC) where a > Right, AFAIR PaulT suggested to add support for the concept of a task > being "latency tolerant": meaning we can spend more time to search for > a CPU and/or avoid preempting the current task. > Wondering if searching and preempting needs will ever be conflicting? Otherwise sounds like a good direction to me. For the searching aspect, can we map latency nice values to the % of cores we search in select_idle_cpu? Thus the search cost can be controlled by latency nice value. But the issue is if more latency tolerant workloads set to less search, we still need some mechanism to achieve good spread of threads. Can we keep the sliding window mechanism in that case? Also will latency nice do anything for select_idle_core and select_idle_smt?