Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp634964ybv; Wed, 19 Feb 2020 06:17:47 -0800 (PST) X-Google-Smtp-Source: APXvYqwgL2en08XAkO+R44VjoQJqak0dmEw+DjiaKSDghacVu9Glpwk6YostrDWXDQQoFLGEiDoE X-Received: by 2002:a9d:6e9a:: with SMTP id a26mr18879087otr.344.1582121867337; Wed, 19 Feb 2020 06:17:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582121867; cv=none; d=google.com; s=arc-20160816; b=l2IErZc1FIL4W4X+zkTlTef5BHrz0JjwR8bo24cbfqK9j+BjiYOUhSziRijeMHtoCE cB0DYmRlBenzz8cAZ+8DCj8lglf0t9zeAfYzxaZZL18r79wB+jC8eB0dZIrQ62Y95z/w tVQ6CFUGDvW0+yVxqK9zsNakb5ZWsMNkt8QVyqSPOeNXlYGmsyY+O5CsRG8rWcB5T0Zi lIrqo3fIlm6LLC4FgElnTqSLUjDY+XL8MV3tXv4cj9YRN3426bGhvSN9QY19leQfJxhE pu8vy5lP0f1YAaPJUmWL5BZ99XtwsiByNsfTwi5DJrseZQw3wxItyin6En1vzzSQCkai SKcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=f/O2jeHf4ajunXhbuuhXwEpKZ33CEDPKTaHg5Z7Sxl4=; b=KtYVA7qCmJ6VWCweV0J2BIDH5GX19GlnVWtca+RdbMptE8cL3hHfpUVcHXXpN0wMP3 8K5fFweEz8nEEX3Zz4H+2kK4NWmHZXULsWcr+3RbKSIh2v1HrxN+UXqGEDTYIdgUHAE6 CaTyAhWe51DjzBrE8WeXxkqmzMSjLue/FOwaKcqXLIILztq3ky0kqEYb4Jjpe9stDKe5 nCQrwCyl5Q9NlOA5tYIR2R9BSAS9iRvgMKT8fIaTYDywzu8SWulwJOOrP9/zPjFCQXfx RHDvWhUNLuRDHk33pWYM6i/7XGmUowuy8DQD+6XZ6XbJZPVomxix5zRQdR9is9HXueE/ 5jvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=SVBn0pcV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t22si1062187otc.167.2020.02.19.06.17.33; Wed, 19 Feb 2020 06:17:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=SVBn0pcV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727875AbgBSOQL (ORCPT + 99 others); Wed, 19 Feb 2020 09:16:11 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:39054 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726736AbgBSOQK (ORCPT ); Wed, 19 Feb 2020 09:16:10 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 01JE9GCs187390; Wed, 19 Feb 2020 14:15:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=f/O2jeHf4ajunXhbuuhXwEpKZ33CEDPKTaHg5Z7Sxl4=; b=SVBn0pcVfmiBsgA8ZHCsXyol3fhEfr0F7YC0B2XwMCZ3wOtCvsLl92T4s8Zd9HrkAyY2 Shllpb8aLHtkFYRYvu4XThFqNmKK5Lb2hVQNQ2It70nuhzB+tS47OXvZNrDjkE1x+7kv 5+Gf/4ty37Lr1PhHQKGd8N47ozF90EpVMxvgyCk03SKhLA0QxczwP1iGm4fhHOef6rNR Ld5UNsBf8LeWziR0DeZY5q64WNWaijmGRvermWOjvsgX1Xz8Wf6FJ4MSOf0fj9TcUXGA 5b2OdsMKZOqDz9+p75uziau9jnYl/5cLLTuT0nwic37l6hCFjzly/zxsxTyPD3bN5+fR 4A== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2y8udkb8qp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 Feb 2020 14:15:47 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 01JE7gCp077276; Wed, 19 Feb 2020 14:15:47 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3020.oracle.com with ESMTP id 2y8ud187pf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 Feb 2020 14:15:47 +0000 Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 01JEFgVK027741; Wed, 19 Feb 2020 14:15:42 GMT Received: from [192.168.0.195] (/69.207.174.138) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 19 Feb 2020 06:15:42 -0800 Subject: Re: [PATCH v3 0/3] Introduce per-task latency_nice for scheduler hints To: Parth Shah , vincent.guittot@linaro.org, patrick.bellasi@matbug.net, valentin.schneider@arm.com, dhaval.giani@oracle.com, dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, qais.yousef@arm.com, pavel@ucw.cz, qperret@qperret.net, David.Laight@ACULAB.COM, pjt@google.com, tj@kernel.org References: <20200116120230.16759-1-parth@linux.ibm.com> <8ed0f40c-eeb4-c487-5420-a8eb185b5cdd@linux.ibm.com> <971909ed-d4e0-6afa-d20b-365ede5a195e@linux.ibm.com> From: chris hyser Message-ID: <8e984496-e89b-d96c-d84e-2be7f0958ea4@oracle.com> Date: Wed, 19 Feb 2020 09:15:31 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <971909ed-d4e0-6afa-d20b-365ede5a195e@linux.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9535 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 adultscore=0 spamscore=0 mlxscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002190109 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9535 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 suspectscore=0 spamscore=0 priorityscore=1501 adultscore=0 mlxscore=0 clxscore=1015 malwarescore=0 mlxlogscore=999 phishscore=0 impostorscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002190109 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/19/20 5:09 AM, Parth Shah wrote: > Hi Chris, > > On 2/19/20 4:30 AM, chris hyser wrote: >> On 2/17/20 3:57 AM, Parth Shah wrote: >>> >>> >>> On 1/16/20 5:32 PM, Parth Shah wrote: >>>> This is the 3rd revision of the patch set to introduce >>>> latency_{nice/tolerance} as a per task attribute. >>>> >>>> The previous version can be found at: >>>> v1: https://lkml.org/lkml/2019/11/25/151 >>>> v2: https://lkml.org/lkml/2019/12/8/10 >>>> >>>> Changes in this revision are: >>>> v2 -> v3: >>>> - This series changes the longer attribute name to "latency_nice" as per >>>>    the comment from Dietmar Eggemann https://lkml.org/lkml/2019/12/5/394 >>>> v1 -> v2: >>>> - Addressed comments from Qais Yousef >>>> - As per suggestion from Dietmar, moved content from newly created >>>>    include/linux/sched/latency_tolerance.h to kernel/sched/sched.h >>>> - Extend sched_setattr() to support latency_tolerance in tools headers UAPI >>>> >>>> >>>> Introduction: >>>> ============== >>>> This patch series introduces a new per-task attribute latency_nice to >>>> provide the scheduler hints about the latency requirements of the task [1]. >>>> >>>> Latency_nice is a ranged attribute of a task with the value ranging >>>> from [-20, 19] both inclusive which makes it align with the task nice >>>> value. >>>> >>>> The value should provide scheduler hints about the relative latency >>>> requirements of tasks, meaning the task with "latency_nice = -20" >>>> should have lower latency requirements than compared to those tasks with >>>> higher values. Similarly a task with "latency_nice = 19" can have higher >>>> latency and hence such tasks may not care much about latency. >>>> >>>> The default value is set to 0. The usecases discussed below can use this >>>> range of [-20, 19] for latency_nice for the specific purpose. This >>>> patch does not implement any use cases for such attribute so that any >>>> change in naming or range does not affect much to the other (future) >>>> patches using this. The actual use of latency_nice during task wakeup >>>> and load-balancing is yet to be coded for each of those usecases. >>>> >>>> As per my view, this defined attribute can be used in following ways for a >>>> some of the usecases: >>>> 1 Reduce search scan time for select_idle_cpu(): >>>> - Reduce search scans for finding idle CPU for a waking task with lower >>>>    latency_nice values. >>>> >>>> 2 TurboSched: >>>> - Classify the tasks with higher latency_nice values as a small >>>>    background task given that its historic utilization is very low, for >>>>    which the scheduler can search for more number of cores to do task >>>>    packing.  A task with a latency_nice >= some_threshold (e.g, == 19) >>>>    and util <= 12.5% can be background tasks. >>>> >>>> 3 Optimize AVX512 based workload: >>>> - Bias scheduler to not put a task having (latency_nice == -20) on a >>>>    core occupying AVX512 based workload. >>>> >>>> >>>> Series Organization: >>>> ==================== >>>> - Patch 1: Add new attribute latency_nice to task_struct. >>>> - Patch 2: Clone parent task's attribute to the child task on fork >>>> - Patch 3: Add support for sched_{set,get}attr syscall to modify >>>>             latency_nice of the task >>>> >>>> >>>> The patch series can be applied on tip/sched/core at the >>>> commit 804d402fb6f6 ("sched/rt: Make RT capacity-aware") >>>> >>>> >>>> References: >>>> ============ >>>> [1]. Usecases for the per-task latency-nice attribute, >>>>       https://lkml.org/lkml/2019/9/30/215 >>>> [2]. Task Latency-nice, "Subhra Mazumdar", >>>>       https://lkml.org/lkml/2019/8/30/829 >>>> [3]. Introduce per-task latency_tolerance for scheduler hints, >>>>       https://lkml.org/lkml/2019/12/8/10 >>>> >>>> >>>> Parth Shah (3): >>>>    sched: Introduce latency-nice as a per-task attribute >>>>    sched/core: Propagate parent task's latency requirements to the child >>>>      task >>>>    sched: Allow sched_{get,set}attr to change latency_nice of the task >>>> >>>>   include/linux/sched.h            |  1 + >>>>   include/uapi/linux/sched.h       |  4 +++- >>>>   include/uapi/linux/sched/types.h | 19 +++++++++++++++++++ >>>>   kernel/sched/core.c              | 21 +++++++++++++++++++++ >>>>   kernel/sched/sched.h             | 18 ++++++++++++++++++ >>>>   tools/include/uapi/linux/sched.h |  4 +++- >>>>   6 files changed, 65 insertions(+), 2 deletions(-) >>>> >>> >>> Its been a long time and few revisions since the beginning of the >>> discussion around the latency-nice. Hence thought of asking if there is/are >>> any further work that needs to be done for adding latency-nice attribute or >>> am I missing any piece in here? >> >> All, I was asked to take a look at the original latency_nice patchset. >> First, to clarify objectives, Oracle is not interested in trading >> throughput for latency. What we found is that the DB has specific tasks >> which do very little but need to do this as absolutely quickly as possible, >> ie extreme latency sensitivity. Second, the key to latency reduction in the >> task wakeup path seems to be limiting variations of "idle cpu" search. The >> latter particularly interests me as an example of "platform size based >> latency" which I believe to be important given all the varying size VMs and >> containers. >> >> Parth, I've been using your v3 patchset as the basis of an investigation >> into the measurable effects of short-circuiting this search. I'm not quite >> ready to put anything out, but the patchset is working well. The only > > That's a good news as you are able to get a usecase of this patch-set. > >> feedback I have is that currently non-root can set the value negative which >> is inconsistent with 'nice' and I would think a security hole. >> > > I would assume you mean 'latency_nice' here. > > From my testing, I was not able to set values for any root owned task's > latency_nice value by the non-root user. Also, my patch-set just piggybacks > on the already existing sched_setattr syscall and hence it should not allow > non-root user to do any modifications. Can you confirm this by changing > nice (renice) value of a root task from non-root user. > > I have done the sanity check in the code and thinking where it could > possibly have gone wrong. So, can you please specify what values were you > able to set outside the [-20, 19] range? The checks prevent being outside that range. But negative numbers -20 to -1 did not need root. Let me dig some more. I verified this explicitly before sending the email so something is up. -chrish