Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3516853ybt; Tue, 30 Jun 2020 05:01:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwVT7byG9Ku2UMcNm5/mjQMuWo7QI1b44GrTYCnuEnX4VtQoJgF8OlXo8tvSHtAeZaWfG2u X-Received: by 2002:a17:906:70cf:: with SMTP id g15mr1732192ejk.531.1593518475249; Tue, 30 Jun 2020 05:01:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593518475; cv=none; d=google.com; s=arc-20160816; b=kfGHw5eWf8/8gnOr+/w1xRaza7palbZFYURXqigBvCZaF/oCl5xzp2roVfiEAJ3Cw4 jpZxBS/fwpt4iqIkUFTLTf3wHSHjvFGyoKO250Q9Os+PcLetfkF46nm/WX+43LdkBQ1R /tnksyugwZ4KfWDGBWpYas0BmYzkKWWogP7eHqyFoo7KDwbbAASajhpFmaDy026yvqKx xzqqj0QrC/2o+3Jny6YWj/f+2MM9svTMcCe19Yp95kGpvyVFOzS6HOaE1UPYqblgvCFz KET87IP9w+0XG6nrufX8BtGtOx7j7PcVzSew2tNhewYzBzt8aaHRXmmOKKXc6h6PxUlW 6R+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=EnoSmIbH08L1iGLMI837aPTHeGgZ1nnppmnqonPEatM=; b=qU7Pe2ufLm89lTHfuikPUrIS/tQbC6dVST41ohUlIBPBzg0bYma9i4aMgLU4DAujd8 Wn87XJxT70zvb/v65eMCxzKGWtGC6hE3f6yqCO2nD4yQekPWsFiVw1tsoIo9aMbcbFvZ xy0jN28/0iQvLUhYA/BD/675GM6PuosGXckqRNGUfZW1+y0MmTiseIj7BQBIMJx5GwY5 98pYQk2Ct5W67pLOyapBR36lebEAiDeqdaumTboVjuSb6sjkwWvRQGriWjTEW7AZxR7p G7HxFvaNzoAtYAWWYvi24IQ0hLfE0cZJKfG7rDTCWGN0kZKiOiBXxamBIEzF3B2BT91M oAIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cx25si1544564edb.75.2020.06.30.05.00.52; Tue, 30 Jun 2020 05:01:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730494AbgF3LWG (ORCPT + 99 others); Tue, 30 Jun 2020 07:22:06 -0400 Received: from foss.arm.com ([217.140.110.172]:40890 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726931AbgF3LWF (ORCPT ); Tue, 30 Jun 2020 07:22:05 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1BBF030E; Tue, 30 Jun 2020 04:22:05 -0700 (PDT) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.195.21]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 454E93F68F; Tue, 30 Jun 2020 04:22:03 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra Cc: Valentin Schneider , Qais Yousef , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Patrick Bellasi , Chris Redpath , Lukasz Luba , linux-kernel@vger.kernel.org Subject: [PATCH v6 0/2] sched: Optionally skip uclamp logic in fast path Date: Tue, 30 Jun 2020 12:21:21 +0100 Message-Id: <20200630112123.12076-1-qais.yousef@arm.com> X-Mailer: git-send-email 2.17.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This series attempts to address the report that uclamp logic could be expensive sometimes and shows a regression in netperf UDP_STREAM under certain conditions. The first patch is a fix for how struct uclamp_rq is initialized which is required by the 2nd patch which contains the real 'fix'. Worth noting that the root cause of the overhead is believed to be system specific or related to potential certain code/data layout issues, leading to worse I/D $ performance. Different systems exhibited different behaviors and the regression did disappear in certain kernel version while attempting to reporoduce. More info can be found here: https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/ Having the static key seemed the best thing to do to ensure the effect of uclamp is minimized for kernels that compile it in but don't have a userspace that uses it, which will allow distros to distribute uclamp capable kernels by default without having to compromise on performance for some systems that could be affected. Changes in v6: * s/uclamp_is_enabled/uclamp_is_used/ + add comment * Improve the bailout condition for the case where we could end up with unbalanced call of uclamp_rq_dec_id() * Clarify some comments. Changes in v5: * Fix a race that could happen when order of enqueue/dequeue of tasks A and B is not done in order, and sched_uclamp_used is enabled in between. * Add more comments explaining the race and the behavior of uclamp_rq_util_with() which is now protected with a static key to be a NOP. When no uclamp aggregation at rq level is done, this function can't do much. Changes in v4: * Fix broken boosting of RT tasks when static key is disabled. Changes in v3: * Avoid double negatives and rename the static key to uclamp_used * Unconditionally enable the static key through any of the paths where the user can modify the default uclamp value. * Use C99 named struct initializer for struct uclamp_rq which is easier to read than the memset(). Changes in v2: * Add more info in the commit message about the result of perf diff to demonstrate that the activate/deactivate_task pressure is reduced in the fast path. * Fix sparse warning reported by the test robot. * Add an extra commit about using static_branch_likely() instead of static_branch_unlikely(). Thanks -- Qais Yousef Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Steven Rostedt Cc: Ben Segall Cc: Mel Gorman CC: Patrick Bellasi Cc: Chris Redpath Cc: Lukasz Luba Cc: linux-kernel@vger.kernel.org Qais Yousef (2): sched/uclamp: Fix initialization of struct uclamp_rq sched/uclamp: Protect uclamp fast path code with static key kernel/sched/core.c | 95 ++++++++++++++++++++++++++++++-- kernel/sched/cpufreq_schedutil.c | 2 +- kernel/sched/sched.h | 47 +++++++++++++++- 3 files changed, 135 insertions(+), 9 deletions(-) -- 2.17.1