Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp4363884ybg; Mon, 8 Jun 2020 06:09:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwzE7IfCt+JFDnluqIocP2+Sk5k+kYbZ7yXI3Gad9R+x3/wfGX175ynn957YXf/oLlQlONb X-Received: by 2002:a50:e8c6:: with SMTP id l6mr16004792edn.276.1591621760514; Mon, 08 Jun 2020 06:09:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591621760; cv=none; d=google.com; s=arc-20160816; b=nYLiVR5YU1nmTXPBrz4TbLsAF2uw4e3ZNo8U88VTQyHsC2TUKb5GKeiqbFt5O6TyFq dTYdkWBoSJUGB64JkhyHGo0bGdaHb/9cDNfZgVraOm3y2jfk0lTkJ3QpBGpkEjoSHfA1 qwmf5tfj168UHrhCwVUJ1U20nmgwIJ/pRiKEjhlWK8+AiMbz/y0xr4dm18nUkjzp8GRq 3KnHUP5+zqi4REDF0w3pr2A8fW0fKIrAgEQgDtZXiNpwHF/zeWULh8Uke7qUS4jVmU3M LU05eoiT/zDQ5DTz7muSOm2Vl3QvMGCBbzAQ4PeG1Z2Mt0ci84MhjXS4VvV40VMMPCa6 13jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:in-reply-to :subject:cc:to:from:user-agent:references; bh=ucn1GTT+nW+rl5YmIS8Ww9yO19BiZlEA+5IYfrmuuo0=; b=AVTCNBhs+XHWWITkRPzpxtQg2/2yOnFtgMlgdQtKL2v7xwy0HgP+f4Es1vqw8yip6Y NY+LLyz9G95Qb4W8M8EPX1WVzsVBQfZV14cbjy2z3VqqsseVFDyaOpfH1SLg/6pfgGNs 1BgtxtoyUgsFHC8eSfW8dAHhKTSdINAl09/ayjq/M1d7iNtodbAiDrlK5oXDZM6XHG28 4bPfGFg5o9waO2aNVgaivYp7PE1sF4W5FdKEOjTVH58QhmZ8NsKJ1PjGyV7HW/FxU6tu 70wtc41gJZifo08JgVp+51C3syzn9zRSG+aflbNkHHVwFT/zID0/QkSm1yioTXsZZugh ENaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a22si8609537ejr.529.2020.06.08.06.08.55; Mon, 08 Jun 2020 06:09:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729256AbgFHNGZ (ORCPT + 99 others); Mon, 8 Jun 2020 09:06:25 -0400 Received: from foss.arm.com ([217.140.110.172]:52618 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728245AbgFHNGY (ORCPT ); Mon, 8 Jun 2020 09:06:24 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A9C231B; Mon, 8 Jun 2020 06:06:23 -0700 (PDT) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A25B73F52E; Mon, 8 Jun 2020 06:06:20 -0700 (PDT) References: <20200528132327.GB706460@hirez.programming.kicks-ass.net> <20200528155800.yjrmx3hj72xreryh@e107158-lin.cambridge.arm.com> <20200528161112.GI2483@worktop.programming.kicks-ass.net> <20200529100806.GA3070@suse.de> <87v9k84knx.derkling@matbug.net> <20200603101022.GG3070@suse.de> <20200603165200.v2ypeagziht7kxdw@e107158-lin.cambridge.arm.com> <20200608123102.6sdhdhit7lac5cfl@e107158-lin.cambridge.arm.com> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Qais Yousef Cc: Vincent Guittot , Mel Gorman , Patrick Bellasi , Dietmar Eggemann , Peter Zijlstra , Ingo Molnar , Randy Dunlap , Jonathan Corbet , Juri Lelli , Steven Rostedt , Ben Segall , Luis Chamberlain , Kees Cook , Iurii Zaikin , Quentin Perret , Pavan Kondeti , linux-doc@vger.kernel.org, linux-kernel , linux-fs Subject: Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value In-reply-to: <20200608123102.6sdhdhit7lac5cfl@e107158-lin.cambridge.arm.com> Date: Mon, 08 Jun 2020 14:06:13 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/06/20 13:31, Qais Yousef wrote: > With uclamp enabled but no fair group I get > > *** uclamp enabled/fair group disabled *** > > # Executed 50000 pipe operations between two threads > Total time: 0.856 [sec] > > 17.125740 usecs/op > 58391 ops/sec > > The drop is 5.5% in ops/sec. Or 1 usecs/op. > > I don't know what's the expectation here. 1 us could be a lot, but I don't > think we expect the new code to take more than few 100s of ns anyway. If you > add potential caching effects, reaching 1 us wouldn't be that hard. > I don't think it's fair to look at the absolute delta. This being a very hot path, cumulative overhead gets scary real quick. A drop of 5.5% work done is a big hour lost over a full processing day. > Note that in my runs I chose performance governor and use `taskset 0x2` to > force running on a big core to make sure the runs are repeatable. > > On Juno-r2 I managed to scrap most of the 1 us with the below patch. It seems > there was weird branching behavior that affects the I$ in my case. It'd be good > to try it out to see if it makes a difference for you. > > The I$ effect is my best educated guess. Perf doesn't catch this path and > I couldn't convince it to look at cache and branch misses between 2 specific > points. > > Other subtle code shuffling did have weird effect on the result too. One worthy > one is making uclamp_rq_dec() noinline gains back ~400 ns. Making > uclamp_rq_inc() noinline *too* cancels this gain out :-/ > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 0464569f26a7..0835ee20a3c7 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1071,13 +1071,11 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p, > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) > { > - enum uclamp_id clamp_id; > - > if (unlikely(!p->sched_class->uclamp_enabled)) > return; > > - for_each_clamp_id(clamp_id) > - uclamp_rq_inc_id(rq, p, clamp_id); > + uclamp_rq_inc_id(rq, p, UCLAMP_MIN); > + uclamp_rq_inc_id(rq, p, UCLAMP_MAX); > > /* Reset clamp idle holding when there is one RUNNABLE task */ > if (rq->uclamp_flags & UCLAMP_FLAG_IDLE) > @@ -1086,13 +1084,11 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) > > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) > { > - enum uclamp_id clamp_id; > - > if (unlikely(!p->sched_class->uclamp_enabled)) > return; > > - for_each_clamp_id(clamp_id) > - uclamp_rq_dec_id(rq, p, clamp_id); > + uclamp_rq_dec_id(rq, p, UCLAMP_MIN); > + uclamp_rq_dec_id(rq, p, UCLAMP_MAX); > } > That's... Surprising. Did you look at the difference in generated code? > static inline void > > > FWIW I fail to see activate/deactivate_task in perf record. They don't show up > on the list which means this micro benchmark doesn't stress them as Mel's test > does. > You're not going to see it them perf on the Juno. They're in IRQ disabled sections, so AFAICT it won't get sampled as you don't have NMIs. You can turn on ARM64_PSEUDO_NMI, but you'll need a GICv3 (Ampere eMAG, Cavium ThunderX2).