Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp3780474imc; Thu, 14 Mar 2019 05:16:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqyPdI5jkncjwAHHYVsidjAqdWiTFvthzduFiROnPHM5PI2NpdfWegW1mr8tf6nVB0RIveHt X-Received: by 2002:a62:e719:: with SMTP id s25mr48536957pfh.12.1552565763516; Thu, 14 Mar 2019 05:16:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552565763; cv=none; d=google.com; s=arc-20160816; b=G/0OofYIJ2ga2kUxJMIKqzvyl1urGbqiPyy72N/CuWojwlFG9S+rwDSCwDarpq109W Bglb8PoMqGJroNjfjbg63PzQIJ7tEj1OJ7QzdfBfC0hkTSnw5+pnztd3bZGDZvZsCt6D zHDWr+Bh9kH+TyfFebOJjfaa8/Yeq6QGLu4CiewL25599/uU5InmOy0VEmDQR7WYlypj RfyVE0Ct2MCz7Q5UluAn8ipAlFxODBGGGxpBWQiKLaor75qstE4JVqm3McaxAOdrtzfC bxr5Mwp33EJy4SSX7ABgNMRy8A0CvvMQ7LVnVtEr5CjaSk0GTaYsZa53xBfYySEorVd2 pucA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=xB7C3YuUp418BmWzZNlnUxMCFIvUVCVaTp67BfxDRR4=; b=NZnvGgKIC/7LGgVvLd9wlV3ANHyFpulYPY2mHZATVnME97qQGZKQDwKWXtHHOnj8/f HspVZ36HxbPvwZffR7sw06VoJt1dkEMe9ZrNl+kHhQnEE/DWuZJn+eeNu4MyUlh4I/vT 9b4EoovX4tzC/GKwZq/vVmRB8EYBtGrG+wAwzXorgPiYyoS2AQghxkjfgt6mwOaat3Oh PCaBfGV3lDIIjHL+udworPVKFGLqg1K2uwhln9SHdz4ZrxCI+Am8zaS7tMRoTQtVu7Ty SiJut5XZg+7WbRynGKDGoKa35PATcwthr3huXMwiN4Po/7Wb74UmxeVDbCYuOYr8xU20 qPTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c141si13113586pfb.199.2019.03.14.05.15.46; Thu, 14 Mar 2019 05:16:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727087AbfCNMNW (ORCPT + 99 others); Thu, 14 Mar 2019 08:13:22 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:43208 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726776AbfCNMNW (ORCPT ); Thu, 14 Mar 2019 08:13:22 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1E64E374; Thu, 14 Mar 2019 05:13:21 -0700 (PDT) Received: from e110439-lin (e110439-lin.cambridge.arm.com [10.1.194.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 271083F614; Thu, 14 Mar 2019 05:13:18 -0700 (PDT) Date: Thu, 14 Mar 2019 12:13:15 +0000 From: Patrick Bellasi To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-api@vger.kernel.org, Ingo Molnar , Tejun Heo , "Rafael J . Wysocki" , Vincent Guittot , Viresh Kumar , Paul Turner , Quentin Perret , Dietmar Eggemann , Morten Rasmussen , Juri Lelli , Todd Kjos , Joel Fernandes , Steve Muckle , Suren Baghdasaryan Subject: Re: [PATCH v7 01/15] sched/core: uclamp: Add CPU's clamp buckets refcounting Message-ID: <20190314121315.juqpsqu5cwouuqpp@e110439-lin> References: <20190208100554.32196-1-patrick.bellasi@arm.com> <20190208100554.32196-2-patrick.bellasi@arm.com> <20190313134022.GB5922@hirez.programming.kicks-ass.net> <20190313161229.pkib2tmjass5chtb@e110439-lin> <20190313194838.GS2482@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190313194838.GS2482@worktop.programming.kicks-ass.net> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13-Mar 20:48, Peter Zijlstra wrote: > On Wed, Mar 13, 2019 at 04:12:29PM +0000, Patrick Bellasi wrote: > > On 13-Mar 14:40, Peter Zijlstra wrote: > > > On Fri, Feb 08, 2019 at 10:05:40AM +0000, Patrick Bellasi wrote: > > > > +static inline unsigned int uclamp_bucket_id(unsigned int clamp_value) > > > > +{ > > > > + return clamp_value / UCLAMP_BUCKET_DELTA; > > > > +} > > > > + > > > > +static inline unsigned int uclamp_bucket_value(unsigned int clamp_value) > > > > +{ > > > > + return UCLAMP_BUCKET_DELTA * uclamp_bucket_id(clamp_value); > > > > > > return clamp_value - (clamp_value % UCLAMP_BUCKET_DELTA); > > > > > > might generate better code; just a single division, instead of a div and > > > mult. > > > > Wondering if compilers cannot do these optimizations... but yes, looks > > cool and will do it in v8, thanks. > > I'd be most impressed if they pull this off. Check the generated code > and see I suppose :-) On x86 the code generated looks exactly the same: https://godbolt.org/z/PjmA7k While on on arm64 it seems the difference boils down to: - one single "mul" instruction vs - two instructions: "sub" _plus_ one "multiply subtract" https://godbolt.org/z/0shU0S So, if I din't get something wrong... perhaps the original version is even better, isn't it? Test code: ---8<--- #define UCLAMP_BUCKET_DELTA 52 static inline unsigned int uclamp_bucket_id(unsigned int clamp_value) { return clamp_value / UCLAMP_BUCKET_DELTA; } static inline unsigned int uclamp_bucket_value1(unsigned int clamp_value) { return UCLAMP_BUCKET_DELTA * uclamp_bucket_id(clamp_value); } static inline unsigned int uclamp_bucket_value2(unsigned int clamp_value) { return clamp_value - (clamp_value % UCLAMP_BUCKET_DELTA); } int test1(int argc, char *argv[]) { return uclamp_bucket_value1(argc); } int test2(int argc, char *argv[]) { return uclamp_bucket_value2(argc); } int test3(int argc, char *argv[]) { return uclamp_bucket_value1(argc) - uclamp_bucket_value2(argc); } ---8<--- which gives on arm64: ---8<--- test1: mov w1, 60495 movk w1, 0x4ec4, lsl 16 umull x0, w0, w1 lsr x0, x0, 36 mov w1, 52 mul w0, w0, w1 ret test2: mov w1, 60495 movk w1, 0x4ec4, lsl 16 umull x1, w0, w1 lsr x1, x1, 36 mov w2, 52 msub w1, w1, w2, w0 sub w0, w0, w1 ret test3: mov w0, 0 ret ---8<--- -- #include Patrick Bellasi