Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp406724imm; Wed, 29 Aug 2018 03:06:14 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYs732GKkRLivk1FxpSMgaQn/g42poz9Y2OPvQvI8+r7k1MjTrdyG7oGz4JjGrokk8C6sPc X-Received: by 2002:a63:1c5b:: with SMTP id c27-v6mr5066901pgm.109.1535537174923; Wed, 29 Aug 2018 03:06:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535537174; cv=none; d=google.com; s=arc-20160816; b=hQAybjWF1OgxNvjdQqqT7b8rNN2g2tqtttRRiMhL2ADWbdNSYNbsobMwqM0mV5xnSy JMzaRvCtXCnLRSGr2nC1BE0+kHSBP8PNtMkF05AuYN2QqsV0aUh/6G0pd3nSKkCw0Vi9 EZQnkh87zJraO+0CJ6aADBF0V0DJZXUJCIem5yH9wSSclqy1xm+Rlb/ZmBNcAvM+wF0F +WUwUpwH7OHpga0teYZ6Cd3jTKGI2pRnjUyxbbBL4ax+jlMXXi8EB/7ZOSRSg78YIeM/ bsKn8gYjzSrII5BSu8VqnuqF+CzPAeuUV/Zh5miOLMnt2VyauCuAGWvNXFSepBTCXVhW Ywjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=vwummItRiLoJRuKKw4JA190vah6ABMIHxPS84lSmXAc=; b=KX0lT0MyTSl3J7sXu8mpnXhRV9f/PphNv2jQZMUZWKwX6ClU2Aiu1KBGVXKCbbVncO Bl+hLVLu0mpbvFhfoICb3vViGGqvAduqsWWCut6Yp5RIvfJx+Xe8aUOpcNGwra/GpSSH 2Jbe5XQaafmDl0BeGfhBLwWutj5iruyJGriP4pbHKogMObc8Dt3P0xtPEdzTXP3CExh5 REQAkL3xrwkCShEND2DmQafGy0o/qw9KpSrVVVmxMFx980mKOci63OpSylw8gdN9uVuP 4cqFcm4vfWn0LUXAKXMGNROfOtZP48qFfplrriAh6P6Gm2AJpiOMes1cJQj2nfXoXhu5 Pgnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b5-v6si3955975plk.176.2018.08.29.03.05.59; Wed, 29 Aug 2018 03:06:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727612AbeH2OAu (ORCPT + 99 others); Wed, 29 Aug 2018 10:00:50 -0400 Received: from foss.arm.com ([217.140.101.70]:51446 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727204AbeH2OAu (ORCPT ); Wed, 29 Aug 2018 10:00:50 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4B53D80D; Wed, 29 Aug 2018 03:04:42 -0700 (PDT) Received: from e110439-lin (e110439-lin.Emea.Arm.com [10.4.12.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 56D7E3F5BD; Wed, 29 Aug 2018 03:04:38 -0700 (PDT) Date: Wed, 29 Aug 2018 11:04:35 +0100 From: Patrick Bellasi To: Quentin Perret Cc: peterz@infradead.org, rjw@rjwysocki.net, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, gregkh@linuxfoundation.org, mingo@redhat.com, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, chris.redpath@arm.com, valentin.schneider@arm.com, vincent.guittot@linaro.org, thara.gopinath@linaro.org, viresh.kumar@linaro.org, tkjos@google.com, joel@joelfernandes.org, smuckle@google.com, adharmap@codeaurora.org, skannan@codeaurora.org, pkondeti@codeaurora.org, juri.lelli@redhat.com, edubezval@gmail.com, srinivas.pandruvada@linux.intel.com, currojerez@riseup.net, javi.merino@kernel.org Subject: Re: [PATCH v6 03/14] PM: Introduce an Energy Model management framework Message-ID: <20180829100435.GP2960@e110439-lin> References: <20180820094420.26590-1-quentin.perret@arm.com> <20180820094420.26590-4-quentin.perret@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180820094420.26590-4-quentin.perret@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Quentin, few possible optimizations related to the (simplified) following code: On 20-Aug 10:44, Quentin Perret wrote: [...] > +struct em_perf_domain { > + struct em_cap_state *table; /* Capacity states, in ascending order. */ > + int nr_cap_states; > + unsigned long cpus[0]; /* CPUs of the frequency domain. */ > +}; [...] > +static DEFINE_PER_CPU(struct em_perf_domain *, em_data); [...] > +struct em_perf_domain *em_cpu_get(int cpu) > +{ > + return READ_ONCE(per_cpu(em_data, cpu)); > +} [...] > +int em_register_perf_domain(cpumask_t *span, unsigned int nr_states, > + struct em_data_callback *cb) > +{ [...] > + mutex_lock(&em_pd_mutex); > + [...] > + for_each_cpu(cpu, span) { > + if (READ_ONCE(per_cpu(em_data, cpu))) { > + ret = -EEXIST; > + goto unlock; > + } [...] > + for_each_cpu(cpu, span) { > + /* > + * The per-cpu array can be concurrently accessed from > + * em_cpu_get(). > + */ > + smp_store_release(per_cpu_ptr(&em_data, cpu), pd); > + } [...] > +unlock: > + mutex_unlock(&em_pd_mutex); > +} In the loop above we use smp_store_release() to propagate the pointer setting in a PER_CPU(em_data), which ultimate goal is to protect em_register_perf_domain() from multiple clients registering the same power domain. I think there are two possible optimizations there: 1. use of a single memory barrier Since we are already em_pd_mutex protected, i.e. there cannot be a concurrent writers, we can use one single memory barrier after the loop, i.e. for_each_cpu(cpu, span) WRITE_ONCE() smp_wmb() which should be just enough to ensure that all other CPUs will see the pointer set once we release the mutex 2. avoid using PER_CPU variables Apart from the initialization code, i.e. boot time, the em_data is expected to be read only, isn't it? If that's the case, I think that using PER_CPU variables is not strictly required while it unnecessarily increases the cache pressure. In the worst case we can end up with one cache line for each CPU to host just an 8B pointer, instead of using that single cache line to host up to 8 pointers if we use just an array, i.e. struct em_perf_domain *em_data[NR_CPUS] ____cacheline_aligned_in_smp __read_mostly; Consider also that: up to 8 pointers in a single cache line means also that single cache line can be enough to access the EM from all the CPUs of almost every modern mobile phone SoC. Note entirely sure if PER_CPU uses less overall memory in case you have much less CPUs then the compile time defined NR_CPUS. But still, if the above makes sense, you still have a 8x gain factor between number Write allocated .data..percp sections and the value of NR_CPUS. Meaning that in the worst case we allocate the same amount of memory using NR_CPUS=64 (the default on arm64) while running on an 8 CPUs system... but still we should get less cluster caches pressure at run-time with the array approach, 1 cache line vs 4. Best, Patrick -- #include Patrick Bellasi