Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1204220pxb; Wed, 4 Nov 2020 02:49:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJwiNrGIX0Us4LtbyUXDCE49o+SZxVRSjNchflmauUKH0SZtgHBkOuHysbJ6Y/QbnShZ9dMj X-Received: by 2002:a50:f104:: with SMTP id w4mr15784701edl.381.1604486983266; Wed, 04 Nov 2020 02:49:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604486983; cv=none; d=google.com; s=arc-20160816; b=LTlS0KOm3KCjz5eO+WY3wToC3SraQh1ysT/2uDXjmo4mdiQ/Z1uQdjVd0B1N+OVpLi HOlKocyg7+ePD7lc0Wvrz+jjQn5S3Pi1bSYYUMCVMAt59FdqSPc6ytLQHSV/9Pfkwu3y De3fKdMWuNsQuHJomQpKOirDO1xlYNaywuNK+axBaQPHJpZHbGQxSk8cx1PElfy2U6gf gK4/ZtwwzId5BVJpM/fETJxSyRAhSVhxgluD6o8SLwu94VP13sUtwO7yylidP3Tj8HpX SYVyKfdciRPBb+clIT8nujOlwAIhKlRs1z3iZPiq/Di+34oxXb931+wNy8otTfpDcJfl BCFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=GU32s5mPA9T+LP1F/Et8vMr9fZ2JQ8mwg1UEQD4eWu8=; b=S7EI7ytekLuXBp60EmZJOwIYvIOJnvtLNy53FPTwFsIuXETRYZvZ8jQVp14k4+NlOK hbV4sVVJ7BV32MFTFVXIcLRrA2lGbelN0D1wzSDXB/CsWn+45ocgs6/BR+UPlf4Gc3Iy +6Ym66zkpUrTdQGIoquA1begIlvA59L4LkEWN2T1LI7VDORTzuaeYdEiReIHiZCoAnGL 3WgnN2TM4gy7tFdFmu/JEOJ7Wppdu5CUuBqczF66Zm6r83kvlCo1NZNNwqjfAQwGFjCX tmkfauSUQQAjqsrzqsi6iLMFFTM9woozaSWnEZt58hhAixRIqhuot3NZ2Ls0MA9FVwrN 13Hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bvkw1V5h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w8si1051539edu.395.2020.11.04.02.49.19; Wed, 04 Nov 2020 02:49:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bvkw1V5h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729134AbgKDKru (ORCPT + 99 others); Wed, 4 Nov 2020 05:47:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727923AbgKDKrt (ORCPT ); Wed, 4 Nov 2020 05:47:49 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58917C061A4A for ; Wed, 4 Nov 2020 02:47:49 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id n18so21501095wrs.5 for ; Wed, 04 Nov 2020 02:47:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=GU32s5mPA9T+LP1F/Et8vMr9fZ2JQ8mwg1UEQD4eWu8=; b=bvkw1V5hKN78NW4QcZv9NyVdAIrb4vlvn+od6Ecp1ixk+F46RTb0wcfOa80Qq18s+w HVzrEYJEHvPHzr3yqcX0wjNYt/eAcnYRClzwoZCyAjIR5GIbgvmbeRbAlDhUh4TRskPT D0FC72KWXp2Qz6LqicFC9ZDDmaLxqXyRFuUxLfwM97V88Zc9wZV9wxW2Y/dvncM9PRuh 1x9crXHaHDHOVBrAEeo6Z9f2k/30OpE3Z+72DRH7I96IleySEAe0xurcUUChFAfc0pip Qs3QGkfbEL9tf+OwIfK15yME27NfJlaEZ3Z59xfsI59BNK04+Z/HCyqHylbQV6L6PFxw guYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=GU32s5mPA9T+LP1F/Et8vMr9fZ2JQ8mwg1UEQD4eWu8=; b=r5tKygDEa9MH7GMFIY8EO3QSdOUB1SSZS847gFNhXIKpC2On317/rCoK76tIt9bFgW 0QiIy8Oiyijel3GV7G1dT1UbK8tf9CmaiXwgWlAxdUPLw9nX6ONEALsYoj7SkBFQBUxo wKYhiVdxU3Q77Ir9h0CroRSjtZt3HBfvPLkAnxdpgrMSxDO7NcX2lTPy4+jWvJQq0FpV yPEOIZdODN5Ls5/Qk6YBXz88tOS/4Y2/21uam4Gne4CXPR0kwlgt8q0U1K5KXJ/N0svQ dqYlGjtJ0fhoE5xgKvIaPElXMERxnFnyxnnyJHKTNmmIQtmRe7rQ452JGfDrtYt9crVA BJVQ== X-Gm-Message-State: AOAM532uVT/nXpsp1zJ2XqVsi6+frO/ruXPjBJeSSMeKj7e+uG2R+/7u 5TqZLgVIU9RS4AKEUeZkGL9Mog== X-Received: by 2002:adf:ebc6:: with SMTP id v6mr30159657wrn.427.1604486867742; Wed, 04 Nov 2020 02:47:47 -0800 (PST) Received: from ?IPv6:2a01:e34:ed2f:f020:9024:99cb:b881:4beb? ([2a01:e34:ed2f:f020:9024:99cb:b881:4beb]) by smtp.googlemail.com with ESMTPSA id u3sm1950444wrq.19.2020.11.04.02.47.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 04 Nov 2020 02:47:47 -0800 (PST) Subject: Re: [PATCH 4/4] powercap/drivers/dtpm: Add CPU energy model based support To: Lukasz Luba Cc: rafael@kernel.org, srinivas.pandruvada@linux.intel.com, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rui.zhang@intel.com, "Rafael J. Wysocki" , Palmer Dabbelt , Anup Patel , Atish Patra , Marc Zyngier , Andrew Jones , Michael Kelley , Mike Leach , Kajol Jain , Daniel Jordan , Steven Price References: <20201006122024.14539-1-daniel.lezcano@linaro.org> <20201006122024.14539-5-daniel.lezcano@linaro.org> From: Daniel Lezcano Message-ID: Date: Wed, 4 Nov 2020 11:47:45 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Lukasz, On 23/10/2020 15:27, Lukasz Luba wrote: > Hi Daniel, > > > On 10/6/20 1:20 PM, Daniel Lezcano wrote: >> With the powercap dtpm controller, we are able to plug devices with >> power limitation features in the tree. >> >> The following patch introduces the CPU power limitation based on the >> energy model and the performance states. >> >> The power limitation is done at the performance domain level. If some >> CPUs are unplugged, the corresponding power will be substracted from >> the performance domain total power. >> >> It is up to the platform to initialize the dtpm tree and add the CPU. >> >> Here is an example to create a simple tree with one root node called >> "pkg" and the cpu's performance domains. [ ... ] >> +static int set_pd_power_limit(struct powercap_zone *pcz, int cid, >> +                  u64 power_limit) >> +{ >> +    struct dtpm *dtpm = to_dtpm(pcz); >> +    struct dtpm_cpu *dtpm_cpu = dtpm->private; >> +    struct em_perf_domain *pd; >> +    unsigned long freq; >> +    int i, nr_cpus; >> + >> +    spin_lock(&dtpm->lock); >> + >> +    power_limit = clamp_val(power_limit, dtpm->power_min, >> dtpm->power_max); >> + >> +    pd = em_cpu_get(dtpm_cpu->cpu); >> + >> +    nr_cpus = cpumask_weight(to_cpumask(pd->cpus)); >> + >> +    for (i = 0; i < pd->nr_perf_states; i++) { >> + >> +        u64 power = pd->table[i].power * MICROWATT_PER_MILLIWATT; >> + >> +        if ((power * nr_cpus) > power_limit) > > We have one node in that DTPM hierarchy tree, which represents all CPUs > which are in 'related_cpus' mask. I saw below that we just remove the > node in hotplug. The last CPU hotplugged will remove the node. > I have put a comment below asking if we could change the registration, > which will affect power calculation. > > >> +            break; >> +    } >> + >> +    freq = pd->table[i - 1].frequency; >> + >> +    freq_qos_update_request(&dtpm_cpu->qos_req, freq); >> + >> +    dtpm->power_limit = power_limit; >> + >> +    spin_unlock(&dtpm->lock); >> + >> +    return 0; >> +} >> + >> +static int get_pd_power_limit(struct powercap_zone *pcz, int cid, u64 >> *data) >> +{ >> +    struct dtpm *dtpm = to_dtpm(pcz); >> + >> +    spin_lock(&dtpm->lock); >> +    *data = dtpm->power_max; >> +    spin_unlock(&dtpm->lock); >> + >> +    return 0; >> +} >> + >> +static int get_pd_power_uw(struct powercap_zone *pcz, u64 *power_uw) >> +{ >> +    struct dtpm *dtpm = to_dtpm(pcz); >> +    struct dtpm_cpu *dtpm_cpu = dtpm->private; >> +    struct em_perf_domain *pd; >> +    unsigned long freq; >> +    int i, nr_cpus; >> + >> +    freq = cpufreq_quick_get(dtpm_cpu->cpu); >> +    pd = em_cpu_get(dtpm_cpu->cpu); >> +    nr_cpus = cpumask_weight(to_cpumask(pd->cpus)); >> + >> +    for (i = 0; i < pd->nr_perf_states; i++) { >> + >> +        if (pd->table[i].frequency < freq) >> +            continue; >> + >> +        *power_uw = pd->table[i].power * >> +            MICROWATT_PER_MILLIWATT * nr_cpus; > > Same here, we have 'nr_cpus'. > >> + >> +        return 0; >> +    } >> + >> +    return -EINVAL; >> +} >> + >> +static int cpu_release_zone(struct powercap_zone *pcz) >> +{ >> +    struct dtpm *dtpm = to_dtpm(pcz); >> +    struct dtpm_cpu *dtpm_cpu = dtpm->private; >> + >> +    freq_qos_remove_request(&dtpm_cpu->qos_req); >> + >> +    return dtpm_release_zone(pcz); >> +} >> + >> +static struct powercap_zone_constraint_ops pd_constraint_ops = { >> +    .set_power_limit_uw = set_pd_power_limit, >> +    .get_power_limit_uw = get_pd_power_limit, >> +}; >> + >> +static struct powercap_zone_ops pd_zone_ops = { >> +    .get_power_uw = get_pd_power_uw, >> +    .release = cpu_release_zone, >> +}; >> + >> +static int cpuhp_dtpm_cpu_offline(unsigned int cpu) >> +{ >> +    struct cpufreq_policy *policy; >> +    struct em_perf_domain *pd; >> +    struct dtpm *dtpm; >> + >> +    policy = cpufreq_cpu_get(cpu); >> + >> +    if (!policy) >> +        return 0; >> + >> +    pd = em_cpu_get(cpu); >> +    if (!pd) >> +        return -EINVAL; >> + >> +    dtpm = per_cpu(dtpm_per_cpu, cpu); >> + >> +    power_sub(dtpm, pd); >> + >> +    if (cpumask_weight(policy->cpus) != 1) >> +        return 0; >> + >> +    for_each_cpu(cpu, policy->related_cpus) >> +        per_cpu(dtpm_per_cpu, cpu) = NULL; > > Hotplugging one CPU would affect others. I would keep them > all but marked somehow that CPU is offline. No, the last one will remove the node. This is checked in the test above (policy->cpus) != 1 ... >> + >> +    dtpm_unregister(dtpm); > > Could we keep the node in the hierarchy on CPU hotplug? [ ... ] >> diff --git a/include/linux/dtpm.h b/include/linux/dtpm.h >> index 6696bdcfdb87..b62215a13baa 100644 >> --- a/include/linux/dtpm.h >> +++ b/include/linux/dtpm.h >> @@ -70,4 +70,7 @@ int dtpm_register_parent(const char *name, struct >> dtpm *dtpm, >>   int dtpm_register(const char *name, struct dtpm *dtpm, struct dtpm >> *parent, >>             struct powercap_zone_ops *ops, int nr_constraints, >>             struct powercap_zone_constraint_ops *const_ops); >> + >> +int dtpm_register_cpu(struct dtpm *parent); >> + >>   #endif >> > > I have a few comments for this DTPM CPU. > > 1. Maybe we can register these CPUs differently. First register > the parent node as a separate dtpm based on 'policy->related_cpus. Then > register new children nodes, one for each CPU. When the CPU is up, mark > it as 'active'. > > 2. We don't remove the node when the CPU is hotplugged, but we mark it > '!active' Or 'offline'. The power calculation could be done in upper > node, which takes into account that flag for children. > > 3. We would only remove the node when it's module is unloaded (e.g. GPU) > > That would make the tree more stable and also more detailed. > We would also account the power properly when one CPU went offline, but > the other are still there. > > What do you think? The paradigm of the DTPM is the intermediate nodes (have children), are aggregating the power of their children and do not represent the real devices. The leaves are the real devices which are power manageable. In our case, the CPU DTPM is based on the performance state which is a group of CPUs, hence it is a leaf of the tree. I think you misunderstood the power is recomputed when the CPU is switched on/off and the node is removed when the last CPU is hotplugged. eg. 1000mW max per CPU, a performance domain with 4 CPUs. With all CPUs on, max power is 4000mW With 3 CPUs on, and 1 CPU off, max power is 3000mW etc... With 4 CPUs off, the node is removed. If the hardware evolves with a performance domain per CPU, we will end up with a leaf per CPU and a "cluster" on top of them. -- Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog