Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757880AbaKTSya (ORCPT ); Thu, 20 Nov 2014 13:54:30 -0500 Received: from mail-qc0-f174.google.com ([209.85.216.174]:60900 "EHLO mail-qc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757699AbaKTSy2 (ORCPT ); Thu, 20 Nov 2014 13:54:28 -0500 Date: Thu, 20 Nov 2014 14:54:27 -0400 From: Eduardo Valentin To: Lukasz Majewski Cc: Zhang Rui , Ezequiel Garcia , Kuninori Morimoto , Linux PM list , Vincenzo Frascino , Bartlomiej Zolnierkiewicz , Lukasz Majewski , Nobuhiro Iwamatsu , Mikko Perttunen , Stephen Warren , Thierry Reding , Alexandre Courbot , linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Message-ID: <20141120185420.GA26794@developer> References: <1411547232-21493-1-git-send-email-l.majewski@samsung.com> <1415898165-27406-1-git-send-email-l.majewski@samsung.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/04w6evG8XlLl3ft" Content-Disposition: inline In-Reply-To: <1415898165-27406-1-git-send-email-l.majewski@samsung.com> User-Agent: Mutt/1.5.22 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --/04w6evG8XlLl3ft Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Lukasz, Thanks for the keeping this up. And apologize for late answer. On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote: > Presented fixes are a response for problem described below: > http://thread.gmane.org/gmane.linux.kernel/1793821/match=3Dthermal+core+f= ix+initialize+max_state+variable+0 >=20 > In short - it turned out that two trivial fixes (included in this patch s= et) > require support for deferred probe in thermal drivers. >=20 > This situation shows up when CPU frequency reduction is used as a thermal= cooling > device for a thermal zone. > It happens that during initialization, the call to thermal probe will be = executed > before cpufreq probe (it can be observed at ./drivers/Makefile). > In such a situation thermal will not be properly configured until cpufreq= policy > is setup. >=20 > In the current code (without included fixes) there is a time window in wh= ich thermal > can try to use not configured cpufreq and possibly crash the system. >=20 >=20 > Proposed solution was based on the code already available in the imx_ther= mal.c file. >=20 > /db8500_thermal.c: -> NOT NEEDED > /intel_powerclamp.c: -> NOT NEEDED - INTEL (x86) > /intel_powerclamp.c: -> NOT NEEDED - INTEL (x86) > /ti-soc-thermal/ti-bandgap.c: -> FIXED [omap2plus_defconfig] > /dove_thermal.c: -> NOT NEEDED - CPU_COOLING NOT A= VAILABLE > [dove_defconfig] > /spear_thermal.c: -> FIXED [spear3xx_defconfig] > /samsung/exynos_tmu.c: -> NOT NEEDED (nasty hack - will = be reworked in later patches) > /imx_thermal.c: -> OK (deferred probe already in = place) > /int340x_thermal/int3402_thermal.c: -> NOT NEEDED - ACPI x86 - Intel = specific > /int340x_thermal/int3400_thermal.c: -> NOT NEEDED - ACPI x86 - Intel = specific > /tegra_soctherm.c: -> FIXED [tegra_defconfig] > /kirkwood_thermal.c: -> FIXED [multi_v5_defconfig] > /armada_thermal.c: -> FIXED [multi_v7_defconfig] > /rcar_thermal.c: -> FIXED [shmobile_defconfig] > /db8500_cpufreq_cooling.c: -> OK (deferred probe already in = place) [multi_v7_defconfig] > /st/st_thermal_syscfg.c: -> NOT NEEDED (Those two are enab= led by e.g. ARMADA) > /st/st_thermal_memmap.c: >=20 >=20 Instead of doing the same check on all drivers in the need for cpu cooling looks like a promiscuous solution. What if we do this check in cpu cooling itself and we propagate the error in callers code?=20 =46rom what I see, only exynos does not propagate the error. And we would need a tweak in the cpufreq-dt code. Something like the following (not tested): diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index f657c57..f139247 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy *policy) { struct cpufreq_dt_platform_data *pd; struct cpufreq_frequency_table *freq_table; - struct thermal_cooling_device *cdev; struct device_node *np; struct private_data *priv; struct device *cpu_dev; @@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy *policy) goto out_free_priv; } =20 - /* - * For now, just loading the cooling device; - * thermal DT code takes care of matching them. - */ - if (of_find_property(np, "#cooling-cells", NULL)) { - cdev =3D of_cpufreq_cooling_register(np, cpu_present_mask); - if (IS_ERR(cdev)) - dev_err(cpu_dev, - "running cpufreq without cooling device: %ld\n", - PTR_ERR(cdev)); - else - priv->cdev =3D cdev; - } - priv->cpu_dev =3D cpu_dev; priv->cpu_reg =3D cpu_reg; policy->driver_data =3D priv; @@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy *policy) if (ret) { dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__, ret); - goto out_cooling_unregister; + goto free_table; } =20 policy->cpuinfo.transition_latency =3D transition_latency; @@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy *policy) =20 return 0; =20 -out_cooling_unregister: - cpufreq_cooling_unregister(priv->cdev); +free_table: dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table); out_free_priv: kfree(priv); @@ -342,11 +326,14 @@ static struct cpufreq_driver dt_cpufreq_driver =3D { =20 static int dt_cpufreq_probe(struct platform_device *pdev) { + struct device_node *np; struct device *cpu_dev; struct regulator *cpu_reg; struct clk *cpu_clk; int ret; =20 + /* at this point we checked the pointer already right? */ + np =3D of_node_get(pdev->dev.of_node); /* * All per-cluster (CPUs sharing clock/voltages) initialization is done * from ->init(). In probe(), we just need to make sure that clk and @@ -368,6 +355,28 @@ static int dt_cpufreq_probe(struct platform_device *pd= ev) if (ret) dev_err(cpu_dev, "failed register driver: %d\n", ret); =20 + /* + * For now, just loading the cooling device; + * thermal DT code takes care of matching them. + */ + if (of_find_property(np, "#cooling-cells", NULL)) { + struct cpufreq_policy policy; + struct private_data *priv; + struct thermal_cooling_device *cdev; + + /* TODO: can cpu0 be always used ? */ + cpufreq_get_policy(&policy, 0); + priv =3D policy.driver_data; + cdev =3D of_cpufreq_cooling_register(np, cpu_present_mask); + if (IS_ERR(cdev)) + dev_err(cpu_dev, + "running cpufreq without cooling device: %ld\n", + PTR_ERR(cdev)); + else + priv->cdev =3D cdev; + } + of_node_put(np); + return ret; } =20 diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index 1ab0018..342eb9e 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node *np, int ret =3D 0, i; struct cpufreq_policy policy; =20 + if (!cpufreq_get_current_driver()) { + dev_warn(&pdev->dev, "no cpufreq driver, deferring."); + return -EPROBE_DEFER; + } + /* Verify that all the clip cpus have same freq_min, freq_max limit */ for_each_cpu(i, clip_cpus) { /* continue if cpufreq policy not found and not return error */ diff --git a/drivers/thermal/samsung/exynos_thermal_common.c b/drivers/ther= mal/samsung/exynos_thermal_common.c index 3f5ad25..f84975e 100644 --- a/drivers/thermal/samsung/exynos_thermal_common.c +++ b/drivers/thermal/samsung/exynos_thermal_common.c @@ -373,7 +373,7 @@ int exynos_register_thermal(struct thermal_sensor_conf = *sensor_conf) if (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size])) { dev_err(sensor_conf->dev, "Failed to register cpufreq cooling device\n"); - ret =3D -EINVAL; + ret =3D PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]); goto err_unregister; } th_zone->cool_dev_size++; The above way, we avoid having same test in every driver that needs it. Besides, it makes sense the cpu_cooling code takes care of this check, as it is the very first part that has direct dependency with cpufreq. > I only possess Exynos boards and Beagle Bone Black, so I'd be grateful for > testing proposed solution on other boards. The posted code is compile tes= ted. >=20 > This code applies on Eduardo's ti-soc-thermal-next tree: > SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3 >=20 > Lukasz Majewski (8): > thermal:cpu cooling:armada: Provide deferred probing for armada driver > thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood > driver > thermal:cpu cooling:rcar: Provide deferred probing for rcar driver > thermal:cpu cooling:spear: Provide deferred probing for spear driver > thermal:cpu cooling:tegra: Provide deferred probing for tegra driver > thermal:cpu cooling:ti: Provide deferred probing for ti drivers > thermal:core:fix: Initialize the max_state variable to 0 > thermal:core:fix: Check return code of the ->get_max_state() callback >=20 > drivers/thermal/armada_thermal.c | 7 +++++++ > drivers/thermal/kirkwood_thermal.c | 7 +++++++ > drivers/thermal/rcar_thermal.c | 7 +++++++ > drivers/thermal/spear_thermal.c | 7 +++++++ > drivers/thermal/tegra_soctherm.c | 7 +++++++ > drivers/thermal/thermal_core.c | 8 +++++--- > drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++ > 7 files changed, 47 insertions(+), 3 deletions(-) >=20 > --=20 > 2.0.0.rc2 >=20 --/04w6evG8XlLl3ft Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBAgAGBQJUbjjcAAoJEMLUO4d9pOJWfAsH/1pVVxzSR5mMW1eF4M1s6nvM sxEPSSqu7S8l9Z2SZ4hEXSdPB0g9ZOzxkmoz8aT5amo+lxuzv7LLFmwcnushFYll YyI4AeGROKwMXEAVqYsczrSM0PfrC5v1VAMZ5nlrfU4qy+xLD8vdY23P+9x/6pKo Iv1SiLVZUQ+bNXL4Bs5sXcqhaY6l2JimcQT6+mXeZpAEJccucUMzrlzq4EtK8sbE eGe/G1m9/FNJI+AGIR3E/asJhTqhfiJWuEF0NMHgnPnAIAA1uS9Q/ESaWVfKvm/v DJmTWynlw75UOYRAME269As8ME1WaJYqTMmm8JCyAuBSRDF4DJNfZGwINvS+WUU= =ySdp -----END PGP SIGNATURE----- --/04w6evG8XlLl3ft-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/