The final step in regulator_register() is to call
regulator_resolve_supply() for each registered regulator
(including the one in the process of being registered). The
regulator_resolve_supply() function first checks if rdev->supply
is NULL, then it performs various steps to try to find the supply.
If successful, rdev->supply is set inside of set_supply().
This procedure can encounter a race condition if two concurrent
tasks call regulator_register() near to each other on separate CPUs
and one of the regulators has rdev->supply_name specified. There
is currently nothing guaranteeing atomicity between the rdev->supply
check and set steps. Thus, both tasks can observe rdev->supply==NULL
in their regulator_resolve_supply() calls. This then results in
both creating a struct regulator for the supply. One ends up
actually stored in rdev->supply and the other is lost (though still
present in the supply's consumer_list).
Here is a kernel log snippet showing the issue:
[ 12.421768] gpu_cc_gx_gdsc: supplied by pm8350_s5_level
[ 12.425854] gpu_cc_gx_gdsc: supplied by pm8350_s5_level
[ 12.429064] debugfs: Directory 'regulator.4-SUPPLY' with parent
'17a00000.rsc:rpmh-regulator-gfxlvl-pm8350_s5_level'
already present!
Avoid this race condition by holding the rdev->mutex lock inside
of regulator_resolve_supply() while checking and setting
rdev->supply.
Signed-off-by: David Collins <[email protected]>
---
drivers/regulator/core.c | 39 ++++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 11 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index fee9241..3ae5ccd 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1813,23 +1813,34 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
{
struct regulator_dev *r;
struct device *dev = rdev->dev.parent;
- int ret;
+ int ret = 0;
/* No supply to resolve? */
if (!rdev->supply_name)
return 0;
- /* Supply already resolved? */
+ /* Supply already resolved? (fast-path without locking contention) */
if (rdev->supply)
return 0;
+ /*
+ * Recheck rdev->supply with rdev->mutex lock held to avoid a race
+ * between rdev->supply null check and setting rdev->supply in
+ * set_supply() from concurrent tasks.
+ */
+ regulator_lock(rdev);
+
+ /* Supply just resolved by a concurrent task? */
+ if (rdev->supply)
+ goto out;
+
r = regulator_dev_lookup(dev, rdev->supply_name);
if (IS_ERR(r)) {
ret = PTR_ERR(r);
/* Did the lookup explicitly defer for us? */
if (ret == -EPROBE_DEFER)
- return ret;
+ goto out;
if (have_full_constraints()) {
r = dummy_regulator_rdev;
@@ -1837,15 +1848,18 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
} else {
dev_err(dev, "Failed to resolve %s-supply for %s\n",
rdev->supply_name, rdev->desc->name);
- return -EPROBE_DEFER;
+ ret = -EPROBE_DEFER;
+ goto out;
}
}
if (r == rdev) {
dev_err(dev, "Supply for %s (%s) resolved to itself\n",
rdev->desc->name, rdev->supply_name);
- if (!have_full_constraints())
- return -EINVAL;
+ if (!have_full_constraints()) {
+ ret = -EINVAL;
+ goto out;
+ }
r = dummy_regulator_rdev;
get_device(&r->dev);
}
@@ -1859,7 +1873,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (r->dev.parent && r->dev.parent != rdev->dev.parent) {
if (!device_is_bound(r->dev.parent)) {
put_device(&r->dev);
- return -EPROBE_DEFER;
+ ret = -EPROBE_DEFER;
+ goto out;
}
}
@@ -1867,13 +1882,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
ret = regulator_resolve_supply(r);
if (ret < 0) {
put_device(&r->dev);
- return ret;
+ goto out;
}
ret = set_supply(rdev, r);
if (ret < 0) {
put_device(&r->dev);
- return ret;
+ goto out;
}
/*
@@ -1886,11 +1901,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (ret < 0) {
_regulator_put(rdev->supply);
rdev->supply = NULL;
- return ret;
+ goto out;
}
}
- return 0;
+out:
+ regulator_unlock(rdev);
+ return ret;
}
/* Internal regulator request function */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Thu, 7 Jan 2021 17:16:02 -0800, David Collins wrote:
> The final step in regulator_register() is to call
> regulator_resolve_supply() for each registered regulator
> (including the one in the process of being registered). The
> regulator_resolve_supply() function first checks if rdev->supply
> is NULL, then it performs various steps to try to find the supply.
> If successful, rdev->supply is set inside of set_supply().
>
> [...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git for-next
Thanks!
[1/1] regulator: core: avoid regulator_resolve_supply() race condition
commit: eaa7995c529b54d68d97a30f6344cc6ca2f214a7
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
Hi,
On 08.01.2021 02:16, David Collins wrote:
> The final step in regulator_register() is to call
> regulator_resolve_supply() for each registered regulator
> (including the one in the process of being registered). The
> regulator_resolve_supply() function first checks if rdev->supply
> is NULL, then it performs various steps to try to find the supply.
> If successful, rdev->supply is set inside of set_supply().
>
> This procedure can encounter a race condition if two concurrent
> tasks call regulator_register() near to each other on separate CPUs
> and one of the regulators has rdev->supply_name specified. There
> is currently nothing guaranteeing atomicity between the rdev->supply
> check and set steps. Thus, both tasks can observe rdev->supply==NULL
> in their regulator_resolve_supply() calls. This then results in
> both creating a struct regulator for the supply. One ends up
> actually stored in rdev->supply and the other is lost (though still
> present in the supply's consumer_list).
>
> Here is a kernel log snippet showing the issue:
>
> [ 12.421768] gpu_cc_gx_gdsc: supplied by pm8350_s5_level
> [ 12.425854] gpu_cc_gx_gdsc: supplied by pm8350_s5_level
> [ 12.429064] debugfs: Directory 'regulator.4-SUPPLY' with parent
> '17a00000.rsc:rpmh-regulator-gfxlvl-pm8350_s5_level'
> already present!
>
> Avoid this race condition by holding the rdev->mutex lock inside
> of regulator_resolve_supply() while checking and setting
> rdev->supply.
>
> Signed-off-by: David Collins <[email protected]>
This patch landed in linux next-20210112 as commit eaa7995c529b
("regulator: core: avoid regulator_resolve_supply() race condition"). I
found that it triggers a following lockdep warning during the DWC3
driver registration on some Exynos based boards (this log is from
Samsung Exynos5420-based Peach-Pit board):
======================================================
WARNING: possible circular locking dependency detected
5.11.0-rc1-00008-geaa7995c529b #10095 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
c12e1b80 (regulator_list_mutex){+.+.}-{3:3}, at:
regulator_lock_dependent+0x4c/0x2b0
but task is already holding lock:
df7190c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x44/0x318
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (regulator_ww_class_mutex){+.+.}-{3:3}:
ww_mutex_lock+0x48/0x88
regulator_lock_recursive+0x84/0x1f4
regulator_lock_dependent+0x184/0x2b0
regulator_enable+0x30/0xe4
dwc3_exynos_probe+0x17c/0x2c0
platform_probe+0x80/0xc0
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x18c/0x1dc
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
-> #1 (regulator_ww_class_acquire){+.+.}-{0:0}:
regulator_enable+0x30/0xe4
dwc3_exynos_probe+0x17c/0x2c0
platform_probe+0x80/0xc0
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x18c/0x1dc
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
-> #0 (regulator_list_mutex){+.+.}-{3:3}:
lock_acquire+0x2e4/0x5dc
__mutex_lock+0xa4/0xb60
mutex_lock_nested+0x1c/0x24
regulator_lock_dependent+0x4c/0x2b0
regulator_enable+0x30/0xe4
regulator_resolve_supply+0x1cc/0x318
regulator_register_resolve_supply+0x14/0x78
class_for_each_device+0x68/0xe8
regulator_register+0xa2c/0xc9c
devm_regulator_register+0x40/0x70
tps65090_regulator_probe+0x150/0x648
platform_probe+0x80/0xc0
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c4/0x7e8
platform_device_add+0x120/0x25c
mfd_add_devices+0x580/0x60c
tps65090_i2c_probe+0xb8/0x184
i2c_device_probe+0x234/0x2a4
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c4/0x7e8
i2c_new_client_device+0x15c/0x27c
of_i2c_register_devices+0x114/0x184
i2c_register_adapter+0x1d8/0x6dc
ec_i2c_probe+0xc8/0x124
platform_probe+0x80/0xc0
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c4/0x7e8
of_platform_device_create_pdata+0x90/0xc8
of_platform_bus_create+0x1a0/0x4ec
of_platform_populate+0x88/0x120
devm_of_platform_populate+0x40/0x80
cros_ec_register+0x174/0x308
cros_ec_spi_probe+0x16c/0x1ec
spi_probe+0x88/0xac
really_probe+0x1c4/0x4e4
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x18c/0x1dc
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
other info that might help us debug this:
Chain exists of:
regulator_list_mutex --> regulator_ww_class_acquire -->
regulator_ww_class_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(regulator_ww_class_mutex);
lock(regulator_ww_class_acquire);
lock(regulator_ww_class_mutex);
lock(regulator_list_mutex);
*** DEADLOCK ***
5 locks held by swapper/0/1:
#0: dfb6e4c8 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x60
#1: c1fedcd8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#2: df53a4e8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#3: df5224d8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#4: df7190c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x44/0x318
stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.11.0-rc1-00008-geaa7995c529b
#10095
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c01116e8>] (unwind_backtrace) from [<c010cf58>] (show_stack+0x10/0x14)
[<c010cf58>] (show_stack) from [<c0b38ffc>] (dump_stack+0xa4/0xc4)
[<c0b38ffc>] (dump_stack) from [<c0193458>] (check_noncircular+0x14c/0x164)
[<c0193458>] (check_noncircular) from [<c0196b90>]
(__lock_acquire+0x1830/0x31cc)
[<c0196b90>] (__lock_acquire) from [<c01991e4>] (lock_acquire+0x2e4/0x5dc)
[<c01991e4>] (lock_acquire) from [<c0b4043c>] (__mutex_lock+0xa4/0xb60)
[<c0b4043c>] (__mutex_lock) from [<c0b40f14>] (mutex_lock_nested+0x1c/0x24)
[<c0b40f14>] (mutex_lock_nested) from [<c05ccd94>]
(regulator_lock_dependent+0x4c/0x2b0)
[<c05ccd94>] (regulator_lock_dependent) from [<c05d220c>]
(regulator_enable+0x30/0xe4)
[<c05d220c>] (regulator_enable) from [<c05d248c>]
(regulator_resolve_supply+0x1cc/0x318)
[<c05d248c>] (regulator_resolve_supply) from [<c05d2974>]
(regulator_register_resolve_supply+0x14/0x78)
[<c05d2974>] (regulator_register_resolve_supply) from [<c06a3000>]
(class_for_each_device+0x68/0xe8)
[<c06a3000>] (class_for_each_device) from [<c05d3e20>]
(regulator_register+0xa2c/0xc9c)
[<c05d3e20>] (regulator_register) from [<c05d5c70>]
(devm_regulator_register+0x40/0x70)
[<c05d5c70>] (devm_regulator_register) from [<c05dea58>]
(tps65090_regulator_probe+0x150/0x648)
[<c05dea58>] (tps65090_regulator_probe) from [<c06a3fe8>]
(platform_probe+0x80/0xc0)
[<c06a3fe8>] (platform_probe) from [<c06a1114>] (really_probe+0x1c4/0x4e4)
[<c06a1114>] (really_probe) from [<c06a14ac>]
(driver_probe_device+0x78/0x1d8)
[<c06a14ac>] (driver_probe_device) from [<c069f1a4>]
(bus_for_each_drv+0x78/0xbc)
[<c069f1a4>] (bus_for_each_drv) from [<c06a0eb0>]
(__device_attach+0xe8/0x180)
[<c06a0eb0>] (__device_attach) from [<c069ff50>]
(bus_probe_device+0x88/0x90)
[<c069ff50>] (bus_probe_device) from [<c069dbac>] (device_add+0x4c4/0x7e8)
[<c069dbac>] (device_add) from [<c06a3bac>]
(platform_device_add+0x120/0x25c)
[<c06a3bac>] (platform_device_add) from [<c06d5c7c>]
(mfd_add_devices+0x580/0x60c)
[<c06d5c7c>] (mfd_add_devices) from [<c06d80e8>]
(tps65090_i2c_probe+0xb8/0x184)
[<c06d80e8>] (tps65090_i2c_probe) from [<c0822520>]
(i2c_device_probe+0x234/0x2a4)
[<c0822520>] (i2c_device_probe) from [<c06a1114>] (really_probe+0x1c4/0x4e4)
[<c06a1114>] (really_probe) from [<c06a14ac>]
(driver_probe_device+0x78/0x1d8)
[<c06a14ac>] (driver_probe_device) from [<c069f1a4>]
(bus_for_each_drv+0x78/0xbc)
[<c069f1a4>] (bus_for_each_drv) from [<c06a0eb0>]
(__device_attach+0xe8/0x180)
[<c06a0eb0>] (__device_attach) from [<c069ff50>]
(bus_probe_device+0x88/0x90)
[<c069ff50>] (bus_probe_device) from [<c069dbac>] (device_add+0x4c4/0x7e8)
[<c069dbac>] (device_add) from [<c0824aec>]
(i2c_new_client_device+0x15c/0x27c)
[<c0824aec>] (i2c_new_client_device) from [<c08285e0>]
(of_i2c_register_devices+0x114/0x184)
[<c08285e0>] (of_i2c_register_devices) from [<c08254b8>]
(i2c_register_adapter+0x1d8/0x6dc)
[<c08254b8>] (i2c_register_adapter) from [<c082dd1c>]
(ec_i2c_probe+0xc8/0x124)
[<c082dd1c>] (ec_i2c_probe) from [<c06a3fe8>] (platform_probe+0x80/0xc0)
[<c06a3fe8>] (platform_probe) from [<c06a1114>] (really_probe+0x1c4/0x4e4)
[<c06a1114>] (really_probe) from [<c06a14ac>]
(driver_probe_device+0x78/0x1d8)
[<c06a14ac>] (driver_probe_device) from [<c069f1a4>]
(bus_for_each_drv+0x78/0xbc)
[<c069f1a4>] (bus_for_each_drv) from [<c06a0eb0>]
(__device_attach+0xe8/0x180)
[<c06a0eb0>] (__device_attach) from [<c069ff50>]
(bus_probe_device+0x88/0x90)
[<c069ff50>] (bus_probe_device) from [<c069dbac>] (device_add+0x4c4/0x7e8)
[<c069dbac>] (device_add) from [<c08b140c>]
(of_platform_device_create_pdata+0x90/0xc8)
[<c08b140c>] (of_platform_device_create_pdata) from [<c08b15f0>]
(of_platform_bus_create+0x1a0/0x4ec)
[<c08b15f0>] (of_platform_bus_create) from [<c08b1af0>]
(of_platform_populate+0x88/0x120)
[<c08b1af0>] (of_platform_populate) from [<c08b1bdc>]
(devm_of_platform_populate+0x40/0x80)
[<c08b1bdc>] (devm_of_platform_populate) from [<c08b72fc>]
(cros_ec_register+0x174/0x308)
[<c08b72fc>] (cros_ec_register) from [<c08b868c>]
(cros_ec_spi_probe+0x16c/0x1ec)
[<c08b868c>] (cros_ec_spi_probe) from [<c071b2f4>] (spi_probe+0x88/0xac)
[<c071b2f4>] (spi_probe) from [<c06a1114>] (really_probe+0x1c4/0x4e4)
[<c06a1114>] (really_probe) from [<c06a14ac>]
(driver_probe_device+0x78/0x1d8)
[<c06a14ac>] (driver_probe_device) from [<c06a19c4>]
(device_driver_attach+0x58/0x60)
[<c06a19c4>] (device_driver_attach) from [<c06a1ac8>]
(__driver_attach+0xfc/0x160)
[<c06a1ac8>] (__driver_attach) from [<c069f0cc>]
(bus_for_each_dev+0x6c/0xb8)
[<c069f0cc>] (bus_for_each_dev) from [<c06a0204>]
(bus_add_driver+0x170/0x20c)
[<c06a0204>] (bus_add_driver) from [<c06a2968>] (driver_register+0x78/0x10c)
[<c06a2968>] (driver_register) from [<c0102428>]
(do_one_initcall+0x88/0x438)
[<c0102428>] (do_one_initcall) from [<c1101104>]
(kernel_init_freeable+0x18c/0x1dc)
[<c1101104>] (kernel_init_freeable) from [<c0b3c65c>]
(kernel_init+0x8/0x118)
[<c0b3c65c>] (kernel_init) from [<c010011c>] (ret_from_fork+0x14/0x38)
Exception stack(0xc1ce3fb0 to 0xc1ce3ff8)
3fa0: 00000000 00000000 00000000
00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
I didn't analyze it yet if this warning is really an issue or just a
false positive. If you have any hints or comments let me know.
> ---
> drivers/regulator/core.c | 39 ++++++++++++++++++++++++++++-----------
> 1 file changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
> index fee9241..3ae5ccd 100644
> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -1813,23 +1813,34 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> {
> struct regulator_dev *r;
> struct device *dev = rdev->dev.parent;
> - int ret;
> + int ret = 0;
>
> /* No supply to resolve? */
> if (!rdev->supply_name)
> return 0;
>
> - /* Supply already resolved? */
> + /* Supply already resolved? (fast-path without locking contention) */
> if (rdev->supply)
> return 0;
>
> + /*
> + * Recheck rdev->supply with rdev->mutex lock held to avoid a race
> + * between rdev->supply null check and setting rdev->supply in
> + * set_supply() from concurrent tasks.
> + */
> + regulator_lock(rdev);
> +
> + /* Supply just resolved by a concurrent task? */
> + if (rdev->supply)
> + goto out;
> +
> r = regulator_dev_lookup(dev, rdev->supply_name);
> if (IS_ERR(r)) {
> ret = PTR_ERR(r);
>
> /* Did the lookup explicitly defer for us? */
> if (ret == -EPROBE_DEFER)
> - return ret;
> + goto out;
>
> if (have_full_constraints()) {
> r = dummy_regulator_rdev;
> @@ -1837,15 +1848,18 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> } else {
> dev_err(dev, "Failed to resolve %s-supply for %s\n",
> rdev->supply_name, rdev->desc->name);
> - return -EPROBE_DEFER;
> + ret = -EPROBE_DEFER;
> + goto out;
> }
> }
>
> if (r == rdev) {
> dev_err(dev, "Supply for %s (%s) resolved to itself\n",
> rdev->desc->name, rdev->supply_name);
> - if (!have_full_constraints())
> - return -EINVAL;
> + if (!have_full_constraints()) {
> + ret = -EINVAL;
> + goto out;
> + }
> r = dummy_regulator_rdev;
> get_device(&r->dev);
> }
> @@ -1859,7 +1873,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> if (r->dev.parent && r->dev.parent != rdev->dev.parent) {
> if (!device_is_bound(r->dev.parent)) {
> put_device(&r->dev);
> - return -EPROBE_DEFER;
> + ret = -EPROBE_DEFER;
> + goto out;
> }
> }
>
> @@ -1867,13 +1882,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> ret = regulator_resolve_supply(r);
> if (ret < 0) {
> put_device(&r->dev);
> - return ret;
> + goto out;
> }
>
> ret = set_supply(rdev, r);
> if (ret < 0) {
> put_device(&r->dev);
> - return ret;
> + goto out;
> }
>
> /*
> @@ -1886,11 +1901,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> if (ret < 0) {
> _regulator_put(rdev->supply);
> rdev->supply = NULL;
> - return ret;
> + goto out;
> }
> }
>
> - return 0;
> +out:
> + regulator_unlock(rdev);
> + return ret;
> }
>
> /* Internal regulator request function */
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Tue, Jan 12, 2021 at 10:34:19PM +0100, Marek Szyprowski wrote:
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.11.0-rc1-00008-geaa7995c529b #10095 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> c12e1b80 (regulator_list_mutex){+.+.}-{3:3}, at:
> regulator_lock_dependent+0x4c/0x2b0
If you're sending backtraces or other enormous reports like this please
run them through addr2line first so that things are a bit more leigible.
> but task is already holding lock:
> df7190c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
> regulator_resolve_supply+0x44/0x318
>
> which lock already depends on the new lock.
Does this help (completely untested):
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 3ae5ccd9277d..7d1422b00974 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1823,17 +1823,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (rdev->supply)
return 0;
- /*
- * Recheck rdev->supply with rdev->mutex lock held to avoid a race
- * between rdev->supply null check and setting rdev->supply in
- * set_supply() from concurrent tasks.
- */
- regulator_lock(rdev);
-
- /* Supply just resolved by a concurrent task? */
- if (rdev->supply)
- goto out;
-
r = regulator_dev_lookup(dev, rdev->supply_name);
if (IS_ERR(r)) {
ret = PTR_ERR(r);
@@ -1885,10 +1874,23 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
goto out;
}
+ /*
+ * Recheck rdev->supply with rdev->mutex lock held to avoid a race
+ * between rdev->supply null check and setting rdev->supply in
+ * set_supply() from concurrent tasks.
+ */
+ regulator_lock(rdev);
+
+ /* Supply just resolved by a concurrent task? */
+ if (rdev->supply) {
+ put_device(&r->dev);
+ goto out_rdev_lock;
+ }
+
ret = set_supply(rdev, r);
if (ret < 0) {
put_device(&r->dev);
- goto out;
+ goto out_rdev_lock;
}
/*
@@ -1901,12 +1903,13 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (ret < 0) {
_regulator_put(rdev->supply);
rdev->supply = NULL;
- goto out;
+ goto out_rdev_lock;
}
}
-out:
+out_rdev_lock:
regulator_unlock(rdev);
+out:
return ret;
}
Hi,
On Wed, 13 Jan 2021 at 03:21, Marek Szyprowski <[email protected]> wrote:
>
> Hi,
>
<trim>
>
> This patch landed in linux next-20210112 as commit eaa7995c529b
> ("regulator: core: avoid regulator_resolve_supply() race condition"). I
> found that it triggers a following lockdep warning during the DWC3
> driver registration on some Exynos based boards (this log is from
> Samsung Exynos5420-based Peach-Pit board):
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.11.0-rc1-00008-geaa7995c529b #10095 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> c12e1b80 (regulator_list_mutex){+.+.}-{3:3}, at:
> regulator_lock_dependent+0x4c/0x2b0
>
> but task is already holding lock:
> df7190c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
> regulator_resolve_supply+0x44/0x318
LKFT testing also found this lockdep warning on
arm64 - hi6220-hikey while booting.
[ 0.635532] WARNING: possible recursive locking detected
[ 0.635558] 5.11.0-rc3-next-20210118 #1 Not tainted
[ 0.635585] --------------------------------------------
[ 0.635611] swapper/0/1 is trying to acquire lock:
[ 0.635636] ffff000000a13158
(regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_lock_recursive+0x9c/0x1e8
[ 0.635721]
[ 0.635721] but task is already holding lock:
[ 0.635749] ffff000000a13958
(regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x70/0x2f0
[ 0.635817]
[ 0.635817] other info that might help us debug this:
[ 0.635847] Possible unsafe locking scenario:
[ 0.635847]
[ 0.635875] CPU0
[ 0.635892] ----
[ 0.635909] lock(regulator_ww_class_mutex);
[ 0.635942] lock(regulator_ww_class_mutex);
[ 0.635974]
[ 0.635974] *** DEADLOCK ***
[ 0.635974]
[ 0.636002] May be due to missing lock nesting notation
[ 0.636002]
[ 0.636033] 4 locks held by swapper/0/1:
[ 0.636057] #0: ffff000000a02988 (&dev->mutex){....}-{3:3}, at:
__device_driver_lock+0x38/0x70
[ 0.636131] #1: ffff000000a13958
(regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x70/0x2f0
[ 0.636205] #2: ffff800012b102c0
(regulator_list_mutex){+.+.}-{3:3}, at:
regulator_lock_dependent+0x5c/0x290
[ 0.636280] #3: ffff8000137e3918
(regulator_ww_class_acquire){+.+.}-{0:0}, at:
regulator_enable+0x40/0xe0
[ 0.636352]
[ 0.636352] stack backtrace:
[ 0.636378] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.11.0-rc3-next-20210118 #1
[ 0.636415] Hardware name: HiKey Development Board (DT)
[ 0.636443] Call trace:
[ 0.636460] dump_backtrace+0x0/0x1f0
[ 0.636490] show_stack+0x2c/0x80
[ 0.636516] dump_stack+0xf8/0x160
[ 0.636543] __lock_acquire+0xa3c/0x1718
[ 0.636571] lock_acquire+0x3d8/0x4f0
[ 0.636596] __ww_mutex_lock.constprop.14+0xbc/0xf68
[ 0.636628] ww_mutex_lock+0x6c/0x3e8
[ 0.636653] regulator_lock_recursive+0x9c/0x1e8
[ 0.636683] regulator_lock_dependent+0x198/0x290
[ 0.636713] regulator_enable+0x40/0xe0
[ 0.636739] regulator_resolve_supply+0x1e8/0x2f0
[ 0.636767] regulator_register_resolve_supply+0x24/0x80
[ 0.636797] class_for_each_device+0x78/0xf8
[ 0.636825] regulator_register+0x840/0xbb0
[ 0.636851] devm_regulator_register+0x50/0xa8
[ 0.636879] reg_fixed_voltage_probe+0x224/0x410
[ 0.636908] platform_probe+0x6c/0xd8
[ 0.636932] really_probe+0x2b8/0x520
[ 0.636960] driver_probe_device+0xf4/0x168
[ 0.636988] device_driver_attach+0x74/0x98
[ 0.637014] __driver_attach+0xc4/0x178
[ 0.637039] bus_for_each_dev+0x84/0xd8
[ 0.637066] driver_attach+0x30/0x40
[ 0.637092] bus_add_driver+0x170/0x258
[ 0.637119] driver_register+0x64/0x118
[ 0.637144] __platform_driver_register+0x34/0x40
[ 0.637172] regulator_fixed_voltage_init+0x20/0x28
[ 0.637205] do_one_initcall+0x94/0x4a0
[ 0.637231] kernel_init_freeable+0x2f0/0x344
[ 0.637261] kernel_init+0x18/0x120
Reported-by: Naresh Kamboju <[email protected]>
Full boot log here:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20210118/testrun/3771538/suite/linux-log-parser/test/check-kernel-warning-2159912/log
metadata:
git branch: master
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
git describe: next-20210112
kernel-config:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/hikey/lkft/linux-next/935/config
--
Linaro LKFT
https://lkft.linaro.org
Hi Mark,
On 18.01.2021 21:49, Mark Brown wrote:
> On Tue, Jan 12, 2021 at 10:34:19PM +0100, Marek Szyprowski wrote:
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 5.11.0-rc1-00008-geaa7995c529b #10095 Not tainted
>> ------------------------------------------------------
>> swapper/0/1 is trying to acquire lock:
>> c12e1b80 (regulator_list_mutex){+.+.}-{3:3}, at:
>> regulator_lock_dependent+0x4c/0x2b0
> If you're sending backtraces or other enormous reports like this please
> run them through addr2line first so that things are a bit more leigible.
Well, I had a little time to process that issue, so I just copy-pasted
the kernel log with the hope it will be useful. The trace is really
long, but the function call stack is imho readable.
If you need more details about any specific trace, just ask. I don't
know any good method of processing the raw kernel logs with addr2line
and keeping things readable.
>> but task is already holding lock:
>> df7190c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
>> regulator_resolve_supply+0x44/0x318
>>
>> which lock already depends on the new lock.
> Does this help (completely untested):
Sadly nope. I get same warning:
======================================================
WARNING: possible circular locking dependency detected
5.11.0-rc3-next-20210118-00005-g56a65ff7ca8b #10162 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
c12e1e40 (regulator_list_mutex){+.+.}-{3:3}, at:
regulator_lock_dependent+0x4c/0x2b4
but task is already holding lock:
df4fe8c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x98/0x320
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (regulator_ww_class_mutex){+.+.}-{3:3}:
ww_mutex_lock+0x48/0x88
regulator_lock_recursive+0x84/0x1f4
regulator_lock_dependent+0x188/0x2b4
regulator_enable+0x30/0xe4
dwc3_exynos_probe+0x17c/0x2c0
platform_probe+0x80/0xc0
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x190/0x1e0
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
-> #1 (regulator_ww_class_acquire){+.+.}-{0:0}:
regulator_enable+0x30/0xe4
dwc3_exynos_probe+0x17c/0x2c0
platform_probe+0x80/0xc0
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x190/0x1e0
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
-> #0 (regulator_list_mutex){+.+.}-{3:3}:
lock_acquire+0x314/0x5d0
__mutex_lock+0xa4/0xb60
mutex_lock_nested+0x1c/0x24
regulator_lock_dependent+0x4c/0x2b4
regulator_enable+0x30/0xe4
regulator_resolve_supply+0x1d0/0x320
regulator_register_resolve_supply+0x14/0x78
class_for_each_device+0x68/0xe8
regulator_register+0xa30/0xca0
devm_regulator_register+0x40/0x70
tps65090_regulator_probe+0x150/0x648
platform_probe+0x80/0xc0
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c8/0x7ec
platform_device_add+0x120/0x25c
mfd_add_devices+0x580/0x60c
tps65090_i2c_probe+0xb8/0x184
i2c_device_probe+0x234/0x2a4
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c8/0x7ec
i2c_new_client_device+0x15c/0x27c
of_i2c_register_devices+0x114/0x184
i2c_register_adapter+0x1d8/0x6dc
ec_i2c_probe+0xc8/0x124
platform_probe+0x80/0xc0
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
bus_for_each_drv+0x78/0xbc
__device_attach+0xe8/0x180
bus_probe_device+0x88/0x90
device_add+0x4c8/0x7ec
of_platform_device_create_pdata+0x90/0xc8
of_platform_bus_create+0x1a0/0x4ec
of_platform_populate+0x88/0x120
devm_of_platform_populate+0x40/0x80
cros_ec_register+0x174/0x308
cros_ec_spi_probe+0x16c/0x1ec
spi_probe+0x88/0xac
really_probe+0x1d4/0x4ec
driver_probe_device+0x78/0x1d8
device_driver_attach+0x58/0x60
__driver_attach+0xfc/0x160
bus_for_each_dev+0x6c/0xb8
bus_add_driver+0x170/0x20c
driver_register+0x78/0x10c
do_one_initcall+0x88/0x438
kernel_init_freeable+0x190/0x1e0
kernel_init+0x8/0x118
ret_from_fork+0x14/0x38
0x0
other info that might help us debug this:
Chain exists of:
regulator_list_mutex --> regulator_ww_class_acquire -->
regulator_ww_class_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(regulator_ww_class_mutex);
lock(regulator_ww_class_acquire);
lock(regulator_ww_class_mutex);
lock(regulator_list_mutex);
*** DEADLOCK ***
5 locks held by swapper/0/1:
#0: dfbef0c8 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x60
#1: df4f84d8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#2: df4f98e8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#3: df509cd8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x34/0x180
#4: df4fe8c0 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
regulator_resolve_supply+0x98/0x320
stack backtrace:
CPU: 3 PID: 1 Comm: swapper/0 Not tainted
5.11.0-rc3-next-20210118-00005-g56a65ff7ca8b #10162
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c01116e8>] (unwind_backtrace) from [<c010cf58>] (show_stack+0x10/0x14)
[<c010cf58>] (show_stack) from [<c0b443c0>] (dump_stack+0xa4/0xc4)
[<c0b443c0>] (dump_stack) from [<c01932e0>] (check_noncircular+0x14c/0x164)
[<c01932e0>] (check_noncircular) from [<c0196a08>]
(__lock_acquire+0x181c/0x3204)
[<c0196a08>] (__lock_acquire) from [<c01990cc>] (lock_acquire+0x314/0x5d0)
[<c01990cc>] (lock_acquire) from [<c0b4bd54>] (__mutex_lock+0xa4/0xb60)
[<c0b4bd54>] (__mutex_lock) from [<c0b4c82c>] (mutex_lock_nested+0x1c/0x24)
[<c0b4c82c>] (mutex_lock_nested) from [<c05d4544>]
(regulator_lock_dependent+0x4c/0x2b4)
[<c05d4544>] (regulator_lock_dependent) from [<c05d99c0>]
(regulator_enable+0x30/0xe4)
[<c05d99c0>] (regulator_enable) from [<c05d9c44>]
(regulator_resolve_supply+0x1d0/0x320)
[<c05d9c44>] (regulator_resolve_supply) from [<c05da130>]
(regulator_register_resolve_supply+0x14/0x78)
[<c05da130>] (regulator_register_resolve_supply) from [<c06aba80>]
(class_for_each_device+0x68/0xe8)
[<c06aba80>] (class_for_each_device) from [<c05db5e0>]
(regulator_register+0xa30/0xca0)
[<c05db5e0>] (regulator_register) from [<c05dd430>]
(devm_regulator_register+0x40/0x70)
[<c05dd430>] (devm_regulator_register) from [<c05e6218>]
(tps65090_regulator_probe+0x150/0x648)
[<c05e6218>] (tps65090_regulator_probe) from [<c06aca70>]
(platform_probe+0x80/0xc0)
[<c06aca70>] (platform_probe) from [<c06a9b9c>] (really_probe+0x1d4/0x4ec)
[<c06a9b9c>] (really_probe) from [<c06a9f2c>]
(driver_probe_device+0x78/0x1d8)
[<c06a9f2c>] (driver_probe_device) from [<c06a7c24>]
(bus_for_each_drv+0x78/0xbc)
[<c06a7c24>] (bus_for_each_drv) from [<c06a9928>]
(__device_attach+0xe8/0x180)
[<c06a9928>] (__device_attach) from [<c06a89d0>]
(bus_probe_device+0x88/0x90)
[<c06a89d0>] (bus_probe_device) from [<c06a662c>] (device_add+0x4c8/0x7ec)
[<c06a662c>] (device_add) from [<c06ac634>]
(platform_device_add+0x120/0x25c)
[<c06ac634>] (platform_device_add) from [<c06de87c>]
(mfd_add_devices+0x580/0x60c)
[<c06de87c>] (mfd_add_devices) from [<c06e0ce8>]
(tps65090_i2c_probe+0xb8/0x184)
[<c06e0ce8>] (tps65090_i2c_probe) from [<c082d2b8>]
(i2c_device_probe+0x234/0x2a4)
[<c082d2b8>] (i2c_device_probe) from [<c06a9b9c>] (really_probe+0x1d4/0x4ec)
[<c06a9b9c>] (really_probe) from [<c06a9f2c>]
(driver_probe_device+0x78/0x1d8)
[<c06a9f2c>] (driver_probe_device) from [<c06a7c24>]
(bus_for_each_drv+0x78/0xbc)
[<c06a7c24>] (bus_for_each_drv) from [<c06a9928>]
(__device_attach+0xe8/0x180)
[<c06a9928>] (__device_attach) from [<c06a89d0>]
(bus_probe_device+0x88/0x90)
[<c06a89d0>] (bus_probe_device) from [<c06a662c>] (device_add+0x4c8/0x7ec)
[<c06a662c>] (device_add) from [<c082f884>]
(i2c_new_client_device+0x15c/0x27c)
[<c082f884>] (i2c_new_client_device) from [<c08332dc>]
(of_i2c_register_devices+0x114/0x184)
[<c08332dc>] (of_i2c_register_devices) from [<c0830250>]
(i2c_register_adapter+0x1d8/0x6dc)
[<c0830250>] (i2c_register_adapter) from [<c0838a1c>]
(ec_i2c_probe+0xc8/0x124)
[<c0838a1c>] (ec_i2c_probe) from [<c06aca70>] (platform_probe+0x80/0xc0)
[<c06aca70>] (platform_probe) from [<c06a9b9c>] (really_probe+0x1d4/0x4ec)
[<c06a9b9c>] (really_probe) from [<c06a9f2c>]
(driver_probe_device+0x78/0x1d8)
[<c06a9f2c>] (driver_probe_device) from [<c06a7c24>]
(bus_for_each_drv+0x78/0xbc)
[<c06a7c24>] (bus_for_each_drv) from [<c06a9928>]
(__device_attach+0xe8/0x180)
[<c06a9928>] (__device_attach) from [<c06a89d0>]
(bus_probe_device+0x88/0x90)
[<c06a89d0>] (bus_probe_device) from [<c06a662c>] (device_add+0x4c8/0x7ec)
[<c06a662c>] (device_add) from [<c08bba20>]
(of_platform_device_create_pdata+0x90/0xc8)
[<c08bba20>] (of_platform_device_create_pdata) from [<c08bbc04>]
(of_platform_bus_create+0x1a0/0x4ec)
[<c08bbc04>] (of_platform_bus_create) from [<c08bc104>]
(of_platform_populate+0x88/0x120)
[<c08bc104>] (of_platform_populate) from [<c08bc1f0>]
(devm_of_platform_populate+0x40/0x80)
[<c08bc1f0>] (devm_of_platform_populate) from [<c08c1910>]
(cros_ec_register+0x174/0x308)
[<c08c1910>] (cros_ec_register) from [<c08c2ca0>]
(cros_ec_spi_probe+0x16c/0x1ec)
[<c08c2ca0>] (cros_ec_spi_probe) from [<c07240fc>] (spi_probe+0x88/0xac)
[<c07240fc>] (spi_probe) from [<c06a9b9c>] (really_probe+0x1d4/0x4ec)
[<c06a9b9c>] (really_probe) from [<c06a9f2c>]
(driver_probe_device+0x78/0x1d8)
[<c06a9f2c>] (driver_probe_device) from [<c06aa444>]
(device_driver_attach+0x58/0x60)
[<c06aa444>] (device_driver_attach) from [<c06aa548>]
(__driver_attach+0xfc/0x160)
[<c06aa548>] (__driver_attach) from [<c06a7b4c>]
(bus_for_each_dev+0x6c/0xb8)
[<c06a7b4c>] (bus_for_each_dev) from [<c06a8c84>]
(bus_add_driver+0x170/0x20c)
[<c06a8c84>] (bus_add_driver) from [<c06ab3e8>] (driver_register+0x78/0x10c)
[<c06ab3e8>] (driver_register) from [<c0102428>]
(do_one_initcall+0x88/0x438)
[<c0102428>] (do_one_initcall) from [<c11010d4>]
(kernel_init_freeable+0x190/0x1e0)
[<c11010d4>] (kernel_init_freeable) from [<c0b47db0>]
(kernel_init+0x8/0x118)
[<c0b47db0>] (kernel_init) from [<c010011c>] (ret_from_fork+0x14/0x38)
Exception stack(0xc1ce3fb0 to 0xc1ce3ff8)
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Thu, Jan 21, 2021 at 10:41:59AM +0100, Marek Szyprowski wrote:
> On 18.01.2021 21:49, Mark Brown wrote:
> > Does this help (completely untested):
> Sadly nope. I get same warning:
Try this instead:
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 3ae5ccd9277d..31503776dbd7 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1823,17 +1823,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (rdev->supply)
return 0;
- /*
- * Recheck rdev->supply with rdev->mutex lock held to avoid a race
- * between rdev->supply null check and setting rdev->supply in
- * set_supply() from concurrent tasks.
- */
- regulator_lock(rdev);
-
- /* Supply just resolved by a concurrent task? */
- if (rdev->supply)
- goto out;
-
r = regulator_dev_lookup(dev, rdev->supply_name);
if (IS_ERR(r)) {
ret = PTR_ERR(r);
@@ -1885,12 +1874,29 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
goto out;
}
+ /*
+ * Recheck rdev->supply with rdev->mutex lock held to avoid a race
+ * between rdev->supply null check and setting rdev->supply in
+ * set_supply() from concurrent tasks.
+ */
+ regulator_lock(rdev);
+
+ /* Supply just resolved by a concurrent task? */
+ if (rdev->supply) {
+ regulator_unlock(rdev);
+ put_device(&r->dev);
+ return ret;
+ }
+
ret = set_supply(rdev, r);
if (ret < 0) {
+ regulator_unlock(rdev);
put_device(&r->dev);
- goto out;
+ return ret;
}
+ regulator_unlock(rdev);
+
/*
* In set_machine_constraints() we may have turned this regulator on
* but we couldn't propagate to the supply if it hadn't been resolved
@@ -1901,12 +1907,11 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
if (ret < 0) {
_regulator_put(rdev->supply);
rdev->supply = NULL;
- goto out;
+ goto out_rdev_lock;
}
}
out:
- regulator_unlock(rdev);
return ret;
}
Hi Mark,
On 21.01.2021 16:44, Mark Brown wrote:
> On Thu, Jan 21, 2021 at 10:41:59AM +0100, Marek Szyprowski wrote:
>> On 18.01.2021 21:49, Mark Brown wrote:
>>> Does this help (completely untested):
>> Sadly nope. I get same warning:
> Try this instead:
>
> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
> index 3ae5ccd9277d..31503776dbd7 100644
> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -1823,17 +1823,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> if (rdev->supply)
> return 0;
>
> - /*
> - * Recheck rdev->supply with rdev->mutex lock held to avoid a race
> - * between rdev->supply null check and setting rdev->supply in
> - * set_supply() from concurrent tasks.
> - */
> - regulator_lock(rdev);
> -
> - /* Supply just resolved by a concurrent task? */
> - if (rdev->supply)
> - goto out;
> -
> r = regulator_dev_lookup(dev, rdev->supply_name);
> if (IS_ERR(r)) {
> ret = PTR_ERR(r);
> @@ -1885,12 +1874,29 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> goto out;
> }
>
> + /*
> + * Recheck rdev->supply with rdev->mutex lock held to avoid a race
> + * between rdev->supply null check and setting rdev->supply in
> + * set_supply() from concurrent tasks.
> + */
> + regulator_lock(rdev);
> +
> + /* Supply just resolved by a concurrent task? */
> + if (rdev->supply) {
> + regulator_unlock(rdev);
> + put_device(&r->dev);
> + return ret;
> + }
> +
> ret = set_supply(rdev, r);
> if (ret < 0) {
> + regulator_unlock(rdev);
> put_device(&r->dev);
> - goto out;
> + return ret;
> }
>
> + regulator_unlock(rdev);
> +
> /*
> * In set_machine_constraints() we may have turned this regulator on
> * but we couldn't propagate to the supply if it hadn't been resolved
> @@ -1901,12 +1907,11 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
> if (ret < 0) {
> _regulator_put(rdev->supply);
> rdev->supply = NULL;
> - goto out;
> + goto out_rdev_lock;
drivers/regulator/core.c:1910:4: error: label ‘out_rdev_lock’ used but
not defined
> }
> }
>
> out:
> - regulator_unlock(rdev);
> return ret;
> }
>
It looks that it finally fixes the locking issue, with the above goto
removed completely to fix build. Feel free to add:
Reported-by: Marek Szyprowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Hello Mark,
On 1/21/21 12:30 PM, Marek Szyprowski wrote:
> Hi Mark,
>
> On 21.01.2021 16:44, Mark Brown wrote:
>> On Thu, Jan 21, 2021 at 10:41:59AM +0100, Marek Szyprowski wrote:
>>> On 18.01.2021 21:49, Mark Brown wrote:
>>>> Does this help (completely untested):
>>> Sadly nope. I get same warning:
>> Try this instead:
>>
>> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
>> index 3ae5ccd9277d..31503776dbd7 100644
>> --- a/drivers/regulator/core.c
>> +++ b/drivers/regulator/core.c
>> @@ -1823,17 +1823,6 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
>> if (rdev->supply)
>> return 0;
>>
>> - /*
>> - * Recheck rdev->supply with rdev->mutex lock held to avoid a race
>> - * between rdev->supply null check and setting rdev->supply in
>> - * set_supply() from concurrent tasks.
>> - */
>> - regulator_lock(rdev);
>> -
>> - /* Supply just resolved by a concurrent task? */
>> - if (rdev->supply)
>> - goto out;
>> -
>> r = regulator_dev_lookup(dev, rdev->supply_name);
>> if (IS_ERR(r)) {
>> ret = PTR_ERR(r);
>> @@ -1885,12 +1874,29 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
>> goto out;
>> }
>>
>> + /*
>> + * Recheck rdev->supply with rdev->mutex lock held to avoid a race
>> + * between rdev->supply null check and setting rdev->supply in
>> + * set_supply() from concurrent tasks.
>> + */
>> + regulator_lock(rdev);
>> +
>> + /* Supply just resolved by a concurrent task? */
>> + if (rdev->supply) {
>> + regulator_unlock(rdev);
>> + put_device(&r->dev);
>> + return ret;
>> + }
>> +
>> ret = set_supply(rdev, r);
>> if (ret < 0) {
>> + regulator_unlock(rdev);
>> put_device(&r->dev);
>> - goto out;
>> + return ret;
>> }
>>
>> + regulator_unlock(rdev);
>> +
>> /*
>> * In set_machine_constraints() we may have turned this regulator on
>> * but we couldn't propagate to the supply if it hadn't been resolved
>> @@ -1901,12 +1907,11 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
>> if (ret < 0) {
>> _regulator_put(rdev->supply);
>> rdev->supply = NULL;
>> - goto out;
>> + goto out_rdev_lock;
>
> drivers/regulator/core.c:1910:4: error: label ‘out_rdev_lock’ used but
> not defined
>
>> }
>> }
>>
>> out:
>> - regulator_unlock(rdev);
>> return ret;
>> }
>>
>
> It looks that it finally fixes the locking issue, with the above goto
> removed completely to fix build. Feel free to add:
>
> Reported-by: Marek Szyprowski <[email protected]>
>
> Tested-by: Marek Szyprowski <[email protected]>
Thank you for making this fix. I'm sorry that I missed the potential
deadlock issue resulting from the regulator_enable() call inside
regulator_resolve_supply() with rdev->mutex locked. Your fix avoids
deadlock while still ensuring that the there isn't a set supply race
condition.
Take care,
David
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project