Hi!
The Tegra20 requires an enabled VDE power domain during startup. As the
VDE is currently not used, it is disabled during runtime.
Since 8f0c714ad9be, there is a workaround for the "normal restart path"
which enables the VDE before doing PMC's warm reboot. This workaround is
not executed in the "emergency restart path", leading to a hang-up
during start.
This series implements and registers a new pmic-based restart handler
for boards with tps6586x. This cold reboot ensures that the VDE power
domain is enabled during startup on tegra20-based boards.
Since bae1d3a05a8b, i2c transfers are non-atomic while preemption is
disabled (which is e.g. done during panic()). This could lead to
warnings ("Voluntary context switch within RCU") in i2c-based restart
handlers during emergency restart. The state of preemption should be
detected by i2c_in_atomic_xfer_mode() to use atomic i2c xfer when
required. Beside the new system_state check, the check is the same as
the one pre v5.2.
v3: https://lore.kernel.org/r/[email protected]
v2: https://lore.kernel.org/all/[email protected]/
system_state: https://lore.kernel.org/all/[email protected]/
v1: https://lore.kernel.org/all/[email protected]/
v4:
- 1,2: add "Fixes" and adapt commit messages
- 4: reduce delay after requesting the restart (as suggested by Dmitry)
v3:
- bring system_state back in this series
- do atomic i2c xfer if not preemptible (as suggested by Dmitry)
- fix style issues mentioned by Dmitry
- add cc stable as suggested by Dmitry
- add explanation why this is needed for Jon
v2:
- use devm-based restart handler
- convert the existing power_off handler to a devm-based handler
- handle system_state in extra series
---
Benjamin Bara (4):
kernel/reboot: emergency_restart: set correct system_state
i2c: core: run atomic i2c xfer when !preemptible
mfd: tps6586x: use devm-based power off handler
mfd: tps6586x: register restart handler
drivers/i2c/i2c-core.h | 2 +-
drivers/mfd/tps6586x.c | 45 +++++++++++++++++++++++++++++++++++++--------
kernel/reboot.c | 1 +
3 files changed, 39 insertions(+), 9 deletions(-)
---
base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
change-id: 20230327-tegra-pmic-reboot-4175ff814a4b
Best regards,
--
Benjamin Bara <[email protected]>
From: Benjamin Bara <[email protected]>
Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
disabled. However, non-atomic i2c transfers require preemption (e.g. in
wait_for_completion() while waiting for the DMA).
panic() calls preempt_disable_notrace() before calling
emergency_restart(). Therefore, if an i2c device is used for the
restart, the xfer should be atomic. This avoids warnings like:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
[ 12.676926] Voluntary context switch within RCU read-side critical section!
...
[ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114
[ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
...
[ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58
[ 13.001050] machine_restart from panic+0x2a8/0x32c
Use !preemptible() instead, which is basically the same check as
pre-v5.2.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
Cc: [email protected] # v5.2+
Suggested-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/i2c/i2c-core.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/i2c-core.h b/drivers/i2c/i2c-core.h
index 1247e6e6e975..05b8b8dfa9bd 100644
--- a/drivers/i2c/i2c-core.h
+++ b/drivers/i2c/i2c-core.h
@@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const struct resource *resources,
*/
static inline bool i2c_in_atomic_xfer_mode(void)
{
- return system_state > SYSTEM_RUNNING && irqs_disabled();
+ return system_state > SYSTEM_RUNNING && !preemptible();
}
static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)
--
2.34.1
From: Benjamin Bara <[email protected]>
There are a couple of boards which use a tps6586x as
"ti,system-power-controller", e.g. the tegra20-tamonten.dtsi.
For these, the only registered restart handler is the warm reboot via
tegra's PMC. As the bootloader of the tegra20 requires the VDE, it must
be ensured that VDE is enabled (which is the case after a cold reboot).
For the "normal reboot", this is basically the case since 8f0c714ad9be.
However, this workaround is not executed in case of an emergency restart.
In case of an emergency restart, the system now simply hangs in the
bootloader, as VDE is not enabled (because it is not used).
The TPS658629-Q1 (unfortunately the only TPS6586x with public data sheet)
provides a SOFT RST bit in the SUPPLYENE reg to request a (cold) reboot,
which takes at least 10ms (as the data sheet states).
This avoids the hang-up.
Tested on a TPS658640.
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/mfd/tps6586x.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/mfd/tps6586x.c b/drivers/mfd/tps6586x.c
index 93f1bf440191..91754f30e26b 100644
--- a/drivers/mfd/tps6586x.c
+++ b/drivers/mfd/tps6586x.c
@@ -30,6 +30,7 @@
#include <linux/mfd/tps6586x.h>
#define TPS6586X_SUPPLYENE 0x14
+#define SOFT_RST_BIT BIT(0)
#define EXITSLREQ_BIT BIT(1)
#define SLEEP_MODE_BIT BIT(3)
@@ -470,6 +471,19 @@ static int tps6586x_power_off_handler(struct sys_off_data *data)
return tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
}
+static int tps6586x_restart_handler(struct sys_off_data *data)
+{
+ struct device *tps6586x_dev = data->cb_data;
+ int ret;
+
+ /* bring pmic into HARD REBOOT state. this takes at least 10ms. */
+ ret = tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SOFT_RST_BIT);
+ mdelay(15);
+
+ dev_err(tps6586x_dev, "restart failed: timeout\n");
+ return ret;
+}
+
static void tps6586x_print_version(struct i2c_client *client, int version)
{
const char *name;
@@ -570,6 +584,13 @@ static int tps6586x_i2c_probe(struct i2c_client *client)
dev_err(&client->dev, "register power off handler failed: %d\n", ret);
goto err_add_devs;
}
+
+ ret = devm_register_restart_handler(&client->dev, &tps6586x_restart_handler,
+ &client->dev);
+ if (ret) {
+ dev_err(&client->dev, "register restart handler failed: %d\n", ret);
+ goto err_add_devs;
+ }
}
return 0;
--
2.34.1
From: Benjamin Bara <[email protected]>
Convert the power off handler to a devm-based power off handler.
Reviewed-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/mfd/tps6586x.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/mfd/tps6586x.c b/drivers/mfd/tps6586x.c
index 2d947f3f606a..93f1bf440191 100644
--- a/drivers/mfd/tps6586x.c
+++ b/drivers/mfd/tps6586x.c
@@ -22,6 +22,7 @@
#include <linux/err.h>
#include <linux/i2c.h>
#include <linux/platform_device.h>
+#include <linux/reboot.h>
#include <linux/regmap.h>
#include <linux/of.h>
@@ -457,13 +458,16 @@ static const struct regmap_config tps6586x_regmap_config = {
.cache_type = REGCACHE_RBTREE,
};
-static struct device *tps6586x_dev;
-static void tps6586x_power_off(void)
+static int tps6586x_power_off_handler(struct sys_off_data *data)
{
- if (tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT))
- return;
+ struct device *tps6586x_dev = data->cb_data;
+ int ret;
+
+ ret = tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT);
+ if (ret)
+ return ret;
- tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
+ return tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
}
static void tps6586x_print_version(struct i2c_client *client, int version)
@@ -559,9 +563,13 @@ static int tps6586x_i2c_probe(struct i2c_client *client)
goto err_add_devs;
}
- if (pdata->pm_off && !pm_power_off) {
- tps6586x_dev = &client->dev;
- pm_power_off = tps6586x_power_off;
+ if (pdata->pm_off) {
+ ret = devm_register_power_off_handler(&client->dev, &tps6586x_power_off_handler,
+ &client->dev);
+ if (ret) {
+ dev_err(&client->dev, "register power off handler failed: %d\n", ret);
+ goto err_add_devs;
+ }
}
return 0;
--
2.34.1
On Thu, Apr 13, 2023 at 09:46:40AM +0200, Benjamin Bara wrote:
> From: Benjamin Bara <[email protected]>
>
> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
> disabled. However, non-atomic i2c transfers require preemption (e.g. in
> wait_for_completion() while waiting for the DMA).
>
> panic() calls preempt_disable_notrace() before calling
> emergency_restart(). Therefore, if an i2c device is used for the
> restart, the xfer should be atomic. This avoids warnings like:
>
> [ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
> [ 12.676926] Voluntary context switch within RCU read-side critical section!
> ...
> [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114
> [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
> ...
> [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58
> [ 13.001050] machine_restart from panic+0x2a8/0x32c
>
> Use !preemptible() instead, which is basically the same check as
> pre-v5.2.
>
> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
> Cc: [email protected] # v5.2+
> Suggested-by: Dmitry Osipenko <[email protected]>
> Signed-off-by: Benjamin Bara <[email protected]>
So, with Peter's input and me checking again:
Acked-by: Wolfram Sang <[email protected]>
I assume this shall go in via the mfd-tree. Let me know if I should pick
it instead.
On 4/13/23 10:46, Benjamin Bara wrote:
> +static int tps6586x_power_off_handler(struct sys_off_data *data)
> {
> - if (tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT))
> - return;
> + struct device *tps6586x_dev = data->cb_data;
> + int ret;
> +
> + ret = tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT);
> + if (ret)
> + return ret;
>
> - tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
> + return tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
missing this previously.
--
Best regards,
Dmitry
On Thu, 13 Apr 2023, 22:37 Dmitry Osipenko,
<[email protected]> wrote:
> Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
> missing this previously.
Thanks!
AFAIU, notifier_from_errno() sets NOTIFY_STOP_MASK, which stops
atomic_notifier_call_chain() immediately. So I think NOTIFY_DONE is the
only valid return value for sys_off handlers, to not skip others. So I
think letting sys_off_notify() [1] always return NOTIFY_DONE might be a
good idea.
If so, we could return a "notify return errno" (or also a "normal
errno") from the handler, which is checked, but then replaced to
NOTIFY_DONE, in [1]. This would enable us to have a common place to
check for failed handlers.
Handlers then should only return NOTIFY_DONE when they are skipped (e.g.
when the requested reboot mode is not supported by the handler).
Otherwise, I think ETIME, ENOSYS or ENOTSUPP might fit when the
communication was successful, a possible delay awaited, but the return
was still reached. What do you think?
Thanks and best regards,
Benjamin
[1] https://elixir.bootlin.com/linux/v6.3-rc6/source/kernel/reboot.c#L327
On 4/14/23 09:15, Benjamin Bara wrote:
> On Thu, 13 Apr 2023, 22:37 Dmitry Osipenko,
> <[email protected]> wrote:
>> Handlers must return NOTIFY_DONE or notifier_from_errno(). Sorry for
>> missing this previously.
>
> Thanks!
>
> AFAIU, notifier_from_errno() sets NOTIFY_STOP_MASK, which stops
> atomic_notifier_call_chain() immediately. So I think NOTIFY_DONE is the
> only valid return value for sys_off handlers, to not skip others. So I
> think letting sys_off_notify() [1] always return NOTIFY_DONE might be a
> good idea.
>
> If so, we could return a "notify return errno" (or also a "normal
> errno") from the handler, which is checked, but then replaced to
> NOTIFY_DONE, in [1]. This would enable us to have a common place to
> check for failed handlers.
>
> Handlers then should only return NOTIFY_DONE when they are skipped (e.g.
> when the requested reboot mode is not supported by the handler).
> Otherwise, I think ETIME, ENOSYS or ENOTSUPP might fit when the
> communication was successful, a possible delay awaited, but the return
> was still reached. What do you think?
The behaviour may depend on a particular platform and driver. In general
and in case of this driver, it should be more reliable and cleaner to
abort the reboot on a error that shall never happen.
--
Best regards,
Dmitry
On Mon, 24 Apr 2023 at 12:42, Dmitry Osipenko <[email protected]> wrote:
> In general and in case of this driver, it should be more reliable and
> cleaner to abort the reboot on a error that shall never happen.
Thanks! Then I will drop my 4/6 of v5 [1].
[1] https://lore.kernel.org/all/[email protected]/