Hi!
The Tegra20 requires an enabled VDE power domain during startup. As the
VDE is currently not used, it is disabled during runtime.
Since 8f0c714ad9be, there is a workaround for the "normal restart path"
which enables the VDE before doing PMC's warm reboot. This workaround is
not executed in the "emergency restart path", leading to a hang-up
during start.
This series implements and registers a new pmic-based restart handler
for boards with tps6586x. This cold reboot ensures that the VDE power
domain is enabled during startup on tegra20-based boards.
Since bae1d3a05a8b, i2c transfers are non-atomic while preemption is
disabled (which is e.g. done during panic()). This could lead to
warnings ("Voluntary context switch within RCU") in i2c-based restart
handlers during emergency restart. The state of preemption should be
detected by i2c_in_atomic_xfer_mode() to use atomic i2c xfer when
required. Beside the new system_state check, the check is the same as
the one pre v5.2.
---
v7:
- 5/5: drop mode check (suggested by Dmitry)
- Link to v6: https://lore.kernel.org/r/[email protected]
v6:
- drop 4/6 to abort restart on unexpected failure (suggested by Dmitry)
- 4,5: fix comments in handlers (suggested by Lee)
- 4,5: same delay for both handlers (suggested by Lee)
v5:
- introduce new 3 & 4, therefore 3 -> 5, 4 -> 6
- 3: provide dev to sys_off handler, if it is known
- 4: return NOTIFY_DONE from sys_off_notify, to avoid skipping
- 5: drop Reviewed-by from Dmitry, add poweroff timeout
- 5,6: return notifier value instead of direct errno from handler
- 5,6: use new dev field instead of passing dev as cb_data
- 5,6: increase timeout values based on error observations
- 6: skip unsupported reboot modes in restart handler
---
Benjamin Bara (5):
kernel/reboot: emergency_restart: set correct system_state
i2c: core: run atomic i2c xfer when !preemptible
kernel/reboot: add device to sys_off_handler
mfd: tps6586x: use devm-based power off handler
mfd: tps6586x: register restart handler
drivers/i2c/i2c-core.h | 2 +-
drivers/mfd/tps6586x.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
include/linux/reboot.h | 3 +++
kernel/reboot.c | 4 ++++
4 files changed, 50 insertions(+), 9 deletions(-)
---
base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
change-id: 20230327-tegra-pmic-reboot-4175ff814a4b
Best regards,
--
Benjamin Bara <[email protected]>
From: Benjamin Bara <[email protected]>
As the emergency restart does not call kernel_restart_prepare(), the
system_state stays in SYSTEM_RUNNING.
Since bae1d3a05a8b, this hinders i2c_in_atomic_xfer_mode() from becoming
active, and therefore might lead to avoidable warnings in the restart
handlers, e.g.:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
[ 12.676926] Voluntary context switch within RCU read-side critical section!
...
[ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114
[ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
...
[ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58
[ 13.001050] machine_restart from panic+0x2a8/0x32c
Avoid these by setting the correct system_state.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
Cc: [email protected] # v5.2+
Reviewed-by: Dmitry Osipenko <[email protected]>
Tested-by: Nishanth Menon <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
kernel/reboot.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 3bba88c7ffc6..6ebef11c8876 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -74,6 +74,7 @@ void __weak (*pm_power_off)(void);
void emergency_restart(void)
{
kmsg_dump(KMSG_DUMP_EMERG);
+ system_state = SYSTEM_RESTART;
machine_emergency_restart();
}
EXPORT_SYMBOL_GPL(emergency_restart);
--
2.34.1
From: Benjamin Bara <[email protected]>
Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
disabled. However, non-atomic i2c transfers require preemption (e.g. in
wait_for_completion() while waiting for the DMA).
panic() calls preempt_disable_notrace() before calling
emergency_restart(). Therefore, if an i2c device is used for the
restart, the xfer should be atomic. This avoids warnings like:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
[ 12.676926] Voluntary context switch within RCU read-side critical section!
...
[ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114
[ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
...
[ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58
[ 13.001050] machine_restart from panic+0x2a8/0x32c
Use !preemptible() instead, which is basically the same check as
pre-v5.2.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
Cc: [email protected] # v5.2+
Suggested-by: Dmitry Osipenko <[email protected]>
Acked-by: Wolfram Sang <[email protected]>
Reviewed-by: Dmitry Osipenko <[email protected]>
Tested-by: Nishanth Menon <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/i2c/i2c-core.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/i2c-core.h b/drivers/i2c/i2c-core.h
index 1247e6e6e975..05b8b8dfa9bd 100644
--- a/drivers/i2c/i2c-core.h
+++ b/drivers/i2c/i2c-core.h
@@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const struct resource *resources,
*/
static inline bool i2c_in_atomic_xfer_mode(void)
{
- return system_state > SYSTEM_RUNNING && irqs_disabled();
+ return system_state > SYSTEM_RUNNING && !preemptible();
}
static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)
--
2.34.1
From: Benjamin Bara <[email protected]>
There are a couple of boards which use a tps6586x as
"ti,system-power-controller", e.g. the tegra20-tamonten.dtsi.
For these, the only registered restart handler is the warm reboot via
tegra's PMC. As the bootloader of the tegra20 requires the VDE, it must
be ensured that VDE is enabled (which is the case after a cold reboot).
For the "normal reboot", this is basically the case since 8f0c714ad9be.
However, this workaround is not executed in case of an emergency restart.
In case of an emergency restart, the system now simply hangs in the
bootloader, as VDE is not enabled (because it is not used).
The TPS658629-Q1 provides a SOFT RST bit in the SUPPLYENE reg to request
a (cold) reboot, which takes at least 20ms (as the data sheet states).
This avoids the hang-up.
Tested on a TPS658640.
Reviewed-by: Dmitry Osipenko <[email protected]>
Acked-for-MFD-by: Lee Jones <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/mfd/tps6586x.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/mfd/tps6586x.c b/drivers/mfd/tps6586x.c
index b12c9e18970a..1777d8d3a990 100644
--- a/drivers/mfd/tps6586x.c
+++ b/drivers/mfd/tps6586x.c
@@ -30,6 +30,7 @@
#include <linux/mfd/tps6586x.h>
#define TPS6586X_SUPPLYENE 0x14
+#define SOFT_RST_BIT BIT(0)
#define EXITSLREQ_BIT BIT(1)
#define SLEEP_MODE_BIT BIT(3)
@@ -475,6 +476,19 @@ static int tps6586x_power_off_handler(struct sys_off_data *data)
return notifier_from_errno(-ETIME);
}
+static int tps6586x_restart_handler(struct sys_off_data *data)
+{
+ int ret;
+
+ /* Put the PMIC into hard reboot state. This takes at least 20ms. */
+ ret = tps6586x_set_bits(data->dev, TPS6586X_SUPPLYENE, SOFT_RST_BIT);
+ if (ret)
+ return notifier_from_errno(ret);
+
+ mdelay(50);
+ return notifier_from_errno(-ETIME);
+}
+
static void tps6586x_print_version(struct i2c_client *client, int version)
{
const char *name;
@@ -575,6 +589,13 @@ static int tps6586x_i2c_probe(struct i2c_client *client)
dev_err(&client->dev, "register power off handler failed: %d\n", ret);
goto err_add_devs;
}
+
+ ret = devm_register_restart_handler(&client->dev, &tps6586x_restart_handler,
+ NULL);
+ if (ret) {
+ dev_err(&client->dev, "register restart handler failed: %d\n", ret);
+ goto err_add_devs;
+ }
}
return 0;
--
2.34.1
From: Benjamin Bara <[email protected]>
Convert the power off handler to a devm-based power off handler.
Acked-for-MFD-by: Lee Jones <[email protected]>
Reviewed-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Benjamin Bara <[email protected]>
---
drivers/mfd/tps6586x.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/mfd/tps6586x.c b/drivers/mfd/tps6586x.c
index 2d947f3f606a..b12c9e18970a 100644
--- a/drivers/mfd/tps6586x.c
+++ b/drivers/mfd/tps6586x.c
@@ -22,6 +22,7 @@
#include <linux/err.h>
#include <linux/i2c.h>
#include <linux/platform_device.h>
+#include <linux/reboot.h>
#include <linux/regmap.h>
#include <linux/of.h>
@@ -457,13 +458,21 @@ static const struct regmap_config tps6586x_regmap_config = {
.cache_type = REGCACHE_RBTREE,
};
-static struct device *tps6586x_dev;
-static void tps6586x_power_off(void)
+static int tps6586x_power_off_handler(struct sys_off_data *data)
{
- if (tps6586x_clr_bits(tps6586x_dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT))
- return;
+ int ret;
+
+ /* Put the PMIC into sleep state. This takes at least 20ms. */
+ ret = tps6586x_clr_bits(data->dev, TPS6586X_SUPPLYENE, EXITSLREQ_BIT);
+ if (ret)
+ return notifier_from_errno(ret);
+
+ ret = tps6586x_set_bits(data->dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
+ if (ret)
+ return notifier_from_errno(ret);
- tps6586x_set_bits(tps6586x_dev, TPS6586X_SUPPLYENE, SLEEP_MODE_BIT);
+ mdelay(50);
+ return notifier_from_errno(-ETIME);
}
static void tps6586x_print_version(struct i2c_client *client, int version)
@@ -559,9 +568,13 @@ static int tps6586x_i2c_probe(struct i2c_client *client)
goto err_add_devs;
}
- if (pdata->pm_off && !pm_power_off) {
- tps6586x_dev = &client->dev;
- pm_power_off = tps6586x_power_off;
+ if (pdata->pm_off) {
+ ret = devm_register_power_off_handler(&client->dev, &tps6586x_power_off_handler,
+ NULL);
+ if (ret) {
+ dev_err(&client->dev, "register power off handler failed: %d\n", ret);
+ goto err_add_devs;
+ }
}
return 0;
--
2.34.1
15.07.2023 10:53, Benjamin Bara пишет:
> From: Benjamin Bara <[email protected]>
>
> There are a couple of boards which use a tps6586x as
> "ti,system-power-controller", e.g. the tegra20-tamonten.dtsi.
> For these, the only registered restart handler is the warm reboot via
> tegra's PMC. As the bootloader of the tegra20 requires the VDE, it must
> be ensured that VDE is enabled (which is the case after a cold reboot).
> For the "normal reboot", this is basically the case since 8f0c714ad9be.
> However, this workaround is not executed in case of an emergency restart.
> In case of an emergency restart, the system now simply hangs in the
> bootloader, as VDE is not enabled (because it is not used).
>
> The TPS658629-Q1 provides a SOFT RST bit in the SUPPLYENE reg to request
> a (cold) reboot, which takes at least 20ms (as the data sheet states).
> This avoids the hang-up.
>
> Tested on a TPS658640.
>
> Reviewed-by: Dmitry Osipenko <[email protected]>
> Acked-for-MFD-by: Lee Jones <[email protected]>
Acked-for-MFD-by isn't a valid tag, scripts/checkpatch.pl should tell
you about it.
In general you may add a comment to a tag, like this:
Acked-by: Lee Jones <[email protected]> # for MFD
In this particular case, the comment is unnecessary because Lee is the
MFD maintainer, hence his ack itself implies the MFD subsys.
--
Best regards,
Dmitry
Hi Dmitry,
thanks for the feedback!
On Tue, 18 Jul 2023 at 06:46, Dmitry Osipenko <[email protected]> wrote:
> 15.07.2023 10:53, Benjamin Bara пишет:
> >
> > Reviewed-by: Dmitry Osipenko <[email protected]>
> > Acked-for-MFD-by: Lee Jones <[email protected]>
>
> Acked-for-MFD-by isn't a valid tag, scripts/checkpatch.pl should tell
> you about it.
>
> In general you may add a comment to a tag, like this:
>
> Acked-by: Lee Jones <[email protected]> # for MFD
>
> In this particular case, the comment is unnecessary because Lee is the
> MFD maintainer, hence his ack itself implies the MFD subsys.
I saw the warning, but Lee requested to add it like this [1].
@Konstantin:
Do you think it makes sense to print a warning when adding "non-standard
trailers" during running "b4 trailers -u", maybe around the
find_trailers() checks? I could provide a RFC, if considered useful.
Best regards,
Benjamin
[1] https://lore.kernel.org/all/[email protected]/
On Wed, 19 Jul 2023, Benjamin Bara wrote:
> Hi Dmitry,
>
> thanks for the feedback!
>
> On Tue, 18 Jul 2023 at 06:46, Dmitry Osipenko <[email protected]> wrote:
> > 15.07.2023 10:53, Benjamin Bara пишет:
> > >
> > > Reviewed-by: Dmitry Osipenko <[email protected]>
> > > Acked-for-MFD-by: Lee Jones <[email protected]>
> >
> > Acked-for-MFD-by isn't a valid tag, scripts/checkpatch.pl should tell
> > you about it.
> >
> > In general you may add a comment to a tag, like this:
> >
> > Acked-by: Lee Jones <[email protected]> # for MFD
> >
> > In this particular case, the comment is unnecessary because Lee is the
> > MFD maintainer, hence his ack itself implies the MFD subsys.
>
> I saw the warning, but Lee requested to add it like this [1].
>
> @Konstantin:
> Do you think it makes sense to print a warning when adding "non-standard
> trailers" during running "b4 trailers -u", maybe around the
> find_trailers() checks? I could provide a RFC, if considered useful.
> [1] https://lore.kernel.org/all/[email protected]/
Dmitry, Benjamin,
The warning is valid. The patch will not be applied like this.
I will remove it when I merge the patch.
--
Lee Jones [李琼斯]
July 19, 2023 at 4:22 AM, "Benjamin Bara" <[email protected]> wrote:
> @Konstantin:
> Do you think it makes sense to print a warning when adding "non-standard
> trailers" during running "b4 trailers -u", maybe around the
> find_trailers() checks? I could provide a RFC, if considered useful.
With b4 being used for other projects than just the Linux kernel, I don't think it makes sense for us to track what is a valid and what is an invalid "person-trailer". I know that we could make it configurable, but I don't think this will actually improve the situation.
One goal for b4 is to allow defining validation tests and requiring them prior to "b4 send", so I think this is a better mechanism for dealing with such situations.
-K
On Fri, 28 Jul 2023, Lee Jones wrote:
> On Sat, 15 Jul 2023 09:53:22 +0200, Benjamin Bara wrote:
> > The Tegra20 requires an enabled VDE power domain during startup. As the
> > VDE is currently not used, it is disabled during runtime.
> > Since 8f0c714ad9be, there is a workaround for the "normal restart path"
> > which enables the VDE before doing PMC's warm reboot. This workaround is
> > not executed in the "emergency restart path", leading to a hang-up
> > during start.
> >
> > [...]
>
> Applied, thanks!
>
> [1/5] kernel/reboot: emergency_restart: set correct system_state
> commit: 60466c067927abbcaff299845abd4b7069963139
> [2/5] i2c: core: run atomic i2c xfer when !preemptible
> commit: aa49c90894d06e18a1ee7c095edbd2f37c232d02
> [3/5] kernel/reboot: add device to sys_off_handler
> commit: db2d6038c5e795cab4f0a8d3e86b4f7e33338629
> [4/5] mfd: tps6586x: use devm-based power off handler
> commit: 8bd141b17cedcbcb7d336df6e0462e4f4a528ab1
> [5/5] mfd: tps6586x: register restart handler
> commit: 510f276df2b91efd73f6c53be62b7e692ff533c1
Pull-request to follow after built tests have completed.
--
Lee Jones [李琼斯]
On Sat, 15 Jul 2023 09:53:22 +0200, Benjamin Bara wrote:
> The Tegra20 requires an enabled VDE power domain during startup. As the
> VDE is currently not used, it is disabled during runtime.
> Since 8f0c714ad9be, there is a workaround for the "normal restart path"
> which enables the VDE before doing PMC's warm reboot. This workaround is
> not executed in the "emergency restart path", leading to a hang-up
> during start.
>
> [...]
Applied, thanks!
[1/5] kernel/reboot: emergency_restart: set correct system_state
commit: 60466c067927abbcaff299845abd4b7069963139
[2/5] i2c: core: run atomic i2c xfer when !preemptible
commit: aa49c90894d06e18a1ee7c095edbd2f37c232d02
[3/5] kernel/reboot: add device to sys_off_handler
commit: db2d6038c5e795cab4f0a8d3e86b4f7e33338629
[4/5] mfd: tps6586x: use devm-based power off handler
commit: 8bd141b17cedcbcb7d336df6e0462e4f4a528ab1
[5/5] mfd: tps6586x: register restart handler
commit: 510f276df2b91efd73f6c53be62b7e692ff533c1
--
Lee Jones [李琼斯]
On Thu, 07 Sep 2023, Benjamin Bara wrote:
> Hi Lee,
>
> On Fri, 28 Jul 2023 at 12:34, Lee Jones <[email protected]> wrote:
> > On Fri, 28 Jul 2023, Lee Jones wrote:
> > > On Sat, 15 Jul 2023 09:53:22 +0200, Benjamin Bara wrote:
> > > > The Tegra20 requires an enabled VDE power domain during startup. As the
> > > > VDE is currently not used, it is disabled during runtime.
> > > > Since 8f0c714ad9be, there is a workaround for the "normal restart path"
> > > > which enables the VDE before doing PMC's warm reboot. This workaround is
> > > > not executed in the "emergency restart path", leading to a hang-up
> > > > during start.
> > > >
> > > > [...]
> > >
> > > Applied, thanks!
> > >
> > > [1/5] kernel/reboot: emergency_restart: set correct system_state
> > > commit: 60466c067927abbcaff299845abd4b7069963139
> > > [2/5] i2c: core: run atomic i2c xfer when !preemptible
> > > commit: aa49c90894d06e18a1ee7c095edbd2f37c232d02
> > > [3/5] kernel/reboot: add device to sys_off_handler
> > > commit: db2d6038c5e795cab4f0a8d3e86b4f7e33338629
> > > [4/5] mfd: tps6586x: use devm-based power off handler
> > > commit: 8bd141b17cedcbcb7d336df6e0462e4f4a528ab1
> > > [5/5] mfd: tps6586x: register restart handler
> > > commit: 510f276df2b91efd73f6c53be62b7e692ff533c1
> >
> > Pull-request to follow after built tests have completed.
>
> What's the current state of this series?
Looks like the build-tests didn't complete properly, so they stayed on
one of my development branches. I'll re-submit them for testing and get
back to you about merging for this cycle.
--
Lee Jones [李琼斯]
Enjoy!
The following changes since commit 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5:
Linux 6.5-rc1 (2023-07-09 13:53:13 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git ib-mfd-i2c-reboot-v6.7
for you to fetch changes up to 510f276df2b91efd73f6c53be62b7e692ff533c1:
mfd: tps6586x: Register restart handler (2023-07-28 11:33:20 +0100)
----------------------------------------------------------------
Immutable branch between MFD, I2C and Reboot due for the v6.7 merge window
----------------------------------------------------------------
Benjamin Bara (5):
kernel/reboot: emergency_restart: Set correct system_state
i2c: core: Run atomic i2c xfer when !preemptible
kernel/reboot: Add device to sys_off_handler
mfd: tps6586x: Use devm-based power off handler
mfd: tps6586x: Register restart handler
drivers/i2c/i2c-core.h | 2 +-
drivers/mfd/tps6586x.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
include/linux/reboot.h | 3 +++
kernel/reboot.c | 4 ++++
4 files changed, 50 insertions(+), 9 deletions(-)
--
Lee Jones [李琼斯]
On 11/13/23 17:54, Chris Morgan wrote:
..
> I can confirm I no longer get any of the errors with this patch. Tested
> on both an Anbernic RG353P (RK3566 with an RK817 PMIC) and an Odroid
> Go Advance (RK3326 with an RK817 PMIC). The device appears to shut
> down consistently again and I no longer see these messages in my dmesg
> log when I shut down.
I'll prepare the proper patch, thanks.
--
Best regards,
Dmitry
Hi,
> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
> disabled. However, non-atomic i2c transfers require preemption (e.g. in
> wait_for_completion() while waiting for the DMA).
>
> panic() calls preempt_disable_notrace() before calling
> emergency_restart(). Therefore, if an i2c device is used for the
> restart, the xfer should be atomic. This avoids warnings like:
>
> [ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
> [ 12.676926] Voluntary context switch within RCU read-side critical section!
> ...
> [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114
> [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
> ...
> [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58
> [ 13.001050] machine_restart from panic+0x2a8/0x32c
>
> Use !preemptible() instead, which is basically the same check as
> pre-v5.2.
>
> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
> Cc: [email protected] # v5.2+
> Suggested-by: Dmitry Osipenko <[email protected]>
> Acked-by: Wolfram Sang <[email protected]>
> Reviewed-by: Dmitry Osipenko <[email protected]>
> Tested-by: Nishanth Menon <[email protected]>
> Signed-off-by: Benjamin Bara <[email protected]>
> ---
> drivers/i2c/i2c-core.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/i2c-core.h b/drivers/i2c/i2c-core.h
> index 1247e6e6e975..05b8b8dfa9bd 100644
> --- a/drivers/i2c/i2c-core.h
> +++ b/drivers/i2c/i2c-core.h
> @@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const struct resource *resources,
> */
> static inline bool i2c_in_atomic_xfer_mode(void)
> {
> - return system_state > SYSTEM_RUNNING && irqs_disabled();
> + return system_state > SYSTEM_RUNNING && !preemptible();
With preemption disabled, this boils down to
return system_state > SYSTEM_RUNNING (&& !0)
and will then generate a backtrace splash on each reboot on our
board:
# reboot -f
[ 12.687169] No atomic I2C transfer handler for 'i2c-0'
[ 12.692313] WARNING: CPU: 6 PID: 275 at drivers/i2c/i2c-core.h:40 i2c_smbus_xfer+0x100/0x118
[ 12.700745] Modules linked in:
[ 12.703788] CPU: 6 PID: 275 Comm: reboot Not tainted 6.7.0-rc6-next-20231222+ #2494
[ 12.711431] Hardware name: Kontron 3.5"-SBC-i1200 (DT)
[ 12.716555] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 12.723504] pc : i2c_smbus_xfer+0x100/0x118
[ 12.727674] lr : i2c_smbus_xfer+0x100/0x118
[ 12.731844] sp : ffff80008389b7c0
[ 12.735146] x29: ffff80008389b7c0 x28: ffff0000c561b4c0 x27: ffff0000c2b06400
[ 12.742268] x26: ffff0000c65aec4a x25: 000000000000000b x24: 0000000000000001
[ 12.749389] x23: 0000000000000000 x22: 0000000000000064 x21: 0000000000000008
[ 12.756510] x20: ffff80008389b836 x19: ffff0000c2dda080 x18: ffffffffffffffff
[ 12.763631] x17: ffff800080813f48 x16: ffff80008081bd38 x15: 0730072d07630732
[ 12.770752] x14: 0769072707200772 x13: 0730072d07630732 x12: 0769072707200772
[ 12.777873] x11: 0720072007200720 x10: ffff8000827dc828 x9 : ffff80008012e154
[ 12.784994] x8 : 00000000ffffefff x7 : ffff8000827dc828 x6 : 80000000fffff000
[ 12.792116] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
[ 12.799236] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c561b4c0
[ 12.806359] Call trace:
[ 12.808793] i2c_smbus_xfer+0x100/0x118
[ 12.812616] i2c_smbus_read_i2c_block_data+0x60/0xb0
[ 12.817569] mt6360_regmap_read+0xa4/0x1b8
[ 12.821654] _regmap_raw_read+0xfc/0x270
[ 12.825566] _regmap_bus_read+0x4c/0x90
[ 12.829389] _regmap_read+0x6c/0x168
[ 12.832953] _regmap_update_bits+0xf8/0x150
[ 12.837124] regmap_update_bits_base+0x6c/0xa8
[ 12.841555] regulator_disable_regmap+0x48/0x68
[ 12.846073] _regulator_do_disable+0x130/0x1e8
[ 12.850504] _regulator_disable+0x154/0x1d8
[ 12.854676] regulator_disable+0x44/0x90
[ 12.858587] mmc_regulator_set_ocr+0x8c/0x118
[ 12.862933] msdc_ops_set_ios+0x3f4/0x938
[ 12.866932] mmc_set_initial_state+0x90/0xa8
[ 12.871191] mmc_power_off+0x40/0x68
[ 12.874754] _mmc_sd_suspend+0x5c/0x190
[ 12.878577] mmc_sd_suspend+0x20/0x78
[ 12.882227] mmc_bus_shutdown+0x48/0x90
[ 12.886051] device_shutdown+0x134/0x298
[ 12.889961] kernel_restart+0x48/0xd0
[ 12.893613] __do_sys_reboot+0x1e0/0x278
[ 12.897523] __arm64_sys_reboot+0x2c/0x40
[ 12.901519] invoke_syscall+0x50/0x128
[ 12.905257] el0_svc_common.constprop.0+0x48/0xf0
[ 12.909950] do_el0_svc+0x24/0x38
[ 12.913253] el0_svc+0x38/0xd8
[ 12.916296] el0t_64_sync_handler+0x100/0x130
[ 12.920639] el0t_64_sync+0x1a4/0x1a8
[ 12.924289] ---[ end trace 0000000000000000 ]---
...
I'm not sure if this is now the expected behavior or not. There will be
no backtraces, if I build a preemptible kernel, nor will there be
backtraces if I revert this patch.
OTOH, the driver I'm using (drivers/i2c/busses/i2c-mt65xx.c) has no
*_atomic(). So the warning is correct. There is also [1], which seems to
be the same issue I'm facing.
-michael
[1] https://lore.kernel.org/linux-i2c/[email protected]/
Hi Michael,
On Tue, 2 Jan 2024 at 16:03, Michael Walle <[email protected]> wrote:
> With preemption disabled, this boils down to
> return system_state > SYSTEM_RUNNING (&& !0)
>
> and will then generate a backtrace splash on each reboot on our
> board:
>
> # reboot -f
> [ 12.687169] No atomic I2C transfer handler for 'i2c-0'
> ...
> [ 12.806359] Call trace:
> [ 12.808793] i2c_smbus_xfer+0x100/0x118
> ...
>
> I'm not sure if this is now the expected behavior or not. There will be
> no backtraces, if I build a preemptible kernel, nor will there be
> backtraces if I revert this patch.
thanks for the report.
In your case, the warning comes from shutting down a regulator during
device_shutdown(), so nothing really problematic here. However, later in
the "restart sequence", IRQs are disabled before the restart handlers
are called. If the reboot handlers would rely on irq-based
("non-atomic") i2c transfer, they might not work properly.
> OTOH, the driver I'm using (drivers/i2c/busses/i2c-mt65xx.c) has no
> *_atomic(). So the warning is correct. There is also [1], which seems to
> be the same issue I'm facing.
>
> -michael
>
> [1] https://lore.kernel.org/linux-i2c/[email protected]/
I tried to implement an atomic handler for the mt65xx, but I don't have
the respective hardware available to test it. I decided to use a similar
approach as done in drivers/i2c/busses/i2c-rk3x.c, which calls the IRQ
handler in a while loop if an atomic xfer is requested. IMHO, this
should work with IRQs enabled and disabled, but I am not sure if this is
the best approach...
diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
index a8b5719c3372..3c18305e6059 100644
--- a/drivers/i2c/busses/i2c-mt65xx.c
+++ b/drivers/i2c/busses/i2c-mt65xx.c
@@ -16,6 +16,7 @@
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/iopoll.h>
+#include <linux/jiffies.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/module.h>
@@ -307,6 +308,8 @@ struct mtk_i2c {
bool ignore_restart_irq;
struct mtk_i2c_ac_timing ac_timing;
const struct mtk_i2c_compatible *dev_comp;
+ bool atomic_xfer;
+ bool xfer_complete;
};
/**
@@ -994,6 +997,20 @@ static void i2c_dump_register(struct mtk_i2c *i2c)
readl(i2c->pdmabase + OFFSET_RX_4G_MODE));
}
+static irqreturn_t mtk_i2c_irq(int irqno, void *dev_id);
+
+static int mtk_i2c_wait_xfer_atomic(struct mtk_i2c *i2c)
+{
+ unsigned long timeout = jiffies + i2c->adap.timeout;
+
+ do {
+ udelay(10);
+ mtk_i2c_irq(0, i2c);
+ } while (!i2c->xfer_complete && time_before(jiffies, timeout));
+
+ return i2c->xfer_complete;
+}
+
static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, struct i2c_msg *msgs,
int num, int left_num)
{
@@ -1015,7 +1032,10 @@ static int mtk_i2c_do_transfer(struct mtk_i2c
*i2c, struct i2c_msg *msgs,
if (i2c->auto_restart)
restart_flag = I2C_RS_TRANSFER;
- reinit_completion(&i2c->msg_complete);
+ if (i2c->atomic_xfer)
+ i2c->xfer_complete = false;
+ else
+ reinit_completion(&i2c->msg_complete);
if (i2c->dev_comp->apdma_sync &&
i2c->op != I2C_MASTER_WRRD && num > 1) {
@@ -1195,8 +1215,12 @@ static int mtk_i2c_do_transfer(struct mtk_i2c
*i2c, struct i2c_msg *msgs,
}
mtk_i2c_writew(i2c, start_reg, OFFSET_START);
- ret = wait_for_completion_timeout(&i2c->msg_complete,
- i2c->adap.timeout);
+ if (i2c->atomic_xfer)
+ /* We can't rely on wait_for_completion* calls in atomic mode. */
+ ret = mtk_i2c_wait_xfer_atomic(i2c);
+ else
+ ret = wait_for_completion_timeout(&i2c->msg_complete,
+ i2c->adap.timeout);
/* Clear interrupt mask */
mtk_i2c_writew(i2c, ~(restart_flag | I2C_HS_NACKERR | I2C_ACKERR |
@@ -1238,8 +1262,8 @@ static int mtk_i2c_do_transfer(struct mtk_i2c
*i2c, struct i2c_msg *msgs,
return 0;
}
-static int mtk_i2c_transfer(struct i2c_adapter *adap,
- struct i2c_msg msgs[], int num)
+static int mtk_i2c_transfer_common(struct i2c_adapter *adap,
+ struct i2c_msg msgs[], int num, bool atomic)
{
int ret;
int left_num = num;
@@ -1249,6 +1273,7 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
if (ret)
return ret;
+ i2c->atomic_xfer = atomic;
i2c->auto_restart = i2c->dev_comp->auto_restart;
/* checking if we can skip restart and optimize using WRRD mode */
@@ -1303,6 +1328,18 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
return ret;
}
+static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ struct i2c_msg msgs[], int num)
+{
+ return mtk_i2c_transfer_common(adap, msgs, num, false);
+}
+
+static int mtk_i2c_transfer_atomic(struct i2c_adapter *adap,
+ struct i2c_msg msgs[], int num)
+{
+ return mtk_i2c_transfer_common(adap, msgs, num, true);
+}
+
static irqreturn_t mtk_i2c_irq(int irqno, void *dev_id)
{
struct mtk_i2c *i2c = dev_id;
@@ -1328,8 +1365,12 @@ static irqreturn_t mtk_i2c_irq(int irqno, void *dev_id)
mtk_i2c_writew(i2c, I2C_RS_MUL_CNFG | I2C_RS_MUL_TRIG |
I2C_TRANSAC_START, OFFSET_START);
} else {
- if (i2c->irq_stat & (I2C_TRANSAC_COMP | restart_flag))
- complete(&i2c->msg_complete);
+ if (i2c->irq_stat & (I2C_TRANSAC_COMP | restart_flag)) {
+ if (i2c->atomic_xfer)
+ i2c->xfer_complete = true;
+ else
+ complete(&i2c->msg_complete);
+ }
}
return IRQ_HANDLED;
@@ -1346,6 +1387,7 @@ static u32 mtk_i2c_functionality(struct i2c_adapter *adap)
static const struct i2c_algorithm mtk_i2c_algorithm = {
.master_xfer = mtk_i2c_transfer,
+ .master_xfer_atomic = mtk_i2c_transfer_atomic,
.functionality = mtk_i2c_functionality,
};
Hi Benjamin,
>> With preemption disabled, this boils down to
>> return system_state > SYSTEM_RUNNING (&& !0)
>>
>> and will then generate a backtrace splash on each reboot on our
>> board:
>>
>> # reboot -f
>> [ 12.687169] No atomic I2C transfer handler for 'i2c-0'
>> ...
>> [ 12.806359] Call trace:
>> [ 12.808793] i2c_smbus_xfer+0x100/0x118
>> ...
>>
>> I'm not sure if this is now the expected behavior or not. There will
>> be
>> no backtraces, if I build a preemptible kernel, nor will there be
>> backtraces if I revert this patch.
>
>
> thanks for the report.
>
> In your case, the warning comes from shutting down a regulator during
> device_shutdown(), so nothing really problematic here.
I tend to disagree. Yes it's not problematic. But from a users point of
view, you get a splash of *many* backtraces on every reboot. Btw, one
should really turn this into a WARN_ONCE(). But even in this case you
might scare users which will eventually lead to more bug reports.
> However, later in
> the "restart sequence", IRQs are disabled before the restart handlers
> are called. If the reboot handlers would rely on irq-based
> ("non-atomic") i2c transfer, they might not work properly.
I get this from a technical point of view and agree that the correct
fix is to add the atomic variant to the i2c driver, which begs the
question, if adding the atomic variant to the driver will be considered
as a Fixes patch.
Do I get it correct, that in my case the interrupts are still enabled?
Otherwise I'd have gotten this warning even before your patch, correct?
Excuse my ignorance, but when are the interrupts actually disabled
during shutdown?
>> OTOH, the driver I'm using (drivers/i2c/busses/i2c-mt65xx.c) has no
>> *_atomic(). So the warning is correct. There is also [1], which seems
>> to
>> be the same issue I'm facing.
>>
>> -michael
>>
>> [1]
>> https://lore.kernel.org/linux-i2c/[email protected]/
>
>
> I tried to implement an atomic handler for the mt65xx, but I don't have
> the respective hardware available to test it. I decided to use a
> similar
> approach as done in drivers/i2c/busses/i2c-rk3x.c, which calls the IRQ
> handler in a while loop if an atomic xfer is requested. IMHO, this
> should work with IRQs enabled and disabled, but I am not sure if this
> is
> the best approach...
Thanks for already looking into that. Do you want to submit it as an
actual patch? If so, you can add
Tested-by: Michael Walle <[email protected]>
But again, it would be nice if we somehow can get rid of this huge
splash
of backtraces on 6.7.x (I guess it's already too late 6.7).
-michael
Hi Michael,
On Wed, 3 Jan 2024 at 10:20, Michael Walle <[email protected]> wrote:
> >> With preemption disabled, this boils down to
> >> return system_state > SYSTEM_RUNNING (&& !0)
> >>
> >> and will then generate a backtrace splash on each reboot on our
> >> board:
> >>
> >> # reboot -f
> >> [ 12.687169] No atomic I2C transfer handler for 'i2c-0'
> >> ...
> >> [ 12.806359] Call trace:
> >> [ 12.808793] i2c_smbus_xfer+0x100/0x118
> >> ...
> >>
> >> I'm not sure if this is now the expected behavior or not. There will
> >> be
> >> no backtraces, if I build a preemptible kernel, nor will there be
> >> backtraces if I revert this patch.
> >
> >
> > thanks for the report.
> >
> > In your case, the warning comes from shutting down a regulator during
> > device_shutdown(), so nothing really problematic here.
>
> I tend to disagree. Yes it's not problematic. But from a users point of
> view, you get a splash of *many* backtraces on every reboot. Btw, one
> should really turn this into a WARN_ONCE(). But even in this case you
> might scare users which will eventually lead to more bug reports.
Sure, but the correct "fix" would be to implement an atomic handler if
the i2c is used during this late stage. I just meant that the
device_shutdown() is less problematic than the actual reboot handler.
Your PMIC seems to not have a reboot handler (registered (yet)), and is
therefore not "affected".
> > However, later in
> > the "restart sequence", IRQs are disabled before the restart handlers
> > are called. If the reboot handlers would rely on irq-based
> > ("non-atomic") i2c transfer, they might not work properly.
>
> I get this from a technical point of view and agree that the correct
> fix is to add the atomic variant to the i2c driver, which begs the
> question, if adding the atomic variant to the driver will be considered
> as a Fixes patch.
I can add a Fixes when I post it. Although the initial patch just makes
the actual problem "noisier".
> Do I get it correct, that in my case the interrupts are still enabled?
> Otherwise I'd have gotten this warning even before your patch, correct?
Yes, device_shutdown() is called during
kernel_{shutdown,restart}_prepare(), before
machine_{power_off,restart}() is called. The interrupts should therefore
still be enabled in your case.
> Excuse my ignorance, but when are the interrupts actually disabled
> during shutdown?
This is usually one of the first things done in machine_restart(),
before the architecture-specific restart handlers are called (which
might use i2c). Same for machine_power_off().
> >> OTOH, the driver I'm using (drivers/i2c/busses/i2c-mt65xx.c) has no
> >> *_atomic(). So the warning is correct. There is also [1], which seems
> >> to
> >> be the same issue I'm facing.
> >>
> >> -michael
> >>
> >> [1]
> >> https://lore.kernel.org/linux-i2c/[email protected]/
> >
> >
> > I tried to implement an atomic handler for the mt65xx, but I don't have
> > the respective hardware available to test it. I decided to use a
> > similar
> > approach as done in drivers/i2c/busses/i2c-rk3x.c, which calls the IRQ
> > handler in a while loop if an atomic xfer is requested. IMHO, this
> > should work with IRQs enabled and disabled, but I am not sure if this
> > is
> > the best approach...
>
> Thanks for already looking into that. Do you want to submit it as an
> actual patch? If so, you can add
>
> Tested-by: Michael Walle <[email protected]>
Yes, I can do that - thanks for the quick feedback.
> But again, it would be nice if we somehow can get rid of this huge
> splash
> of backtraces on 6.7.x (I guess it's already too late 6.7).
IMHO, converting the error to WARN_ONCE() makes sense to reduce the
noise, but helps having more reliable reboot handling via i2c. Do you
think this is a sufficient "short-term solution" to reduce the noise
before the missing atomic handlers are actually implemented?
Hi Benjamin,
>> >> With preemption disabled, this boils down to
>> >> return system_state > SYSTEM_RUNNING (&& !0)
>> >>
>> >> and will then generate a backtrace splash on each reboot on our
>> >> board:
>> >>
>> >> # reboot -f
>> >> [ 12.687169] No atomic I2C transfer handler for 'i2c-0'
>> >> ...
>> >> [ 12.806359] Call trace:
>> >> [ 12.808793] i2c_smbus_xfer+0x100/0x118
>> >> ...
>> >>
>> >> I'm not sure if this is now the expected behavior or not. There will
>> >> be
>> >> no backtraces, if I build a preemptible kernel, nor will there be
>> >> backtraces if I revert this patch.
>> >
>> >
>> > thanks for the report.
>> >
>> > In your case, the warning comes from shutting down a regulator during
>> > device_shutdown(), so nothing really problematic here.
>>
>> I tend to disagree. Yes it's not problematic. But from a users point
>> of
>> view, you get a splash of *many* backtraces on every reboot. Btw, one
>> should really turn this into a WARN_ONCE(). But even in this case you
>> might scare users which will eventually lead to more bug reports.
>
> Sure, but the correct "fix" would be to implement an atomic handler if
> the i2c is used during this late stage. I just meant that the
> device_shutdown() is less problematic than the actual reboot handler.
> Your PMIC seems to not have a reboot handler (registered (yet)), and is
> therefore not "affected".
>
>> > However, later in
>> > the "restart sequence", IRQs are disabled before the restart handlers
>> > are called. If the reboot handlers would rely on irq-based
>> > ("non-atomic") i2c transfer, they might not work properly.
>>
>> I get this from a technical point of view and agree that the correct
>> fix is to add the atomic variant to the i2c driver, which begs the
>> question, if adding the atomic variant to the driver will be
>> considered
>> as a Fixes patch.
>
> I can add a Fixes when I post it. Although the initial patch just makes
> the actual problem "noisier".
As far as I understand, there was no problem (for me at least),
because the interrupts were still enabled at this time. But now,
there is the problem with getting these backtraces and with that
the user reports.
Don't get me wrong, I'm all for the correct fix here. But at the
same time I fear all the reports we'll be getting. And in the meantime
there was already a new one.
>> Do I get it correct, that in my case the interrupts are still enabled?
>> Otherwise I'd have gotten this warning even before your patch,
>> correct?
>
> Yes, device_shutdown() is called during
> kernel_{shutdown,restart}_prepare(), before
> machine_{power_off,restart}() is called. The interrupts should
> therefore
> still be enabled in your case.
>
>> Excuse my ignorance, but when are the interrupts actually disabled
>> during shutdown?
>
> This is usually one of the first things done in machine_restart(),
> before the architecture-specific restart handlers are called (which
> might use i2c). Same for machine_power_off().
Thanks for explaining.
>> >> OTOH, the driver I'm using (drivers/i2c/busses/i2c-mt65xx.c) has no
>> >> *_atomic(). So the warning is correct. There is also [1], which seems
>> >> to
>> >> be the same issue I'm facing.
>> >>
>> >> -michael
>> >>
>> >> [1]
>> >> https://lore.kernel.org/linux-i2c/[email protected]/
>> >
>> >
>> > I tried to implement an atomic handler for the mt65xx, but I don't have
>> > the respective hardware available to test it. I decided to use a
>> > similar
>> > approach as done in drivers/i2c/busses/i2c-rk3x.c, which calls the IRQ
>> > handler in a while loop if an atomic xfer is requested. IMHO, this
>> > should work with IRQs enabled and disabled, but I am not sure if this
>> > is
>> > the best approach...
>>
>> Thanks for already looking into that. Do you want to submit it as an
>> actual patch? If so, you can add
>>
>> Tested-by: Michael Walle <[email protected]>
>
> Yes, I can do that - thanks for the quick feedback.
>
>> But again, it would be nice if we somehow can get rid of this huge
>> splash
>> of backtraces on 6.7.x (I guess it's already too late 6.7).
>
> IMHO, converting the error to WARN_ONCE() makes sense to reduce the
> noise, but helps having more reliable reboot handling via i2c. Do you
> think this is a sufficient "short-term solution" to reduce the noise
> before the missing atomic handlers are actually implemented?
Turning that WARN into a WARN_ONCE is one thing. But it is still odd
that don't I get a warning with preemption enabled. Is that because
preemptible() will still return 1 until interrupts are actually
disabled?
Can we achieve something similar with kernels without preemption
support?
IOW, just warn iff there is an actual error, that is if i2c_xfer()
is called with interrupt off?
-michael