2024-02-09 13:56:23

by Théo Lebrun

[permalink] [raw]
Subject: [PATCH v3 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend

Hi,

This fixes runtime PM and system-wide suspend for the cadence-qspi
driver. Seeing how runtime PM and autosuspend are enabled by default, I
believe this affects all users of the driver.

This series has been tested on both Mobileye EyeQ5 hardware and the TI
J7200 EVM board, under s2idle.

Thanks all,
Théo

[0]: https://lore.kernel.org/lkml/[email protected]/

Signed-off-by: Théo Lebrun <[email protected]>
---
Changes in v3:
- Move both bugfix patches to the start of the series.
- Remove Fixes: trailer from the function renaming patch.
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Split the initial change into three separate commits, to make intents
clearer.
- Mark controller as suspended during the system-wide suspend.
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Théo Lebrun (4):
spi: cadence-qspi: fix pointer reference in runtime PM hooks
spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
spi: cadence-qspi: put runtime in runtime PM hooks names
spi: cadence-qspi: add system-wide suspend and resume callbacks

drivers/spi/spi-cadence-quadspi.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)
---
base-commit: 13acce918af915278e49980a3038df31845dbf39
change-id: 20240202-cdns-qspi-pm-fix-29600cc6d7bf

Best regards,
--
Théo Lebrun <[email protected]>



2024-02-09 13:58:30

by Théo Lebrun

[permalink] [raw]
Subject: [PATCH v3 2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks

The ->runtime_suspend() and ->runtime_resume() callbacks are not
expected to call spi_controller_suspend() and spi_controller_resume().
Remove calls to those in the cadence-qspi driver.

Those helpers have two roles currently:
- They stop/start the queue, including dealing with the kworker.
- They toggle the SPI controller SPI_CONTROLLER_SUSPENDED flag. It
requires acquiring ctlr->bus_lock_mutex.

Step one is irrelevant because cadence-qspi is not queued. Step two
however has two implications:
- A deadlock occurs, because ->runtime_resume() is called in a context
where the lock is already taken (in the ->exec_op() callback, where
the usage count is incremented).
- It would disallow all operations once the device is auto-suspended.

Here is a brief call tree highlighting the mutex deadlock:

spi_mem_exec_op()
...
spi_mem_access_start()
mutex_lock(&ctlr->bus_lock_mutex)

cqspi_exec_mem_op()
pm_runtime_resume_and_get()
cqspi_resume()
spi_controller_resume()
mutex_lock(&ctlr->bus_lock_mutex)
...

spi_mem_access_end()
mutex_unlock(&ctlr->bus_lock_mutex)
...

Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
Signed-off-by: Théo Lebrun <[email protected]>
---
drivers/spi/spi-cadence-quadspi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index d19ba024c80b..809bbbb876ad 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1930,14 +1930,10 @@ static void cqspi_remove(struct platform_device *pdev)
static int cqspi_suspend(struct device *dev)
{
struct cqspi_st *cqspi = dev_get_drvdata(dev);
- int ret;

- ret = spi_controller_suspend(cqspi->host);
cqspi_controller_enable(cqspi, 0);
-
clk_disable_unprepare(cqspi->clk);
-
- return ret;
+ return 0;
}

static int cqspi_resume(struct device *dev)
@@ -1950,8 +1946,7 @@ static int cqspi_resume(struct device *dev)

cqspi->current_cs = -1;
cqspi->sclk = 0;
-
- return spi_controller_resume(cqspi->host);
+ return 0;
}

static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,

--
2.43.0


2024-02-09 14:06:44

by Théo Lebrun

[permalink] [raw]
Subject: [PATCH v3 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks

dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
controller. Neither embed the other; this lead to memory corruption.

On a given platform (Mobileye EyeQ5) the memory corruption is hidden
inside cqspi->f_pdata. Also, this uninitialised memory is used as a
mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().

Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")
Signed-off-by: Théo Lebrun <[email protected]>
---
drivers/spi/spi-cadence-quadspi.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 74647dfcb86c..d19ba024c80b 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1930,10 +1930,9 @@ static void cqspi_remove(struct platform_device *pdev)
static int cqspi_suspend(struct device *dev)
{
struct cqspi_st *cqspi = dev_get_drvdata(dev);
- struct spi_controller *host = dev_get_drvdata(dev);
int ret;

- ret = spi_controller_suspend(host);
+ ret = spi_controller_suspend(cqspi->host);
cqspi_controller_enable(cqspi, 0);

clk_disable_unprepare(cqspi->clk);
@@ -1944,7 +1943,6 @@ static int cqspi_suspend(struct device *dev)
static int cqspi_resume(struct device *dev)
{
struct cqspi_st *cqspi = dev_get_drvdata(dev);
- struct spi_controller *host = dev_get_drvdata(dev);

clk_prepare_enable(cqspi->clk);
cqspi_wait_idle(cqspi);
@@ -1953,7 +1951,7 @@ static int cqspi_resume(struct device *dev)
cqspi->current_cs = -1;
cqspi->sclk = 0;

- return spi_controller_resume(host);
+ return spi_controller_resume(cqspi->host);
}

static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,

--
2.43.0


2024-02-09 14:07:10

by Théo Lebrun

[permalink] [raw]
Subject: [PATCH v3 3/4] spi: cadence-qspi: put runtime in runtime PM hooks names

Follow kernel naming convention with regards to power-management
callback function names.

The convention in the kernel is:
- prefix_suspend means the system-wide suspend callback;
- prefix_runtime_suspend means the runtime PM suspend callback.
The same applies to resume callbacks.

Signed-off-by: Théo Lebrun <[email protected]>
---
drivers/spi/spi-cadence-quadspi.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 809bbbb876ad..ee14965142ba 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1927,7 +1927,7 @@ static void cqspi_remove(struct platform_device *pdev)
pm_runtime_disable(&pdev->dev);
}

-static int cqspi_suspend(struct device *dev)
+static int cqspi_runtime_suspend(struct device *dev)
{
struct cqspi_st *cqspi = dev_get_drvdata(dev);

@@ -1936,7 +1936,7 @@ static int cqspi_suspend(struct device *dev)
return 0;
}

-static int cqspi_resume(struct device *dev)
+static int cqspi_runtime_resume(struct device *dev)
{
struct cqspi_st *cqspi = dev_get_drvdata(dev);

@@ -1949,8 +1949,8 @@ static int cqspi_resume(struct device *dev)
return 0;
}

-static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_suspend,
- cqspi_resume, NULL);
+static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_runtime_suspend,
+ cqspi_runtime_resume, NULL);

static const struct cqspi_driver_platdata cdns_qspi = {
.quirks = CQSPI_DISABLE_DAC_MODE,

--
2.43.0


2024-02-09 14:07:41

by Théo Lebrun

[permalink] [raw]
Subject: [PATCH v3 4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks

Each SPI controller is expected to call the spi_controller_suspend() and
spi_controller_resume() callbacks at system-wide suspend and resume.

It (1) handles the kthread worker for queued controllers and (2) marks
the controller as suspended to have spi_sync() fail while the
controller is unavailable.

Those two operations do not require the controller to be active, we do
not need to increment the runtime PM usage counter.

Signed-off-by: Théo Lebrun <[email protected]>
---
drivers/spi/spi-cadence-quadspi.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index ee14965142ba..f976681187b0 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1949,8 +1949,24 @@ static int cqspi_runtime_resume(struct device *dev)
return 0;
}

-static DEFINE_RUNTIME_DEV_PM_OPS(cqspi_dev_pm_ops, cqspi_runtime_suspend,
- cqspi_runtime_resume, NULL);
+static int cqspi_suspend(struct device *dev)
+{
+ struct cqspi_st *cqspi = dev_get_drvdata(dev);
+
+ return spi_controller_suspend(cqspi->host);
+}
+
+static int cqspi_resume(struct device *dev)
+{
+ struct cqspi_st *cqspi = dev_get_drvdata(dev);
+
+ return spi_controller_resume(cqspi->host);
+}
+
+static const struct dev_pm_ops cqspi_dev_pm_ops = {
+ SET_RUNTIME_PM_OPS(cqspi_runtime_suspend, cqspi_runtime_resume, NULL)
+ SET_SYSTEM_SLEEP_PM_OPS(cqspi_suspend, cqspi_resume)
+};

static const struct cqspi_driver_platdata cdns_qspi = {
.quirks = CQSPI_DISABLE_DAC_MODE,

--
2.43.0


2024-02-12 05:01:05

by Dhruva Gole

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks

Hi Theo!

On Feb 09, 2024 at 14:55:50 +0100, Th?o Lebrun wrote:
> dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
> controller. Neither embed the other; this lead to memory corruption.
>
> On a given platform (Mobileye EyeQ5) the memory corruption is hidden
> inside cqspi->f_pdata. Also, this uninitialised memory is used as a
> mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().
>
> Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")
> Signed-off-by: Th?o Lebrun <[email protected]>
> ---
> drivers/spi/spi-cadence-quadspi.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 74647dfcb86c..d19ba024c80b 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -1930,10 +1930,9 @@ static void cqspi_remove(struct platform_device *pdev)
> static int cqspi_suspend(struct device *dev)
> {
> struct cqspi_st *cqspi = dev_get_drvdata(dev);
> - struct spi_controller *host = dev_get_drvdata(dev);
> int ret;
>
> - ret = spi_controller_suspend(host);
> + ret = spi_controller_suspend(cqspi->host);
> cqspi_controller_enable(cqspi, 0);
>
> clk_disable_unprepare(cqspi->clk);
> @@ -1944,7 +1943,6 @@ static int cqspi_suspend(struct device *dev)
> static int cqspi_resume(struct device *dev)
> {
> struct cqspi_st *cqspi = dev_get_drvdata(dev);
> - struct spi_controller *host = dev_get_drvdata(dev);
>
> clk_prepare_enable(cqspi->clk);
> cqspi_wait_idle(cqspi);
> @@ -1953,7 +1951,7 @@ static int cqspi_resume(struct device *dev)
> cqspi->current_cs = -1;
> cqspi->sclk = 0;
>
> - return spi_controller_resume(host);
> + return spi_controller_resume(cqspi->host);

Looks good,
Reviewed-by: Dhruva Gole <[email protected]>


--
Best regards,
Dhruva Gole <[email protected]>

2024-02-22 19:24:31

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] spi: cadence-qspi: Fix runtime PM and system-wide suspend

On Fri, 09 Feb 2024 14:55:49 +0100, Théo Lebrun wrote:
> This fixes runtime PM and system-wide suspend for the cadence-qspi
> driver. Seeing how runtime PM and autosuspend are enabled by default, I
> believe this affects all users of the driver.
>
> This series has been tested on both Mobileye EyeQ5 hardware and the TI
> J7200 EVM board, under s2idle.
>
> [...]

Applied to

https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/4] spi: cadence-qspi: fix pointer reference in runtime PM hooks
commit: 32ce3bb57b6b402de2aec1012511e7ac4e7449dc
[2/4] spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
commit: 959043afe53ae80633e810416cee6076da6e91c6
[3/4] spi: cadence-qspi: put runtime in runtime PM hooks names
commit: 4efa1250b59ebf47ce64a7b6b7c3e2e0a2a9d35a
[4/4] spi: cadence-qspi: add system-wide suspend and resume callbacks
commit: 078d62de433b4f4556bb676e5dd670f0d4103376

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark