2023-10-09 10:06:29

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v2 0/2] Fix hang due to CPU BW request as BPMP suspended

This patch set fixes hang during system resume which started coming
after adding Memory Interconnect and OPP support to the Tegra194 CPUFREQ
in below change:
f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth").

Tegra194 CPUFREQ driver uses 'CPUFREQ_NEED_INITIAL_FREQ_CHECK' flag
which causes a CPU frequency set request from the 'cpuhp_cpufreq_online'
hotplug notifier during resume. The CPU frequency set call also triggers
a DRAM bandwidth set request but the BPMP driver hasn't resumed yet
which results in hang during resume.

Fix this by resetting the BPMP IPC channels inside tegra_bpmp_transfer*()
API if the bandwidth request came from CPU clusters and the BPMP driver
is still suspended.

---
v[1] -> v2:
- add suspend hook instead of reset hook based approach.
- add suspended instead of needs_reset flag in 'struct tegra_bpmp'.
- add flags field to 'struct tegra_bpmp_message'.
- set TEGRA_BPMP_MESSAGE_RESET flag from MC driver for CPU BW request.
- use suspended and flags to reset BPMP IPC early on resume.

Sumit Gupta (1):
firmware: tegra: add suspend hook and reset BPMP IPC early on resume

Thierry Reding (1):
memory: tegra: set BPMP msg flags to reset IPC channels

drivers/firmware/tegra/bpmp.c | 30 ++++++++++++++++++++++++++++++
drivers/memory/tegra/tegra234.c | 4 ++++
include/soc/tegra/bpmp.h | 6 ++++++
3 files changed, 40 insertions(+)

[1] https://lore.kernel.org/linux-tegra/[email protected]/

--
2.17.1


2023-10-09 10:06:38

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v2 1/2] firmware: tegra: add suspend hook and reset BPMP IPC early on resume

Add suspend hook and a 'suspended' field in the 'struct tegra_bpmp'
to mark if BPMP is suspended. Also, add a 'flags' field in the
'struct tegra_bpmp_message' whose 'TEGRA_BPMP_MESSAGE_RESET' bit
can be set from the Tegra MC driver to signal that the reset of BPMP
IPC channels is required before sending MRQ to the BPMP FW.
Together both the fields allow us to handle any requests that might
be sent too soon as they can cause hang during system resume.
One case where we see BPMP requests being sent before the BPMP driver
has resumed is the memory bandwidth requests which are triggered by
onlining the CPUs during system resume. The CPUs are onlined before
the BPMP has resumed and we need to reset the BPMP IPC channels to
handle these requests.
The additional check for 'flags' is done to avoid any un-intended BPMP
IPC reset if the tegra_bpmp_transfer*() API gets called during suspend
sequence after the BPMP driver is suspended.

Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
Co-developed-by: Thierry Reding <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sumit Gupta <[email protected]>
---
drivers/firmware/tegra/bpmp.c | 30 ++++++++++++++++++++++++++++++
include/soc/tegra/bpmp.h | 6 ++++++
2 files changed, 36 insertions(+)

diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
index 51d062e0c3f1..c1590d3aa9cb 100644
--- a/drivers/firmware/tegra/bpmp.c
+++ b/drivers/firmware/tegra/bpmp.c
@@ -313,6 +313,8 @@ static ssize_t tegra_bpmp_channel_write(struct tegra_bpmp_channel *channel,
return __tegra_bpmp_channel_write(channel, mrq, flags, data, size);
}

+static int __maybe_unused tegra_bpmp_resume(struct device *dev);
+
int tegra_bpmp_transfer_atomic(struct tegra_bpmp *bpmp,
struct tegra_bpmp_message *msg)
{
@@ -325,6 +327,14 @@ int tegra_bpmp_transfer_atomic(struct tegra_bpmp *bpmp,
if (!tegra_bpmp_message_valid(msg))
return -EINVAL;

+ if (bpmp->suspended) {
+ /* Reset BPMP IPC channels during resume based on flags passed */
+ if (msg->flags & TEGRA_BPMP_MESSAGE_RESET)
+ tegra_bpmp_resume(bpmp->dev);
+ else
+ return -EAGAIN;
+ }
+
channel = bpmp->tx_channel;

spin_lock(&bpmp->atomic_tx_lock);
@@ -364,6 +374,14 @@ int tegra_bpmp_transfer(struct tegra_bpmp *bpmp,
if (!tegra_bpmp_message_valid(msg))
return -EINVAL;

+ if (bpmp->suspended) {
+ /* Reset BPMP IPC channels during resume based on flags passed */
+ if (msg->flags & TEGRA_BPMP_MESSAGE_RESET)
+ tegra_bpmp_resume(bpmp->dev);
+ else
+ return -EAGAIN;
+ }
+
channel = tegra_bpmp_write_threaded(bpmp, msg->mrq, msg->tx.data,
msg->tx.size);
if (IS_ERR(channel))
@@ -796,10 +814,21 @@ static int tegra_bpmp_probe(struct platform_device *pdev)
return err;
}

+static int __maybe_unused tegra_bpmp_suspend(struct device *dev)
+{
+ struct tegra_bpmp *bpmp = dev_get_drvdata(dev);
+
+ bpmp->suspended = true;
+
+ return 0;
+}
+
static int __maybe_unused tegra_bpmp_resume(struct device *dev)
{
struct tegra_bpmp *bpmp = dev_get_drvdata(dev);

+ bpmp->suspended = false;
+
if (bpmp->soc->ops->resume)
return bpmp->soc->ops->resume(bpmp);
else
@@ -807,6 +836,7 @@ static int __maybe_unused tegra_bpmp_resume(struct device *dev)
}

static const struct dev_pm_ops tegra_bpmp_pm_ops = {
+ .suspend_noirq = tegra_bpmp_suspend,
.resume_noirq = tegra_bpmp_resume,
};

diff --git a/include/soc/tegra/bpmp.h b/include/soc/tegra/bpmp.h
index 5842e38bb288..f5e4ac5b8cce 100644
--- a/include/soc/tegra/bpmp.h
+++ b/include/soc/tegra/bpmp.h
@@ -102,8 +102,12 @@ struct tegra_bpmp {
#ifdef CONFIG_DEBUG_FS
struct dentry *debugfs_mirror;
#endif
+
+ bool suspended;
};

+#define TEGRA_BPMP_MESSAGE_RESET BIT(0)
+
struct tegra_bpmp_message {
unsigned int mrq;

@@ -117,6 +121,8 @@ struct tegra_bpmp_message {
size_t size;
int ret;
} rx;
+
+ unsigned long flags;
};

#if IS_ENABLED(CONFIG_TEGRA_BPMP)
--
2.17.1

2023-10-09 10:07:09

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v2 2/2] memory: tegra: set BPMP msg flags to reset IPC channels

From: Thierry Reding <[email protected]>

Set the 'TEGRA_BPMP_MESSAGE_RESET' bit in newly added 'flags' field
of 'struct tegra_bpmp_message' to request for the reset of BPMP IPC
channels. This is used along with the 'suspended' check in BPMP driver
for handling early bandwidth requests due to the hotplug of CPU's
during system resume before the driver gets resumed.

Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
Signed-off-by: Thierry Reding <[email protected]>
Co-developed-by: Sumit Gupta <[email protected]>
Signed-off-by: Sumit Gupta <[email protected]>
---
drivers/memory/tegra/tegra234.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/memory/tegra/tegra234.c b/drivers/memory/tegra/tegra234.c
index 9e5b5dbd9c8d..2845041f32d6 100644
--- a/drivers/memory/tegra/tegra234.c
+++ b/drivers/memory/tegra/tegra234.c
@@ -986,6 +986,10 @@ static int tegra234_mc_icc_set(struct icc_node *src, struct icc_node *dst)
msg.rx.data = &bwmgr_resp;
msg.rx.size = sizeof(bwmgr_resp);

+ if (pclient->bpmp_id >= TEGRA_ICC_BPMP_CPU_CLUSTER0 &&
+ pclient->bpmp_id <= TEGRA_ICC_BPMP_CPU_CLUSTER2)
+ msg.flags = TEGRA_BPMP_MESSAGE_RESET;
+
ret = tegra_bpmp_transfer(mc->bpmp, &msg);
if (ret < 0) {
dev_err(mc->dev, "BPMP transfer failed: %d\n", ret);
--
2.17.1

2023-10-12 11:01:25

by Thierry Reding

[permalink] [raw]
Subject: Re: [Patch v2 2/2] memory: tegra: set BPMP msg flags to reset IPC channels

On Mon, Oct 09, 2023 at 03:35:57PM +0530, Sumit Gupta wrote:
> From: Thierry Reding <[email protected]>
>
> Set the 'TEGRA_BPMP_MESSAGE_RESET' bit in newly added 'flags' field
> of 'struct tegra_bpmp_message' to request for the reset of BPMP IPC
> channels. This is used along with the 'suspended' check in BPMP driver
> for handling early bandwidth requests due to the hotplug of CPU's
> during system resume before the driver gets resumed.
>
> Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
> Signed-off-by: Thierry Reding <[email protected]>
> Co-developed-by: Sumit Gupta <[email protected]>
> Signed-off-by: Sumit Gupta <[email protected]>
> ---
> drivers/memory/tegra/tegra234.c | 4 ++++
> 1 file changed, 4 insertions(+)

Krzysztof,

this one has a build-time dependency on patch 1/2, so it'd make sense
for me to pick this up into the Tegra tree along with patch 1/2. That
is slightly easier because I already have a BPMP patch in the tree.

There should be no conflict between this and the Tegra tree, though,
so if you feel strongly about it, you could also pick up both patches,
in which case:

Acked-by: Thierry Reding <[email protected]>

>
> diff --git a/drivers/memory/tegra/tegra234.c b/drivers/memory/tegra/tegra234.c
> index 9e5b5dbd9c8d..2845041f32d6 100644
> --- a/drivers/memory/tegra/tegra234.c
> +++ b/drivers/memory/tegra/tegra234.c
> @@ -986,6 +986,10 @@ static int tegra234_mc_icc_set(struct icc_node *src, struct icc_node *dst)
> msg.rx.data = &bwmgr_resp;
> msg.rx.size = sizeof(bwmgr_resp);
>
> + if (pclient->bpmp_id >= TEGRA_ICC_BPMP_CPU_CLUSTER0 &&
> + pclient->bpmp_id <= TEGRA_ICC_BPMP_CPU_CLUSTER2)
> + msg.flags = TEGRA_BPMP_MESSAGE_RESET;
> +
> ret = tegra_bpmp_transfer(mc->bpmp, &msg);
> if (ret < 0) {
> dev_err(mc->dev, "BPMP transfer failed: %d\n", ret);
> --
> 2.17.1
>


Attachments:
(No filename) (1.87 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-12 11:01:40

by Thierry Reding

[permalink] [raw]
Subject: Re: [Patch v2 1/2] firmware: tegra: add suspend hook and reset BPMP IPC early on resume

On Mon, Oct 09, 2023 at 03:35:56PM +0530, Sumit Gupta wrote:
> Add suspend hook and a 'suspended' field in the 'struct tegra_bpmp'
> to mark if BPMP is suspended. Also, add a 'flags' field in the
> 'struct tegra_bpmp_message' whose 'TEGRA_BPMP_MESSAGE_RESET' bit
> can be set from the Tegra MC driver to signal that the reset of BPMP
> IPC channels is required before sending MRQ to the BPMP FW.
> Together both the fields allow us to handle any requests that might
> be sent too soon as they can cause hang during system resume.
> One case where we see BPMP requests being sent before the BPMP driver
> has resumed is the memory bandwidth requests which are triggered by
> onlining the CPUs during system resume. The CPUs are onlined before
> the BPMP has resumed and we need to reset the BPMP IPC channels to
> handle these requests.
> The additional check for 'flags' is done to avoid any un-intended BPMP
> IPC reset if the tegra_bpmp_transfer*() API gets called during suspend
> sequence after the BPMP driver is suspended.
>
> Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
> Co-developed-by: Thierry Reding <[email protected]>
> Signed-off-by: Thierry Reding <[email protected]>
> Signed-off-by: Sumit Gupta <[email protected]>
> ---
> drivers/firmware/tegra/bpmp.c | 30 ++++++++++++++++++++++++++++++
> include/soc/tegra/bpmp.h | 6 ++++++
> 2 files changed, 36 insertions(+)

Krzysztof,

if you want to pick these up instead of me taking them through the Tegra
tree:

Acked-by: Thierry Reding <[email protected]>


Attachments:
(No filename) (1.56 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-12 13:29:10

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [Patch v2 2/2] memory: tegra: set BPMP msg flags to reset IPC channels

On 09/10/2023 12:05, Sumit Gupta wrote:
> From: Thierry Reding <[email protected]>
>
> Set the 'TEGRA_BPMP_MESSAGE_RESET' bit in newly added 'flags' field
> of 'struct tegra_bpmp_message' to request for the reset of BPMP IPC
> channels. This is used along with the 'suspended' check in BPMP driver
> for handling early bandwidth requests due to the hotplug of CPU's
> during system resume before the driver gets resumed.
>
> Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
> Signed-off-by: Thierry Reding <[email protected]>
> Co-developed-by: Sumit Gupta <[email protected]>
> Signed-off-by: Sumit Gupta <[email protected]>


Acked-by: Krzysztof Kozlowski <[email protected]>

Best regards,
Krzysztof

2023-10-12 13:33:05

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [Patch v2 2/2] memory: tegra: set BPMP msg flags to reset IPC channels

On 12/10/2023 13:00, Thierry Reding wrote:
> On Mon, Oct 09, 2023 at 03:35:57PM +0530, Sumit Gupta wrote:
>> From: Thierry Reding <[email protected]>
>>
>> Set the 'TEGRA_BPMP_MESSAGE_RESET' bit in newly added 'flags' field
>> of 'struct tegra_bpmp_message' to request for the reset of BPMP IPC
>> channels. This is used along with the 'suspended' check in BPMP driver
>> for handling early bandwidth requests due to the hotplug of CPU's
>> during system resume before the driver gets resumed.
>>
>> Fixes: f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth")
>> Signed-off-by: Thierry Reding <[email protected]>
>> Co-developed-by: Sumit Gupta <[email protected]>
>> Signed-off-by: Sumit Gupta <[email protected]>
>> ---
>> drivers/memory/tegra/tegra234.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>
> Krzysztof,
>
> this one has a build-time dependency on patch 1/2, so it'd make sense
> for me to pick this up into the Tegra tree along with patch 1/2. That
> is slightly easier because I already have a BPMP patch in the tree.

Sounds good.

Best regards,
Krzysztof

2023-10-13 12:27:10

by Thierry Reding

[permalink] [raw]
Subject: Re: (subset) [Patch v2 0/2] Fix hang due to CPU BW request as BPMP suspended

From: Thierry Reding <[email protected]>


On Mon, 09 Oct 2023 15:35:55 +0530, Sumit Gupta wrote:
> This patch set fixes hang during system resume which started coming
> after adding Memory Interconnect and OPP support to the Tegra194 CPUFREQ
> in below change:
> f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth").
>
> Tegra194 CPUFREQ driver uses 'CPUFREQ_NEED_INITIAL_FREQ_CHECK' flag
> which causes a CPU frequency set request from the 'cpuhp_cpufreq_online'
> hotplug notifier during resume. The CPU frequency set call also triggers
> a DRAM bandwidth set request but the BPMP driver hasn't resumed yet
> which results in hang during resume.
>
> [...]

Applied, thanks!

[2/2] memory: tegra: set BPMP msg flags to reset IPC channels
(no commit info)

Best regards,
--
Thierry Reding <[email protected]>

2023-10-13 12:37:01

by Thierry Reding

[permalink] [raw]
Subject: Re: (subset) [Patch v2 0/2] Fix hang due to CPU BW request as BPMP suspended

On Fri, Oct 13, 2023 at 02:25:12PM +0200, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
>
> On Mon, 09 Oct 2023 15:35:55 +0530, Sumit Gupta wrote:
> > This patch set fixes hang during system resume which started coming
> > after adding Memory Interconnect and OPP support to the Tegra194 CPUFREQ
> > in below change:
> > f41e1442ac5b ("cpufreq: tegra194: add OPP support and set bandwidth").
> >
> > Tegra194 CPUFREQ driver uses 'CPUFREQ_NEED_INITIAL_FREQ_CHECK' flag
> > which causes a CPU frequency set request from the 'cpuhp_cpufreq_online'
> > hotplug notifier during resume. The CPU frequency set call also triggers
> > a DRAM bandwidth set request but the BPMP driver hasn't resumed yet
> > which results in hang during resume.
> >
> > [...]
>
> Applied, thanks!
>
> [2/2] memory: tegra: set BPMP msg flags to reset IPC channels
> (no commit info)

For the record, I've actually applied both patches, but applying them to
different branches (i.e. resulting in two subsets for the same series)
seems to have confused b4.

Thierry


Attachments:
(No filename) (1.07 kB)
signature.asc (849.00 B)
Download all attachments