Coresight_claim_device() is called in cti_starting_cpu() only
when CTI is enabled while coresight_disclaim_device() is called
uncontionally in cti_dying_cpu(). This triggered below WARNING.
Only call disclaim device when CTI device is enabled to fix it.
[ 75.989643] WARNING: CPU: 1 PID: 14 at
kernel/drivers/hwtracing/coresight/coresight.c:209
coresight_disclaim_device_unlocked+0x10/0x24
[ 75.989697] CPU: 1 PID: 14 Comm: migration/1 Not tainted
5.9.0-rc1-gff1304be0a05-dirty #21
[ 75.989709] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 75.989737] pstate: 80c00085 (Nzcv daIf +PAN +UAO BTYPE=--)
[ 75.989758] pc : coresight_disclaim_device_unlocked+0x10/0x24
[ 75.989775] lr : coresight_disclaim_device+0x24/0x38
[ 75.989783] sp : ffff800011cd3c90
.
[ 75.990018] Call trace:
[ 75.990041] coresight_disclaim_device_unlocked+0x10/0x24
[ 75.990066] cti_dying_cpu+0x34/0x4c
[ 75.990101] cpuhp_invoke_callback+0x84/0x1e0
[ 75.990121] take_cpu_down+0x90/0xe0
[ 75.990154] multi_cpu_stop+0x134/0x160
[ 75.990171] cpu_stopper_thread+0xb0/0x13c
[ 75.990196] smpboot_thread_fn+0x1c4/0x270
[ 75.990222] kthread+0x128/0x154
[ 75.990251] ret_from_fork+0x10/0x18
Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver")
Signed-off-by: Tingwei Zhang <[email protected]>
---
drivers/hwtracing/coresight/coresight-cti.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
index 3ccc703dc940..d6fea6efec71 100644
--- a/drivers/hwtracing/coresight/coresight-cti.c
+++ b/drivers/hwtracing/coresight/coresight-cti.c
@@ -742,7 +742,8 @@ static int cti_dying_cpu(unsigned int cpu)
spin_lock(&drvdata->spinlock);
drvdata->config.hw_powered = false;
- coresight_disclaim_device(drvdata->base);
+ if (drvdata->config.hw_enabled)
+ coresight_disclaim_device(drvdata->base);
spin_unlock(&drvdata->spinlock);
return 0;
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Deadlock as below is triggered by one CPU holds drvdata->spinlock
and calls cti_enable_hw(). Smp_call_function_single() is called
in cti_enable_hw() and tries to let another CPU write CTI registers.
That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
and doesn't response to IPI from smp_call_function_single().
[ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
[ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
[ 988.357689] pc : smp_call_function_single+0x158/0x1b8
[ 988.362782] lr : smp_call_function_single+0x124/0x1b8
...
[ 988.451638] Call trace:
[ 988.454119] smp_call_function_single+0x158/0x1b8
[ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
[ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
[ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
[ 988.474957] enable_source_store+0x5c/0x9c [coresight]
[ 988.480140] dev_attr_store+0x14/0x28
[ 988.483839] sysfs_kf_write+0x38/0x4c
[ 988.487532] kernfs_fop_write+0x1c0/0x2b0
[ 988.491585] vfs_write+0xfc/0x300
[ 988.494931] ksys_write+0x78/0xe0
[ 988.498283] __arm64_sys_write+0x18/0x20
[ 988.502240] el0_svc_common+0x98/0x160
[ 988.506024] do_el0_svc+0x78/0x80
[ 988.509377] el0_sync_handler+0xd4/0x270
[ 988.513337] el0_sync+0x164/0x180
This change write CTI registers directly in cti_enable_hw().
Config->hw_powered has been checked to be true with spinlock holded.
CTI is powered and can be programmed until spinlock is released.
Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices")
Signed-off-by: Tingwei Zhang <[email protected]>
---
drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
1 file changed, 1 insertion(+), 16 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
index 3ccc703dc940..869569eb8c7f 100644
--- a/drivers/hwtracing/coresight/coresight-cti.c
+++ b/drivers/hwtracing/coresight/coresight-cti.c
@@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata)
CS_LOCK(drvdata->base);
}
-static void cti_enable_hw_smp_call(void *info)
-{
- struct cti_drvdata *drvdata = info;
-
- cti_write_all_hw_regs(drvdata);
-}
-
/* write regs to hardware and enable */
static int cti_enable_hw(struct cti_drvdata *drvdata)
{
@@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
if (rc)
goto cti_err_not_enabled;
- if (drvdata->ctidev.cpu >= 0) {
- rc = smp_call_function_single(drvdata->ctidev.cpu,
- cti_enable_hw_smp_call,
- drvdata, 1);
- if (rc)
- goto cti_err_not_enabled;
- } else {
- cti_write_all_hw_regs(drvdata);
- }
+ cti_write_all_hw_regs(drvdata);
config->hw_enabled = true;
atomic_inc(&drvdata->config.enable_req_count);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Below BUG is triggered by call pm_runtime_get_sync() in
cti_cpuhp_enable_hw(). It's in CPU hotplug callback with interrupt
disabled. Pm_runtime_get_sync() calls clock driver to enable clock
which could sleep. Remove pm_runtime_get_sync() in cti_cpuhp_enable_hw()
since pm_runtime_get_sync() is called in cti_enabld and pm_runtime_put()
is called in cti_disabled. No need to increase pm count when CPU gets
online since it's not decreased when CPU is offline.
[ 105.800279] BUG: scheduling while atomic: swapper/1/0/0x00000002
[ 105.800290] Modules linked in:
[ 105.800327] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
5.9.0-rc1-gff1304be0a05-dirty #21
[ 105.800337] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 105.800353] Call trace:
[ 105.800414] dump_backtrace+0x0/0x1d4
[ 105.800439] show_stack+0x14/0x1c
[ 105.800462] dump_stack+0xc0/0x100
[ 105.800490] __schedule_bug+0x58/0x74
[ 105.800523] __schedule+0x590/0x65c
[ 105.800538] schedule+0x78/0x10c
[ 105.800553] schedule_timeout+0x188/0x250
[ 105.800585] qmp_send.constprop.10+0x12c/0x1b0
[ 105.800599] qmp_qdss_clk_prepare+0x18/0x20
[ 105.800622] clk_core_prepare+0x48/0xd4
[ 105.800639] clk_prepare+0x20/0x34
[ 105.800663] amba_pm_runtime_resume+0x54/0x90
[ 105.800695] __rpm_callback+0xdc/0x138
[ 105.800709] rpm_callback+0x24/0x78
[ 105.800724] rpm_resume+0x328/0x47c
[ 105.800739] __pm_runtime_resume+0x50/0x74
[ 105.800768] cti_starting_cpu+0x40/0xa4
[ 105.800795] cpuhp_invoke_callback+0x84/0x1e0
[ 105.800814] notify_cpu_starting+0x9c/0xb8
[ 105.800834] secondary_start_kernel+0xd8/0x164
[ 105.800933] CPU1: Booted secondary processor 0x0000000100 [0x517f803c]
Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver")
Signed-off-by: Tingwei Zhang <[email protected]>
---
drivers/hwtracing/coresight/coresight-cti.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
index 869569eb8c7f..baba6af83440 100644
--- a/drivers/hwtracing/coresight/coresight-cti.c
+++ b/drivers/hwtracing/coresight/coresight-cti.c
@@ -126,9 +126,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata)
{
struct cti_config *config = &drvdata->config;
- struct device *dev = &drvdata->csdev->dev;
- pm_runtime_get_sync(dev->parent);
spin_lock(&drvdata->spinlock);
config->hw_powered = true;
@@ -148,7 +146,6 @@ static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata)
/* did not re-enable due to no claim / no request */
cti_hp_not_enabled:
spin_unlock(&drvdata->spinlock);
- pm_runtime_put(dev->parent);
}
/* disable hardware */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <[email protected]> wrote:
>
> Below BUG is triggered by call pm_runtime_get_sync() in
> cti_cpuhp_enable_hw(). It's in CPU hotplug callback with interrupt
> disabled. Pm_runtime_get_sync() calls clock driver to enable clock
> which could sleep. Remove pm_runtime_get_sync() in cti_cpuhp_enable_hw()
> since pm_runtime_get_sync() is called in cti_enabld and pm_runtime_put()
> is called in cti_disabled. No need to increase pm count when CPU gets
> online since it's not decreased when CPU is offline.
>
> [ 105.800279] BUG: scheduling while atomic: swapper/1/0/0x00000002
> [ 105.800290] Modules linked in:
> [ 105.800327] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
> 5.9.0-rc1-gff1304be0a05-dirty #21
> [ 105.800337] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 105.800353] Call trace:
> [ 105.800414] dump_backtrace+0x0/0x1d4
> [ 105.800439] show_stack+0x14/0x1c
> [ 105.800462] dump_stack+0xc0/0x100
> [ 105.800490] __schedule_bug+0x58/0x74
> [ 105.800523] __schedule+0x590/0x65c
> [ 105.800538] schedule+0x78/0x10c
> [ 105.800553] schedule_timeout+0x188/0x250
> [ 105.800585] qmp_send.constprop.10+0x12c/0x1b0
> [ 105.800599] qmp_qdss_clk_prepare+0x18/0x20
> [ 105.800622] clk_core_prepare+0x48/0xd4
> [ 105.800639] clk_prepare+0x20/0x34
> [ 105.800663] amba_pm_runtime_resume+0x54/0x90
> [ 105.800695] __rpm_callback+0xdc/0x138
> [ 105.800709] rpm_callback+0x24/0x78
> [ 105.800724] rpm_resume+0x328/0x47c
> [ 105.800739] __pm_runtime_resume+0x50/0x74
> [ 105.800768] cti_starting_cpu+0x40/0xa4
> [ 105.800795] cpuhp_invoke_callback+0x84/0x1e0
> [ 105.800814] notify_cpu_starting+0x9c/0xb8
> [ 105.800834] secondary_start_kernel+0xd8/0x164
> [ 105.800933] CPU1: Booted secondary processor 0x0000000100 [0x517f803c]
>
> Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver")
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-cti.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
> index 869569eb8c7f..baba6af83440 100644
> --- a/drivers/hwtracing/coresight/coresight-cti.c
> +++ b/drivers/hwtracing/coresight/coresight-cti.c
> @@ -126,9 +126,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
> static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata)
> {
> struct cti_config *config = &drvdata->config;
> - struct device *dev = &drvdata->csdev->dev;
>
> - pm_runtime_get_sync(dev->parent);
> spin_lock(&drvdata->spinlock);
> config->hw_powered = true;
>
> @@ -148,7 +146,6 @@ static void cti_cpuhp_enable_hw(struct cti_drvdata *drvdata)
> /* did not re-enable due to no claim / no request */
> cti_hp_not_enabled:
> spin_unlock(&drvdata->spinlock);
> - pm_runtime_put(dev->parent);
> }
>
> /* disable hardware */
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
Reviewed-by Mike Leach <[email protected]>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi Tingwei
On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <[email protected]> wrote:
>
> Deadlock as below is triggered by one CPU holds drvdata->spinlock
> and calls cti_enable_hw(). Smp_call_function_single() is called
> in cti_enable_hw() and tries to let another CPU write CTI registers.
> That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> and doesn't response to IPI from smp_call_function_single().
>
> [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> ...
> [ 988.451638] Call trace:
> [ 988.454119] smp_call_function_single+0x158/0x1b8
> [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> [ 988.480140] dev_attr_store+0x14/0x28
> [ 988.483839] sysfs_kf_write+0x38/0x4c
> [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> [ 988.491585] vfs_write+0xfc/0x300
> [ 988.494931] ksys_write+0x78/0xe0
> [ 988.498283] __arm64_sys_write+0x18/0x20
> [ 988.502240] el0_svc_common+0x98/0x160
> [ 988.506024] do_el0_svc+0x78/0x80
> [ 988.509377] el0_sync_handler+0xd4/0x270
> [ 988.513337] el0_sync+0x164/0x180
>
> This change write CTI registers directly in cti_enable_hw().
> Config->hw_powered has been checked to be true with spinlock holded.
> CTI is powered and can be programmed until spinlock is released.
>
Agreed.
Reviewed-by Mike Leach <[email protected]>
> Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices")
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> 1 file changed, 1 insertion(+), 16 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
> index 3ccc703dc940..869569eb8c7f 100644
> --- a/drivers/hwtracing/coresight/coresight-cti.c
> +++ b/drivers/hwtracing/coresight/coresight-cti.c
> @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata)
> CS_LOCK(drvdata->base);
> }
>
> -static void cti_enable_hw_smp_call(void *info)
> -{
> - struct cti_drvdata *drvdata = info;
> -
> - cti_write_all_hw_regs(drvdata);
> -}
> -
> /* write regs to hardware and enable */
> static int cti_enable_hw(struct cti_drvdata *drvdata)
> {
> @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
> if (rc)
> goto cti_err_not_enabled;
>
> - if (drvdata->ctidev.cpu >= 0) {
> - rc = smp_call_function_single(drvdata->ctidev.cpu,
> - cti_enable_hw_smp_call,
> - drvdata, 1);
> - if (rc)
> - goto cti_err_not_enabled;
> - } else {
> - cti_write_all_hw_regs(drvdata);
> - }
> + cti_write_all_hw_regs(drvdata);
>
> config->hw_enabled = true;
> atomic_inc(&drvdata->config.enable_req_count);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
> _______________________________________________
> CoreSight mailing list
> [email protected]
> https://lists.linaro.org/mailman/listinfo/coresight
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On Tue, 18 Aug 2020 at 12:11, Tingwei Zhang <[email protected]> wrote:
>
> Coresight_claim_device() is called in cti_starting_cpu() only
> when CTI is enabled while coresight_disclaim_device() is called
> uncontionally in cti_dying_cpu(). This triggered below WARNING.
> Only call disclaim device when CTI device is enabled to fix it.
>
> [ 75.989643] WARNING: CPU: 1 PID: 14 at
> kernel/drivers/hwtracing/coresight/coresight.c:209
> coresight_disclaim_device_unlocked+0x10/0x24
> [ 75.989697] CPU: 1 PID: 14 Comm: migration/1 Not tainted
> 5.9.0-rc1-gff1304be0a05-dirty #21
> [ 75.989709] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 75.989737] pstate: 80c00085 (Nzcv daIf +PAN +UAO BTYPE=--)
> [ 75.989758] pc : coresight_disclaim_device_unlocked+0x10/0x24
> [ 75.989775] lr : coresight_disclaim_device+0x24/0x38
> [ 75.989783] sp : ffff800011cd3c90
> .
> [ 75.990018] Call trace:
> [ 75.990041] coresight_disclaim_device_unlocked+0x10/0x24
> [ 75.990066] cti_dying_cpu+0x34/0x4c
> [ 75.990101] cpuhp_invoke_callback+0x84/0x1e0
> [ 75.990121] take_cpu_down+0x90/0xe0
> [ 75.990154] multi_cpu_stop+0x134/0x160
> [ 75.990171] cpu_stopper_thread+0xb0/0x13c
> [ 75.990196] smpboot_thread_fn+0x1c4/0x270
> [ 75.990222] kthread+0x128/0x154
> [ 75.990251] ret_from_fork+0x10/0x18
>
> Fixes: e9b880581d55 ("coresight: cti: Add CPU Hotplug handling to CTI driver")
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-cti.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
> index 3ccc703dc940..d6fea6efec71 100644
> --- a/drivers/hwtracing/coresight/coresight-cti.c
> +++ b/drivers/hwtracing/coresight/coresight-cti.c
> @@ -742,7 +742,8 @@ static int cti_dying_cpu(unsigned int cpu)
>
> spin_lock(&drvdata->spinlock);
> drvdata->config.hw_powered = false;
> - coresight_disclaim_device(drvdata->base);
> + if (drvdata->config.hw_enabled)
> + coresight_disclaim_device(drvdata->base);
> spin_unlock(&drvdata->spinlock);
> return 0;
> }
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
Reviewed-by: Mike Leach <[email protected]>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi Tingwei,
On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> Deadlock as below is triggered by one CPU holds drvdata->spinlock
> and calls cti_enable_hw(). Smp_call_function_single() is called
> in cti_enable_hw() and tries to let another CPU write CTI registers.
> That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> and doesn't response to IPI from smp_call_function_single().
>
> [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> ...
> [ 988.451638] Call trace:
> [ 988.454119] smp_call_function_single+0x158/0x1b8
> [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> [ 988.480140] dev_attr_store+0x14/0x28
> [ 988.483839] sysfs_kf_write+0x38/0x4c
> [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> [ 988.491585] vfs_write+0xfc/0x300
> [ 988.494931] ksys_write+0x78/0xe0
> [ 988.498283] __arm64_sys_write+0x18/0x20
> [ 988.502240] el0_svc_common+0x98/0x160
> [ 988.506024] do_el0_svc+0x78/0x80
> [ 988.509377] el0_sync_handler+0xd4/0x270
> [ 988.513337] el0_sync+0x164/0x180
>
Was this the full log or you did cut some of it?
> This change write CTI registers directly in cti_enable_hw().
> Config->hw_powered has been checked to be true with spinlock holded.
> CTI is powered and can be programmed until spinlock is released.
>
From your explanation above it seems that cti_enable_hw() was called from, say
CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was taken
and smp_call_function_single() called right after. That woke up CPUx and
cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to take
CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my
assessment correct?
If so I don't think the fix suggested in this patch will work. The same
condition will happen whenever cti_enable_hw() is called on a CPU to enable a
CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on latter
CPU at the same time.
I think a better solution is to grab the lock in cti_enable_hw() and check the
value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the lock and
call smp_call_function_single(). In cti_enable_hw_smp_call() take the lock
again and move forward from there.
I have applied the other two patches in this set so no need to send them again.
Thanks,
Mathieu
> Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices")
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> 1 file changed, 1 insertion(+), 16 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
> index 3ccc703dc940..869569eb8c7f 100644
> --- a/drivers/hwtracing/coresight/coresight-cti.c
> +++ b/drivers/hwtracing/coresight/coresight-cti.c
> @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata)
> CS_LOCK(drvdata->base);
> }
>
> -static void cti_enable_hw_smp_call(void *info)
> -{
> - struct cti_drvdata *drvdata = info;
> -
> - cti_write_all_hw_regs(drvdata);
> -}
> -
> /* write regs to hardware and enable */
> static int cti_enable_hw(struct cti_drvdata *drvdata)
> {
> @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
> if (rc)
> goto cti_err_not_enabled;
>
> - if (drvdata->ctidev.cpu >= 0) {
> - rc = smp_call_function_single(drvdata->ctidev.cpu,
> - cti_enable_hw_smp_call,
> - drvdata, 1);
> - if (rc)
> - goto cti_err_not_enabled;
> - } else {
> - cti_write_all_hw_regs(drvdata);
> - }
> + cti_write_all_hw_regs(drvdata);
>
> config->hw_enabled = true;
> atomic_inc(&drvdata->config.enable_req_count);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> Deadlock as below is triggered by one CPU holds drvdata->spinlock
> and calls cti_enable_hw(). Smp_call_function_single() is called
> in cti_enable_hw() and tries to let another CPU write CTI registers.
> That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> and doesn't response to IPI from smp_call_function_single().
>
> [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> ...
> [ 988.451638] Call trace:
> [ 988.454119] smp_call_function_single+0x158/0x1b8
> [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> [ 988.480140] dev_attr_store+0x14/0x28
> [ 988.483839] sysfs_kf_write+0x38/0x4c
> [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> [ 988.491585] vfs_write+0xfc/0x300
> [ 988.494931] ksys_write+0x78/0xe0
> [ 988.498283] __arm64_sys_write+0x18/0x20
> [ 988.502240] el0_svc_common+0x98/0x160
> [ 988.506024] do_el0_svc+0x78/0x80
> [ 988.509377] el0_sync_handler+0xd4/0x270
> [ 988.513337] el0_sync+0x164/0x180
>
> This change write CTI registers directly in cti_enable_hw().
> Config->hw_powered has been checked to be true with spinlock holded.
> CTI is powered and can be programmed until spinlock is released.
>
> Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI devices")
> Signed-off-by: Tingwei Zhang <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> 1 file changed, 1 insertion(+), 16 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-cti.c b/drivers/hwtracing/coresight/coresight-cti.c
> index 3ccc703dc940..869569eb8c7f 100644
> --- a/drivers/hwtracing/coresight/coresight-cti.c
> +++ b/drivers/hwtracing/coresight/coresight-cti.c
> @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata *drvdata)
> CS_LOCK(drvdata->base);
> }
>
> -static void cti_enable_hw_smp_call(void *info)
> -{
> - struct cti_drvdata *drvdata = info;
> -
> - cti_write_all_hw_regs(drvdata);
> -}
> -
> /* write regs to hardware and enable */
> static int cti_enable_hw(struct cti_drvdata *drvdata)
> {
> @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata *drvdata)
> if (rc)
> goto cti_err_not_enabled;
>
> - if (drvdata->ctidev.cpu >= 0) {
> - rc = smp_call_function_single(drvdata->ctidev.cpu,
> - cti_enable_hw_smp_call,
> - drvdata, 1);
> - if (rc)
> - goto cti_err_not_enabled;
> - } else {
> - cti_write_all_hw_regs(drvdata);
> - }
> + cti_write_all_hw_regs(drvdata);
I forgot... Looking at the code in cti_enable_hw() the claim tags are not
released in the event that smp_call_function_single() fails. Please address
that as part of refactoring the function.
Thanks,
Mathieu
>
> config->hw_enabled = true;
> atomic_inc(&drvdata->config.enable_req_count);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote:
> Hi Tingwei,
>
> On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> > Deadlock as below is triggered by one CPU holds drvdata->spinlock
> > and calls cti_enable_hw(). Smp_call_function_single() is called
> > in cti_enable_hw() and tries to let another CPU write CTI registers.
> > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> > and doesn't response to IPI from smp_call_function_single().
> >
> > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> > ...
> > [ 988.451638] Call trace:
> > [ 988.454119] smp_call_function_single+0x158/0x1b8
> > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> > [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> > [ 988.480140] dev_attr_store+0x14/0x28
> > [ 988.483839] sysfs_kf_write+0x38/0x4c
> > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> > [ 988.491585] vfs_write+0xfc/0x300
> > [ 988.494931] ksys_write+0x78/0xe0
> > [ 988.498283] __arm64_sys_write+0x18/0x20
> > [ 988.502240] el0_svc_common+0x98/0x160
> > [ 988.506024] do_el0_svc+0x78/0x80
> > [ 988.509377] el0_sync_handler+0xd4/0x270
> > [ 988.513337] el0_sync+0x164/0x180
> >
>
> Was this the full log or you did cut some of it?
>
I cut some CPU registers' value since it's too long and not relevant.
The Call trace is full.
> > This change write CTI registers directly in cti_enable_hw().
> > Config->hw_powered has been checked to be true with spinlock holded.
> > CTI is powered and can be programmed until spinlock is released.
> >
>
> From your explanation above it seems that cti_enable_hw() was called from,
> say
> CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was
> taken
> and smp_call_function_single() called right after. That woke up CPUx and
> cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to
> take
> CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my
> assessment correct?
>
Most of them is correct. The only difference is CPUx is power on when
cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged:
and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called
when CPUx tries to suspend instead of resume.
> If so I don't think the fix suggested in this patch will work. The same
> condition will happen whenever cti_enable_hw() is called on a CPU to
> enable a
> CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on
> latter
> CPU at the same time.
>
I'm not sure I understand this correctly. Let me clarify it a little bit.
It's a deadlock since cti_enable_hw() holds the spinlock and calls
cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It
waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while
CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch,
I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx.
It won't wait for CPUy and release spinlock after program registers of
CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will
continue to run. Since spinlock is held and config->hw_powered is true,
we don't need to worry about CPUy power down when we program CTI on CPUx.
> I think a better solution is to grab the lock in cti_enable_hw() and check
> the
> value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the
> lock and
> call smp_call_function_single(). In cti_enable_hw_smp_call() take the
> lock
> again and move forward from there.
>
After cti_enable_hw() releases the lock, it's possible that CPU is offline
by user, cti_enable_hw_smp_call() will fail in this case.
> I have applied the other two patches in this set so no need to send them
> again.
>
Thanks,
Tingwei
> Thanks,
> Mathieu
>
> > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI
> devices")
> > Signed-off-by: Tingwei Zhang <[email protected]>
> > ---
> > drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> > 1 file changed, 1 insertion(+), 16 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-cti.c
> b/drivers/hwtracing/coresight/coresight-cti.c
> > index 3ccc703dc940..869569eb8c7f 100644
> > --- a/drivers/hwtracing/coresight/coresight-cti.c
> > +++ b/drivers/hwtracing/coresight/coresight-cti.c
> > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata
> *drvdata)
> > CS_LOCK(drvdata->base);
> > }
> >
> > -static void cti_enable_hw_smp_call(void *info)
> > -{
> > - struct cti_drvdata *drvdata = info;
> > -
> > - cti_write_all_hw_regs(drvdata);
> > -}
> > -
> > /* write regs to hardware and enable */
> > static int cti_enable_hw(struct cti_drvdata *drvdata)
> > {
> > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata
> *drvdata)
> > if (rc)
> > goto cti_err_not_enabled;
> >
> > - if (drvdata->ctidev.cpu >= 0) {
> > - rc = smp_call_function_single(drvdata->ctidev.cpu,
> > - cti_enable_hw_smp_call,
> > - drvdata, 1);
> > - if (rc)
> > - goto cti_err_not_enabled;
> > - } else {
> > - cti_write_all_hw_regs(drvdata);
> > - }
> > + cti_write_all_hw_regs(drvdata);
> >
> > config->hw_enabled = true;
> > atomic_inc(&drvdata->config.enable_req_count);
> > --
> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
> Forum,
> > a Linux Foundation Collaborative Project
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Fri, Aug 28, 2020 at 11:18:18AM +0800, Tingwei Zhang wrote:
> On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote:
> > Hi Tingwei,
> >
> > On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> > > Deadlock as below is triggered by one CPU holds drvdata->spinlock
> > > and calls cti_enable_hw(). Smp_call_function_single() is called
> > > in cti_enable_hw() and tries to let another CPU write CTI registers.
> > > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> > > and doesn't response to IPI from smp_call_function_single().
> > >
> > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> > > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> > > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> > > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> > > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> > > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> > > ...
> > > [ 988.451638] Call trace:
> > > [ 988.454119] smp_call_function_single+0x158/0x1b8
> > > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> > > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> > > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> > > [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> > > [ 988.480140] dev_attr_store+0x14/0x28
> > > [ 988.483839] sysfs_kf_write+0x38/0x4c
> > > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> > > [ 988.491585] vfs_write+0xfc/0x300
> > > [ 988.494931] ksys_write+0x78/0xe0
> > > [ 988.498283] __arm64_sys_write+0x18/0x20
> > > [ 988.502240] el0_svc_common+0x98/0x160
> > > [ 988.506024] do_el0_svc+0x78/0x80
> > > [ 988.509377] el0_sync_handler+0xd4/0x270
> > > [ 988.513337] el0_sync+0x164/0x180
> > >
> >
> > Was this the full log or you did cut some of it?
> >
>
> I cut some CPU registers' value since it's too long and not relevant.
> The Call trace is full.
>
> > > This change write CTI registers directly in cti_enable_hw().
> > > Config->hw_powered has been checked to be true with spinlock holded.
> > > CTI is powered and can be programmed until spinlock is released.
> > >
> >
> > From your explanation above it seems that cti_enable_hw() was called from,
> > say
> > CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock was
> > taken
> > and smp_call_function_single() called right after. That woke up CPUx and
> > cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to
> > take
> > CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry. Is my
> > assessment correct?
> >
>
> Most of them is correct. The only difference is CPUx is power on when
> cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged:
> and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called
> when CPUx tries to suspend instead of resume.
>
> > If so I don't think the fix suggested in this patch will work. The same
> > condition will happen whenever cti_enable_hw() is called on a CPU to
> > enable a
> > CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on
> > latter
> > CPU at the same time.
> >
>
> I'm not sure I understand this correctly. Let me clarify it a little bit.
> It's a deadlock since cti_enable_hw() holds the spinlock and calls
> cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It
> waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while
> CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch,
> I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx.
> It won't wait for CPUy and release spinlock after program registers of
> CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will
> continue to run. Since spinlock is held and config->hw_powered is true,
> we don't need to worry about CPUy power down when we program CTI on CPUx.
>
> > I think a better solution is to grab the lock in cti_enable_hw() and check
> > the
> > value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release the
> > lock and
> > call smp_call_function_single(). In cti_enable_hw_smp_call() take the
> > lock
> > again and move forward from there.
> >
>
> After cti_enable_hw() releases the lock, it's possible that CPU is offline
> by user, cti_enable_hw_smp_call() will fail in this case.
>
>
I took another look at your solution and you are correct, CPUy won't lock but
rather spin waiting for cti_enable_hw() to finish on CPUx. To make sure CPUx is
not interrupted while executing cti_enable_hw() I suggest to use
spin_lock_irq().
Thanks,
Mathieu
>
> > I have applied the other two patches in this set so no need to send them
> > again.
> >
> Thanks,
> Tingwei
> > Thanks,
> > Mathieu
> >
> > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI
> > devices")
> > > Signed-off-by: Tingwei Zhang <[email protected]>
> > > ---
> > > drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> > > 1 file changed, 1 insertion(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c
> > b/drivers/hwtracing/coresight/coresight-cti.c
> > > index 3ccc703dc940..869569eb8c7f 100644
> > > --- a/drivers/hwtracing/coresight/coresight-cti.c
> > > +++ b/drivers/hwtracing/coresight/coresight-cti.c
> > > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata
> > *drvdata)
> > > CS_LOCK(drvdata->base);
> > > }
> > >
> > > -static void cti_enable_hw_smp_call(void *info)
> > > -{
> > > - struct cti_drvdata *drvdata = info;
> > > -
> > > - cti_write_all_hw_regs(drvdata);
> > > -}
> > > -
> > > /* write regs to hardware and enable */
> > > static int cti_enable_hw(struct cti_drvdata *drvdata)
> > > {
> > > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata
> > *drvdata)
> > > if (rc)
> > > goto cti_err_not_enabled;
> > >
> > > - if (drvdata->ctidev.cpu >= 0) {
> > > - rc = smp_call_function_single(drvdata->ctidev.cpu,
> > > - cti_enable_hw_smp_call,
> > > - drvdata, 1);
> > > - if (rc)
> > > - goto cti_err_not_enabled;
> > > - } else {
> > > - cti_write_all_hw_regs(drvdata);
> > > - }
> > > + cti_write_all_hw_regs(drvdata);
> > >
> > > config->hw_enabled = true;
> > > atomic_inc(&drvdata->config.enable_req_count);
> > > --
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
> > Forum,
> > > a Linux Foundation Collaborative Project
> > >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Tue, Sep 01, 2020 at 04:28:19AM +0800, Mathieu Poirier wrote:
> On Fri, Aug 28, 2020 at 11:18:18AM +0800, Tingwei Zhang wrote:
> > On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote:
> > > Hi Tingwei,
> > >
> > > On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> > > > Deadlock as below is triggered by one CPU holds drvdata->spinlock
> > > > and calls cti_enable_hw(). Smp_call_function_single() is called
> > > > in cti_enable_hw() and tries to let another CPU write CTI registers.
> > > > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> > > > and doesn't response to IPI from smp_call_function_single().
> > > >
> > > > [ 988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: G W L
> > > > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> > > > [ 988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> > > > [ 988.352073] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
> > > > [ 988.357689] pc : smp_call_function_single+0x158/0x1b8
> > > > [ 988.362782] lr : smp_call_function_single+0x124/0x1b8
> > > > ...
> > > > [ 988.451638] Call trace:
> > > > [ 988.454119] smp_call_function_single+0x158/0x1b8
> > > > [ 988.458866] cti_enable+0xb4/0xf8 [coresight_cti]
> > > > [ 988.463618] coresight_control_assoc_ectdev+0x6c/0x128
> [coresight]
> > > > [ 988.469855] coresight_enable+0x1f0/0x364 [coresight]
> > > > [ 988.474957] enable_source_store+0x5c/0x9c [coresight]
> > > > [ 988.480140] dev_attr_store+0x14/0x28
> > > > [ 988.483839] sysfs_kf_write+0x38/0x4c
> > > > [ 988.487532] kernfs_fop_write+0x1c0/0x2b0
> > > > [ 988.491585] vfs_write+0xfc/0x300
> > > > [ 988.494931] ksys_write+0x78/0xe0
> > > > [ 988.498283] __arm64_sys_write+0x18/0x20
> > > > [ 988.502240] el0_svc_common+0x98/0x160
> > > > [ 988.506024] do_el0_svc+0x78/0x80
> > > > [ 988.509377] el0_sync_handler+0xd4/0x270
> > > > [ 988.513337] el0_sync+0x164/0x180
> > > >
> > >
> > > Was this the full log or you did cut some of it?
> > >
> >
> > I cut some CPU registers' value since it's too long and not relevant.
> > The Call trace is full.
> >
> > > > This change write CTI registers directly in cti_enable_hw().
> > > > Config->hw_powered has been checked to be true with spinlock holded.
> > > > CTI is powered and can be programmed until spinlock is released.
> > > >
> > >
> > > From your explanation above it seems that cti_enable_hw() was called
> from,
> > > say
> > > CPUy, to enable the CTI associated to CPUx. CTIx's drvdata->spinlock
> was
> > > taken
> > > and smp_call_function_single() called right after. That woke up CPUx
> and
> > > cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying
> to
> > > take
> > > CTIx's drvdata->spinlock. That hung CPUx and the kernel got angry.
> Is my
> > > assessment correct?
> > >
> >
> > Most of them is correct. The only difference is CPUx is power on when
> > cti_enable_hw() is called. Otherwise it will goto cti_state_unchanged:
> > and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called
> > when CPUx tries to suspend instead of resume.
> >
> > > If so I don't think the fix suggested in this patch will work. The
> same
> > > condition will happen whenever cti_enable_hw() is called on a CPU to
> > > enable a
> > > CTI that belongs to another CPU and that cti_cpu_pm_notify() is called
> on
> > > latter
> > > CPU at the same time.
> > >
> >
> > I'm not sure I understand this correctly. Let me clarify it a little
> bit.
> > It's a deadlock since cti_enable_hw() holds the spinlock and calls
> > cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It
> > waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while
> > CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch,
> > I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx.
> > It won't wait for CPUy and release spinlock after program registers of
> > CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will
> > continue to run. Since spinlock is held and config->hw_powered is true,
> > we don't need to worry about CPUy power down when we program CTI on
> CPUx.
> >
> > > I think a better solution is to grab the lock in cti_enable_hw() and
> check
> > > the
> > > value of ->ctidev.cpu. If not a global CPU, i.e >= 0, then release
> the
> > > lock and
> > > call smp_call_function_single(). In cti_enable_hw_smp_call() take the
> > > lock
> > > again and move forward from there.
> > >
> >
> > After cti_enable_hw() releases the lock, it's possible that CPU is
> offline
> > by user, cti_enable_hw_smp_call() will fail in this case.
> >
> >
>
> I took another look at your solution and you are correct, CPUy won't lock
> but
> rather spin waiting for cti_enable_hw() to finish on CPUx. To make sure
> CPUx is
> not interrupted while executing cti_enable_hw() I suggest to use
> spin_lock_irq().
>
> Thanks,
> Mathieu
>
Sure. I've sent v2 according to your comments.
> >
> > > I have applied the other two patches in this set so no need to send
> them
> > > again.
> > >
> > Thanks,
> > Tingwei
> > > Thanks,
> > > Mathieu
> > >
> > > > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI
> > > devices")
> > > > Signed-off-by: Tingwei Zhang <[email protected]>
> > > > ---
> > > > drivers/hwtracing/coresight/coresight-cti.c | 17 +----------------
> > > > 1 file changed, 1 insertion(+), 16 deletions(-)
> > > >
> > > > diff --git a/drivers/hwtracing/coresight/coresight-cti.c
> > > b/drivers/hwtracing/coresight/coresight-cti.c
> > > > index 3ccc703dc940..869569eb8c7f 100644
> > > > --- a/drivers/hwtracing/coresight/coresight-cti.c
> > > > +++ b/drivers/hwtracing/coresight/coresight-cti.c
> > > > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata
> > > *drvdata)
> > > > CS_LOCK(drvdata->base);
> > > > }
> > > >
> > > > -static void cti_enable_hw_smp_call(void *info)
> > > > -{
> > > > - struct cti_drvdata *drvdata = info;
> > > > -
> > > > - cti_write_all_hw_regs(drvdata);
> > > > -}
> > > > -
> > > > /* write regs to hardware and enable */
> > > > static int cti_enable_hw(struct cti_drvdata *drvdata)
> > > > {
> > > > @@ -112,15 +105,7 @@ static int cti_enable_hw(struct cti_drvdata
> > > *drvdata)
> > > > if (rc)
> > > > goto cti_err_not_enabled;
> > > >
> > > > - if (drvdata->ctidev.cpu >= 0) {
> > > > - rc = smp_call_function_single(drvdata->ctidev.cpu,
> > > > - cti_enable_hw_smp_call,
> > > > - drvdata, 1);
> > > > - if (rc)
> > > > - goto cti_err_not_enabled;
> > > > - } else {
> > > > - cti_write_all_hw_regs(drvdata);
> > > > - }
> > > > + cti_write_all_hw_regs(drvdata);
> > > >
> > > > config->hw_enabled = true;
> > > > atomic_inc(&drvdata->config.enable_req_count);
> > > > --
> > > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
> > > Forum,
> > > > a Linux Foundation Collaborative Project
> > > >
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > [email protected]
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel