2024-02-28 05:41:57

by Fei Wu

[permalink] [raw]
Subject: [PATCH v2] perf: RISCV: Fix panic on pmu overflow handler

(1 << idx) of int is not desired when setting bits in unsigned long
overflowed_ctrs, use BIT() instead. This panic happens when running
'perf record -e branches' on sophgo sg2042.

[ 273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
[ 273.320851] Oops [#1]
[ 273.323179] Modules linked in:
[ 273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9
[ 273.332521] Hardware name: Sophgo Mango (DT)
[ 273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62
[ 273.342291] ra : pmu_sbi_ovf_handler+0x2e0/0x34e
[ 273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0
[ 273.354454] gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978
[ 273.361815] t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70
[ 273.369180] s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000
[ 273.376540] a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015
[ 273.383901] a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a
[ 273.391327] s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0
[ 273.398773] s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210
[ 273.406139] s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098
[ 273.413660] s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca
[ 273.421022] t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8
[ 273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d
[ 273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62
[ 273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee
[ 273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36
[ 273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e
[ 273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74
[ 273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92
[ 273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783
[ 273.477921] ---[ end trace 0000000000000000 ]---
[ 273.482630] Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Fei Wu <[email protected]>
---
drivers/perf/riscv_pmu_sbi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 16acd4dcdb96..335171db9647 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -731,14 +731,14 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
/* compute hardware counter index */
hidx = info->csr - CSR_CYCLE;
/* check if the corresponding bit is set in sscountovf */
- if (!(overflow & (1 << hidx)))
+ if (!(overflow & BIT(hidx)))
continue;

/*
* Keep a track of overflowed counters so that they can be started
* with updated initial value.
*/
- overflowed_ctrs |= 1 << lidx;
+ overflowed_ctrs |= BIT(lidx);
hw_evt = &event->hw;
riscv_pmu_event_update(event);
perf_sample_data_init(&data, 0, hw_evt->last_period);
--
2.34.1



2024-02-28 08:02:26

by Atish Kumar Patra

[permalink] [raw]
Subject: Re: [PATCH v2] perf: RISCV: Fix panic on pmu overflow handler

On 2/27/24 21:38, Fei Wu wrote:
> (1 << idx) of int is not desired when setting bits in unsigned long
> overflowed_ctrs, use BIT() instead. This panic happens when running
> 'perf record -e branches' on sophgo sg2042.
>
> [ 273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
> [ 273.320851] Oops [#1]
> [ 273.323179] Modules linked in:
> [ 273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9
> [ 273.332521] Hardware name: Sophgo Mango (DT)
> [ 273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62
> [ 273.342291] ra : pmu_sbi_ovf_handler+0x2e0/0x34e
> [ 273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0
> [ 273.354454] gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978
> [ 273.361815] t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70
> [ 273.369180] s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000
> [ 273.376540] a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015
> [ 273.383901] a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a
> [ 273.391327] s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0
> [ 273.398773] s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210
> [ 273.406139] s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098
> [ 273.413660] s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca
> [ 273.421022] t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8
> [ 273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d
> [ 273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62
> [ 273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee
> [ 273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36
> [ 273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e
> [ 273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74
> [ 273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92
> [ 273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783
> [ 273.477921] ---[ end trace 0000000000000000 ]---
> [ 273.482630] Kernel panic - not syncing: Fatal exception in interrupt
>
> Signed-off-by: Fei Wu <[email protected]>
> ---
> drivers/perf/riscv_pmu_sbi.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 16acd4dcdb96..335171db9647 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
> @@ -731,14 +731,14 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
> /* compute hardware counter index */
> hidx = info->csr - CSR_CYCLE;
> /* check if the corresponding bit is set in sscountovf */
> - if (!(overflow & (1 << hidx)))
> + if (!(overflow & BIT(hidx)))
> continue;
>
> /*
> * Keep a track of overflowed counters so that they can be started
> * with updated initial value.
> */
> - overflowed_ctrs |= 1 << lidx;
> + overflowed_ctrs |= BIT(lidx);
> hw_evt = &event->hw;
> riscv_pmu_event_update(event);
> perf_sample_data_init(&data, 0, hw_evt->last_period);

Thanks for the fix.Can you modify the other two instance of 1<< as
pointed by Alex as well. They are harmless because 32 counters supported
at max but improves the code quality.
With that:
Reviewed-by: Atish Patra <[email protected]>