LinuxLists.cc - [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

2023-01-13 11:53:32

Subject: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

When booting on a CPU that has a countertum on the counter read,
we use the arch_counter_get_cnt{v,p}ct_stable() backend which
applies the workaround.

However, we don't do the same thing when an affected CPU is
a secondary CPU, and we're stuck with the standard sched_clock()
backend that knows nothing about the workaround.

Fix it by always indirecting sched_clock(), making arch_timer_read_counter
a function instead of a function pointer. In turn, we update the
pointer (now private to the driver code) when detecting a new
workaround.

Cc: Mark Rutland <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Reported-by: Yogesh Lal <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters")
Link: https://lore.kernel.org/r/[email protected]
---
drivers/clocksource/arm_arch_timer.c | 56 +++++++++++++++++-----------
include/clocksource/arm_arch_timer.h | 2 +-
2 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index e09d4427f604..5272db86bef5 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -217,7 +217,12 @@ static notrace u64 arch_counter_get_cntvct(void)
* to exist on arm64. arm doesn't use this before DT is probed so even
* if we don't have the cp15 accessors we won't have a problem.
*/
-u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
+static u64 (*__arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
+
+u64 arch_timer_read_counter(void)
+{
+ return __arch_timer_read_counter();
+}
EXPORT_SYMBOL_GPL(arch_timer_read_counter);

static u64 arch_counter_read(struct clocksource *cs)
@@ -230,6 +235,28 @@ static u64 arch_counter_read_cc(const struct cyclecounter *cc)
return arch_timer_read_counter();
}

+static bool arch_timer_counter_has_wa(void);
+
+static u64 (*arch_counter_get_read_fn(void))(void)
+{
+ u64 (*rd)(void);
+
+ if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
+ arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
+ if (arch_timer_counter_has_wa())
+ rd = arch_counter_get_cntvct_stable;
+ else
+ rd = arch_counter_get_cntvct;
+ } else {
+ if (arch_timer_counter_has_wa())
+ rd = arch_counter_get_cntpct_stable;
+ else
+ rd = arch_counter_get_cntpct;
+ }
+
+ return rd;
+}
+
static struct clocksource clocksource_counter = {
.name = "arch_sys_counter",
.id = CSID_ARM_ARCH_COUNTER,
@@ -571,8 +598,10 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
per_cpu(timer_unstable_counter_workaround, i) = wa;
}

- if (wa->read_cntvct_el0 || wa->read_cntpct_el0)
- atomic_set(&timer_unstable_counter_workaround_in_use, 1);
+ if (wa->read_cntvct_el0 || wa->read_cntpct_el0) {
+ __arch_timer_read_counter = arch_counter_get_read_fn();
+ atomic_set_release(&timer_unstable_counter_workaround_in_use, 1);
+ }

/*
* Don't use the vdso fastpath if errata require using the
@@ -641,7 +670,7 @@ static bool arch_timer_counter_has_wa(void)
#else
#define arch_timer_check_ool_workaround(t,a) do { } while(0)
#define arch_timer_this_cpu_has_cntvct_wa() ({false;})
-#define arch_timer_counter_has_wa() ({false;})
+static bool arch_timer_counter_has_wa(void) { return false; }
#endif /* CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND */

static __always_inline irqreturn_t timer_handler(const int access,
@@ -1079,25 +1108,10 @@ static void __init arch_counter_register(unsigned type)

/* Register the CP15 based counter if we have one */
if (type & ARCH_TIMER_TYPE_CP15) {
- u64 (*rd)(void);
-
- if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
- arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
- if (arch_timer_counter_has_wa())
- rd = arch_counter_get_cntvct_stable;
- else
- rd = arch_counter_get_cntvct;
- } else {
- if (arch_timer_counter_has_wa())
- rd = arch_counter_get_cntpct_stable;
- else
- rd = arch_counter_get_cntpct;
- }
-
- arch_timer_read_counter = rd;
+ __arch_timer_read_counter = arch_counter_get_read_fn();
clocksource_counter.vdso_clock_mode = vdso_default;
} else {
- arch_timer_read_counter = arch_counter_get_cntvct_mem;
+ __arch_timer_read_counter = arch_counter_get_cntvct_mem;
}

width = arch_counter_get_width();
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 057c8964aefb..ec331b65ba23 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -85,7 +85,7 @@ struct arch_timer_mem {
#ifdef CONFIG_ARM_ARCH_TIMER

extern u32 arch_timer_get_rate(void);
-extern u64 (*arch_timer_read_counter)(void);
+extern u64 arch_timer_read_counter(void);
extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
extern bool arch_timer_evtstrm_available(void);

--
2.34.1

2023-01-13 13:56:37

by Mark Rutland

[permalink] [raw]

Subject: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

Hi Marc,

On Fri, Jan 13, 2023 at 11:16:48AM +0000, Marc Zyngier wrote:
> When booting on a CPU that has a countertum on the counter read,
> we use the arch_counter_get_cnt{v,p}ct_stable() backend which
> applies the workaround.
>
> However, we don't do the same thing when an affected CPU is
> a secondary CPU, and we're stuck with the standard sched_clock()
> backend that knows nothing about the workaround.
>
> Fix it by always indirecting sched_clock(), making arch_timer_read_counter
> a function instead of a function pointer. In turn, we update the
> pointer (now private to the driver code) when detecting a new
> workaround.

Unfortunately, I don't think this is sufficient.

I'm pretty sure secondary CPUs might call sched_clock() before getting to
arch_counter_register(), so there'll be a window where this could go wrong.

If we consider late onlining on a preemptible kernel we'll also have a race at
runtime:

| sched_clock() {
| arch_timer_read_counter() {
| // reads __arch_timer_read_counter == arch_counter_get_cntvct;
|
| arch_counter_get_cntvct() {
|
| < PREEMPTED >
| < CPU requiring workaround onlined >
| < RESCHEDULED on affected CPU >
|
| MRS xN, CNTVCT_EL0 // reads junk here
| }
| }
| }

I think we need to reconsider the approach.

Since the accessor is out-of-line anyway, we could use a static key *within*
the accessor to handle that, e.g.

| u64 arch_timer_get_cntvct(void)
| {
| u64 val = read_sysreg(cntvct_el0);
| if (!static_branch_unlikely(use_timer_workaround))
| return val;
|
| // do stablisation workaround here
|
| return val;
| }

... and we'd need to transiently enable the workaround when beinging a CPU
online in case it needs the workaround. We could use the static key inc / dec
helpers from the CPU invoking the hotplug to manage that.

With that, we should never perform the first read on an affected core without
also deciding to perform the workaround.

Does that make sound plausible?

Thanks,
Mark.

> Cc: Mark Rutland <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Daniel Lezcano <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: [email protected]
> Reported-by: Yogesh Lal <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters")
> Link: https://lore.kernel.org/r/[email protected]
> ---
> drivers/clocksource/arm_arch_timer.c | 56 +++++++++++++++++-----------
> include/clocksource/arm_arch_timer.h | 2 +-
> 2 files changed, 36 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index e09d4427f604..5272db86bef5 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -217,7 +217,12 @@ static notrace u64 arch_counter_get_cntvct(void)
> * to exist on arm64. arm doesn't use this before DT is probed so even
> * if we don't have the cp15 accessors we won't have a problem.
> */
> -u64 (*arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
> +static u64 (*__arch_timer_read_counter)(void) __ro_after_init = arch_counter_get_cntvct;
> +
> +u64 arch_timer_read_counter(void)
> +{
> + return __arch_timer_read_counter();

Since the function pointer can be modified concurrently, we'll need to use
return READ_ONCE(__arch_timer_read_counter)();

SInce this could change dynamically, we
> +}
> EXPORT_SYMBOL_GPL(arch_timer_read_counter);
>
> static u64 arch_counter_read(struct clocksource *cs)
> @@ -230,6 +235,28 @@ static u64 arch_counter_read_cc(const struct cyclecounter *cc)
> return arch_timer_read_counter();
> }
>
> +static bool arch_timer_counter_has_wa(void);
> +
> +static u64 (*arch_counter_get_read_fn(void))(void)
> +{
> + u64 (*rd)(void);
> +
> + if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
> + arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> + if (arch_timer_counter_has_wa())
> + rd = arch_counter_get_cntvct_stable;
> + else
> + rd = arch_counter_get_cntvct;
> + } else {
> + if (arch_timer_counter_has_wa())
> + rd = arch_counter_get_cntpct_stable;
> + else
> + rd = arch_counter_get_cntpct;
> + }
> +
> + return rd;
> +}
> +
> static struct clocksource clocksource_counter = {
> .name = "arch_sys_counter",
> .id = CSID_ARM_ARCH_COUNTER,
> @@ -571,8 +598,10 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
> per_cpu(timer_unstable_counter_workaround, i) = wa;
> }
>
> - if (wa->read_cntvct_el0 || wa->read_cntpct_el0)
> - atomic_set(&timer_unstable_counter_workaround_in_use, 1);
> + if (wa->read_cntvct_el0 || wa->read_cntpct_el0) {
> + __arch_timer_read_counter = arch_counter_get_read_fn();
> + atomic_set_release(&timer_unstable_counter_workaround_in_use, 1);
> + }
>
> /*
> * Don't use the vdso fastpath if errata require using the
> @@ -641,7 +670,7 @@ static bool arch_timer_counter_has_wa(void)
> #else
> #define arch_timer_check_ool_workaround(t,a) do { } while(0)
> #define arch_timer_this_cpu_has_cntvct_wa() ({false;})
> -#define arch_timer_counter_has_wa() ({false;})
> +static bool arch_timer_counter_has_wa(void) { return false; }
> #endif /* CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND */
>
> static __always_inline irqreturn_t timer_handler(const int access,
> @@ -1079,25 +1108,10 @@ static void __init arch_counter_register(unsigned type)
>
> /* Register the CP15 based counter if we have one */
> if (type & ARCH_TIMER_TYPE_CP15) {
> - u64 (*rd)(void);
> -
> - if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
> - arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> - if (arch_timer_counter_has_wa())
> - rd = arch_counter_get_cntvct_stable;
> - else
> - rd = arch_counter_get_cntvct;
> - } else {
> - if (arch_timer_counter_has_wa())
> - rd = arch_counter_get_cntpct_stable;
> - else
> - rd = arch_counter_get_cntpct;
> - }
> -
> - arch_timer_read_counter = rd;
> + __arch_timer_read_counter = arch_counter_get_read_fn();
> clocksource_counter.vdso_clock_mode = vdso_default;
> } else {
> - arch_timer_read_counter = arch_counter_get_cntvct_mem;
> + __arch_timer_read_counter = arch_counter_get_cntvct_mem;
> }
>
> width = arch_counter_get_width();
> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> index 057c8964aefb..ec331b65ba23 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -85,7 +85,7 @@ struct arch_timer_mem {
> #ifdef CONFIG_ARM_ARCH_TIMER
>
> extern u32 arch_timer_get_rate(void);
> -extern u64 (*arch_timer_read_counter)(void);
> +extern u64 arch_timer_read_counter(void);
> extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
> extern bool arch_timer_evtstrm_available(void);
>
> --
> 2.34.1
>

2023-01-19 08:58:52

by 刘琦

[permalink] [raw]

Subject: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

[Test Report]
Result: Test Pass

A total of two rounds of pending testing
a. The first round of hanging test
Number of machines: 200
Hanging test duration: 48h
Hanging test results: no walt crash problem
b. The second round of hanging test
Number of machines: 200
Hanging test duration: 72h
Hanging test results: no walt crash problem

Tested-by: wangfajie <[email protected]>
Tested-by: liurenwang <[email protected]>
Tested-by: zhanghui <[email protected]>
Tested-by: liangke <[email protected]>

2023-01-19 13:57:06

by 王法杰

[permalink] [raw]

Subject: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

Hi Marc Zyngier

We found the APPS Crash issue on MTBF test.
Brief crash information, APPS Crash - Kernel BUG at /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow

[Test equipment]
1. Number of phone: 200 pcs
2. CPU info of phone: CPU architecture with Quad Cortex-A73 and Quad Cortex-A53

[Preset conditions]
1. Android 13 with kernel 5.15
2. Install application of top 20
3. Connected to Wi-Fi
4. Connect the adapter to charge the phone
5. Test duration 48 hours

[Expected results]
0 crash happened.

[Test results without [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround ]
1. First round of test, found 3 phones with APPS Crash issue on /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow
2. Second round of test, found 7 phones with APP Crash issue on /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow

[Test results with [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround ]
1. First round of test, 0 crash happened.
2. Second round of test, 0 crash happened.

Tested-by: wangfajie <[email protected]>
Tested-by: liurenwang <[email protected]>
Tested-by: zhanghui <[email protected]>
Tested-by: liangke <[email protected]>

So we think the PATCH is working and it can fix APPS crash issue. Many thanks your time.

Best regards!
Wangfajie

-----邮件原件-----
发件人: Marc Zyngier [mailto:[email protected]]
发送时间: 2023年1月19日 17:52
收件人: 刘琦 <[email protected]>
抄送: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; 王法杰 <[email protected]>; 刘仁旺 <[email protected]>; 张辉 <[email protected]>; [email protected]
主题: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

On 2023-01-19 07:59, 刘琦 wrote:
> [Test Report]
> Result: Test Pass
>
> A total of two rounds of pending testing
> a. The first round of hanging test
> Number of machines: 200
> Hanging test duration: 48h
> Hanging test results: no walt crash problem
> b. The second round of hanging test
> Number of machines: 200
> Hanging test duration: 72h
> Hanging test results: no walt crash problem
>
> Tested-by: wangfajie <[email protected]>
> Tested-by: liurenwang <[email protected]>
> Tested-by: zhanghui <[email protected]>
> Tested-by: liangke <[email protected]>

Thanks for this.

The only issue here is that that you don't explain what you tested, nor how you tested it.

It is also a patch that has known defects (you just have to read the thread for the details)... This makes this testing, no matter how thorough it is, rather ineffective.

M.
--
Jazz is not dead. It just smells funny...

2023-01-20 05:52:12

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround

On 2023-01-19 07:59, 刘琦 wrote:
> [Test Report]
> Result: Test Pass
>
> A total of two rounds of pending testing
> a. The first round of hanging test
> Number of machines: 200
> Hanging test duration: 48h
> Hanging test results: no walt crash problem
> b. The second round of hanging test
> Number of machines: 200
> Hanging test duration: 72h
> Hanging test results: no walt crash problem
>
> Tested-by: wangfajie <[email protected]>
> Tested-by: liurenwang <[email protected]>
> Tested-by: zhanghui <[email protected]>
> Tested-by: liangke <[email protected]>

Thanks for this.

The only issue here is that that you don't explain what you tested,
nor how you tested it.

It is also a patch that has known defects (you just have to read the
thread for the details)... This makes this testing, no matter how
thorough it is, rather ineffective.

M.
--
Jazz is not dead. It just smells funny...

2023-03-28 14:05:32

by Will Deacon

[permalink] [raw]