2023-11-17 21:09:52

by Mario Limonciello

[permalink] [raw]
Subject: [PATCH 0/4] Extend time to wait for UIP for some callers

A number of users have reported their system will have a failure reading
the RTC around s2idle entry or exit.

This failure manifests as UIP clear taking longer than 10ms. Affected users
have reported that after this happens the clock jumps forward to 2077, which
is presumably from epoch + century bit.

Users who used a debugging patch provided by Mateusz Jończyk demonstrated
that this has taken upwards of 480ms in some cases.

This series adjusts the UIP timeout to be configurable by the caller and
changes some callers which aren't called in an interrupt context to allow
longer timeouts.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217626
Link: https://community.frame.work/t/responded-fw13-amd-fedora-39-system-clock-advances-50-years-during-overnight-suspend
Mario Limonciello (4):
rtc: mc146818-lib: Adjust failure return code for mc146818_get_time()
rtc: Adjust failure return code for cmos_set_alarm()
rtc: Add support for configuring the UIP timeout for RTC reads
rtc: Extend timeout for waiting for UIP to clear to 1s

arch/alpha/kernel/rtc.c | 2 +-
arch/x86/kernel/hpet.c | 2 +-
arch/x86/kernel/rtc.c | 2 +-
drivers/base/power/trace.c | 2 +-
drivers/rtc/rtc-cmos.c | 8 ++++----
drivers/rtc/rtc-mc146818-lib.c | 35 ++++++++++++++++++++++++----------
include/linux/mc146818rtc.h | 3 ++-
7 files changed, 35 insertions(+), 19 deletions(-)

--
2.34.1


2023-11-17 21:09:52

by Mario Limonciello

[permalink] [raw]
Subject: [PATCH 1/4] rtc: mc146818-lib: Adjust failure return code for mc146818_get_time()

mc146818_get_time() calls mc146818_avoid_UIP() to avoid fetching the
time while RTC update is in progress (UIP). When this fails, the return
code is -EIO, but actually there was no IO failure.

The reason for the return from mc146818_avoid_UIP() is that the UIP
wasn't cleared in the time period. Adjust the return code to -ETIMEDOUT
to match the behavior.

Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/rtc/rtc-mc146818-lib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index f1c09f1db044..43a28e82674e 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -138,7 +138,7 @@ int mc146818_get_time(struct rtc_time *time)

if (!mc146818_avoid_UIP(mc146818_get_time_callback, &p)) {
memset(time, 0, sizeof(*time));
- return -EIO;
+ return -ETIMEDOUT;
}

if (!(p.ctrl & RTC_DM_BINARY) || RTC_ALWAYS_BCD)
--
2.34.1

2023-11-17 21:09:56

by Mario Limonciello

[permalink] [raw]
Subject: [PATCH 4/4] rtc: Extend timeout for waiting for UIP to clear to 1s

Specs don't say anything about UIP being cleared within 10ms. They
only say that UIP won't occur for another 244uS. If a long NMI occurs
while UIP is still updating it might not be possible to get valid
data in 10ms.

This has been observed in the wild that around s2idle some calls can
take up to 480ms before UIP is clear.

Adjust callers from outside an interrupt context to wait for up to a
1s instead of 10ms.

Reported-by: [email protected]
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217626
Signed-off-by: Mario Limonciello <[email protected]>
---
arch/x86/kernel/rtc.c | 2 +-
drivers/base/power/trace.c | 2 +-
drivers/rtc/rtc-cmos.c | 2 +-
drivers/rtc/rtc-mc146818-lib.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index 961ebc7f1872..2e7066980f3e 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -67,7 +67,7 @@ void mach_get_cmos_time(struct timespec64 *now)
return;
}

- if (mc146818_get_time(&tm, 10)) {
+ if (mc146818_get_time(&tm, 1000)) {
pr_err("Unable to read current time from RTC\n");
now->tv_sec = now->tv_nsec = 0;
return;
diff --git a/drivers/base/power/trace.c b/drivers/base/power/trace.c
index c2e925357474..cd6e559648b2 100644
--- a/drivers/base/power/trace.c
+++ b/drivers/base/power/trace.c
@@ -120,7 +120,7 @@ static unsigned int read_magic_time(void)
struct rtc_time time;
unsigned int val;

- if (mc146818_get_time(&time, 10) < 0) {
+ if (mc146818_get_time(&time, 1000) < 0) {
pr_err("Unable to read current time from RTC\n");
return 0;
}
diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 16dcbd196ce2..345897dadb91 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -231,7 +231,7 @@ static int cmos_read_time(struct device *dev, struct rtc_time *t)
if (!pm_trace_rtc_valid())
return -EIO;

- ret = mc146818_get_time(t, 10);
+ ret = mc146818_get_time(t, 1000);
if (ret < 0) {
dev_err_ratelimited(dev, "unable to read current time\n");
return ret;
diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 75b2fc1791e6..c6f5db6521dd 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -87,7 +87,7 @@ EXPORT_SYMBOL_GPL(mc146818_avoid_UIP);
*/
bool mc146818_does_rtc_work(void)
{
- return mc146818_avoid_UIP(NULL, 10, NULL);
+ return mc146818_avoid_UIP(NULL, 1000, NULL);
}
EXPORT_SYMBOL_GPL(mc146818_does_rtc_work);

--
2.34.1

2023-11-17 21:10:31

by Mario Limonciello

[permalink] [raw]
Subject: [PATCH 2/4] rtc: Adjust failure return code for cmos_set_alarm()

When mc146818_avoid_UIP() fails to return a valid value, this is because
UIP didn't clear in the timeout period. Adjust the return code in this
case to -ETIMEDOUT.

Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/rtc/rtc-cmos.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 228fb2d11c70..b39890a5531e 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -557,7 +557,7 @@ static int cmos_set_alarm(struct device *dev, struct rtc_wkalrm *t)
* Use mc146818_avoid_UIP() to avoid this.
*/
if (!mc146818_avoid_UIP(cmos_set_alarm_callback, &p))
- return -EIO;
+ return -ETIMEDOUT;

cmos->alarm_expires = rtc_tm_to_time64(&t->time);

--
2.34.1

2023-11-17 21:11:49

by Mario Limonciello

[permalink] [raw]
Subject: [PATCH 3/4] rtc: Add support for configuring the UIP timeout for RTC reads

The UIP timeout is hardcoded to 10ms for all RTC reads, but in some
contexts this might not be enough time. Add a timeout parameter to
mc146818_get_time() and mc146818_get_time_callback().
Make all callers use 10ms to ensure no functional changes.

Signed-off-by: Mario Limonciello <[email protected]>
---
arch/alpha/kernel/rtc.c | 2 +-
arch/x86/kernel/hpet.c | 2 +-
arch/x86/kernel/rtc.c | 2 +-
drivers/base/power/trace.c | 2 +-
drivers/rtc/rtc-cmos.c | 6 +++---
drivers/rtc/rtc-mc146818-lib.c | 31 +++++++++++++++++++++++--------
include/linux/mc146818rtc.h | 3 ++-
7 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/arch/alpha/kernel/rtc.c b/arch/alpha/kernel/rtc.c
index fb3025396ac9..cfdf90bc8b3f 100644
--- a/arch/alpha/kernel/rtc.c
+++ b/arch/alpha/kernel/rtc.c
@@ -80,7 +80,7 @@ init_rtc_epoch(void)
static int
alpha_rtc_read_time(struct device *dev, struct rtc_time *tm)
{
- int ret = mc146818_get_time(tm);
+ int ret = mc146818_get_time(tm, 10);

if (ret < 0) {
dev_err_ratelimited(dev, "unable to read current time\n");
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 41eecf180b7f..17adad4cbe78 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -1438,7 +1438,7 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
memset(&curr_time, 0, sizeof(struct rtc_time));

if (hpet_rtc_flags & (RTC_UIE | RTC_AIE)) {
- if (unlikely(mc146818_get_time(&curr_time) < 0)) {
+ if (unlikely(mc146818_get_time(&curr_time, 10) < 0)) {
pr_err_ratelimited("unable to read current time from RTC\n");
return IRQ_HANDLED;
}
diff --git a/arch/x86/kernel/rtc.c b/arch/x86/kernel/rtc.c
index 1309b9b05338..961ebc7f1872 100644
--- a/arch/x86/kernel/rtc.c
+++ b/arch/x86/kernel/rtc.c
@@ -67,7 +67,7 @@ void mach_get_cmos_time(struct timespec64 *now)
return;
}

- if (mc146818_get_time(&tm)) {
+ if (mc146818_get_time(&tm, 10)) {
pr_err("Unable to read current time from RTC\n");
now->tv_sec = now->tv_nsec = 0;
return;
diff --git a/drivers/base/power/trace.c b/drivers/base/power/trace.c
index 72b7a92337b1..c2e925357474 100644
--- a/drivers/base/power/trace.c
+++ b/drivers/base/power/trace.c
@@ -120,7 +120,7 @@ static unsigned int read_magic_time(void)
struct rtc_time time;
unsigned int val;

- if (mc146818_get_time(&time) < 0) {
+ if (mc146818_get_time(&time, 10) < 0) {
pr_err("Unable to read current time from RTC\n");
return 0;
}
diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index b39890a5531e..16dcbd196ce2 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -231,7 +231,7 @@ static int cmos_read_time(struct device *dev, struct rtc_time *t)
if (!pm_trace_rtc_valid())
return -EIO;

- ret = mc146818_get_time(t);
+ ret = mc146818_get_time(t, 10);
if (ret < 0) {
dev_err_ratelimited(dev, "unable to read current time\n");
return ret;
@@ -307,7 +307,7 @@ static int cmos_read_alarm(struct device *dev, struct rtc_wkalrm *t)
*
* Use the mc146818_avoid_UIP() function to avoid this.
*/
- if (!mc146818_avoid_UIP(cmos_read_alarm_callback, &p))
+ if (!mc146818_avoid_UIP(cmos_read_alarm_callback, 10, &p))
return -EIO;

if (!(p.rtc_control & RTC_DM_BINARY) || RTC_ALWAYS_BCD) {
@@ -556,7 +556,7 @@ static int cmos_set_alarm(struct device *dev, struct rtc_wkalrm *t)
*
* Use mc146818_avoid_UIP() to avoid this.
*/
- if (!mc146818_avoid_UIP(cmos_set_alarm_callback, &p))
+ if (!mc146818_avoid_UIP(cmos_set_alarm_callback, 10, &p))
return -ETIMEDOUT;

cmos->alarm_expires = rtc_tm_to_time64(&t->time);
diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 43a28e82674e..75b2fc1791e6 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -8,26 +8,29 @@
#include <linux/acpi.h>
#endif

+#define UIP_RECHECK_DELAY 100 /* usec */
+
/*
* Execute a function while the UIP (Update-in-progress) bit of the RTC is
- * unset.
+ * unset. The timeout is configurable by the caller in ms.
*
* Warning: callback may be executed more then once.
*/
bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
+ int timeout,
void *param)
{
int i;
unsigned long flags;
unsigned char seconds;

- for (i = 0; i < 100; i++) {
+ for (i = 0; i < USEC_PER_MSEC / UIP_RECHECK_DELAY * timeout; i++) {
spin_lock_irqsave(&rtc_lock, flags);

/*
* Check whether there is an update in progress during which the
* readout is unspecified. The maximum update time is ~2ms. Poll
- * every 100 usec for completion.
+ * for completion.
*
* Store the second value before checking UIP so a long lasting
* NMI which happens to hit after the UIP check cannot make
@@ -37,7 +40,7 @@ bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),

if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
spin_unlock_irqrestore(&rtc_lock, flags);
- udelay(100);
+ udelay(UIP_RECHECK_DELAY);
continue;
}

@@ -56,7 +59,7 @@ bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
*/
if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
spin_unlock_irqrestore(&rtc_lock, flags);
- udelay(100);
+ udelay(UIP_RECHECK_DELAY);
continue;
}

@@ -84,7 +87,7 @@ EXPORT_SYMBOL_GPL(mc146818_avoid_UIP);
*/
bool mc146818_does_rtc_work(void)
{
- return mc146818_avoid_UIP(NULL, NULL);
+ return mc146818_avoid_UIP(NULL, 10, NULL);
}
EXPORT_SYMBOL_GPL(mc146818_does_rtc_work);

@@ -130,13 +133,25 @@ static void mc146818_get_time_callback(unsigned char seconds, void *param_in)
p->ctrl = CMOS_READ(RTC_CONTROL);
}

-int mc146818_get_time(struct rtc_time *time)
+/**
+ * mc146818_get_time - Get the current time from the RTC
+ * @time: pointer to struct rtc_time to store the current time
+ * @timeout: timeout value in ms
+ *
+ * This function reads the current time from the RTC and stores it in the
+ * provided struct rtc_time. The timeout parameter specifies the maximum
+ * time to wait for the RTC to become ready.
+ *
+ * Return: 0 on success, -ETIMEDOUT if the RTC did not become ready within
+ * the specified timeout, or another error code if an error occurred.
+ */
+int mc146818_get_time(struct rtc_time *time, int timeout)
{
struct mc146818_get_time_callback_param p = {
.time = time
};

- if (!mc146818_avoid_UIP(mc146818_get_time_callback, &p)) {
+ if (!mc146818_avoid_UIP(mc146818_get_time_callback, timeout, &p)) {
memset(time, 0, sizeof(*time));
return -ETIMEDOUT;
}
diff --git a/include/linux/mc146818rtc.h b/include/linux/mc146818rtc.h
index b0da04fe087b..34dfcc77f505 100644
--- a/include/linux/mc146818rtc.h
+++ b/include/linux/mc146818rtc.h
@@ -126,10 +126,11 @@ struct cmos_rtc_board_info {
#endif /* ARCH_RTC_LOCATION */

bool mc146818_does_rtc_work(void);
-int mc146818_get_time(struct rtc_time *time);
+int mc146818_get_time(struct rtc_time *time, int timeout);
int mc146818_set_time(struct rtc_time *time);

bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
+ int timeout,
void *param);

#endif /* _MC146818RTC_H */
--
2.34.1

2023-11-18 19:02:31

by Mateusz Jończyk

[permalink] [raw]
Subject: Re: [PATCH 2/4] rtc: Adjust failure return code for cmos_set_alarm()

W dniu 17.11.2023 o 07:32, Mario Limonciello pisze:
> When mc146818_avoid_UIP() fails to return a valid value, this is because
> UIP didn't clear in the timeout period. Adjust the return code in this
> case to -ETIMEDOUT.
>
Hello,

Thank you for posting this good patch series.

Why don't you CC stable it?

Fixes: cdedc45c579fa ("rtc: cmos: avoid UIP when reading alarm time")
Fixes: cd17420ebea58 ("rtc: cmos: avoid UIP when writing alarm time")

> Signed-off-by: Mario Limonciello <[email protected]>
> ---
> drivers/rtc/rtc-cmos.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
> index 228fb2d11c70..b39890a5531e 100644
> --- a/drivers/rtc/rtc-cmos.c
> +++ b/drivers/rtc/rtc-cmos.c
> @@ -557,7 +557,7 @@ static int cmos_set_alarm(struct device *dev, struct rtc_wkalrm *t)
> * Use mc146818_avoid_UIP() to avoid this.
> */
> if (!mc146818_avoid_UIP(cmos_set_alarm_callback, &p))
> - return -EIO;
> + return -ETIMEDOUT;
>
> cmos->alarm_expires = rtc_tm_to_time64(&t->time);

This should be changed also in cmos_read_alarm().

Greetings,

Mateusz

2023-11-18 19:07:47

by Mateusz Jończyk

[permalink] [raw]
Subject: Re: [PATCH 3/4] rtc: Add support for configuring the UIP timeout for RTC reads

Hello,

W dniu 17.11.2023 o 07:32, Mario Limonciello pisze:
> The UIP timeout is hardcoded to 10ms for all RTC reads, but in some
> contexts this might not be enough time. Add a timeout parameter to
> mc146818_get_time() and mc146818_get_time_callback().
> Make all callers use 10ms to ensure no functional changes.
>
> Signed-off-by: Mario Limonciello <[email protected]>

Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")

If you would like to Cc: stable this patch,

commit d2a632a8a117 ("rtc: mc146818-lib: reduce RTC_UIP polling period")

is a prerequisite.

> ---
> arch/alpha/kernel/rtc.c | 2 +-
> arch/x86/kernel/hpet.c | 2 +-
> arch/x86/kernel/rtc.c | 2 +-
> drivers/base/power/trace.c | 2 +-
> drivers/rtc/rtc-cmos.c | 6 +++---
> drivers/rtc/rtc-mc146818-lib.c | 31 +++++++++++++++++++++++--------
> include/linux/mc146818rtc.h | 3 ++-
> 7 files changed, 32 insertions(+), 16 deletions(-)
>
[snip]
> --- a/drivers/rtc/rtc-mc146818-lib.c
> +++ b/drivers/rtc/rtc-mc146818-lib.c
> @@ -8,26 +8,29 @@
> #include <linux/acpi.h>
> #endif
>
> +#define UIP_RECHECK_DELAY 100 /* usec */
> +
> /*
> * Execute a function while the UIP (Update-in-progress) bit of the RTC is
> - * unset.
> + * unset. The timeout is configurable by the caller in ms.
> *
> * Warning: callback may be executed more then once.
> */
> bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
> + int timeout,
> void *param)
> {
> int i;
> unsigned long flags;
> unsigned char seconds;
>
> - for (i = 0; i < 100; i++) {
> + for (i = 0; i < USEC_PER_MSEC / UIP_RECHECK_DELAY * timeout; i++) {
> spin_lock_irqsave(&rtc_lock, flags);
>
> /*
> * Check whether there is an update in progress during which the
> * readout is unspecified. The maximum update time is ~2ms. Poll
> - * every 100 usec for completion.
> + * for completion.
> *
> * Store the second value before checking UIP so a long lasting
> * NMI which happens to hit after the UIP check cannot make
> @@ -37,7 +40,7 @@ bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
>
> if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
> spin_unlock_irqrestore(&rtc_lock, flags);
> - udelay(100);
> + udelay(UIP_RECHECK_DELAY);
> continue;
> }
>
> @@ -56,7 +59,7 @@ bool mc146818_avoid_UIP(void (*callback)(unsigned char seconds, void *param),
> */
> if (CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP) {
> spin_unlock_irqrestore(&rtc_lock, flags);
> - udelay(100);
> + udelay(UIP_RECHECK_DELAY);
> continue;
> }
>

I think that when reading the RTC is not finished in 100ms or so
(irrespective of the timeout parameter), the code should log
a warning / an error message.

Greetings,

Mateusz

2023-11-18 19:26:27

by Mateusz Jończyk

[permalink] [raw]
Subject: Re: [PATCH 4/4] rtc: Extend timeout for waiting for UIP to clear to 1s

W dniu 17.11.2023 o 07:32, Mario Limonciello pisze:
> Specs don't say anything about UIP being cleared within 10ms. They
> only say that UIP won't occur for another 244uS. If a long NMI occurs
> while UIP is still updating it might not be possible to get valid
> data in 10ms.
>
> This has been observed in the wild that around s2idle some calls can
> take up to 480ms before UIP is clear.
>
> Adjust callers from outside an interrupt context to wait for up to a
> 1s instead of 10ms.
>
> Reported-by: [email protected]
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217626
> Signed-off-by: Mario Limonciello <[email protected]>
> ---
> arch/x86/kernel/rtc.c | 2 +-
> drivers/base/power/trace.c | 2 +-
> drivers/rtc/rtc-cmos.c | 2 +-
> drivers/rtc/rtc-mc146818-lib.c | 2 +-
> 4 files changed, 4 insertions(+), 4 deletions(-)

Consider:

Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")

If you would like to Cc: stable this patch,

commit d2a632a8a117 ("rtc: mc146818-lib: reduce RTC_UIP polling period")

is a prerequisite.

Greetings,

Mateusz