2022-04-24 13:21:43

by Maciej W. Rozycki

[permalink] [raw]
Subject: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

Fix the discrepancy between the two places we check for the CP0 counter
erratum in along with the incorrect comparison of the R4400 revision
number against 0x30 which matches none and consistently consider all
R4000 and R4400 processors affected, as documented in processor errata
publications[1][2][3], following the mapping between CP0 PRId register
values and processor models:

PRId | Processor Model
---------+--------------------
00000422 | R4000 Revision 2.2
00000430 | R4000 Revision 3.0
00000440 | R4400 Revision 1.0
00000450 | R4400 Revision 2.0
00000460 | R4400 Revision 3.0

No other revision of either processor has ever been spotted.

Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume
R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the
CP0 counter as buggy does not preclude it from being used as either a
clock event or a clock source device. It just cannot be used as both at
a time, because in that case clock event interrupts will be occasionally
lost, and the use as a clock event device takes precedence.

Compare against 0x4ff in `can_use_mips_counter' so that a single machine
instruction is produced.

References:

[1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
Technologies Inc., May 10, 1994, Erratum 53, p.13

[2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
Inc., February 9, 1994, Erratum 21, p.4

[3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS
Technologies Inc., January 24, 1995, Erratum 14, p.3

Signed-off-by: Maciej W. Rozycki <[email protected]>
Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
Cc: [email protected] # v2.6.24+
---
Thomas,

Please review the requirements for SNI platforms. In the case of an
erratic CP0 timer we give precedence to the use as a clock event rather
than clock source device; see `time_init' in arch/mips/kernel/time.c.
Therefore if SNI systems have no alternative timer interrupt source, then
the CP0 timer is supposed to still do regardless of the erratum.

Conversely a system can do without a high-precision clock source, in
which case jiffies will be used. Of course such a system will suffer if
used for precision timekeeping, but such is the price for broken hardware.
Don't SNI systems have any alternative timer available, not even the
venerable 8254?

Long-term I think this code should be factored out and rewritten so that
it lives in one place and can take advantage of compile-time constants,
which will be the case for the majority of platforms. For the time being
fix the immediate breakage however.

With the considerations above in mind, please apply.

Maciej
---
arch/mips/include/asm/timex.h | 8 ++++----
arch/mips/kernel/time.c | 11 +++--------
2 files changed, 7 insertions(+), 12 deletions(-)

linux-mips-r4k-count-erratum-prid.diff
Index: linux-macro/arch/mips/include/asm/timex.h
===================================================================
--- linux-macro.orig/arch/mips/include/asm/timex.h
+++ linux-macro/arch/mips/include/asm/timex.h
@@ -40,9 +40,9 @@
typedef unsigned int cycles_t;

/*
- * On R4000/R4400 before version 5.0 an erratum exists such that if the
- * cycle counter is read in the exact moment that it is matching the
- * compare register, no interrupt will be generated.
+ * On R4000/R4400 an erratum exists such that if the cycle counter is
+ * read in the exact moment that it is matching the compare register,
+ * no interrupt will be generated.
*
* There is a suggested workaround and also the erratum can't strike if
* the compare interrupt isn't being used as the clock source device.
@@ -63,7 +63,7 @@ static inline int can_use_mips_counter(u
if (!__builtin_constant_p(cpu_has_counter))
asm volatile("" : "=m" (cpu_data[0].options));
if (likely(cpu_has_counter &&
- prid >= (PRID_IMP_R4000 | PRID_REV_ENCODE_44(5, 0))))
+ prid > (PRID_IMP_R4000 | PRID_REV_ENCODE_44(15, 15))))
return 1;
else
return 0;
Index: linux-macro/arch/mips/kernel/time.c
===================================================================
--- linux-macro.orig/arch/mips/kernel/time.c
+++ linux-macro/arch/mips/kernel/time.c
@@ -141,15 +141,10 @@ static __init int cpu_has_mfc0_count_bug
case CPU_R4400MC:
/*
* The published errata for the R4400 up to 3.0 say the CPU
- * has the mfc0 from count bug.
- */
- if ((current_cpu_data.processor_id & 0xff) <= 0x30)
- return 1;
-
- /*
- * we assume newer revisions are ok
+ * has the mfc0 from count bug. This seems the last version
+ * produced.
*/
- return 0;
+ return 1;
}

return 0;


2022-04-29 13:20:27

by Philippe Mathieu-Daudé

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

On Mon, Apr 25, 2022 at 9:39 AM Maciej W. Rozycki <[email protected]> wrote:
>
> Fix the discrepancy between the two places we check for the CP0 counter
> erratum in along with the incorrect comparison of the R4400 revision
> number against 0x30 which matches none and consistently consider all
> R4000 and R4400 processors affected, as documented in processor errata
> publications[1][2][3], following the mapping between CP0 PRId register
> values and processor models:
>
> PRId | Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0
>
> No other revision of either processor has ever been spotted.
>
> Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume
> R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the
> CP0 counter as buggy does not preclude it from being used as either a
> clock event or a clock source device. It just cannot be used as both at
> a time, because in that case clock event interrupts will be occasionally
> lost, and the use as a clock event device takes precedence.
>
> Compare against 0x4ff in `can_use_mips_counter' so that a single machine
> instruction is produced.
>
> References:
>
> [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
> Technologies Inc., May 10, 1994, Erratum 53, p.13
>
> [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
> Inc., February 9, 1994, Erratum 21, p.4
>
> [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS
> Technologies Inc., January 24, 1995, Erratum 14, p.3
>
> Signed-off-by: Maciej W. Rozycki <[email protected]>
> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
> Cc: [email protected] # v2.6.24+
> ---
> arch/mips/include/asm/timex.h | 8 ++++----
> arch/mips/kernel/time.c | 11 +++--------
> 2 files changed, 7 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

2022-04-29 14:42:48

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

On Fri, 29 Apr 2022, Thomas Bogendoerfer wrote:

> > Fix the discrepancy between the two places we check for the CP0 counter
> > erratum in along with the incorrect comparison of the R4400 revision
> > number against 0x30 which matches none and consistently consider all
> > R4000 and R4400 processors affected, as documented in processor errata
> > publications[1][2][3], following the mapping between CP0 PRId register
> > values and processor models:
> >
> > PRId | Processor Model
> > ---------+--------------------
> > 00000422 | R4000 Revision 2.2
> > 00000430 | R4000 Revision 3.0
> > 00000440 | R4400 Revision 1.0
> > 00000450 | R4400 Revision 2.0
> > 00000460 | R4400 Revision 3.0
>
> interesting, where is this documented ? And it's quite funny that so far
> everybody messed up revision printing for R4400 CPUs.

That's just observation combined with past discussions with Ralf.

Basically the PRId implementation number is 0x04 for both the R4000 and
the R4400 (the only difference between the two CPUs is the addition of the
write-back buffer in the latter one, making it weakly ordered). And then
the PRId revision number matches exactly the documented CPU revision for
the R4000, while for the R4400 you need to subtract 3 from the PRId
revision number to get the documented CPU revision (i.e. what would be
R4000 Revision 4.0 actually became R4400 Revision 1.0).

At this time no old MIPSer from the SGI days may be around to confirm or
contradict this observation. Documentation explicitly says[1]:

"The revision number can distinguish some chip revisions, however there is
no guarantee that changes to the chip will necessarily be reflected in the
PRId register, or that changes to the revision number necessarily reflect
real chip changes. For this reason, these values are not listed and
software should not rely on the revision number in the PRId register to
characterize the chip."

but surely the author didn't have errata workarounds in mind plus all CPU
revisions have already been manufactured so the mapping has been fixed.

> > Conversely a system can do without a high-precision clock source, in
> > which case jiffies will be used. Of course such a system will suffer if
> > used for precision timekeeping, but such is the price for broken hardware.
> > Don't SNI systems have any alternative timer available, not even the
> > venerable 8254?
>
> all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't
> that nice for clock events as their interupts are connected via an i8259
> addresses via ISA PIO.

Interrupts are used for clock events, so you don't need one here. For
clock sources you just read the counter.

That said reading from the 8254 is messy too and you may need a spinlock
(you need to write the Counter Latch or Read-Back command to the control
register and then issue consecutive two reads to the requested timer's
data register[2]). Which is I guess why support for it has been removed
from x86 code. For non-SMP it might be good enough.

> > With the considerations above in mind, please apply.
>
> will do later.

Great, thanks!

References:

[1] Joe Heinrich: "MIPS R4000 Microprocessor User's Manual", Second
Edition, MIPS Technologies, Inc., April 1, 1994, Chapter 4 "Memory
Management", p.90

[2] "8254 Programmable Interval Timer", Intel Corporation, Order Number:
231164-005, September 1993, Section "Read Operations", pp.7-9

Maciej

2022-04-29 23:57:30

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote:
> Fix the discrepancy between the two places we check for the CP0 counter
> erratum in along with the incorrect comparison of the R4400 revision
> number against 0x30 which matches none and consistently consider all
> R4000 and R4400 processors affected, as documented in processor errata
> publications[1][2][3], following the mapping between CP0 PRId register
> values and processor models:
>
> PRId | Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0

interesting, where is this documented ? And it's quite funny that so far
everybody messed up revision printing for R4400 CPUs.

> Please review the requirements for SNI platforms. In the case of an
> erratic CP0 timer we give precedence to the use as a clock event rather
> than clock source device; see `time_init' in arch/mips/kernel/time.c.
> Therefore if SNI systems have no alternative timer interrupt source, then
> the CP0 timer is supposed to still do regardless of the erratum.
>
> Conversely a system can do without a high-precision clock source, in
> which case jiffies will be used. Of course such a system will suffer if
> used for precision timekeeping, but such is the price for broken hardware.
> Don't SNI systems have any alternative timer available, not even the
> venerable 8254?

all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't
that nice for clock events as their interupts are connected via an i8259
addresses via ISA PIO.

> With the considerations above in mind, please apply.

will do later.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2022-05-02 03:35:30

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote:
> Fix the discrepancy between the two places we check for the CP0 counter
> erratum in along with the incorrect comparison of the R4400 revision
> number against 0x30 which matches none and consistently consider all
> R4000 and R4400 processors affected, as documented in processor errata
> publications[1][2][3], following the mapping between CP0 PRId register
> values and processor models:
>
> PRId | Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0
>
> No other revision of either processor has ever been spotted.
>
> Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume
> R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the
> CP0 counter as buggy does not preclude it from being used as either a
> clock event or a clock source device. It just cannot be used as both at
> a time, because in that case clock event interrupts will be occasionally
> lost, and the use as a clock event device takes precedence.
>
> Compare against 0x4ff in `can_use_mips_counter' so that a single machine
> instruction is produced.
>
> References:
>
> [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
> Technologies Inc., May 10, 1994, Erratum 53, p.13
>
> [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
> Inc., February 9, 1994, Erratum 21, p.4
>
> [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS
> Technologies Inc., January 24, 1995, Erratum 14, p.3
>
> Signed-off-by: Maciej W. Rozycki <[email protected]>
> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
> Cc: [email protected] # v2.6.24+
> ---
> Thomas,
>
> Please review the requirements for SNI platforms. In the case of an
> erratic CP0 timer we give precedence to the use as a clock event rather
> than clock source device; see `time_init' in arch/mips/kernel/time.c.
> Therefore if SNI systems have no alternative timer interrupt source, then
> the CP0 timer is supposed to still do regardless of the erratum.
>
> Conversely a system can do without a high-precision clock source, in
> which case jiffies will be used. Of course such a system will suffer if
> used for precision timekeeping, but such is the price for broken hardware.
> Don't SNI systems have any alternative timer available, not even the
> venerable 8254?
>
> Long-term I think this code should be factored out and rewritten so that
> it lives in one place and can take advantage of compile-time constants,
> which will be the case for the majority of platforms. For the time being
> fix the immediate breakage however.
>
> With the considerations above in mind, please apply.
>
> Maciej
> ---
> arch/mips/include/asm/timex.h | 8 ++++----
> arch/mips/kernel/time.c | 11 +++--------
> 2 files changed, 7 insertions(+), 12 deletions(-)

applied to mips-fixes.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2022-05-02 23:16:44

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] MIPS: Fix CP0 counter erratum detection for R4k CPUs

> That said reading from the 8254 is messy too and you may need a spinlock
> (you need to write the Counter Latch or Read-Back command to the control
> register and then issue consecutive two reads to the requested timer's
> data register[2]). Which is I guess why support for it has been removed
> from x86 code. For non-SMP it might be good enough.

It is important to 'latch' the counter before reading it.
I tried to optimise some code to avoid the extra accesses.
In principle it ought to have worked, but reading the
unlatched values gave garbage - not just messed up 16bit values.
It was probably returning a value that was being ripple-counted.

The sheer number of IO cycles you need to read the counters
just beggars belief.
So while they can be used to get an accurate timestamp it
really takes too long to be useful.
Even in a modern x86 chipset I think they are still ISA
speed cycles.
(Mind you, we've an fpga based PCIe boards where reads
are ISA speed....)

OTOH I'd have though that for a real 486 (one without a TSC)
that is the only way to get a high res timer count.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)