Subject: [PATCH 0/3] microblaze: Add support for TMR Subsystem

This patch series adds support for Triple Modular Redundancy Subsystem,
Triple Modular Redundancy (TMR) Microblaze solution provides soft error
detection, correction and recovery for Microblaze cores in the system.
The Xilinx/AMD Triple Modular Redundancy (TMR) solution in Vivado provides
all the necessary building blocks to implement a redundant triplicated
MicroBlaze subsystem. This processing subsystem is fault-tolerant and
continues to operate nominally after encountering an error. Together
with the capability to detect and recover from errors, the implementation
ensures the reliability of the entire subsystem, for more details about
IP please refer PG268[1].

[1]: https://docs.xilinx.com/r/en-US/pg268-tmr/Triple-Modular-Redundancy-TMR-v1.0-LogiCORE-IP-Product-Guide-PG268

Appana Durga Kedareswara rao (3):
microblaze: Add xmb_manager_register function
microblaze: Add custom break vector handler for mb manager
microblaze: Add support for error injection

arch/microblaze/Kconfig | 10 +
.../include/asm/xilinx_mb_manager.h | 29 ++
arch/microblaze/kernel/asm-offsets.c | 7 +
arch/microblaze/kernel/entry.S | 302 +++++++++++++++++-
4 files changed, 347 insertions(+), 1 deletion(-)
create mode 100644 arch/microblaze/include/asm/xilinx_mb_manager.h

--
2.25.1


Subject: [PATCH 3/3] microblaze: Add support for error injection

To inject the error using the tmr inject IP reset vectors need to be placed
in lmb(bram) due to the limitation in HW when this code runs out of DDR.
Below code adds the error inject code to the .init.ivt section to copy
it in machine_early_init to lmb/Bram location. C_BASE_VECTORS which allow
moving reset vectors out of 0 location is not currently supported by
Microblaze architecture, that's why all the time reset vectors with
injection code is all the time copied to address 0.

As of now getting this functionality working CPU switches to real mode
and simply jumps to bram, which causes triggering of fault which continues
to call_xmb_manager_break break handler which will at the end calls the
error count callback function and performs recovery.

Signed-off-by: Appana Durga Kedareswara rao <[email protected]>
---
.../include/asm/xilinx_mb_manager.h | 8 +++
arch/microblaze/kernel/entry.S | 52 +++++++++++++++++++
2 files changed, 60 insertions(+)

diff --git a/arch/microblaze/include/asm/xilinx_mb_manager.h b/arch/microblaze/include/asm/xilinx_mb_manager.h
index 392c3aa278dc..7b6995722b0c 100644
--- a/arch/microblaze/include/asm/xilinx_mb_manager.h
+++ b/arch/microblaze/include/asm/xilinx_mb_manager.h
@@ -5,6 +5,8 @@
#ifndef _XILINX_MB_MANAGER_H
#define _XILINX_MB_MANAGER_H

+# ifndef __ASSEMBLY__
+
#include <linux/of_address.h>

/*
@@ -17,5 +19,11 @@
void xmb_manager_register(uintptr_t phys_baseaddr, u32 cr_val,
void (*callback)(void *data),
void *priv, void (*reset_callback)(void *data));
+asmlinkage void xmb_inject_err(void);
+
+# endif /* __ASSEMBLY__ */
+
+/* Error injection offset */
+#define XMB_INJECT_ERR_OFFSET 0x200

#endif /* _XILINX_MB_MANAGER_H */
diff --git a/arch/microblaze/kernel/entry.S b/arch/microblaze/kernel/entry.S
index df367bf94b26..2bb2fea70b3e 100644
--- a/arch/microblaze/kernel/entry.S
+++ b/arch/microblaze/kernel/entry.S
@@ -27,6 +27,7 @@

#include <asm/page.h>
#include <asm/unistd.h>
+#include <asm/xilinx_mb_manager.h>

#include <linux/errno.h>
#include <asm/signal.h>
@@ -1151,6 +1152,41 @@ ENTRY(_switch_to)
nop

#ifdef CONFIG_MB_MANAGER
+.global xmb_inject_err
+.section .text
+.align 2
+.ent xmb_inject_err
+.type xmb_inject_err, @function
+xmb_inject_err:
+ addik r1, r1, -PT_SIZE
+ SAVE_REGS
+
+ /* Switch to real mode */
+ VM_OFF;
+ set_bip;
+ mbar 1
+ mbar 2
+ bralid r15, XMB_INJECT_ERR_OFFSET
+ nop;
+
+ /* enable virtual mode */
+ set_vms;
+ /* barrier for instructions and data accesses */
+ mbar 1
+ mbar 2
+ /*
+ * Enable Interrupts, Virtual Protected Mode, equalize
+ * initial state for all possible entries.
+ */
+ rtbd r0, 1f
+ nop;
+1:
+ RESTORE_REGS
+ addik r1, r1, PT_SIZE
+ rtsd r15, 8;
+ nop;
+.end xmb_inject_err
+
.section .data
.global xmb_manager_dev
.global xmb_manager_baseaddr
@@ -1225,6 +1261,22 @@ ENTRY(_reset)
.org 0x20
brai TOPHYS(_hw_exception_handler); /* HW exception handler */

+#ifdef CONFIG_MB_MANAGER
+ /*
+ * For TMR Inject API which injects the error should
+ * be executed from LMB.
+ * TMR Inject is programmed with address of 0x200 so that
+ * when program counter matches with this address error will
+ * be injected. 0x200 is expected to be next available bram
+ * offset, hence used for this api.
+ */
+ .org XMB_INJECT_ERR_OFFSET
+xmb_inject_error:
+ nop
+ rtsd r15, 8
+ nop
+#endif
+
.section .rodata,"a"
#include "syscall_table.S"

--
2.25.1

Subject: [PATCH 2/3] microblaze: Add custom break vector handler for mb manager

When the TMR Manager detects a fault Lockstep state it is signaled to the
MicroBlaze processors by asserting a break signal, When Microblaze gets
a break vector from tmr Microblaze it's needed to clear/block the break
bit in the tmr manager before performing recovery.
In order to perform recovery need to perform the following steps.
1) Store all internal MicroBlaze registers in RAM
2) Execute a suspend instruction which asserts the reset signal
3) Restore all registers from RAM and execute an RTBD instruction to
return from the reset handler, to resume execution at the place
where the break occurred.

This API supports getting called from kernel space only.

Signed-off-by: Appana Durga Kedareswara rao <[email protected]>
---
arch/microblaze/kernel/asm-offsets.c | 7 +
arch/microblaze/kernel/entry.S | 206 ++++++++++++++++++++++++++-
2 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/arch/microblaze/kernel/asm-offsets.c b/arch/microblaze/kernel/asm-offsets.c
index 47ee409508b1..104c3ac5f30c 100644
--- a/arch/microblaze/kernel/asm-offsets.c
+++ b/arch/microblaze/kernel/asm-offsets.c
@@ -120,5 +120,12 @@ int main(int argc, char *argv[])
DEFINE(CC_FSR, offsetof(struct cpu_context, fsr));
BLANK();

+ /* struct cpuinfo */
+ DEFINE(CI_DCS, offsetof(struct cpuinfo, dcache_size));
+ DEFINE(CI_DCL, offsetof(struct cpuinfo, dcache_line_length));
+ DEFINE(CI_ICS, offsetof(struct cpuinfo, icache_size));
+ DEFINE(CI_ICL, offsetof(struct cpuinfo, icache_line_length));
+ BLANK();
+
return 0;
}
diff --git a/arch/microblaze/kernel/entry.S b/arch/microblaze/kernel/entry.S
index b8e1dfe02d58..df367bf94b26 100644
--- a/arch/microblaze/kernel/entry.S
+++ b/arch/microblaze/kernel/entry.S
@@ -30,6 +30,7 @@

#include <linux/errno.h>
#include <asm/signal.h>
+#include <asm/mmu.h>

#undef DEBUG

@@ -287,6 +288,44 @@ syscall_debug_table:

.text

+.extern cpuinfo
+
+C_ENTRY(mb_flush_dcache):
+ addik r1, r1, -PT_SIZE
+ SAVE_REGS
+
+ addik r3, r0, cpuinfo
+ lwi r7, r3, CI_DCS
+ lwi r8, r3, CI_DCL
+ sub r9, r7, r8
+1:
+ wdc.flush r9, r0
+ bgtid r9, 1b
+ addk r9, r9, r8
+
+ RESTORE_REGS
+ addik r1, r1, PT_SIZE
+ rtsd r15, 8
+ nop
+
+C_ENTRY(mb_invalidate_icache):
+ addik r1, r1, -PT_SIZE
+ SAVE_REGS
+
+ addik r3, r0, cpuinfo
+ lwi r7, r3, CI_ICS
+ lwi r8, r3, CI_ICL
+ sub r9, r7, r8
+1:
+ wic r9, r0
+ bgtid r9, 1b
+ addk r9, r9, r8
+
+ RESTORE_REGS
+ addik r1, r1, PT_SIZE
+ rtsd r15, 8
+ nop
+
/*
* User trap.
*
@@ -753,6 +792,160 @@ IRQ_return: /* MS: Make global symbol for debugging */
rtid r14, 0
nop

+#ifdef CONFIG_MB_MANAGER
+
+#define PT_PID PT_SIZE
+#define PT_TLBI PT_SIZE + 4
+#define PT_ZPR PT_SIZE + 8
+#define PT_TLBL0 PT_SIZE + 12
+#define PT_TLBH0 PT_SIZE + 16
+
+C_ENTRY(_xtmr_manager_reset):
+ lwi r1, r0, xmb_manager_stackpointer
+
+ /* Restore MSR */
+ lwi r2, r1, PT_MSR
+ mts rmsr, r2
+ bri 4
+
+ /* restore Special purpose registers */
+ lwi r2, r1, PT_PID
+ mts rpid, r2
+
+ lwi r2, r1, PT_TLBI
+ mts rtlbx, r2
+
+ lwi r2, r1, PT_ZPR
+ mts rzpr, r2
+
+#if CONFIG_XILINX_MICROBLAZE0_USE_FPU
+ lwi r2, r1, PT_FSR
+ mts rfsr, r2
+#endif
+
+ /* restore all the tlb's */
+ addik r3, r0, TOPHYS(tlb_skip)
+ addik r6, r0, PT_TLBL0
+ addik r7, r0, PT_TLBH0
+restore_tlb:
+ add r6, r6, r1
+ add r7, r7, r1
+ lwi r2, r6, 0
+ mts rtlblo, r2
+ lwi r2, r7, 0
+ mts rtlbhi, r2
+ addik r6, r6, 4
+ addik r7, r7, 4
+ bgtid r3, restore_tlb
+ addik r3, r3, -1
+
+ lwi r5, r0, TOPHYS(xmb_manager_dev)
+ lwi r8, r0, TOPHYS(xmb_manager_reset_callback)
+ set_vms
+ /* return from reset need -8 to adjust for rtsd r15, 8 */
+ addik r15, r0, ret_from_reset - 8
+ rtbd r8, 0
+ nop
+
+ret_from_reset:
+ set_bip /* Ints masked for state restore */
+ VM_OFF
+ /* MS: Restore all regs */
+ RESTORE_REGS
+ lwi r14, r1, PT_R14
+ lwi r16, r1, PT_PC
+ addik r1, r1, PT_SIZE + 36
+ rtbd r16, 0
+ nop
+
+/*
+ * Break handler for MB Manager. Enter to _xmb_manager_break by
+ * injecting fault in one of the TMR Microblaze core.
+ * FIXME: This break handler supports getting
+ * called from kernel space only.
+ */
+C_ENTRY(_xmb_manager_break):
+ /*
+ * Reserve memory in the stack for context store/restore
+ * (which includes memory for storing tlbs (max two tlbs))
+ */
+ addik r1, r1, -PT_SIZE - 36
+ swi r1, r0, xmb_manager_stackpointer
+ SAVE_REGS
+ swi r14, r1, PT_R14 /* rewrite saved R14 value */
+ swi r16, r1, PT_PC; /* PC and r16 are the same */
+
+ lwi r6, r0, TOPHYS(xmb_manager_baseaddr)
+ lwi r7, r0, TOPHYS(xmb_manager_crval)
+ /*
+ * When the break vector gets asserted because of error injection,
+ * the break signal must be blocked before exiting from the
+ * break handler, below code configures the tmr manager
+ * control register to block break signal.
+ */
+ swi r7, r6, 0
+
+ /* Save the special purpose registers */
+ mfs r2, rpid
+ swi r2, r1, PT_PID
+
+ mfs r2, rtlbx
+ swi r2, r1, PT_TLBI
+
+ mfs r2, rzpr
+ swi r2, r1, PT_ZPR
+
+#if CONFIG_XILINX_MICROBLAZE0_USE_FPU
+ mfs r2, rfsr
+ swi r2, r1, PT_FSR
+#endif
+ mfs r2, rmsr
+ swi r2, r1, PT_MSR
+
+ /* Save all the tlb's */
+ addik r3, r0, TOPHYS(tlb_skip)
+ addik r6, r0, PT_TLBL0
+ addik r7, r0, PT_TLBH0
+save_tlb:
+ add r6, r6, r1
+ add r7, r7, r1
+ mfs r2, rtlblo
+ swi r2, r6, 0
+ mfs r2, rtlbhi
+ swi r2, r7, 0
+ addik r6, r6, 4
+ addik r7, r7, 4
+ bgtid r3, save_tlb
+ addik r3, r3, -1
+
+ lwi r5, r0, TOPHYS(xmb_manager_dev)
+ lwi r8, r0, TOPHYS(xmb_manager_callback)
+ /* return from break need -8 to adjust for rtsd r15, 8 */
+ addik r15, r0, ret_from_break - 8
+ rtbd r8, 0
+ nop
+
+ret_from_break:
+ /* flush the d-cache */
+ bralid r15, mb_flush_dcache
+ nop
+
+ /*
+ * To make sure microblaze i-cache is in a proper state
+ * invalidate the i-cache.
+ */
+ bralid r15, mb_invalidate_icache
+ nop
+
+ set_bip; /* Ints masked for state restore */
+ VM_OFF;
+ mbar 1
+ mbar 2
+ bri 4
+ suspend
+ nop
+#endif
+
/*
* Debug trap for KGDB. Enter to _debug_exception by brki r16, 0x18
* and call handling function with saved pt_regs
@@ -964,6 +1157,7 @@ ENTRY(_switch_to)
.global xmb_manager_crval
.global xmb_manager_callback
.global xmb_manager_reset_callback
+.global xmb_manager_stackpointer
.align 4
xmb_manager_dev:
.long 0
@@ -975,6 +1169,8 @@ xmb_manager_callback:
.long 0
xmb_manager_reset_callback:
.long 0
+xmb_manager_stackpointer:
+ .long 0

/*
* When the break vector gets asserted because of error injection,
@@ -1008,16 +1204,24 @@ ENTRY(_reset)
/* These are compiled and loaded into high memory, then
* copied into place in mach_early_setup */
.section .init.ivt, "ax"
-#if CONFIG_MANUAL_RESET_VECTOR
+#if CONFIG_MANUAL_RESET_VECTOR && !defined(CONFIG_MB_MANAGER)
.org 0x0
brai CONFIG_MANUAL_RESET_VECTOR
+#elif defined(CONFIG_MB_MANAGER)
+ .org 0x0
+ brai TOPHYS(_xtmr_manager_reset);
#endif
.org 0x8
brai TOPHYS(_user_exception); /* syscall handler */
.org 0x10
brai TOPHYS(_interrupt); /* Interrupt handler */
+#ifdef CONFIG_MB_MANAGER
+ .org 0x18
+ brai TOPHYS(_xmb_manager_break); /* microblaze manager break handler */
+#else
.org 0x18
brai TOPHYS(_debug_exception); /* debug trap handler */
+#endif
.org 0x20
brai TOPHYS(_hw_exception_handler); /* HW exception handler */

--
2.25.1

2022-09-26 15:46:48

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH 0/3] microblaze: Add support for TMR Subsystem

po 27. 6. 2022 v 8:40 odesílatel Appana Durga Kedareswara rao
<[email protected]> napsal:
>
> This patch series adds support for Triple Modular Redundancy Subsystem,
> Triple Modular Redundancy (TMR) Microblaze solution provides soft error
> detection, correction and recovery for Microblaze cores in the system.
> The Xilinx/AMD Triple Modular Redundancy (TMR) solution in Vivado provides
> all the necessary building blocks to implement a redundant triplicated
> MicroBlaze subsystem. This processing subsystem is fault-tolerant and
> continues to operate nominally after encountering an error. Together
> with the capability to detect and recover from errors, the implementation
> ensures the reliability of the entire subsystem, for more details about
> IP please refer PG268[1].
>
> [1]: https://docs.xilinx.com/r/en-US/pg268-tmr/Triple-Modular-Redundancy-TMR-v1.0-LogiCORE-IP-Product-Guide-PG268
>
> Appana Durga Kedareswara rao (3):
> microblaze: Add xmb_manager_register function
> microblaze: Add custom break vector handler for mb manager
> microblaze: Add support for error injection
>
> arch/microblaze/Kconfig | 10 +
> .../include/asm/xilinx_mb_manager.h | 29 ++
> arch/microblaze/kernel/asm-offsets.c | 7 +
> arch/microblaze/kernel/entry.S | 302 +++++++++++++++++-
> 4 files changed, 347 insertions(+), 1 deletion(-)
> create mode 100644 arch/microblaze/include/asm/xilinx_mb_manager.h
>
> --
> 2.25.1
>

Applied.
M


--
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: http://www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs