Some firmware erroneously unmask IRQs (and potentially other architecture
specific exceptions) during runtime services functions, in violation of both
common sense and the UEFI specification. This can result in a number of issues
if said exceptions are taken when they are expected to be masked, and
additionally can confuse IRQ tracing if the original mask state is not
restored prior to returning from firmware.
In practice it's difficult to check that firmware never unmasks exceptions, but
we can at least check that the IRQ flags are at least consistent upon entry to
and return from a runtime services function call. This series implements said
check in the shared EFI runtime wrappers code, after an initial round of
refactoring such that this can be generic.
I have left ia64 as-is, without this check, as ia64 doesn't currently use the
generic runtime wrappers, has many special cases for the runtime calls which
don't fit well with the generic code, and I don't expect a new, buggy ia64
firmware to appear soon.
The first time corruption of the IRQ flags is detected, we dump a stack trace,
and set TAINT_FIRMWARE_WORKAROUND. Additionally, and in all subsequent cases,
we log (with ratelimiting) the specific corruption of the flags, and restore
the expected flags to avoid redundant warnings elsewhere.
Since v1 [1]:
* Fix thinko: s/local_irq_save/local_save_flags/
* Remove ifdefs after conversion
* Remove reundant semicolon from x86 patch
* Move efi_call_virt_check_flags before first use
* Add Acked-bys and Reviewed-bys
Ard, I assume that your Reviewed-by still stands for the final patch, even
though efi_call_virt_check_flags moved. Please shout if that's not the case!
Hopefully you're also happy to extend that to the new patch removing the
ifdefs once they become superfluous.
Thanks,
Mark.
[1] https://lkml.org/lkml/2016/4/21/260
Mark Rutland (6):
efi/runtime-wrappers: add {__,}efi_call_virt templates
arm64/efi: move to generic {__,}efi_call_virt
arm/efi: move to generic {__,}efi_call_virt
x86/efi: move to generic {__,}efi_call_virt
efi/runtime-wrappers: remove redundant ifdefs
efi/runtime-wrappers: detect FW irq flag corruption
arch/arm/include/asm/efi.h | 20 ++++------------
arch/arm64/include/asm/efi.h | 21 ++++++-----------
arch/x86/include/asm/efi.h | 41 ++++++++++-----------------------
drivers/firmware/efi/runtime-wrappers.c | 38 ++++++++++++++++++++++++++++++
4 files changed, 61 insertions(+), 59 deletions(-)
--
1.9.1
Currently each architecture must implement two macros, efi_call_virt and
__efi_call_virt, which only differ by the presence or absence of a
return type. Otherwise, the logic surrounding the call is identical.
As each architecture must define the entire body of each, we can't place
any generic manipulation (e.g. irq flag validation) in the middle.
This patch adds template implementations of these macros. With these,
arch code can implement three template macros, avoiding reptition for
the void/non-void return cases:
* arch_efi_call_virt_setup
Sets up the environment for the call (e.g. switching page tables,
allowing kernel-mode use of floating point, if required).
* arch_efi_call_virt
Performs the call. The last expression in the macro must be the call
itself, allowing the logic to be shared by the void and non-void
cases.
* arch_efi_call_virt_teardown
Restores the usual kernel environment once the call has returned.
While the savings from repition are minimal, we additionally gain the
ability to add common code around the call with the call environment set
up. This can be used to detect common firmware issues (e.g. bad irq mask
management).
Signed-off-by: Mark Rutland <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/firmware/efi/runtime-wrappers.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index de69530..1b9fa54 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -20,6 +20,27 @@
#include <linux/spinlock.h>
#include <asm/efi.h>
+
+#ifndef efi_call_virt
+#define efi_call_virt(f, args...) \
+({ \
+ efi_status_t __s; \
+ arch_efi_call_virt_setup(); \
+ __s = arch_efi_call_virt(f, args); \
+ arch_efi_call_virt_teardown(); \
+ __s; \
+})
+#endif
+
+#ifndef __efi_call_virt
+#define __efi_call_virt(f, args...) \
+({ \
+ arch_efi_call_virt_setup(); \
+ arch_efi_call_virt(f, args); \
+ arch_efi_call_virt_teardown(); \
+})
+#endif
+
/*
* According to section 7.1 of the UEFI spec, Runtime Services are not fully
* reentrant, and there are particular combinations of calls that need to be
--
1.9.1
Now there's a common template for {__,}efi_call_virt, remove the
duplicate logic from the x86 efi code.
Signed-off-by: Mark Rutland <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/x86/include/asm/efi.h | 41 ++++++++++++-----------------------------
1 file changed, 12 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 53748c4..34bd38b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -33,28 +33,16 @@
extern unsigned long asmlinkage efi_call_phys(void *, ...);
+#define arch_efi_call_virt_setup() kernel_fpu_begin()
+#define arch_efi_call_virt_teardown() kernel_fpu_end()
+
/*
* Wrap all the virtual calls in a way that forces the parameters on the stack.
*/
-
-/* Use this macro if your virtual returns a non-void value */
-#define efi_call_virt(f, args...) \
-({ \
- efi_status_t __s; \
- kernel_fpu_begin(); \
- __s = ((efi_##f##_t __attribute__((regparm(0)))*) \
- efi.systab->runtime->f)(args); \
- kernel_fpu_end(); \
- __s; \
-})
-
-/* Use this macro if your virtual call does not return any value */
-#define __efi_call_virt(f, args...) \
+#define arch_efi_call_virt(f, args...) \
({ \
- kernel_fpu_begin(); \
((efi_##f##_t __attribute__((regparm(0)))*) \
efi.systab->runtime->f)(args); \
- kernel_fpu_end(); \
})
#define efi_ioremap(addr, size, type, attr) ioremap_cache(addr, size)
@@ -78,10 +66,8 @@ struct efi_scratch {
u64 phys_stack;
} __packed;
-#define efi_call_virt(f, ...) \
+#define arch_efi_call_virt_setup() \
({ \
- efi_status_t __s; \
- \
efi_sync_low_kernel_mappings(); \
preempt_disable(); \
__kernel_fpu_begin(); \
@@ -91,9 +77,13 @@ struct efi_scratch {
write_cr3((unsigned long)efi_scratch.efi_pgt); \
__flush_tlb_all(); \
} \
- \
- __s = efi_call((void *)efi.systab->runtime->f, __VA_ARGS__); \
- \
+})
+
+#define arch_efi_call_virt(f, args...) \
+ efi_call((void *)efi.systab->runtime->f, args) \
+
+#define arch_efi_call_virt_teardown() \
+({ \
if (efi_scratch.use_pgd) { \
write_cr3(efi_scratch.prev_cr3); \
__flush_tlb_all(); \
@@ -101,15 +91,8 @@ struct efi_scratch {
\
__kernel_fpu_end(); \
preempt_enable(); \
- __s; \
})
-/*
- * All X86_64 virt calls return non-void values. Thus, use non-void call for
- * virt calls that would be void on X86_32.
- */
-#define __efi_call_virt(f, args...) efi_call_virt(f, args)
-
extern void __iomem *__init efi_ioremap(unsigned long addr, unsigned long size,
u32 type, u64 attribute);
--
1.9.1
Now there's a common template for {__,}efi_call_virt, remove the
duplicate logic from the arm efi code.
Signed-off-by: Mark Rutland <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Cc: Leif Lindholm <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
arch/arm/include/asm/efi.h | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index e0eea72..e1461fc 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -23,26 +23,14 @@ void efi_init(void);
int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
-#define efi_call_virt(f, ...) \
-({ \
- efi_##f##_t *__f; \
- efi_status_t __s; \
- \
- efi_virtmap_load(); \
- __f = efi.systab->runtime->f; \
- __s = __f(__VA_ARGS__); \
- efi_virtmap_unload(); \
- __s; \
-})
+#define arch_efi_call_virt_setup() efi_virtmap_load()
+#define arch_efi_call_virt_teardown() efi_virtmap_unload()
-#define __efi_call_virt(f, ...) \
+#define arch_efi_call_virt(f, args...) \
({ \
efi_##f##_t *__f; \
- \
- efi_virtmap_load(); \
__f = efi.systab->runtime->f; \
- __f(__VA_ARGS__); \
- efi_virtmap_unload(); \
+ __f(args); \
})
static inline void efi_set_pgd(struct mm_struct *mm)
--
1.9.1
The UEFI spec allows runtime services to be called with interrupts
masked or unmasked, and if a runtime service function needs to mask
interrupts, it must restore the mask to its original state before
returning (i.e. from the PoV of the OS, this does not change across a
call). Firmware should never unmask exceptions, as these may then be
taken by the OS unexpectedly.
Unfortunately, some firmware has been seen to unmask IRQs (and
potentially other maskable exceptions) across runtime services calls,
leaving irq flags corrupted after returning from a runtime services
function call. This may be detected by the IRQ tracing code, but often
goes unnoticed, leaving a potentially disastrous bug hidden.
This patch detects when the irq flags are corrupted by an EFI runtime
services call, logging the call and specific corruption to the console.
While restoring the expected value of the flags is insufficient to avoid
problems, we do so to avoid redundant warnings from elsewhere (e.g. IRQ
tracing).
Signed-off-by: Mark Rutland <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/firmware/efi/runtime-wrappers.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
I'm not sure about the LOCKDEP_NOW_UNRELIABLE here. If FW unmasks IRQs there's
the potential for deadlock, but arguably by the time we've detected the flag
corruption the danger has passed. I'm erring on the side of caution here
setting it, but perhaps that's not the best idea?
Mark.
diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index a2c8e70..19e30b7 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -16,23 +16,45 @@
#include <linux/bug.h>
#include <linux/efi.h>
+#include <linux/irqflags.h>
#include <linux/mutex.h>
#include <linux/spinlock.h>
+#include <linux/stringify.h>
#include <asm/efi.h>
+static void efi_call_virt_check_flags(unsigned long flags, const char *call)
+{
+ unsigned long cur_flags;
+
+ local_save_flags(cur_flags);
+ if (!WARN_ON_ONCE(cur_flags != flags))
+ return;
+
+ add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_NOW_UNRELIABLE);
+ pr_err_ratelimited(FW_BUG "IRQ flags corrupted (0x%08lx=>0x%08lx) by EFI %s\n",
+ flags, cur_flags, call);
+ local_irq_restore(flags);
+}
+
#define efi_call_virt(f, args...) \
({ \
efi_status_t __s; \
+ unsigned long flags; \
arch_efi_call_virt_setup(); \
+ local_save_flags(flags); \
__s = arch_efi_call_virt(f, args); \
+ efi_call_virt_check_flags(flags, __stringify(f)); \
arch_efi_call_virt_teardown(); \
__s; \
})
#define __efi_call_virt(f, args...) \
({ \
+ unsigned long flags; \
arch_efi_call_virt_setup(); \
+ local_save_flags(flags); \
arch_efi_call_virt(f, args); \
+ efi_call_virt_check_flags(flags, __stringify(f)); \
arch_efi_call_virt_teardown(); \
})
--
1.9.1
Now that all users of the EFI runtime wrappers (arm,arm64,x86) have been
migrated to the new setup/teardown macros, we don't need to support
overridden {__,}efi_call_virt implementations.
This patch removes the unnecessary ifdefs.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Leif Lindholm <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
drivers/firmware/efi/runtime-wrappers.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index 1b9fa54..a2c8e70 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -20,8 +20,6 @@
#include <linux/spinlock.h>
#include <asm/efi.h>
-
-#ifndef efi_call_virt
#define efi_call_virt(f, args...) \
({ \
efi_status_t __s; \
@@ -30,16 +28,13 @@
arch_efi_call_virt_teardown(); \
__s; \
})
-#endif
-#ifndef __efi_call_virt
#define __efi_call_virt(f, args...) \
({ \
arch_efi_call_virt_setup(); \
arch_efi_call_virt(f, args); \
arch_efi_call_virt_teardown(); \
})
-#endif
/*
* According to section 7.1 of the UEFI spec, Runtime Services are not fully
--
1.9.1
Now there's a common template for {__,}efi_call_virt, remove the
duplicate logic from the arm64 efi code.
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Leif Lindholm <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
arch/arm64/include/asm/efi.h | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 8e88a69..f4f71224 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,28 +14,21 @@ extern void efi_init(void);
int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
-#define efi_call_virt(f, ...) \
+#define arch_efi_call_virt_setup() \
({ \
- efi_##f##_t *__f; \
- efi_status_t __s; \
- \
kernel_neon_begin(); \
efi_virtmap_load(); \
- __f = efi.systab->runtime->f; \
- __s = __f(__VA_ARGS__); \
- efi_virtmap_unload(); \
- kernel_neon_end(); \
- __s; \
})
-#define __efi_call_virt(f, ...) \
+#define arch_efi_call_virt(f, args...) \
({ \
efi_##f##_t *__f; \
- \
- kernel_neon_begin(); \
- efi_virtmap_load(); \
__f = efi.systab->runtime->f; \
- __f(__VA_ARGS__); \
+ __f(args); \
+})
+
+#define arch_efi_call_virt_teardown() \
+({ \
efi_virtmap_unload(); \
kernel_neon_end(); \
})
--
1.9.1
On 22 April 2016 at 15:51, Mark Rutland <[email protected]> wrote:
> Some firmware erroneously unmask IRQs (and potentially other architecture
> specific exceptions) during runtime services functions, in violation of both
> common sense and the UEFI specification. This can result in a number of issues
> if said exceptions are taken when they are expected to be masked, and
> additionally can confuse IRQ tracing if the original mask state is not
> restored prior to returning from firmware.
>
> In practice it's difficult to check that firmware never unmasks exceptions, but
> we can at least check that the IRQ flags are at least consistent upon entry to
> and return from a runtime services function call. This series implements said
> check in the shared EFI runtime wrappers code, after an initial round of
> refactoring such that this can be generic.
>
> I have left ia64 as-is, without this check, as ia64 doesn't currently use the
> generic runtime wrappers, has many special cases for the runtime calls which
> don't fit well with the generic code, and I don't expect a new, buggy ia64
> firmware to appear soon.
>
> The first time corruption of the IRQ flags is detected, we dump a stack trace,
> and set TAINT_FIRMWARE_WORKAROUND. Additionally, and in all subsequent cases,
> we log (with ratelimiting) the specific corruption of the flags, and restore
> the expected flags to avoid redundant warnings elsewhere.
>
> Since v1 [1]:
> * Fix thinko: s/local_irq_save/local_save_flags/
> * Remove ifdefs after conversion
> * Remove reundant semicolon from x86 patch
> * Move efi_call_virt_check_flags before first use
> * Add Acked-bys and Reviewed-bys
>
> Ard, I assume that your Reviewed-by still stands for the final patch, even
> though efi_call_virt_check_flags moved. Please shout if that's not the case!
>
No, that's fine. Thanks for respinning so quickly.
> Hopefully you're also happy to extend that to the new patch removing the
> ifdefs once they become superfluous.
>
Matt: in case your review bandwidth is limited atm, I'd much prefer
this series making v4.7 than the GOP stuff or the other stuff i have
been posting over the past weeks.
Thanks,
Ard.
On Fri, 22 Apr, at 02:51:18PM, Mark Rutland wrote:
> Currently each architecture must implement two macros, efi_call_virt and
> __efi_call_virt, which only differ by the presence or absence of a
> return type. Otherwise, the logic surrounding the call is identical.
>
> As each architecture must define the entire body of each, we can't place
> any generic manipulation (e.g. irq flag validation) in the middle.
>
> This patch adds template implementations of these macros. With these,
> arch code can implement three template macros, avoiding reptition for
> the void/non-void return cases:
>
> * arch_efi_call_virt_setup
>
> Sets up the environment for the call (e.g. switching page tables,
> allowing kernel-mode use of floating point, if required).
>
> * arch_efi_call_virt
>
> Performs the call. The last expression in the macro must be the call
> itself, allowing the logic to be shared by the void and non-void
> cases.
>
> * arch_efi_call_virt_teardown
>
> Restores the usual kernel environment once the call has returned.
>
> While the savings from repition are minimal, we additionally gain the
> ability to add common code around the call with the call environment set
> up. This can be used to detect common firmware issues (e.g. bad irq mask
> management).
>
> Signed-off-by: Mark Rutland <[email protected]>
> Reviewed-by: Ard Biesheuvel <[email protected]>
> Cc: Matt Fleming <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/firmware/efi/runtime-wrappers.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
Applied. I incorporated the explanation of the 3 template macros next
to the code because it is too useful to only have it exist in the git
commit history.
On Fri, 22 Apr, at 02:51:23PM, Mark Rutland wrote:
> The UEFI spec allows runtime services to be called with interrupts
> masked or unmasked, and if a runtime service function needs to mask
> interrupts, it must restore the mask to its original state before
> returning (i.e. from the PoV of the OS, this does not change across a
> call). Firmware should never unmask exceptions, as these may then be
> taken by the OS unexpectedly.
>
> Unfortunately, some firmware has been seen to unmask IRQs (and
> potentially other maskable exceptions) across runtime services calls,
> leaving irq flags corrupted after returning from a runtime services
> function call. This may be detected by the IRQ tracing code, but often
> goes unnoticed, leaving a potentially disastrous bug hidden.
>
> This patch detects when the irq flags are corrupted by an EFI runtime
> services call, logging the call and specific corruption to the console.
> While restoring the expected value of the flags is insufficient to avoid
> problems, we do so to avoid redundant warnings from elsewhere (e.g. IRQ
> tracing).
>
> Signed-off-by: Mark Rutland <[email protected]>
> Reviewed-by: Ard Biesheuvel <[email protected]>
> Cc: Matt Fleming <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> drivers/firmware/efi/runtime-wrappers.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> I'm not sure about the LOCKDEP_NOW_UNRELIABLE here. If FW unmasks IRQs there's
> the potential for deadlock, but arguably by the time we've detected the flag
> corruption the danger has passed. I'm erring on the side of caution here
> setting it, but perhaps that's not the best idea?
I think it makes sense to leave it as-is, not least to guard against
future changes we make to the kernel side that might have lockdep side
effects when this bug is triggered.
On Fri, 22 Apr, at 04:12:59PM, Ard Biesheuvel wrote:
> On 22 April 2016 at 15:51, Mark Rutland <[email protected]> wrote:
> > Some firmware erroneously unmask IRQs (and potentially other architecture
> > specific exceptions) during runtime services functions, in violation of both
> > common sense and the UEFI specification. This can result in a number of issues
> > if said exceptions are taken when they are expected to be masked, and
> > additionally can confuse IRQ tracing if the original mask state is not
> > restored prior to returning from firmware.
> >
> > In practice it's difficult to check that firmware never unmasks exceptions, but
> > we can at least check that the IRQ flags are at least consistent upon entry to
> > and return from a runtime services function call. This series implements said
> > check in the shared EFI runtime wrappers code, after an initial round of
> > refactoring such that this can be generic.
> >
> > I have left ia64 as-is, without this check, as ia64 doesn't currently use the
> > generic runtime wrappers, has many special cases for the runtime calls which
> > don't fit well with the generic code, and I don't expect a new, buggy ia64
> > firmware to appear soon.
> >
> > The first time corruption of the IRQ flags is detected, we dump a stack trace,
> > and set TAINT_FIRMWARE_WORKAROUND. Additionally, and in all subsequent cases,
> > we log (with ratelimiting) the specific corruption of the flags, and restore
> > the expected flags to avoid redundant warnings elsewhere.
> >
> > Since v1 [1]:
> > * Fix thinko: s/local_irq_save/local_save_flags/
> > * Remove ifdefs after conversion
> > * Remove reundant semicolon from x86 patch
> > * Move efi_call_virt_check_flags before first use
> > * Add Acked-bys and Reviewed-bys
> >
> > Ard, I assume that your Reviewed-by still stands for the final patch, even
> > though efi_call_virt_check_flags moved. Please shout if that's not the case!
> >
>
> No, that's fine. Thanks for respinning so quickly.
>
> > Hopefully you're also happy to extend that to the new patch removing the
> > ifdefs once they become superfluous.
> >
>
> Matt: in case your review bandwidth is limited atm, I'd much prefer
> this series making v4.7 than the GOP stuff or the other stuff i have
> been posting over the past weeks.
I like this series a lot (well, ignoring the fact that the firmware is
trying to eat itself). The runtime call code is much cleaner now, and
this is a great precedent for any future multi-architecture quirks we
may need.
Queued for v4.7, thanks everyone!
On Sun, 24 Apr, at 10:22:41PM, Matt Fleming wrote:
>
> I like this series a lot (well, ignoring the fact that the firmware is
> trying to eat itself). The runtime call code is much cleaner now, and
> this is a great precedent for any future multi-architecture quirks we
> may need.
>
> Queued for v4.7, thanks everyone!
Hmm... Booting this series with Qemu and OVMF results in lots of
warnings,
[ 0.102173] ------------[ cut here ]------------
[ 0.103000] WARNING: CPU: 0 PID: 0 at /dev/shm/mfleming/git/efi/drivers/firmware/efi/runtime-wrappers.c:30 efi_call_virt_check_flags+0x69/0x90
[ 0.103505] Modules linked in:
[ 0.104519] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
[ 0.105000] 0000000000000000 ffffffff81e03e30 ffffffff8132206f 0000000000000000
[ 0.105000] 0000000000000000 ffffffff81e03e70 ffffffff8105a47c 0000001e0000000a
[ 0.105000] 0000000000000246 0000000000000286 ffffffff81bed975 ffffffff81e03f10
[ 0.105000] Call Trace:
[ 0.105000] [<ffffffff8132206f>] dump_stack+0x4d/0x6e
[ 0.105000] [<ffffffff8105a47c>] __warn+0xcc/0xf0
[ 0.105000] [<ffffffff8105a558>] warn_slowpath_null+0x18/0x20
[ 0.105000] [<ffffffff8164e5a9>] efi_call_virt_check_flags+0x69/0x90
[ 0.105000] [<ffffffff8164f9d2>] virt_efi_set_variable+0x82/0x190
[ 0.105000] [<ffffffff81054555>] efi_delete_dummy_variable+0x75/0x80
[ 0.105000] [<ffffffff81f599f6>] efi_enter_virtual_mode+0x463/0x472
[ 0.105000] [<ffffffff81f41f82>] start_kernel+0x38f/0x415
[ 0.105000] [<ffffffff81f419e1>] ? set_init_arg+0x55/0x55
[ 0.105000] [<ffffffff81f415ee>] x86_64_start_reservations+0x2a/0x2c
[ 0.105000] [<ffffffff81f416da>] x86_64_start_kernel+0xea/0xed
[ 0.107181] ---[ end trace 0081cc453369d969 ]---
[ 0.107499] Disabling lock debugging due to kernel taint
[ 0.108226] [Firmware Bug]: IRQ flags corrupted (0x00000246=>0x00000286) by EFI set_variable
Has anyone tested this series on x86 to ensure that this is a rare
case? I'll go and test some physical x86 machines now.
(+ Laszlo)
On 25 April 2016 at 12:15, Matt Fleming <[email protected]> wrote:
> On Sun, 24 Apr, at 10:22:41PM, Matt Fleming wrote:
>>
>> I like this series a lot (well, ignoring the fact that the firmware is
>> trying to eat itself). The runtime call code is much cleaner now, and
>> this is a great precedent for any future multi-architecture quirks we
>> may need.
>>
>> Queued for v4.7, thanks everyone!
>
> Hmm... Booting this series with Qemu and OVMF results in lots of
> warnings,
>
> [ 0.102173] ------------[ cut here ]------------
> [ 0.103000] WARNING: CPU: 0 PID: 0 at /dev/shm/mfleming/git/efi/drivers/firmware/efi/runtime-wrappers.c:30 efi_call_virt_check_flags+0x69/0x90
> [ 0.103505] Modules linked in:
> [ 0.104519] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
> [ 0.105000] 0000000000000000 ffffffff81e03e30 ffffffff8132206f 0000000000000000
> [ 0.105000] 0000000000000000 ffffffff81e03e70 ffffffff8105a47c 0000001e0000000a
> [ 0.105000] 0000000000000246 0000000000000286 ffffffff81bed975 ffffffff81e03f10
> [ 0.105000] Call Trace:
> [ 0.105000] [<ffffffff8132206f>] dump_stack+0x4d/0x6e
> [ 0.105000] [<ffffffff8105a47c>] __warn+0xcc/0xf0
> [ 0.105000] [<ffffffff8105a558>] warn_slowpath_null+0x18/0x20
> [ 0.105000] [<ffffffff8164e5a9>] efi_call_virt_check_flags+0x69/0x90
> [ 0.105000] [<ffffffff8164f9d2>] virt_efi_set_variable+0x82/0x190
> [ 0.105000] [<ffffffff81054555>] efi_delete_dummy_variable+0x75/0x80
> [ 0.105000] [<ffffffff81f599f6>] efi_enter_virtual_mode+0x463/0x472
> [ 0.105000] [<ffffffff81f41f82>] start_kernel+0x38f/0x415
> [ 0.105000] [<ffffffff81f419e1>] ? set_init_arg+0x55/0x55
> [ 0.105000] [<ffffffff81f415ee>] x86_64_start_reservations+0x2a/0x2c
> [ 0.105000] [<ffffffff81f416da>] x86_64_start_kernel+0xea/0xed
> [ 0.107181] ---[ end trace 0081cc453369d969 ]---
> [ 0.107499] Disabling lock debugging due to kernel taint
> [ 0.108226] [Firmware Bug]: IRQ flags corrupted (0x00000246=>0x00000286) by EFI set_variable
>
> Has anyone tested this series on x86 to ensure that this is a rare
> case? I'll go and test some physical x86 machines now.
I suppose that it is quite likely that this issue occurs in the wild
if it is present in OVMF. Could anyone check which flag is actually
clobbered here?
On Mon, 25 Apr, at 12:21:18PM, Ard Biesheuvel wrote:
> (+ Laszlo)
>
> On 25 April 2016 at 12:15, Matt Fleming <[email protected]> wrote:
> > On Sun, 24 Apr, at 10:22:41PM, Matt Fleming wrote:
> >>
> >> I like this series a lot (well, ignoring the fact that the firmware is
> >> trying to eat itself). The runtime call code is much cleaner now, and
> >> this is a great precedent for any future multi-architecture quirks we
> >> may need.
> >>
> >> Queued for v4.7, thanks everyone!
> >
> > Hmm... Booting this series with Qemu and OVMF results in lots of
> > warnings,
> >
> > [ 0.102173] ------------[ cut here ]------------
> > [ 0.103000] WARNING: CPU: 0 PID: 0 at /dev/shm/mfleming/git/efi/drivers/firmware/efi/runtime-wrappers.c:30 efi_call_virt_check_flags+0x69/0x90
> > [ 0.103505] Modules linked in:
> > [ 0.104519] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
> > [ 0.105000] 0000000000000000 ffffffff81e03e30 ffffffff8132206f 0000000000000000
> > [ 0.105000] 0000000000000000 ffffffff81e03e70 ffffffff8105a47c 0000001e0000000a
> > [ 0.105000] 0000000000000246 0000000000000286 ffffffff81bed975 ffffffff81e03f10
> > [ 0.105000] Call Trace:
> > [ 0.105000] [<ffffffff8132206f>] dump_stack+0x4d/0x6e
> > [ 0.105000] [<ffffffff8105a47c>] __warn+0xcc/0xf0
> > [ 0.105000] [<ffffffff8105a558>] warn_slowpath_null+0x18/0x20
> > [ 0.105000] [<ffffffff8164e5a9>] efi_call_virt_check_flags+0x69/0x90
> > [ 0.105000] [<ffffffff8164f9d2>] virt_efi_set_variable+0x82/0x190
> > [ 0.105000] [<ffffffff81054555>] efi_delete_dummy_variable+0x75/0x80
> > [ 0.105000] [<ffffffff81f599f6>] efi_enter_virtual_mode+0x463/0x472
> > [ 0.105000] [<ffffffff81f41f82>] start_kernel+0x38f/0x415
> > [ 0.105000] [<ffffffff81f419e1>] ? set_init_arg+0x55/0x55
> > [ 0.105000] [<ffffffff81f415ee>] x86_64_start_reservations+0x2a/0x2c
> > [ 0.105000] [<ffffffff81f416da>] x86_64_start_kernel+0xea/0xed
> > [ 0.107181] ---[ end trace 0081cc453369d969 ]---
> > [ 0.107499] Disabling lock debugging due to kernel taint
> > [ 0.108226] [Firmware Bug]: IRQ flags corrupted (0x00000246=>0x00000286) by EFI set_variable
> >
> > Has anyone tested this series on x86 to ensure that this is a rare
> > case? I'll go and test some physical x86 machines now.
>
> I suppose that it is quite likely that this issue occurs in the wild
> if it is present in OVMF. Could anyone check which flag is actually
> clobbered here?
X86_EFLAGS_ZF (zero flag) gets turned off and X86_EFLAGS_SF (signed
flag) gets turned on. Which is totally legitimate and isn't grounds
for a warning...
I think the function we want to use instead of local_save_flags() is
irqs_disabled_flags().
On Mon, Apr 25, 2016 at 11:28:21AM +0100, Matt Fleming wrote:
> On Mon, 25 Apr, at 12:21:18PM, Ard Biesheuvel wrote:
> > (+ Laszlo)
> >
> > On 25 April 2016 at 12:15, Matt Fleming <[email protected]> wrote:
> > > On Sun, 24 Apr, at 10:22:41PM, Matt Fleming wrote:
> > >>
> > >> I like this series a lot (well, ignoring the fact that the firmware is
> > >> trying to eat itself). The runtime call code is much cleaner now, and
> > >> this is a great precedent for any future multi-architecture quirks we
> > >> may need.
> > >>
> > >> Queued for v4.7, thanks everyone!
> > >
> > > Hmm... Booting this series with Qemu and OVMF results in lots of
> > > warnings,
> > >
> > > [ 0.102173] ------------[ cut here ]------------
> > > [ 0.103000] WARNING: CPU: 0 PID: 0 at /dev/shm/mfleming/git/efi/drivers/firmware/efi/runtime-wrappers.c:30 efi_call_virt_check_flags+0x69/0x90
> > > [ 0.103505] Modules linked in:
> > > [ 0.104519] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4+ #1
> > > [ 0.105000] 0000000000000000 ffffffff81e03e30 ffffffff8132206f 0000000000000000
> > > [ 0.105000] 0000000000000000 ffffffff81e03e70 ffffffff8105a47c 0000001e0000000a
> > > [ 0.105000] 0000000000000246 0000000000000286 ffffffff81bed975 ffffffff81e03f10
> > > [ 0.105000] Call Trace:
> > > [ 0.105000] [<ffffffff8132206f>] dump_stack+0x4d/0x6e
> > > [ 0.105000] [<ffffffff8105a47c>] __warn+0xcc/0xf0
> > > [ 0.105000] [<ffffffff8105a558>] warn_slowpath_null+0x18/0x20
> > > [ 0.105000] [<ffffffff8164e5a9>] efi_call_virt_check_flags+0x69/0x90
> > > [ 0.105000] [<ffffffff8164f9d2>] virt_efi_set_variable+0x82/0x190
> > > [ 0.105000] [<ffffffff81054555>] efi_delete_dummy_variable+0x75/0x80
> > > [ 0.105000] [<ffffffff81f599f6>] efi_enter_virtual_mode+0x463/0x472
> > > [ 0.105000] [<ffffffff81f41f82>] start_kernel+0x38f/0x415
> > > [ 0.105000] [<ffffffff81f419e1>] ? set_init_arg+0x55/0x55
> > > [ 0.105000] [<ffffffff81f415ee>] x86_64_start_reservations+0x2a/0x2c
> > > [ 0.105000] [<ffffffff81f416da>] x86_64_start_kernel+0xea/0xed
> > > [ 0.107181] ---[ end trace 0081cc453369d969 ]---
> > > [ 0.107499] Disabling lock debugging due to kernel taint
> > > [ 0.108226] [Firmware Bug]: IRQ flags corrupted (0x00000246=>0x00000286) by EFI set_variable
> > >
> > > Has anyone tested this series on x86 to ensure that this is a rare
> > > case? I'll go and test some physical x86 machines now.
> >
> > I suppose that it is quite likely that this issue occurs in the wild
> > if it is present in OVMF. Could anyone check which flag is actually
> > clobbered here?
>
> X86_EFLAGS_ZF (zero flag) gets turned off and X86_EFLAGS_SF (signed
> flag) gets turned on. Which is totally legitimate and isn't grounds
> for a warning...
>
> I think the function we want to use instead of local_save_flags() is
> irqs_disabled_flags().
It looks like irqs_disabled_flags() will do what you expect, and ignore
everything but the interrupt flag.
However, for ARM that will ignore the other exceptions we've seen FW
erroneously unmask (e.g. FIQ), which is unfortunate, as fiddling with
those is just as disastrous.
Would you be happy with an arch_efi_call_check_flags(before, after),
instead? That way we can make the flags we check arch-specific.
Thanks,
Mark.
On Mon, 25 Apr, at 11:40:09AM, Mark Rutland wrote:
>
> It looks like irqs_disabled_flags() will do what you expect, and ignore
> everything but the interrupt flag.
>
> However, for ARM that will ignore the other exceptions we've seen FW
> erroneously unmask (e.g. FIQ), which is unfortunate, as fiddling with
> those is just as disastrous.
Bah, right.
> Would you be happy with an arch_efi_call_check_flags(before, after),
> instead? That way we can make the flags we check arch-specific.
Could we just make the flag mask arch-specific instead of the call
since the rest of efi_call_virt_check_flags() is good?
Something like this (uncompiled, untested, half-baked),
---
diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index c38b1cfc26e2..057d00bee7d6 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -25,9 +25,12 @@
static void efi_call_virt_check_flags(unsigned long flags, const char *call)
{
unsigned long cur_flags;
+ bool mismatch;
local_save_flags(cur_flags);
- if (!WARN_ON_ONCE(cur_flags != flags))
+
+ mismatch = (cur_flags ^ flags) & ARCH_EFI_IRQ_FLAGS_MASK;
+ if (!WARN_ON_ONCE(mismatch))
return;
add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_NOW_UNRELIABLE);
On Mon, Apr 25, 2016 at 11:51:53AM +0100, Matt Fleming wrote:
> On Mon, 25 Apr, at 11:40:09AM, Mark Rutland wrote:
> >
> > It looks like irqs_disabled_flags() will do what you expect, and ignore
> > everything but the interrupt flag.
> >
> > However, for ARM that will ignore the other exceptions we've seen FW
> > erroneously unmask (e.g. FIQ), which is unfortunate, as fiddling with
> > those is just as disastrous.
>
> Bah, right.
>
> > Would you be happy with an arch_efi_call_check_flags(before, after),
> > instead? That way we can make the flags we check arch-specific.
>
> Could we just make the flag mask arch-specific instead of the call
> since the rest of efi_call_virt_check_flags() is good?
Yup, I meant that arch_efi_call_check_flags would only do the flag
comparison, only replacing the bit currently in the WARN_ON_ONCE().
> Something like this (uncompiled, untested, half-baked),
>
> ---
>
> diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
> index c38b1cfc26e2..057d00bee7d6 100644
> --- a/drivers/firmware/efi/runtime-wrappers.c
> +++ b/drivers/firmware/efi/runtime-wrappers.c
> @@ -25,9 +25,12 @@
> static void efi_call_virt_check_flags(unsigned long flags, const char *call)
> {
> unsigned long cur_flags;
> + bool mismatch;
>
> local_save_flags(cur_flags);
> - if (!WARN_ON_ONCE(cur_flags != flags))
> +
> + mismatch = (cur_flags ^ flags) & ARCH_EFI_IRQ_FLAGS_MASK;
> + if (!WARN_ON_ONCE(mismatch))
> return;
This style also works for me.
Should I respin patch 6 as a series doing the above?
I assume that the first 5 patches are fine as-is.
Thanks,
Mark.
On Mon, 25 Apr, at 12:04:55PM, Mark Rutland wrote:
> On Mon, Apr 25, 2016 at 11:51:53AM +0100, Matt Fleming wrote:
> > On Mon, 25 Apr, at 11:40:09AM, Mark Rutland wrote:
> > >
> > > It looks like irqs_disabled_flags() will do what you expect, and ignore
> > > everything but the interrupt flag.
> > >
> > > However, for ARM that will ignore the other exceptions we've seen FW
> > > erroneously unmask (e.g. FIQ), which is unfortunate, as fiddling with
> > > those is just as disastrous.
> >
> > Bah, right.
> >
> > > Would you be happy with an arch_efi_call_check_flags(before, after),
> > > instead? That way we can make the flags we check arch-specific.
> >
> > Could we just make the flag mask arch-specific instead of the call
> > since the rest of efi_call_virt_check_flags() is good?
>
> Yup, I meant that arch_efi_call_check_flags would only do the flag
> comparison, only replacing the bit currently in the WARN_ON_ONCE().
>
> > Something like this (uncompiled, untested, half-baked),
> >
> > ---
> >
> > diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
> > index c38b1cfc26e2..057d00bee7d6 100644
> > --- a/drivers/firmware/efi/runtime-wrappers.c
> > +++ b/drivers/firmware/efi/runtime-wrappers.c
> > @@ -25,9 +25,12 @@
> > static void efi_call_virt_check_flags(unsigned long flags, const char *call)
> > {
> > unsigned long cur_flags;
> > + bool mismatch;
> >
> > local_save_flags(cur_flags);
> > - if (!WARN_ON_ONCE(cur_flags != flags))
> > +
> > + mismatch = (cur_flags ^ flags) & ARCH_EFI_IRQ_FLAGS_MASK;
> > + if (!WARN_ON_ONCE(mismatch))
> > return;
>
> This style also works for me.
Cool. One thing that occurred to me after I sent it is that
technically we should either,
1) make 'mismatch' an int or
2) do mismatch = !!((cur_flags ^ flags) & ARCH_EFI_IRQ_FLAGS_MASK)
Either is fine with me, I just don't want to leave the implicit
conversion to C's type system.
> Should I respin patch 6 as a series doing the above?
>
> I assume that the first 5 patches are fine as-is.
Yep, they're fine. Sure, go ahead and respin patch 6.