2024-04-17 18:01:12

by Xin Li (Intel)

[permalink] [raw]
Subject: [PATCH v4 1/1] x86/fred: Fix INT80 emulation for FRED

Add a FRED-specific INT80 handler fred_int80_emulation():

1) As INT instructions and hardware interrupts are separate event
types, FRED does not preclude the use of vector 0x80 for external
interrupts. As a result the FRED setup code does *NOT* reserve
vector 0x80 and calling int80_is_external() is not merely
suboptimal but actively incorrect: it could cause a system call
to be incorrectly ignored.

2) fred_int80_emulation(), only called for handling vector 0x80 of
event type EVENT_TYPE_SWINT, will NEVER be called to handle any
external interrupt (event type EVENT_TYPE_EXTINT).

3) FRED has separate entry flows depending on if the event came from
user space or kernel space, and because kernel does not use INT
insns, the FRED kernel entry handler fred_entry_from_kernel()
falls through to fred_bad_type() if the event type is
EVENT_TYPE_SWINT, i.e., INT insns. So if the kernel is handling
an INT insn, it can only be from a user level application.

4) int80_emulation() does a CLEAR_BRANCH_HISTORY, which is
IDT-specific. While FRED will likely take a different approach if
it is ever needed: it *probably* belongs in either fred_intx()/
fred_other() or asm_fred_entrypoint_user(), depending on if this
ought to be done for all entries from userspace or only system
calls.

5) int $0x80 is the FAST path for 32-bit system calls under FRED.

A dedicated FRED INT80 handler duplicates quite a bit of the code in
do_int80_emulation(), but it avoids sprinkling more tests and seems
more readable.

Fixes: 55617fb991df ("x86/entry: Do not allow external 0x80 interrupts")

Suggested-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li (Intel) <[email protected]>
---

Changes since v3:
* Make it more clear that why the FRED kernel entry handler
fred_entry_from_kernel() falls through to fred_bad_type() if the event
type is EVENT_TYPE_SWINT, i.e., INT insns (Borislav Petkov).
* Fix the comment about CLEAR_BRANCH_HISTORY (Nikolay Borisov).

Changes since v2:
* Add comments explaining the reasons why a FRED-specific INT80 handler
is required to the head comment of fred_int80_emulation(), not just
the change log (H. Peter Anvin).
* Incorporate extra clarifications from H. Peter Anvin.
* Fix a few typos and wordings (H. Peter Anvin).
* Add a maintainer tip to the change log and head comment: unify common
stuff later, i.e., after the code settles (Borislav Petkov).

Change since v1:
* Prefer a FRED-specific INT80 handler instead of sprinkling more tests
around (Borislav Petkov).
---
arch/x86/entry/common.c | 71 +++++++++++++++++++++++++++++++++++++
arch/x86/entry/entry_fred.c | 2 +-
2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 6de50b80702e..700acda99cc8 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -255,6 +255,77 @@ __visible noinstr void do_int80_emulation(struct pt_regs *regs)
instrumentation_end();
syscall_exit_to_user_mode(regs);
}
+
+#ifdef CONFIG_X86_FRED
+/*
+ * A FRED-specific INT80 handler fred_int80_emulation() is required:
+ *
+ * 1) As INT instructions and hardware interrupts are separate event
+ * types, FRED does not preclude the use of vector 0x80 for external
+ * interrupts. As a result the FRED setup code does *NOT* reserve
+ * vector 0x80 and calling int80_is_external() is not merely
+ * suboptimal but actively incorrect: it could cause a system call
+ * to be incorrectly ignored.
+ *
+ * 2) fred_int80_emulation(), only called for handling vector 0x80 of
+ * event type EVENT_TYPE_SWINT, will NEVER be called to handle any
+ * external interrupt (event type EVENT_TYPE_EXTINT).
+ *
+ * 3) FRED has separate entry flows depending on if the event came from
+ * user space or kernel space, and because kernel does not use INT
+ * insns, the FRED kernel entry handler fred_entry_from_kernel()
+ * falls through to fred_bad_type() if the event type is
+ * EVENT_TYPE_SWINT, i.e., INT insns. So if the kernel is handling
+ * an INT insn, it can only be from a user level application.
+ *
+ * 4) int80_emulation() does a CLEAR_BRANCH_HISTORY, which is
+ * IDT-specific. While FRED will likely take a different approach if
+ * it is ever needed: it *probably* belongs in either fred_intx()/
+ * fred_other() or asm_fred_entrypoint_user(), depending on if this
+ * ought to be done for all entries from userspace or only system
+ * calls.
+ *
+ * 5) int $0x80 is the FAST path for 32-bit system calls under FRED.
+ *
+ * A dedicated FRED INT80 handler duplicates quite a bit of the code in
+ * do_int80_emulation(), but it avoids sprinkling more tests and seems
+ * more readable. Just remember that we can always unify common stuff
+ * later if it turns out that it won't diverge anymore, i.e., after the
+ * FRED code settles.
+ */
+DEFINE_FREDENTRY_RAW(int80_emulation)
+{
+ int nr;
+
+ enter_from_user_mode(regs);
+
+ instrumentation_begin();
+ add_random_kstack_offset();
+
+ /*
+ * FRED pushed 0 into regs::orig_ax and regs::ax contains the
+ * syscall number.
+ *
+ * User tracing code (ptrace or signal handlers) might assume
+ * that the regs::orig_ax contains a 32-bit number on invoking
+ * a 32-bit syscall.
+ *
+ * Establish the syscall convention by saving the 32bit truncated
+ * syscall number in regs::orig_ax and by invalidating regs::ax.
+ */
+ regs->orig_ax = regs->ax & GENMASK(31, 0);
+ regs->ax = -ENOSYS;
+
+ nr = syscall_32_enter(regs);
+
+ local_irq_enable();
+ nr = syscall_enter_from_user_mode_work(regs, nr);
+ do_syscall_32_irqs_on(regs, nr);
+
+ instrumentation_end();
+ syscall_exit_to_user_mode(regs);
+}
+#endif
#else /* CONFIG_IA32_EMULATION */

/* Handles int $0x80 on a 32bit kernel */
diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
index ac120cbdaaf2..9fa18b8c7f26 100644
--- a/arch/x86/entry/entry_fred.c
+++ b/arch/x86/entry/entry_fred.c
@@ -66,7 +66,7 @@ static noinstr void fred_intx(struct pt_regs *regs)
/* INT80 */
case IA32_SYSCALL_VECTOR:
if (ia32_enabled())
- return int80_emulation(regs);
+ return fred_int80_emulation(regs);
fallthrough;
#endif


base-commit: 1e0fd81e4f32a8a383c05d27a672d742b45c1088
--
2.44.0



Subject: [tip: x86/urgent] x86/fred: Fix INT80 emulation for FRED

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 32f5f73b79ffdef215e2e1bcb6ad74387c0f925c
Gitweb: https://git.kernel.org/tip/32f5f73b79ffdef215e2e1bcb6ad74387c0f925c
Author: Xin Li (Intel) <[email protected]>
AuthorDate: Wed, 17 Apr 2024 10:47:31 -07:00
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Thu, 18 Apr 2024 10:37:11 +02:00

x86/fred: Fix INT80 emulation for FRED

Add a FRED-specific INT80 handler and document why it differs from the
current one. Eventually, the common bits will be unified once FRED hw is
available and it turns out that no further changes are needed but for
now, keep the handlers separate for everyone's sanity's sake.

[ bp: Zap duplicated commit message, massage. ]

Fixes: 55617fb991df ("x86/entry: Do not allow external 0x80 interrupts")
Suggested-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li (Intel) <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/entry/common.c | 65 ++++++++++++++++++++++++++++++++++++-
arch/x86/entry/entry_fred.c | 2 +-
2 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 6de50b8..51cc9c7 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -255,6 +255,71 @@ __visible noinstr void do_int80_emulation(struct pt_regs *regs)
instrumentation_end();
syscall_exit_to_user_mode(regs);
}
+
+#ifdef CONFIG_X86_FRED
+/*
+ * A FRED-specific INT80 handler is warranted for the follwing reasons:
+ *
+ * 1) As INT instructions and hardware interrupts are separate event
+ * types, FRED does not preclude the use of vector 0x80 for external
+ * interrupts. As a result, the FRED setup code does not reserve
+ * vector 0x80 and calling int80_is_external() is not merely
+ * suboptimal but actively incorrect: it could cause a system call
+ * to be incorrectly ignored.
+ *
+ * 2) It is called only for handling vector 0x80 of event type
+ * EVENT_TYPE_SWINT and will never be called to handle any external
+ * interrupt (event type EVENT_TYPE_EXTINT).
+ *
+ * 3) FRED has separate entry flows depending on if the event came from
+ * user space or kernel space, and because the kernel does not use
+ * INT insns, the FRED kernel entry handler fred_entry_from_kernel()
+ * falls through to fred_bad_type() if the event type is
+ * EVENT_TYPE_SWINT, i.e., INT insns. So if the kernel is handling
+ * an INT insn, it can only be from a user level.
+ *
+ * 4) int80_emulation() does a CLEAR_BRANCH_HISTORY. While FRED will
+ * likely take a different approach if it is ever needed: it
+ * probably belongs in either fred_intx()/ fred_other() or
+ * asm_fred_entrypoint_user(), depending on if this ought to be done
+ * for all entries from userspace or only system
+ * calls.
+ *
+ * 5) INT $0x80 is the fast path for 32-bit system calls under FRED.
+ */
+DEFINE_FREDENTRY_RAW(int80_emulation)
+{
+ int nr;
+
+ enter_from_user_mode(regs);
+
+ instrumentation_begin();
+ add_random_kstack_offset();
+
+ /*
+ * FRED pushed 0 into regs::orig_ax and regs::ax contains the
+ * syscall number.
+ *
+ * User tracing code (ptrace or signal handlers) might assume
+ * that the regs::orig_ax contains a 32-bit number on invoking
+ * a 32-bit syscall.
+ *
+ * Establish the syscall convention by saving the 32bit truncated
+ * syscall number in regs::orig_ax and by invalidating regs::ax.
+ */
+ regs->orig_ax = regs->ax & GENMASK(31, 0);
+ regs->ax = -ENOSYS;
+
+ nr = syscall_32_enter(regs);
+
+ local_irq_enable();
+ nr = syscall_enter_from_user_mode_work(regs, nr);
+ do_syscall_32_irqs_on(regs, nr);
+
+ instrumentation_end();
+ syscall_exit_to_user_mode(regs);
+}
+#endif
#else /* CONFIG_IA32_EMULATION */

/* Handles int $0x80 on a 32bit kernel */
diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
index ac120cb..9fa18b8 100644
--- a/arch/x86/entry/entry_fred.c
+++ b/arch/x86/entry/entry_fred.c
@@ -66,7 +66,7 @@ static noinstr void fred_intx(struct pt_regs *regs)
/* INT80 */
case IA32_SYSCALL_VECTOR:
if (ia32_enabled())
- return int80_emulation(regs);
+ return fred_int80_emulation(regs);
fallthrough;
#endif