2019-11-11 14:32:27

by Jan Beulich

[permalink] [raw]
Subject: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

The first patch here fixes another regression from 3c88c692c287
("x86/stackframe/32: Provide consistent pt_regs"), besides the
one already addressed by
https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
The second patch is a minimal bit of cleanup on top.

1: make xen_iret_crit_fixup independent of frame layout
2: simplify xen_iret_crit_fixup's ring check

Jan


2019-11-11 14:33:41

by Jan Beulich

[permalink] [raw]
Subject: [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout

Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
accounted for in xen_iret_crit_fixup. Otherwise the old_ax value gets
interpreted as EFLAGS, and hence VM86 mode appears to be active all
the time, leading to random "vm86_32: no user_vm86: BAD" log messages
alongside processes randomly crashing.

Since following the previous model (sitting after SAVE_ALL) would
further complicate the code _and_ retain the dependency of
xen_iret_crit_fixup on frame manipulations done by entry_32.S, switch
things around and do the adjustment ahead of SAVE_ALL.

Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <[email protected]>

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1341,11 +1341,6 @@ END(spurious_interrupt_bug)

#ifdef CONFIG_XEN_PV
ENTRY(xen_hypervisor_callback)
- pushl $-1 /* orig_ax = -1 => not a system call */
- SAVE_ALL
- ENCODE_FRAME_POINTER
- TRACE_IRQS_OFF
-
/*
* Check to see if we got the event in the critical
* region in xen_iret_direct, after we've reenabled
@@ -1353,16 +1348,17 @@ ENTRY(xen_hypervisor_callback)
* iret instruction's behaviour where it delivers a
* pending interrupt when enabling interrupts:
*/
- movl PT_EIP(%esp), %eax
- cmpl $xen_iret_start_crit, %eax
+ cmpl $xen_iret_start_crit, (%esp)
jb 1f
- cmpl $xen_iret_end_crit, %eax
+ cmpl $xen_iret_end_crit, (%esp)
jae 1f
-
- jmp xen_iret_crit_fixup
-
-ENTRY(xen_do_upcall)
-1: mov %esp, %eax
+ call xen_iret_crit_fixup
+1:
+ pushl $-1 /* orig_ax = -1 => not a system call */
+ SAVE_ALL
+ ENCODE_FRAME_POINTER
+ TRACE_IRQS_OFF
+ mov %esp, %eax
call xen_evtchn_do_upcall
#ifndef CONFIG_PREEMPTION
call xen_maybe_preempt_hcall
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -126,10 +126,9 @@ hyper_iret:
.globl xen_iret_start_crit, xen_iret_end_crit

/*
- * This is called by xen_hypervisor_callback in entry.S when it sees
+ * This is called by xen_hypervisor_callback in entry_32.S when it sees
* that the EIP at the time of interrupt was between
- * xen_iret_start_crit and xen_iret_end_crit. We're passed the EIP in
- * %eax so we can do a more refined determination of what to do.
+ * xen_iret_start_crit and xen_iret_end_crit.
*
* The stack format at this point is:
* ----------------
@@ -138,34 +137,23 @@ hyper_iret:
* eflags } outer exception info
* cs }
* eip }
- * ---------------- <- edi (copy dest)
- * eax : outer eax if it hasn't been restored
* ----------------
- * eflags } nested exception info
- * cs } (no ss/esp because we're nested
- * eip } from the same ring)
- * orig_eax }<- esi (copy src)
- * - - - - - - - -
- * fs }
- * es }
- * ds } SAVE_ALL state
- * eax }
- * : :
- * ebx }<- esp
+ * eax : outer eax if it hasn't been restored
* ----------------
+ * eflags }
+ * cs } nested exception info
+ * eip }
+ * return address : (into xen_hypervisor_callback)
*
- * In order to deliver the nested exception properly, we need to shift
- * everything from the return addr up to the error code so it sits
- * just under the outer exception info. This means that when we
- * handle the exception, we do it in the context of the outer
- * exception rather than starting a new one.
+ * In order to deliver the nested exception properly, we need to discard the
+ * nested exception frame such that when we handle the exception, we do it
+ * in the context of the outer exception rather than starting a new one.
*
- * The only caveat is that if the outer eax hasn't been restored yet
- * (ie, it's still on stack), we need to insert its value into the
- * SAVE_ALL state before going on, since it's usermode state which we
- * eventually need to restore.
+ * The only caveat is that if the outer eax hasn't been restored yet (i.e.
+ * it's still on stack), we need to restore its value here.
*/
ENTRY(xen_iret_crit_fixup)
+ pushl %ecx
/*
* Paranoia: Make sure we're really coming from kernel space.
* One could imagine a case where userspace jumps into the
@@ -176,32 +164,26 @@ ENTRY(xen_iret_crit_fixup)
* jump instruction itself, not the destination, but some
* virtual environments get this wrong.
*/
- movl PT_CS(%esp), %ecx
+ movl 3*4(%esp), %ecx /* nested CS */
andl $SEGMENT_RPL_MASK, %ecx
cmpl $USER_RPL, %ecx
+ popl %ecx
je 2f

- lea PT_ORIG_EAX(%esp), %esi
- lea PT_EFLAGS(%esp), %edi
-
/*
* If eip is before iret_restore_end then stack
* hasn't been restored yet.
*/
- cmp $iret_restore_end, %eax
+ cmpl $iret_restore_end, 1*4(%esp)
jae 1f

- movl 0+4(%edi), %eax /* copy EAX (just above top of frame) */
- movl %eax, PT_EAX(%esp)
-
- lea ESP_OFFSET(%edi), %edi /* move dest up over saved regs */
-
- /* set up the copy */
-1: std
- mov $PT_EIP / 4, %ecx /* saved regs up to orig_eax */
- rep movsl
- cld
-
- lea 4(%edi), %esp /* point esp to new frame */
-2: jmp xen_do_upcall
-
+ movl 4*4(%esp), %eax /* load outer EAX */
+ ret $4*4 /* discard nested EIP, CS, and EFLAGS as
+ * well as the just restored EAX */
+
+1:
+ ret $3*4 /* discard nested EIP, CS, and EFLAGS */
+
+2:
+ ret
+END(xen_iret_crit_fixup)

2019-11-11 14:34:57

by Jan Beulich

[permalink] [raw]
Subject: [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check

This can be had with two instead of six insns, by just checking the high
CS.RPL bit.

Also adjust the comment - there would be no #GP in the mentioned cases,
as there's no segment limit violation or alike. Instead there'd be #PF,
but that one reports the target EIP of said branch, not the address of
the branch insn itself.

Signed-off-by: Jan Beulich <[email protected]>
---
An alternative would be to keep using SEGMENT_RPL_MASK, but follow it
with "jpe".

--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -153,22 +153,15 @@ hyper_iret:
* it's still on stack), we need to restore its value here.
*/
ENTRY(xen_iret_crit_fixup)
- pushl %ecx
/*
* Paranoia: Make sure we're really coming from kernel space.
* One could imagine a case where userspace jumps into the
* critical range address, but just before the CPU delivers a
- * GP, it decides to deliver an interrupt instead. Unlikely?
- * Definitely. Easy to avoid? Yes. The Intel documents
- * explicitly say that the reported EIP for a bad jump is the
- * jump instruction itself, not the destination, but some
- * virtual environments get this wrong.
+ * PF, it decides to deliver an interrupt instead. Unlikely?
+ * Definitely. Easy to avoid? Yes.
*/
- movl 3*4(%esp), %ecx /* nested CS */
- andl $SEGMENT_RPL_MASK, %ecx
- cmpl $USER_RPL, %ecx
- popl %ecx
- je 2f
+ testb $2, 2*4(%esp) /* nested CS */
+ jnz 2f

/*
* If eip is before iret_restore_end then stack

2019-11-19 13:02:42

by Jan Beulich

[permalink] [raw]
Subject: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 11.11.2019 15:30, Jan Beulich wrote:
> The first patch here fixes another regression from 3c88c692c287
> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
> one already addressed by
> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
> The second patch is a minimal bit of cleanup on top.
>
> 1: make xen_iret_crit_fixup independent of frame layout
> 2: simplify xen_iret_crit_fixup's ring check

Seeing that the other regression fix has been taken into -tip,
what is the situation here? Should 5.4 really ship with this
still unfixed?

Jan

2019-11-19 13:21:56

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check

On 11.11.19 15:32, Jan Beulich wrote:
> This can be had with two instead of six insns, by just checking the high
> CS.RPL bit.
>
> Also adjust the comment - there would be no #GP in the mentioned cases,
> as there's no segment limit violation or alike. Instead there'd be #PF,
> but that one reports the target EIP of said branch, not the address of
> the branch insn itself.
>
> Signed-off-by: Jan Beulich <[email protected]>

Reviewed-by: Juergen Gross <[email protected]>


Juergen

2019-11-19 13:22:03

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout

On 11.11.19 15:32, Jan Beulich wrote:
> Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
> accounted for in xen_iret_crit_fixup. Otherwise the old_ax value gets
> interpreted as EFLAGS, and hence VM86 mode appears to be active all
> the time, leading to random "vm86_32: no user_vm86: BAD" log messages
> alongside processes randomly crashing.
>
> Since following the previous model (sitting after SAVE_ALL) would
> further complicate the code _and_ retain the dependency of
> xen_iret_crit_fixup on frame manipulations done by entry_32.S, switch
> things around and do the adjustment ahead of SAVE_ALL.
>
> Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
> Signed-off-by: Jan Beulich <[email protected]>

Reviewed-by: Juergen Gross <[email protected]>


Juergen

2019-11-19 17:56:53

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 11/19/19 7:58 AM, Jan Beulich wrote:
> On 11.11.2019 15:30, Jan Beulich wrote:
>> The first patch here fixes another regression from 3c88c692c287
>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>> one already addressed by
>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>> The second patch is a minimal bit of cleanup on top.
>>
>> 1: make xen_iret_crit_fixup independent of frame layout
>> 2: simplify xen_iret_crit_fixup's ring check
> Seeing that the other regression fix has been taken into -tip,
> what is the situation here? Should 5.4 really ship with this
> still unfixed?


I am still unable to boot a 32-bit guest with those patches, crashing in
int3_exception_notify with regs->sp zero.

When I revert to 3c88c692c287 the guest actually boots so my (?) problem
was introduced somewhere in-between.

-boris

Subject: [tip: x86/urgent] x86/xen/32: Simplify ring check in xen_iret_crit_fixup()

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 922eea2ce5c799228d9ff1be9890e6873ce8fff6
Gitweb: https://git.kernel.org/tip/922eea2ce5c799228d9ff1be9890e6873ce8fff6
Author: Jan Beulich <[email protected]>
AuthorDate: Mon, 11 Nov 2019 15:32:59 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 19 Nov 2019 21:58:28 +01:00

x86/xen/32: Simplify ring check in xen_iret_crit_fixup()

This can be had with two instead of six insns, by just checking the high
CS.RPL bit.

Also adjust the comment - there would be no #GP in the mentioned cases, as
there's no segment limit violation or alike. Instead there'd be #PF, but
that one reports the target EIP of said branch, not the address of the
branch insn itself.

Signed-off-by: Jan Beulich <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/xen/xen-asm_32.S | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index 392e033..cd17777 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -153,22 +153,15 @@ hyper_iret:
* it's still on stack), we need to restore its value here.
*/
ENTRY(xen_iret_crit_fixup)
- pushl %ecx
/*
* Paranoia: Make sure we're really coming from kernel space.
* One could imagine a case where userspace jumps into the
* critical range address, but just before the CPU delivers a
- * GP, it decides to deliver an interrupt instead. Unlikely?
- * Definitely. Easy to avoid? Yes. The Intel documents
- * explicitly say that the reported EIP for a bad jump is the
- * jump instruction itself, not the destination, but some
- * virtual environments get this wrong.
+ * PF, it decides to deliver an interrupt instead. Unlikely?
+ * Definitely. Easy to avoid? Yes.
*/
- movl 3*4(%esp), %ecx /* nested CS */
- andl $SEGMENT_RPL_MASK, %ecx
- cmpl $USER_RPL, %ecx
- popl %ecx
- je 2f
+ testb $2, 2*4(%esp) /* nested CS */
+ jnz 2f

/*
* If eip is before iret_restore_end then stack

Subject: [tip: x86/urgent] x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 29b810f5a5ec127d3143770098e05981baa3eb77
Gitweb: https://git.kernel.org/tip/29b810f5a5ec127d3143770098e05981baa3eb77
Author: Jan Beulich <[email protected]>
AuthorDate: Mon, 11 Nov 2019 15:32:12 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 19 Nov 2019 21:58:28 +01:00

x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout

Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
accounted for in xen_iret_crit_fixup(). Otherwise the old_ax value gets
interpreted as EFLAGS, and hence VM86 mode appears to be active all the
time, leading to random "vm86_32: no user_vm86: BAD" log messages alongside
processes randomly crashing.

Since following the previous model (sitting after SAVE_ALL) would further
complicate the code _and_ retain the dependency of xen_iret_crit_fixup() on
frame manipulations done by entry_32.S, switch things around and do the
adjustment ahead of SAVE_ALL.

Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Stable Team <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]


---
arch/x86/entry/entry_32.S | 22 +++++--------
arch/x86/xen/xen-asm_32.S | 66 +++++++++++++-------------------------
2 files changed, 33 insertions(+), 55 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 3f847d8..019dbac 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1341,11 +1341,6 @@ END(spurious_interrupt_bug)

#ifdef CONFIG_XEN_PV
ENTRY(xen_hypervisor_callback)
- pushl $-1 /* orig_ax = -1 => not a system call */
- SAVE_ALL
- ENCODE_FRAME_POINTER
- TRACE_IRQS_OFF
-
/*
* Check to see if we got the event in the critical
* region in xen_iret_direct, after we've reenabled
@@ -1353,16 +1348,17 @@ ENTRY(xen_hypervisor_callback)
* iret instruction's behaviour where it delivers a
* pending interrupt when enabling interrupts:
*/
- movl PT_EIP(%esp), %eax
- cmpl $xen_iret_start_crit, %eax
+ cmpl $xen_iret_start_crit, (%esp)
jb 1f
- cmpl $xen_iret_end_crit, %eax
+ cmpl $xen_iret_end_crit, (%esp)
jae 1f
-
- jmp xen_iret_crit_fixup
-
-ENTRY(xen_do_upcall)
-1: mov %esp, %eax
+ call xen_iret_crit_fixup
+1:
+ pushl $-1 /* orig_ax = -1 => not a system call */
+ SAVE_ALL
+ ENCODE_FRAME_POINTER
+ TRACE_IRQS_OFF
+ mov %esp, %eax
call xen_evtchn_do_upcall
#ifndef CONFIG_PREEMPTION
call xen_maybe_preempt_hcall
diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index c15db06..392e033 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -126,10 +126,9 @@ hyper_iret:
.globl xen_iret_start_crit, xen_iret_end_crit

/*
- * This is called by xen_hypervisor_callback in entry.S when it sees
+ * This is called by xen_hypervisor_callback in entry_32.S when it sees
* that the EIP at the time of interrupt was between
- * xen_iret_start_crit and xen_iret_end_crit. We're passed the EIP in
- * %eax so we can do a more refined determination of what to do.
+ * xen_iret_start_crit and xen_iret_end_crit.
*
* The stack format at this point is:
* ----------------
@@ -138,34 +137,23 @@ hyper_iret:
* eflags } outer exception info
* cs }
* eip }
- * ---------------- <- edi (copy dest)
- * eax : outer eax if it hasn't been restored
* ----------------
- * eflags } nested exception info
- * cs } (no ss/esp because we're nested
- * eip } from the same ring)
- * orig_eax }<- esi (copy src)
- * - - - - - - - -
- * fs }
- * es }
- * ds } SAVE_ALL state
- * eax }
- * : :
- * ebx }<- esp
+ * eax : outer eax if it hasn't been restored
* ----------------
+ * eflags }
+ * cs } nested exception info
+ * eip }
+ * return address : (into xen_hypervisor_callback)
*
- * In order to deliver the nested exception properly, we need to shift
- * everything from the return addr up to the error code so it sits
- * just under the outer exception info. This means that when we
- * handle the exception, we do it in the context of the outer
- * exception rather than starting a new one.
+ * In order to deliver the nested exception properly, we need to discard the
+ * nested exception frame such that when we handle the exception, we do it
+ * in the context of the outer exception rather than starting a new one.
*
- * The only caveat is that if the outer eax hasn't been restored yet
- * (ie, it's still on stack), we need to insert its value into the
- * SAVE_ALL state before going on, since it's usermode state which we
- * eventually need to restore.
+ * The only caveat is that if the outer eax hasn't been restored yet (i.e.
+ * it's still on stack), we need to restore its value here.
*/
ENTRY(xen_iret_crit_fixup)
+ pushl %ecx
/*
* Paranoia: Make sure we're really coming from kernel space.
* One could imagine a case where userspace jumps into the
@@ -176,32 +164,26 @@ ENTRY(xen_iret_crit_fixup)
* jump instruction itself, not the destination, but some
* virtual environments get this wrong.
*/
- movl PT_CS(%esp), %ecx
+ movl 3*4(%esp), %ecx /* nested CS */
andl $SEGMENT_RPL_MASK, %ecx
cmpl $USER_RPL, %ecx
+ popl %ecx
je 2f

- lea PT_ORIG_EAX(%esp), %esi
- lea PT_EFLAGS(%esp), %edi
-
/*
* If eip is before iret_restore_end then stack
* hasn't been restored yet.
*/
- cmp $iret_restore_end, %eax
+ cmpl $iret_restore_end, 1*4(%esp)
jae 1f

- movl 0+4(%edi), %eax /* copy EAX (just above top of frame) */
- movl %eax, PT_EAX(%esp)
+ movl 4*4(%esp), %eax /* load outer EAX */
+ ret $4*4 /* discard nested EIP, CS, and EFLAGS as
+ * well as the just restored EAX */

- lea ESP_OFFSET(%edi), %edi /* move dest up over saved regs */
-
- /* set up the copy */
-1: std
- mov $PT_EIP / 4, %ecx /* saved regs up to orig_eax */
- rep movsl
- cld
-
- lea 4(%edi), %esp /* point esp to new frame */
-2: jmp xen_do_upcall
+1:
+ ret $3*4 /* discard nested EIP, CS, and EFLAGS */

+2:
+ ret
+END(xen_iret_crit_fixup)

2019-11-20 02:21:01

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
> On 11/19/19 7:58 AM, Jan Beulich wrote:
>> On 11.11.2019 15:30, Jan Beulich wrote:
>>> The first patch here fixes another regression from 3c88c692c287
>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>> one already addressed by
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>> The second patch is a minimal bit of cleanup on top.
>>>
>>> 1: make xen_iret_crit_fixup independent of frame layout
>>> 2: simplify xen_iret_crit_fixup's ring check
>> Seeing that the other regression fix has been taken into -tip,
>> what is the situation here? Should 5.4 really ship with this
>> still unfixed?
>
> I am still unable to boot a 32-bit guest with those patches, crashing in
> int3_exception_notify with regs->sp zero.
>
> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
> was introduced somewhere in-between.

Nevermind this. I didn't read your patches correctly.

-boris

2019-11-20 02:44:35

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 11/19/19 9:17 PM, Boris Ostrovsky wrote:
> On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
>> On 11/19/19 7:58 AM, Jan Beulich wrote:
>>> On 11.11.2019 15:30, Jan Beulich wrote:
>>>> The first patch here fixes another regression from 3c88c692c287
>>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>>> one already addressed by
>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>>> The second patch is a minimal bit of cleanup on top.
>>>>
>>>> 1: make xen_iret_crit_fixup independent of frame layout
>>>> 2: simplify xen_iret_crit_fixup's ring check
>>> Seeing that the other regression fix has been taken into -tip,
>>> what is the situation here? Should 5.4 really ship with this
>>> still unfixed?
>> I am still unable to boot a 32-bit guest with those patches, crashing in
>> int3_exception_notify with regs->sp zero.
>>
>> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
>> was introduced somewhere in-between.
> Nevermind this. I didn't read your patches correctly.

BTW, I'd rather this not go into 5.4 this late. 3c88c692c287 has been
there since 5.2 and noone complained.

-boris



2019-11-20 07:16:19

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 19.11.2019 18:50, Boris Ostrovsky wrote:
> On 11/19/19 7:58 AM, Jan Beulich wrote:
>> On 11.11.2019 15:30, Jan Beulich wrote:
>>> The first patch here fixes another regression from 3c88c692c287
>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>> one already addressed by
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>> The second patch is a minimal bit of cleanup on top.
>>>
>>> 1: make xen_iret_crit_fixup independent of frame layout
>>> 2: simplify xen_iret_crit_fixup's ring check
>> Seeing that the other regression fix has been taken into -tip,
>> what is the situation here? Should 5.4 really ship with this
>> still unfixed?
>
>
> I am still unable to boot a 32-bit guest with those patches, crashing in
> int3_exception_notify with regs->sp zero.
>
> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
> was introduced somewhere in-between.

In order to even get as far as the patches here taking effect
you first need "x86/stackframe/32: repair 32-bit Xen PV" (which
is what "the other regression fix" in my ping refers to).

Jan

2019-11-20 07:22:10

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments

On 20.11.2019 03:39, Boris Ostrovsky wrote:
> On 11/19/19 9:17 PM, Boris Ostrovsky wrote:
>> On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
>>> On 11/19/19 7:58 AM, Jan Beulich wrote:
>>>> On 11.11.2019 15:30, Jan Beulich wrote:
>>>>> The first patch here fixes another regression from 3c88c692c287
>>>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>>>> one already addressed by
>>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>>>> The second patch is a minimal bit of cleanup on top.
>>>>>
>>>>> 1: make xen_iret_crit_fixup independent of frame layout
>>>>> 2: simplify xen_iret_crit_fixup's ring check
>>>> Seeing that the other regression fix has been taken into -tip,
>>>> what is the situation here? Should 5.4 really ship with this
>>>> still unfixed?
>>> I am still unable to boot a 32-bit guest with those patches, crashing in
>>> int3_exception_notify with regs->sp zero.
>>>
>>> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
>>> was introduced somewhere in-between.
>> Nevermind this. I didn't read your patches correctly.
>
> BTW, I'd rather this not go into 5.4 this late. 3c88c692c287 has been
> there since 5.2 and noone complained.

Afaict the issues were introduced in 5.3, and my first patch (including
a note [complaint if you will] of the second issue) was sent around
5.4-rc2. This has been blocking osstest's linux-linus forever since, so
even without my mail everyone could have been aware by paying attention
to the flight reports (the bisection ones, unfortunately, are pretty
useless here, as in cases like this one they seem to tend to point at
huge merge commits).

Jan