LinuxLists.cc - Re: + espfix-code-cleanup.patch added to -mm tree

2006-08-02 19:18:26

Subject: Re: + espfix-code-cleanup.patch added to -mm tree

In-Reply-To: <[email protected]>

On Wed, 02 Aug 2006 21:12:21 +0400, Stas Sergeev wrote:
>
> > iret faults, but doesn't pop the user return frame.
> But does it push the kernel frame after it or not?
> If not - I don't understand how we go to a fixup.
> If yes - I don't understand how the user's frame gets
> accessed later, as it is above the kernel's frame.

Just before trying to return to userspace, we have a stack:

user_regs [ebx ... es]
orig_eax
user_iret_frame [eip ... oldss]

After RESTORE_ALL and discarding orig_eax, we have this just
before doing iret (user's regs are in the CPU regs now):

user_iret_frame [eip ... oldss]

iret faults and we get:

kernel_iret_frame [eip(of iret) ... flags]
user_iret_frame [eip ... oldss]

error_code then saves regs and we have:

user_regs [ebx ... es]
orig_eax [== -1]
kernel_iret_frame [eip(iret) ... flags]
user_iret_frame [eip ... oldss]

error_code then calls e.g. do_segment_not_present, which finds a fixup
and does:

regs->eip = fixup_address;

now we have:

user_regs [ebx ... es]
orig_eax [== -1]
kernel_iret_frame [eip(fixup) ... flags]
user_iret_frame [eip ... oldss]

standard return sequence gives us (again user's regs are back in CPU):

kernel_iret_frame [eip(fixup) ... flags]
user_iret_frame [eip ... oldss]

iret returns to the fixup code which jumps to error_code and then we have:

user_regs [ebx ... es]
orig_eax [== -1]
user_iret_frame [eip ... oldss]

So now there is a stack frame that looks like it came from userspace
and we call the iret fault handler with that.

Only problem I have with this is we lose the original fault info from
the iret. So we have no real way of knowing whether it was #GP, #NP, #SF
or whatever, and no record of the offending iret's address.

--
Chuck

2006-08-02 19:28:36

by Stas Sergeev

[permalink] [raw]

Subject: Re: + espfix-code-cleanup.patch added to -mm tree

Hi.

Chuck Ebbert wrote:
> Only problem I have with this is we lose the original fault info from
> the iret. So we have no real way of knowing whether it was #GP, #NP, #SF
> or whatever, and no record of the offending iret's address.
Thanks for the precise explanation.
There was also a problem with me reading the Intel's manual:
it uses Pop() in their pseudo-code, and it Pop()'s the values
*before* checking them. The description of the Pop() is very
confusing:
---
Pop() removes the value from the top of the stack and returns it.
---
What "removes" means here is unclear. Whether it adjusts a stack
pointer, is unclear. Since it is Pop(), I was assuming "removes"
means it also adjusts the stack pointer, but now I see it was a
wrong guess.