In-Reply-To: <[email protected]>
On Wed, 02 Aug 2006 21:12:21 +0400, Stas Sergeev wrote:
>
> > iret faults, but doesn't pop the user return frame.
> But does it push the kernel frame after it or not?
> If not - I don't understand how we go to a fixup.
> If yes - I don't understand how the user's frame gets
> accessed later, as it is above the kernel's frame.
Just before trying to return to userspace, we have a stack:
user_regs [ebx ... es]
orig_eax
user_iret_frame [eip ... oldss]
After RESTORE_ALL and discarding orig_eax, we have this just
before doing iret (user's regs are in the CPU regs now):
user_iret_frame [eip ... oldss]
iret faults and we get:
kernel_iret_frame [eip(of iret) ... flags]
user_iret_frame [eip ... oldss]
error_code then saves regs and we have:
user_regs [ebx ... es]
orig_eax [== -1]
kernel_iret_frame [eip(iret) ... flags]
user_iret_frame [eip ... oldss]
error_code then calls e.g. do_segment_not_present, which finds a fixup
and does:
regs->eip = fixup_address;
now we have:
user_regs [ebx ... es]
orig_eax [== -1]
kernel_iret_frame [eip(fixup) ... flags]
user_iret_frame [eip ... oldss]
standard return sequence gives us (again user's regs are back in CPU):
kernel_iret_frame [eip(fixup) ... flags]
user_iret_frame [eip ... oldss]
iret returns to the fixup code which jumps to error_code and then we have:
user_regs [ebx ... es]
orig_eax [== -1]
user_iret_frame [eip ... oldss]
So now there is a stack frame that looks like it came from userspace
and we call the iret fault handler with that.
Only problem I have with this is we lose the original fault info from
the iret. So we have no real way of knowing whether it was #GP, #NP, #SF
or whatever, and no record of the offending iret's address.
--
Chuck
Hi.
Chuck Ebbert wrote:
> Only problem I have with this is we lose the original fault info from
> the iret. So we have no real way of knowing whether it was #GP, #NP, #SF
> or whatever, and no record of the offending iret's address.
Thanks for the precise explanation.
There was also a problem with me reading the Intel's manual:
it uses Pop() in their pseudo-code, and it Pop()'s the values
*before* checking them. The description of the Pop() is very
confusing:
---
Pop() removes the value from the top of the stack and returns it.
---
What "removes" means here is unclear. Whether it adjusts a stack
pointer, is unclear. Since it is Pop(), I was assuming "removes"
means it also adjusts the stack pointer, but now I see it was a
wrong guess.