LinuxLists.cc - Re: ESP corruption bug - what CPUs are affected? (patch att

2004-10-06 17:16:44

Subject: Re: ESP corruption bug - what CPUs are affected? (patch att

On 6 Oct 04 at 20:18, Stas Sergeev wrote:
> Yes, if not for that anonymous guy, who kept posting
> to me until he finally convinced me that the Ring-0
> approach is not that difficult at all.
> So I tried... It was much more difficult to code
> up, but at the end it looks a little better
> and localized to entry.S completely. OTOH it
> touches the exception handlers, but not too much -
> it adds only 5 insns on the fast path. And the
> code is very fragile, but after I made all the
> magic numbers a #define consts, it actually looks
> not so bad.
> I don't know which patch is really better, so
> I am attaching both.

CPL0 solution is certainly more localized, but I have hard problems
to convice myself that it is actually safe.

I would appreciate if you could add comments what values are set
by ESPFIX_SWITCH_16 + 8 + 4 and simillar moves, and what they actually
do. And convicing myself that ESPFIX_SWITCH_32 has just right value so

pushl %eax
pushl %es
lss ESPFIX_SWITCH_32,%esp
popl %es
popl %eax

actually works took almost an hour...
Petr

2004-10-07 13:46:51

by Stas Sergeev

[permalink] [raw]

Subject: Re: ESP corruption bug - what CPUs are affected? (patch att

Hi.

Petr Vandrovec wrote:
> CPL0 solution is certainly more localized, but I have hard problems
> to convice myself that it is actually safe.
I spent 2 days convincing myself the same way:)
The most problematic part was to make sure that
the stack is properly unwinded even if NMI comes
before the exception handler managed to switch
out of 16bit. But I think this is now handled.

> I would appreciate if you could add comments what values are set
OK, I did. But in fact that makes the patch
only even more obfuscated:( It doesn't look
possible to explain all the magic pattern and
its constraints in a comments. But I tried:)
The patch is attached.

> by ESPFIX_SWITCH_16 + 8 + 4 and simillar moves, and what they actually
> do.
Fortunately there are no such moves.
In an attempt to make the patch a little
self-explanatory, I #define'd all the meaningfull
values. So if there is a move to
ESPFIX_SWITCH16_OFFS+some_value, it is safe to
assume that the move is intended to load the
ESPFIX_SWITCH16, and "some_value" is just the
correction constant. So besides the 2 magic
pointers, there are really no moves above the
iret frame.

> And convicing myself that ESPFIX_SWITCH_32 has just right value so
> pushl %eax
> pushl %es
> lss ESPFIX_SWITCH_32,%esp
> popl %es
> popl %eax
> actually works took almost an hour...
I realize that and thats really the big problem
of that patch. It is very obfuscated and difficult
to understand.
I guess if you see the first version of that
patch, which was before I found the way to use
the fixed offsets for locating the switches,
you might just get sick:)

Attachments:

linux-2.6.8-stk0-2a.diff (8.09 kB)

2004-10-11 23:16:30

by Stas Sergeev

[permalink] [raw]

Subject: Re: ESP corruption bug - what CPUs are affected?

Hi.

I've been pointed out that my prev ring-0
patch was not completely safe. I was
doing "popl %es". I thought it is save
because RESTORE_REGS also does that
with the fixup in place, but the anonymous
guy thinks that if %es refers to LDT and
the thread on another CPU changes that LDT
entry in a mean time, my "popl %es" can GPF.
So I have to avoid popping any segregs.
I moved my recovery code to error_code,
right after %es is used last time and before
the %esp is used directly (I am lucky such
a place exist there!).
New patch is attached. Does it look safe
this time?

Attachments:

linux-2.6.8-stk0-3.diff (7.85 kB)