2002-06-30 04:37:38

by Willy Tarreau

[permalink] [raw]
Subject: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

Hi all,

OK, I know that many people dislike this, but I know others
who occasionally need it anyway. So I don't post it for general
inclusion, but for interested people.

What is it ? it's a software emulator for x86 opcodes not handled
by the processor. Emulations available are :
- BSWAP, CMPXCHG, XADD on 386 (486 instructions)
- CMOV on any x86 processor (pentium-pro instructions)

It is not meant to replace a correct compilation, but it may have
happened to all of us to try to rescue a damaged system or with a
boot disk, and copying some programs with a floppy, and then get
an 'Illegal instruction' because these programs have been compiled
with a badly configured compiler. I once had a gcc which did i686
by default, and I had a hard time trying to execute an e2fsck it
had compiled, on my k6 notebook !

Same if you take a disk from a system, and try to boot it on
another one. Well, I won't spend more time finding examples.

I've been using the 486 emulation on my 386 firewall for a few
years now, and have been quite happy with it. I cleaned the code
a bit and added support for cmov. All this will grow your bzImage
with less than 1kB.

As I stated above, it is *NOT* meant to replace a recompilation
for the correct target. Emulation is quite slow because of the
time the CPU spends processing the trap. I measured about 450
cycles for a cmov, which is quite a overhead, but still
acceptable for occasionnal purposes (1us on my k6/450).

I was thinking about adding some statistics informations, such
as the number of traps caught, globally and by process, but
finally realized that this was only bloat for something that
should not be used permanently.

I didn't have the time to try VMWare on top of this. It could
be interesting to be able to provide CMOV or other instructions
to guest systems.

Here is the patch against 2.4.19-rc1.

Comments and criticisms welcome, but flames to /dev/null.

Cheers,
Willy


Attachments:
(No filename) (1.89 kB)
patch-2.4.19-rc1-emux86-0.2 (21.29 kB)
Download all attachments

2002-07-01 09:07:15

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On 30 June 2002 02:39, Willy TARREAU wrote:
> Hi all,
>
> OK, I know that many people dislike this, but I know others
> who occasionally need it anyway. So I don't post it for general
> inclusion, but for interested people.

This code is performance critical.
With this in mind,

+static void *modrm_address(struct pt_regs *regs, unsigned char **from,
+ char w, char bit32, unsigned char modrm)
+{
w seems to be unused

+ offset = **from + (((int)*(*from+1)) << 8) +
+ (((int)*(*from+2)) << 16) + (((int)*(*from+3)) << 24);
+ *from += 4;
Why? i86 can do unaligned accesses:
offset = *(u32*)(*from); *from += 4;
or even
offset = *((u32*)(*from))++; //ugly isn't it?

+ /* base off32 + scaled index */
+ offset += **from + (((int)*(*from+1)) << 8) +
+ (((int)*(*from+2)) << 16) + (((int)*(*from+3)) << 24);
+ *from += 4;
same

+ } else if ((modrm & 0xC0) == 0x80) { /* 32 bits unsigned offset */
+ offset += **from + (((int)*(*from+1)) << 8) +
+ (((int)*(*from+2)) << 16) + (((int)*(*from+3)) << 24);
+ *from += 4;
same

+ if ((modrm & 0xC7) == 0x06) { /* 16 bits offset */
+ offset = **from + (((int)*(*from+1)) << 8);
+ *from += 2;
similar

+ } else if ((modrm & 0xC0) == 0x80) { /* 16 bits unsigned offset */
+ offset += **from + (((int)*(*from+1)) << 8);
+ *from += 2;
similar

+asmlinkage void do_invalid_op(struct pt_regs * regs, long error_code)
+{
...
+ /* we'll first read all known opcode prefixes, and discard obviously
+ invalid combinations.*/
Prefixes are rarer than plain opcodes. Maybe:
1.check opcodes
2.no? check prefixes
3.yes? check opcodes again

+ if (prefixes & PREFIX_LOCK)
+ goto invalid_opcode;
Cycles burned for nothing.
What harm can be done if we managed to emulate
lock lock lock xadd a,b
?
(same for all other prefixes)
(OTOH this way you can be sure while() will end sooner or later)

+ case 0xF2: /* repne */
...
+ case 0xF3: /* rep */
These prefixes are invalid for commands we emulate.
No GCC will ever generate such code, don't check for them.
(Comment them out if you like to retain the code)
This will also simplify
+ else if ((*eip & 0xfc) == 0xf0) {
+ switch (*eip) {
+ case 0xF0: /* lock */
down to single if().

+ reg = (modrm >> 3) & 7;
+ modrm &= 0xC7;
+
+ /* condition is valid */
+ dst = reg_address(regs, 1, reg); <<<1
+ if ((modrm & 0xC0) == 0xC0) { /* register to register */
+ src = reg_address(regs, 1, modrm & 0x07);
+ }
+ else {
+ src = modrm_address(regs, &eip, 1, !(prefixes & PREFIX_A32), modrm); <<<2
eliminate 'reg = (modrm >> 3) & 7', move calculation to <<<1
eliminate 'modrm &= 0xC7', move to <<<2 or drop it
(I think modrm_adr() will work fine with unmasked modrm)
--
vda

2002-07-01 13:01:07

by willy tarreau

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

Hello Denis,

> This code is performance critical.
> With this in mind,

Yes and no. In fact, I first wanted to code some
parts in assembler because GCC is sub-optimal
on bit-fields calculations. But then, I realized that
I could save, say 10 cycles, while the trap costs
about 400 cycles.

> +static void *modrm_address(struct pt_regs *regs,
> unsigned char **from, char w, char bit32,
> w seems to be unused

Well, you're right, it's not used anymore. It was
used to check if the instruction applies to a byte
or a word.

> Why? i86 can do unaligned accesses:
> offset = *(u32*)(*from); *from += 4;

that's simply because I'm not sure if the kernel
runs with AC flag on or off. I quickly checked
that it's OK from userland.

> + /* we'll first read all known opcode
> prefixes, and discard obviously
> + invalid combinations.*/
> Prefixes are rarer than plain opcodes. Maybe:
> 1.check opcodes
> 2.no? check prefixes
> 3.yes? check opcodes again

perhaps a good idea, I don't know. I think the
current code doesn't cost much in case there
is no prefix (only 3 failed IFs). I also wrote a
prefix bitmap to directly map opcodes to prefixes/
known instructions, but thought it was not really
usefull and costed 32 bytes, so I removed it.

> + if (prefixes &
PREFIX_LOCK)
> + goto
invalid_opcode;
> Cycles burned for nothing.
> What harm can be done if we managed to emulate
> lock lock lock xadd a,b

simply avoid that someone filling 16Meg of code
with prefixes spends all his time in the kernel.
When I did this, I had checked and noticed that
an instruction with a repeated prefix is invalid.

> + case 0xF3: /* rep */
> These prefixes are invalid for commands we emulate.
> No GCC will ever generate such code, don't check for
> them.

Yes, I agree with you. The only instructions that
support these prefixes are stable and it's not likely
that others will come in the future, so we may
handle them in the general case of invalid
instruction.

> eliminate 'reg = (modrm >> 3) & 7', move calculation
> to <<<1
> eliminate 'modrm &= 0xC7', move to <<<2 or drop it
> (I think modrm_adr() will work fine with unmasked
> modrm)

I know this part can be reworked. I just need a bit
of time to check redundant calculations between
the main function and modrm_addr(), and I think
I can simplify even more.

Like I said above, I didn't insist on optimizations,
I prefered to get a clear code first. If I want to
optimize, I think most of this will be assembler.

Thanks a lot for your feedback,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com

2002-07-01 13:19:41

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On 1 July 2002 11:03, willy tarreau wrote:
> Hello Denis,
>
> > This code is performance critical.
> > With this in mind,
>
> Yes and no. In fact, I first wanted to code some
> parts in assembler because GCC is sub-optimal
> on bit-fields calculations. But then, I realized that
> I could save, say 10 cycles, while the trap costs
> about 400 cycles.

Can you code up a "dummy" emulator (which just ignores
any invalid opcode by doing eip+=3) and compare trap times
of your emulator and dummy one for, say, CMOVC AL,AL?
(with carry flag cleared)
--
vda

2002-07-01 13:23:28

by willy tarreau

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

> Can you code up a "dummy" emulator (which just
> ignores any invalid opcode by doing eip+=3) and
> compare trap times of your emulator and dummy
> one for, say, CMOVC AL,AL? (with carry flag
> cleared)

I may do this. Don't have the time at the moment,
but perhaps this evening...

cheers,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com

2002-07-01 13:27:14

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On 30 June 2002 02:39, Willy TARREAU wrote:
> Hi all,
>
> OK, I know that many people dislike this, but I know others
> who occasionally need it anyway. So I don't post it for general
> inclusion, but for interested people.

+ if ((*eip == 0x0F) && ((*(eip+1) & 0xF0) == 0x40)) { /* CMOV* */
...
+ if ((*eip == 0x0F) && ((*(eip+1) & 0xF8) == 0xC8)) { /* BSWAP */
...
+ if ((*eip == 0x0F) && ((*(eip+1) & 0xFE) == 0xB0)) { /* CMPXCHG */
...
+ if ((*eip == 0x0F) && ((*(eip+1) & 0xFE) == 0xC0)) { /* XADD */

You may check for 0x0F only once:

if(*eip!=0x0f) goto invalid_opcode;
eip++;
--
vda

2002-07-01 15:58:11

by Bill Davidsen

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On Mon, 1 Jul 2002, [iso-8859-1] willy tarreau wrote:

> Like I said above, I didn't insist on optimizations,
> I prefered to get a clear code first. If I want to
> optimize, I think most of this will be assembler.

This sounds good, the idea is that it should work at all, clarity is good,
I can't imagine anyone running this long term instead of building a
compile with the right machine type.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-07-01 16:23:20

by Gabriel Paubert

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

willy tarreau wrote:
> Hello Denis,
>
>
>>This code is performance critical.
>>With this in mind,
>
>
> Yes and no. In fact, I first wanted to code some
> parts in assembler because GCC is sub-optimal
> on bit-fields calculations. But then, I realized that
> I could save, say 10 cycles, while the trap costs
> about 400 cycles.
>
>
>>+static void *modrm_address(struct pt_regs *regs,
>>unsigned char **from, char w, char bit32,
>>w seems to be unused
>
>
> Well, you're right, it's not used anymore. It was
> used to check if the instruction applies to a byte
> or a word.
>
>
>>Why? i86 can do unaligned accesses:
>> offset = *(u32*)(*from); *from += 4;
>
>
> that's simply because I'm not sure if the kernel
> runs with AC flag on or off. I quickly checked
> that it's OK from userland.

AC is only checked when running at CPL==3, i.e.,
you'll never get an alignment trap in the kernel.

Gabriel



2002-07-01 17:06:31

by willy tarreau

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

> > that's simply because I'm not sure if the kernel
> > runs with AC flag on or off. I quickly checked
> > that it's OK from userland.
>
> AC is only checked when running at CPL==3, i.e.,
> you'll never get an alignment trap in the kernel.

thanks for the tip, I think that with all the feedback
I got, I could rewrite it more cleanly ;-)

Cheers,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com

2002-07-02 06:11:24

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On 1 July 2002 13:55, Bill Davidsen wrote:
> On Mon, 1 Jul 2002, [iso-8859-1] willy tarreau wrote:
> > Like I said above, I didn't insist on optimizations,
> > I prefered to get a clear code first. If I want to
> > optimize, I think most of this will be assembler.
>
> This sounds good, the idea is that it should work at all, clarity is good,
> I can't imagine anyone running this long term instead of building a
> compile with the right machine type.

I see a potential problem here: if someone is running such kernel
all the time, he can take huge performance penalty. 'Dunno why but on
my box mailer does not run. It _crawls_'.
Ordinary user may perceive it like 'Linux is slow'.

What can be done to prevent this? Printk can go unnoticed in the log,
as far as nothing actually breaks user won't look into the logs...

1.big red letters 'CMOV EMULATION' across the screen? :-)
2.Scroll lock LED inverted each time CMOV is triggered?
3.Printk at kernel init time:
"Emergency rescue kernel with CMOV emulation: can be very slow,
not for production use!" ?

Of course (1) is a joke.
--
vda

2002-07-02 06:29:07

by willy tarreau

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

> I see a potential problem here: if someone is
> running such kernel all the time, he can take huge
> performance penalty. 'Dunno why but on
> my box mailer does not run. It _crawls_'.
> Ordinary user may perceive it like 'Linux is slow'.
> What can be done to prevent this? Printk can go
> unnoticed in the log, as far as nothing actually
> breaks user won't look into the logs...

As I state in my former mail, I think it would be
good to at least implement statistics on the
number of traps for each instruction set, and
also be able to disable emulation, to check
whether a program correctly runs without
or not.

> 1.big red letters 'CMOV EMULATION' across the
screen? :-)
> 2.Scroll lock LED inverted each time CMOV is
triggered?
> 3.Printk at kernel init time:
> "Emergency rescue kernel with CMOV emulation: can
> be very slow, not for production use!" ?

perhaps not, but we could send an alert message on
the system console the first time an instruction is
emulated, with the program's name. But nothing more,
else we'll have to modify the task struct to include
counters, and I really don't want that.

> Of course (1) is a joke.

so (2) isn't ? and you talk about overhead of 3 IFs
:-)

Cheers,
Willy


___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com

2002-07-02 07:27:39

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On 2 July 2002 04:31, willy tarreau wrote:
> > 1.big red letters 'CMOV EMULATION' across the screen? :-)
> > 2.Scroll lock LED inverted each time CMOV is triggered?
> > 3.Printk at kernel init time:
> > "Emergency rescue kernel with CMOV emulation: can
> > be very slow, not for production use!" ?
>
> perhaps not, but we could send an alert message on
> the system console the first time an instruction is
> emulated, with the program's name. But nothing more,
> else we'll have to modify the task struct to include
> counters, and I really don't want that.
>
> > Of course (1) is a joke.
>
> so (2) isn't ? and you talk about overhead of 3 IFs

(2) is a half-joke, so to say. It woulda be funny to see
on lkml:

From: [email protected]
Subj: mailer crawls like on 286 and scroll LED blinks!!!

Everyone will immediately realize what's going on.
This will save us chasing non-existent performance
problems. But it will cost _many_ cycles each fault.

Seriously, I think (3) is best. Why?

> an alert message on the system console
> the first time an instruction is emulated

is a printk(KERN_EMERG...), it can go unnoticed
too (all logs go to file only or user in X)
_and_ it incurs penalty on each fault.

> > 3.Printk at kernel init time:
> > "Emergency rescue kernel with CMOV emulation: can
> > be very slow, not for production use!" ?

is non-suppressable (unless user is stupid enough
to suppress ALL kernel boot messages) and have no penalty.
--
vda

2002-07-02 19:57:53

by Willy Tarreau

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

> Can you code up a "dummy" emulator (which just ignores
> any invalid opcode by doing eip+=3) and compare trap times
> of your emulator and dummy one for, say, CMOVC AL,AL?
> (with carry flag cleared)

The dummy emulator costs exactly 296 cycles (stable) on my
k6-2/450. It only adds 3 to eip then returns.

To check this, I compared 1 million iteriations of 10
consecutive cmove %eax,%eax with as much lea 0(%eax),%eax
(1 cycle, RAW dependancy, not parallelizable), and the
difference was exactly 660 ns/inst (297 cycles).

That said, I agree with you that it's worth optimizing a
bit, at least to stay closer to 300 cycles than to 450.
But that won't make emulated machines fast anyway.

One interesting note: I tested the prog on a VIA C3/533
Mhz. One native cmove %eax,%eax costs 56 cycles here ! (at
first, I even thought it was emulated). It's a shame to see
how these instructions have been implemented. May be they
flush the pipelines, write-backs, ... before the instruction.
BTW, cmov isn't reported in cpu_flags, perhaps to discourage
progs from using it ;-)

I will recode the stuff, and add two preventive messages:
- at boot time : "warning: this kernel may emulate unsupported instructions. If you
find it slow, please do dmesg."
- at first emulation : "trap caught for instruction XXX, program XXX."

Cheers,
Willy

2002-07-03 00:34:16

by jw schultz

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On Tue, Jul 02, 2002 at 10:00:05PM +0200, Willy TARREAU wrote:
> > Can you code up a "dummy" emulator (which just ignores
> > any invalid opcode by doing eip+=3) and compare trap times
> > of your emulator and dummy one for, say, CMOVC AL,AL?
> > (with carry flag cleared)
>
> The dummy emulator costs exactly 296 cycles (stable) on my
> k6-2/450. It only adds 3 to eip then returns.
>
> To check this, I compared 1 million iteriations of 10
> consecutive cmove %eax,%eax with as much lea 0(%eax),%eax
> (1 cycle, RAW dependancy, not parallelizable), and the
> difference was exactly 660 ns/inst (297 cycles).
>
> That said, I agree with you that it's worth optimizing a
> bit, at least to stay closer to 300 cycles than to 450.
> But that won't make emulated machines fast anyway.
>
> One interesting note: I tested the prog on a VIA C3/533
> Mhz. One native cmove %eax,%eax costs 56 cycles here ! (at
> first, I even thought it was emulated). It's a shame to see
> how these instructions have been implemented. May be they
> flush the pipelines, write-backs, ... before the instruction.
> BTW, cmov isn't reported in cpu_flags, perhaps to discourage
> progs from using it ;-)
>
> I will recode the stuff, and add two preventive messages:
> - at boot time : "warning: this kernel may emulate unsupported instructions. If you
> find it slow, please do dmesg."
> - at first emulation : "trap caught for instruction XXX, program XXX."

Too often the "it seems slow" complaint comes after weeks or
even months of uptime. How about the message every n times
an emulation is required.

if(!(emulation_count++ & 0xHHHH))
printk(...);

wouldn't add too much more overhead than

if (!emulation_notice)
{
emulation_notice = 1;
printk(...);
}

after all this is only supposed to happen under rescue
situations. That way it will be sure to be in the logs and
maybe even on the console and we won't have to hunt for it.

Also, the message should say you are doing instruction
emulation. "wrong model cpu, emulating instruction XXX" I
doubt indicating the program is helpful unless the tracking
is done per task or the printk every time you emulate.

--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt

2002-07-18 19:12:15

by Robert de Bath

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1

On Tue, 2 Jul 2002, jw schultz wrote:

> wouldn't add too much more overhead than
>
> if (!emulation_notice)
> {
> emulation_notice = 1;
> printk(...);
> }
>
> after all this is only supposed to happen under rescue
> situations. That way it will be sure to be in the logs and
> maybe even on the console and we won't have to hunt for it.
>
> Also, the message should say you are doing instruction
> emulation. "wrong model cpu, emulating instruction XXX" I
> doubt indicating the program is helpful unless the tracking
> is done per task or the printk every time you emulate.

I'd suggest this message could be so frequent that you want to
link it's display to real time. Check the jiffy counter each
time and if it's been less that X seconds since the last message
just up a counter. Plus in the message say how many instructions
have been emulated since the last one ... eg if it's only five
I don't care, but five million would be a problem!

One other thing ... should the FPU emulator also display messages
like these if it's used?

--
Rob. (Robert de Bath <robert$ @ debath.co.uk>)
<http://www.cix.co.uk/~mayday>


2002-07-18 20:41:20

by jw schultz

[permalink] [raw]
Subject: Re: [ANNOUNCE] CMOV emulation for 2.4.19-rc1


On Thu, Jul 18, 2002 at 08:15:05PM +0100, Robert de Bath wrote:
> On Tue, 2 Jul 2002, jw schultz wrote:
[something causing printk every n emulation hits]
>
> > wouldn't add too much more overhead than
> >
> > if (!emulation_notice)
> > {
> > emulation_notice = 1;
> > printk(...);
> > }
> >
> > after all this is only supposed to happen under rescue
> > situations. That way it will be sure to be in the logs and
> > maybe even on the console and we won't have to hunt for it.
> >
> > Also, the message should say you are doing instruction
> > emulation. "wrong model cpu, emulating instruction XXX" I
> > doubt indicating the program is helpful unless the tracking
> > is done per task or the printk every time you emulate.
>
> I'd suggest this message could be so frequent that you want to
> link it's display to real time. Check the jiffy counter each
> time and if it's been less that X seconds since the last message
> just up a counter. Plus in the message say how many instructions
> have been emulated since the last one ... eg if it's only five
> I don't care, but five million would be a problem!

If a jiffies check (need only be low order word) isn't too
expensive, fine. My concern hear is that while i don't want
the printk overhead of emulation to swamp the system i do want
it to pepper the log so if someone is foolish enough to be
miscompiled with this in they will know it.

Emulating advanced instructions via traps is slow, very slow
i would be willing to put up with an extra 5% time overhead
to tell the user he shouldn't be doing it. This emulation
should only be done long enough to rescue and/or recompile.
period. It occurs to me now that if it comes from user-mode
(can we tell?) we should always printk with ARGV[0], not
PID, to identify the faulty executable.

As such i'm more concerned with codesize than speed. If it
is too big i wouldn't enable it in *config.

>
> One other thing ... should the FPU emulator also display messages
> like these if it's used?
Absolutely not. The kernel never uses FPU instructions and
there are legitimate situations for running on systems
without an FPU where user-level floating point will be used.

The distinction between these two emulations is clear.
FPU emulation allows user-mode code to do floating point
without coding around the (now corner) case of not having a
FPU. CMOV et al emulation allows you to move a HD from a
dead MB to another with a different CPU type or at least
boot a kernel that was configured for the wrong CPU type
without crashing on an "illegal instruction". One is
long-term normal operation, the other is short-term crash
avoidance.


--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt