2002-04-19 22:00:15

by Robert Hentosh

[permalink] [raw]
Subject: [PATCH] reboot=bios is invalidating cache incorrectly


Sorry for the trashed email and patch. Apparently Outlook knows better
than I where to do word wrapping on patches.

Here it is again from pine.



When specifying the kernel parameter reboot=bios the assembly code that is
executed to switch to real mode and call the bios vector contains an
error. This causes rebooting via bios to hang in certain conditions.

The hand assembled routine contained in the array "real_mode_switch"
contains INVD which invalidates the CPU caches, unfortunately the routine
was just previously copied via memcpy and is contained in the cache. This
leads to unexpected results. The following patch replaces INVD with
WBINVD which will insure that the routine is written to RAM before
invalidating the cache, providing more reliable reboots.

This patch applies cleanly to 2.4.18 and 2.5.8. It probably also works
with all 2.2.x, 2.4.x and 2.5.x kernels.

This fixes a long standing bug that prevented reliable reboots on some
platforms.


Regards,
Robert Hentosh


--
Robert Hentosh
Sr. Software Engineer
Dell Linux Solutions http://www.dell.com/linux


--- linux-2.4.18.orig/arch/i386/kernel/process.c Fri Apr 19 14:37:21 2002
+++ linux-2.4.18/arch/i386/kernel/process.c Fri Apr 19 14:41:11 2002
@@ -253,7 +253,7 @@
0x66, 0x0f, 0x20, 0xc3, /* movl %cr0,%ebx */
0x66, 0x81, 0xe3, 0x00, 0x00, 0x00, 0x60, /* andl $0x60000000,%ebx */
0x74, 0x02, /* jz f */
- 0x0f, 0x08, /* invd */
+ 0x0f, 0x09, /* wbinvd */
0x24, 0x10, /* f: andb $0x10,al */
0x66, 0x0f, 0x22, 0xc0 /* movl %eax,%cr0 */
};



2002-04-23 13:51:16

by Martin.Knoblauch

[permalink] [raw]
Subject: Re: [PATCH] reboot=bios is invalidating cache incorrectly

> [PATCH] reboot=bios is invalidating cache incorrectly
>
>
> This patch applies cleanly to 2.4.18 and 2.5.8. It probably also works
> with all 2.2.x, 2.4.x and 2.5.x kernels.
>
> This fixes a long standing bug that prevented reliable reboots on some
> platforms.
>
Robert,

care to specify which platforms? :-) I ask because I am experiencing
reboot hangs between BISO and lilo on Tyan 2462 boards. Apparently
others see similar things.

Martin
--
------------------------------------------------------------------
Martin Knoblauch | email: [email protected]
TeraPort GmbH | Phone: +49-89-510857-309
C+ITS | Fax: +49-89-510857-111
http://www.teraport.de | Mobile: +49-170-4904759

2002-04-23 20:03:10

by Robert_Hentosh

[permalink] [raw]
Subject: RE: [PATCH] reboot=bios is invalidating cache incorrectly



> -----Original Message-----
> From: Martin Knoblauch [mailto:[email protected]]
> Sent: Tuesday, April 23, 2002 8:51 AM
> Subject: Re: [PATCH] reboot=bios is invalidating cache incorrectly
>
>
> > [PATCH] reboot=bios is invalidating cache incorrectly
> >
> >
> > This patch applies cleanly to 2.4.18 and 2.5.8. It probably
> also works
> > with all 2.2.x, 2.4.x and 2.5.x kernels.
> >
> > This fixes a long standing bug that prevented reliable
> reboots on some
> > platforms.
> >
> Robert,
>
> care to specify which platforms? :-) I ask because I am experiencing
> reboot hangs between BISO and lilo on Tyan 2462 boards. Apparently
> others see similar things.
>

Martin,

I only have extensive experience with Dell's Server platforms. But because
of the nature of the error, it can happen on any platform using reboot=bios.

On the platforms I was debugging.. invalidating the cache left a bunch of
ADD's in memory followed by a JMP to itself. So you would see the message
"rebooting system" but then you would never get a blank screen and BIOS
posting (the reboot vector was never called).

- Robert

2002-04-24 18:10:22

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] reboot=bios is invalidating cache incorrectly

Robert Hentosh wrote:
> The hand assembled routine contained in the array "real_mode_switch"
> contains INVD which invalidates the CPU caches, unfortunately the routine
> was just previously copied via memcpy and is contained in the cache. This
> leads to unexpected results. The following patch replaces INVD with
> WBINVD which will insure that the routine is written to RAM before
> invalidating the cache, providing more reliable reboots.

Hi Robert,

I wrote that code originally.

I wasn't sure that `wbinvd' was safe with the cache disabled, but I've
looked more closely at some Intel docs, and it seems that CD and NW
simply change the behaviour of the cache -- the cache is still used, but
it has slightly different behaviour.

There is another bug with both `wbinvd' and `invd': it is possible (on
some 486s at least) that a cache line fill is in progress when the
`wbinvd' executes, and that line won't be flushed. In particular, the
i-cache line containing the `wbinvd' instruction itself is quite likely
to be being filled at the time.

The process will access filled cache lines even with CD ("cache
disable") and NW set in CR0, yet it won't fill write back changes to
RAM. (It's a rarely used capability to allow execution from cache
without external bus traffic). This is quite a nasty state for the BIOS
to inherit.

I think that this will occur in practice, if the bug is still present on
real CPUs: we have just copied the real mode code, so the copy is dirty
in the d-cache which means that it is definitely not valid in the
i-cache. And so the cache line containing the `wbinvd' instruction may
still be filling when the instruction executes.

http://ivs.cs.uni-magdeburg.de/~zbrog/asm/86bugs.html shows how to fix
this. It's a bit fiddly, and involves a PC-relative address so the code
isn't a two-liner. So I'm not going to write that fix.

Summary to all Cc'd: please do apply Robert's patch. It would make
sense on the 2.2 stable branch too, after some user testing on 2.4/2.5
has verified that its ok for everyone.

cheers,
-- Jamie