2003-05-10 02:44:04

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] Use correct x86 reboot vector


Extensive discussion by various experts on the [email protected]
mailing list concluded that the correct vector to restart an 286+
CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
but the first is correct.

See the "DPMI on AMD64" and "Warm reboot for x86-64 linux" threads
on http://www.x86-64.org/mailing_lists/list?listname=discuss&listnum=0
for more details.

This patch fixes the 2.5.69 i386 reboot code to use this too.

--- linux-2.5.69/arch/i386/kernel/reboot.c-o 2003-03-28 18:32:18.000000000 +0100
+++ linux-2.5.69/arch/i386/kernel/reboot.c 2003-05-10 04:51:35.000000000 +0200
@@ -123,7 +123,7 @@
};
static unsigned char jump_to_bios [] =
{
- 0xea, 0x00, 0x00, 0xff, 0xff /* ljmp $0xffff,$0x0000 */
+ 0xea, 0xf0, 0xff, 0x00, 0xf0 /* ljmp $0xf000:0xfff0 */
};

/*


2003-05-10 03:22:53

by CaT

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sat, May 10, 2003 at 04:56:34AM +0200, Andi Kleen wrote:
> Extensive discussion by various experts on the [email protected]
> mailing list concluded that the correct vector to restart an 286+
> CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
> but the first is correct.

Could this bug, by any chance, cause a system to shutdown instead of
rebooting? This is what happens to me at the moment but not each and
every time.

--
Martin's distress was in contrast to the bitter satisfaction of some
of his fellow marines as they surveyed the scene. "The Iraqis are sick
people and we are the chemotherapy," said Corporal Ryan Dupre. "I am
starting to hate this country. Wait till I get hold of a friggin' Iraqi.
No, I won't get hold of one. I'll just kill him."
- http://www.informationclearinghouse.info/article2479.htm

2003-05-10 03:45:50

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Followup to: <[email protected]>
By author: CaT <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Sat, May 10, 2003 at 04:56:34AM +0200, Andi Kleen wrote:
> > Extensive discussion by various experts on the [email protected]
> > mailing list concluded that the correct vector to restart an 286+
> > CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
> > but the first is correct.
>
> Could this bug, by any chance, cause a system to shutdown instead of
> rebooting? This is what happens to me at the moment but not each and
> every time.
>

No, it wouldn't.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-05-10 15:36:24

by Alan

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sad, 2003-05-10 at 04:35, CaT wrote:
> On Sat, May 10, 2003 at 04:56:34AM +0200, Andi Kleen wrote:
> > Extensive discussion by various experts on the [email protected]
> > mailing list concluded that the correct vector to restart an 286+
> > CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
> > but the first is correct.
>
> Could this bug, by any chance, cause a system to shutdown instead of
> rebooting? This is what happens to me at the moment but not each and
> every time.

Unlikely. But try it and see 8)

At least some SMP boxes freak if you do a poweroff request on CPU != 0

2003-05-10 16:03:02

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Andi Kleen wrote:
> Extensive discussion by various experts on the [email protected]
> mailing list concluded that the correct vector to restart an 286+
> CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
> but the first is correct.

You are right. That's what a 286 does when the RESET signal is asserted.

Which is amazing, because I wrote that ffff:0000 and I was reading
from the Phoenix BIOS book at the time. It was long ago but I'm
fairly sure I got that address from the book.

I just did some Googling and found that there examples of DOS code
fragments using both vectors. Also, the original IBM BIOS (as they
say) had a long jump at the vector, which is presumably one of the
many de facto ABIs which real mode programmers grew to depend on.

> See the "DPMI on AMD64" and "Warm reboot for x86-64 linux" threads
> on http://www.x86-64.org/mailing_lists/list?listname=discuss&listnum=0
> for more details.

One would hope that AMD64 systems, being a new design and all, offer a
documented and reliable method of rebooting.

It should never be necessary to have to write "reboot=..." on the
kernel command line to choose which legacy method works on different
AMD64 motherboards. Am I too idealistic?

-- Jamie

2003-05-10 16:04:40

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Alan Cox wrote:
> At least some SMP boxes freak if you do a poweroff request on CPU != 0

Power-off works on some SMP boxes?

-- Jamie

2003-05-10 16:31:15

by Jos Hulzink

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Saturday 10 May 2003 18:17, Jamie Lokier wrote:
> Alan Cox wrote:
> > At least some SMP boxes freak if you do a poweroff request on CPU != 0
>
> Power-off works on some SMP boxes?

With ACPI kernels, my Dual PII 333 / Intel 440 LX powers down without pressing
the button.

Jos

2003-05-10 16:56:52

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

> Andi Kleen wrote:
>> Extensive discussion by various experts on the [email protected] mailing
>> list concluded that the correct vector to restart an 286+ CPU is
>> f000:fff0, not ffff:0000. Both seem to work on current systems, but the
>> first is correct.
>
> You are right. That's what a 286 does when the RESET signal is asserted.
>
> Which is amazing, because I wrote that ffff:0000 and I was reading from the
> Phoenix BIOS book at the time. It was long ago but I'm
> fairly sure I got that address from the book.
>
> I just did some Googling and found that there examples of DOS code fragments
> using both vectors. Also, the original IBM BIOS (as they say) had a long
> jump at the vector, which is presumably one of the many de facto ABIs which
> real mode programmers grew to depend on.

This seems to be a difference from 8086/8088 to the 286.
My iAPX 286 Hardware Reference Manual says that the RESET signal initializes
CS to 0FF0000H and IP to 0FFF0H, while my iAPX 86,88 User's Manual says
that RESET sets CS to 0FFFFh and IP to 0.

~Randy



2003-05-10 17:25:34

by Jos Hulzink

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Saturday 10 May 2003 18:15, Jamie Lokier wrote:
> I just did some Googling and found that there examples of DOS code
> fragments using both vectors. Also, the original IBM BIOS (as they
> say) had a long jump at the vector, which is presumably one of the
> many de facto ABIs which real mode programmers grew to depend on.

The 16 byte code space is very small, and usually only contains that LONG jump
to an usable address space.

When the vector f000:fff0 is used, we can survive BIOSes that use relative
jumps with negative offsets or indirect short jumps instead.

When the vector ffff:0000 is used, the code segment effectively contains only
16 bytes (or someone must abuse the 8086 wraparound), can't think of negative
offset short jumps there. As the code is read-only in this early stage, (BIOS
code is RW after the BIOS copied itself to RAM) self modifying code (which
uses absolute addressing) can be excluded too.

Okay... now, as 386 and newer cpus need a far jump to unlock A20-A31, I think
it is safe to assume all BIOSes will do a far jump as soon as possible, which
means it doesn't matter which vector is used.

For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
vector, unless someone can hand me a paper that says it is wrong.

Jos

2003-05-10 17:58:20

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Jos Hulzink wrote:
> For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
> vector, unless someone can hand me a paper that says it is wrong.

I agree, for the simple reason that it is what the chip does on a
hardware reset signal.

-- Jamie

2003-05-10 18:39:31

by Jos Hulzink

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Saturday 10 May 2003 20:10, Jamie Lokier wrote:
> Jos Hulzink wrote:
> > For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
> > vector, unless someone can hand me a paper that says it is wrong.
>
> I agree, for the simple reason that it is what the chip does on a
> hardware reset signal.

Hmzz... this seems indeed true for the 386, that's the only doc I got at hands
here. Willing to believe that this is the hardware behaviour of all 386 and
newer 32 bit procs.

If this really fixes some issues, I'm eager to see that BIOS code....

Jos

2003-05-11 03:39:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector


On Sat, 10 May 2003, Jamie Lokier wrote:
> Jos Hulzink wrote:
> > For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
> > vector, unless someone can hand me a paper that says it is wrong.
>
> I agree, for the simple reason that it is what the chip does on a
> hardware reset signal.

Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
isn't even really true real mode? I have this memory that the reset value
for a i386 has CS=0xf000, but the shadow base register actually contains
0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
and will fetch the first instruction from physical address 0xfffffff0.

At least that was true on an original 386. It's something that could
easily have changed since.

In other words, you're all wrong. Nyaah, nyaah.

Linus

2003-05-11 07:20:55

by Jos Hulzink

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sunday 11 May 2003 05:50, Linus Torvalds wrote:
> On Sat, 10 May 2003, Jamie Lokier wrote:
> > Jos Hulzink wrote:
> > > For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
> > > vector, unless someone can hand me a paper that says it is wrong.
> >
> > I agree, for the simple reason that it is what the chip does on a
> > hardware reset signal.
>
> Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
> isn't even really true real mode? I have this memory that the reset value
> for a i386 has CS=0xf000, but the shadow base register actually contains
> 0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
> and will fetch the first instruction from physical address 0xfffffff0.
>
> At least that was true on an original 386. It's something that could
> easily have changed since.
>
> In other words, you're all wrong. Nyaah, nyaah.
>
> Linus

Source: 80386 Programmers Reference Manual, Intel (1986)

EIP is set 0000FFF0H
CS is set F000H

After RESET, lines A31-A20 are FORCED high till a far JMP is done.

So, unfortunately we have to say Linus is right once again. Damn ;-) My
conclusion is that we are unable to use the CPU reset as the reference for
warm boots, for we can't control A312-A20 in real mode. But as far as I can
see, my arguments still hold...

Jos
Jos

2003-05-11 13:49:10

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Jos Hulzink wrote:
> On Sunday 11 May 2003 05:50, Linus Torvalds wrote:
> > Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
> > isn't even really true real mode? I have this memory that the reset value
> > for a i386 has CS=0xf000, but the shadow base register actually contains
> > 0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
> > and will fetch the first instruction from physical address 0xfffffff0.
> >
> > At least that was true on an original 386. It's something that could
> > easily have changed since.

I got my info from an article on the net which says that a 386 does
behave as you say, but it is possible for the system designer to
arrange that it boots into the 286-compatible vector at physical
address 0x000ffff0. It states that the feature is specifically so
that system designers don't have to create a "memory hole" (that's as
much detail as it gives).

I can't be arsed to look in a real 386 manual though :)

> Source: 80386 Programmers Reference Manual, Intel (1986)
>
> EIP is set 0000FFF0H
> CS is set F000H
>
> After RESET, lines A31-A20 are FORCED high till a far JMP is done.
>
> So, unfortunately we have to say Linus is right once again. Damn ;-) My
> conclusion is that we are unable to use the CPU reset as the reference for
> warm boots, for we can't control A312-A20 in real mode. But as far as I can
> see, my arguments still hold...

You can set up unreal mode but it is quite fiddly.

-- Jamie

2003-05-11 17:24:14

by Davide Libenzi

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sun, 11 May 2003, Jamie Lokier wrote:

> Jos Hulzink wrote:
> > On Sunday 11 May 2003 05:50, Linus Torvalds wrote:
> > > Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
> > > isn't even really true real mode? I have this memory that the reset value
> > > for a i386 has CS=0xf000, but the shadow base register actually contains
> > > 0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
> > > and will fetch the first instruction from physical address 0xfffffff0.
> > >
> > > At least that was true on an original 386. It's something that could
> > > easily have changed since.
>
> I got my info from an article on the net which says that a 386 does
> behave as you say, but it is possible for the system designer to
> arrange that it boots into the 286-compatible vector at physical
> address 0x000ffff0. It states that the feature is specifically so
> that system designers don't have to create a "memory hole" (that's as
> much detail as it gives).

Guys, mem[0xfffffff0,...] == mem[0x000ffff0,...] since the hw remaps the
bios. Being picky about Intel specs, it should be f000:fff0 though.



- Davide

2003-05-11 17:47:23

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Davide Libenzi <[email protected]> writes:

> On Sun, 11 May 2003, Jamie Lokier wrote:
>
> > Jos Hulzink wrote:
> > > On Sunday 11 May 2003 05:50, Linus Torvalds wrote:
> > > > Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
> > > > isn't even really true real mode? I have this memory that the reset value
> > > > for a i386 has CS=0xf000, but the shadow base register actually contains
> > > > 0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
> > > > and will fetch the first instruction from physical address 0xfffffff0.
> > > >
> > > > At least that was true on an original 386. It's something that could
> > > > easily have changed since.
> >
> > I got my info from an article on the net which says that a 386 does
> > behave as you say, but it is possible for the system designer to
> > arrange that it boots into the 286-compatible vector at physical
> > address 0x000ffff0. It states that the feature is specifically so
> > that system designers don't have to create a "memory hole" (that's as
> > much detail as it gives).
>
> Guys, mem[0xfffffff0,...] == mem[0x000ffff0,...] since the hw remaps the
> bios. Being picky about Intel specs, it should be f000:fff0 though.

The remapping is quite common but it usually happens that after bootup:
0xf0000-0xfffff is shadowed RAM. While 0xffff0000-0xffffffff still points
to the rom chip.

Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
mode I would find that very interesting.

Eric

2003-05-11 17:44:57

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Linus Torvalds <[email protected]> writes:

> On Sat, 10 May 2003, Jamie Lokier wrote:
> > Jos Hulzink wrote:
> > > For the sake of bad behaving BIOSes however, I'd vote for the f000:fff0
> > > vector, unless someone can hand me a paper that says it is wrong.
> >
> > I agree, for the simple reason that it is what the chip does on a
> > hardware reset signal.
>
> Hmm.. Doesnt' a _real_ hardware reset actually use a magic segment that
> isn't even really true real mode? I have this memory that the reset value
> for a i386 has CS=0xf000, but the shadow base register actually contains
> 0xffff0000. In other words, the CPU actually starts up in "unreal" mode,
> and will fetch the first instruction from physical address 0xfffffff0.
>
> At least that was true on an original 386. It's something that could
> easily have changed since.

Correct. And no one has changed it since. I use that all of the time.

> In other words, you're all wrong. Nyaah, nyaah.

However 0xf000:fff0 is as close as you can get, and since that is usually
RAM it can do something different on a reset vs a reboot if it wants
to.

Using 0xf000 is just polite as it allows relative jumps.

Eric

2003-05-11 17:52:40

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Alan Cox <[email protected]> writes:

> On Sad, 2003-05-10 at 04:35, CaT wrote:
> > On Sat, May 10, 2003 at 04:56:34AM +0200, Andi Kleen wrote:
> > > Extensive discussion by various experts on the [email protected]
> > > mailing list concluded that the correct vector to restart an 286+
> > > CPU is f000:fff0, not ffff:0000. Both seem to work on current systems,
> > > but the first is correct.
> >
> > Could this bug, by any chance, cause a system to shutdown instead of
> > rebooting? This is what happens to me at the moment but not each and
> > every time.
>
> Unlikely. But try it and see 8)
>
> At least some SMP boxes freak if you do a poweroff request on CPU != 0

As per the MP spec. The system should reboot on the bootstrap cpu.
smp_processor_id() == 0 on x86. apicid??

I have a patch for this as part my kexec stuff as the kernel freaks
when it doesn't start up on the bootstrap cpu as well. I am busily
cleaning it up so it works in interrupt context as well.

Alan if you want it holler and I can send it to you as well.

Eric

2003-05-11 18:10:55

by Alan

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sul, 2003-05-11 at 19:01, Eric W. Biederman wrote:
> > At least some SMP boxes freak if you do a poweroff request on CPU != 0
>
> As per the MP spec. The system should reboot on the bootstrap cpu.
> smp_processor_id() == 0 on x86. apicid??

APM now makes its calls on CPU#0 which was the trigger for these
problems

2003-05-11 18:09:13

by Davide Libenzi

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sun, 11 May 2003, Eric W. Biederman wrote:

> The remapping is quite common but it usually happens that after bootup:
> 0xf0000-0xfffff is shadowed RAM. While 0xffff0000-0xffffffff still points
> to the rom chip.
>
> Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> mode I would find that very interesting.

Have you ever heard about unreal mode ? But I do not think that a reset
has to start over there. I do not think that exist hw/sw that expect that
reset address to be 0xfffffff0 instead of 0x000ffff0, since they map the
same content.



- Davide

2003-05-11 18:23:56

by Christer Weinigel

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

[email protected] (Eric W. Biederman) writes:

> Davide Libenzi <[email protected]> writes:
> Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> mode I would find that very interesting.

Well, it should be possible to use a trick similar to the BIG REAL or
UNREAL mode. Just load CS with a segment that has a base of
0xffff0000 in protected mode and then jump back to real mode.
Something like this, completely untested of course, should do it:

.align 4
reset_gdt:
.word reset_gdt_end - reset_gdt -1
.long reset_gdt
.word 0

/* 16 bit code segment starting at 0xffff0000 */
.word 0xffff, 0x0000
.byte 0xff, 0x9b, 0x00, 0xff
#endif

reset_gdt_end:

lgdt %cs:reset_gdt
ljmp $ROM_CODE_SEG, 0xfff0

/Christer

--
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se

2003-05-11 18:27:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector


On 11 May 2003, Eric W. Biederman wrote:
>
> Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> mode I would find that very interesting.

You should be able to do it the same way as you enter unreal mode, ie:

- in protected mode cpl0, crate a segment that has index 0xf000 (ie you
need a large GDT for this to work), and has the right attributes (ie
base 0xffff0000, 16-bit, etc).

Make sure you reload the other segments with something sanish and be
16-bit clean.

- clear the PE bit, but do _not_ do the long jump to reload the segment
that intel says you should do - just do a short jump to 0xfff0.

One problem is that the code segment you create this way will have the
right base and size, but it will be non-writeable (no way to create a
writable code segment in protected mode), so it will be different in other
ways.

And because you'll have to do some of the the setup with that new and
inconvenient CS, you'll either have to make the limit be big (and wrap
around EIP in order to first execute code that is in low memory), or
you'll have to play even more tricks and clear both PE and PG at the same
time and just "fall through" to the code at 0xfffffff0.

Sounds like it might work, at least on a few CPU's.

Linus

2003-05-11 18:48:41

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sun, May 11, 2003 at 11:38:34AM -0700, Linus Torvalds wrote:
>
> On 11 May 2003, Eric W. Biederman wrote:
> >
> > Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> > mode I would find that very interesting.
>
> You should be able to do it the same way as you enter unreal mode, ie:
>
> - in protected mode cpl0, crate a segment that has index 0xf000 (ie you
> need a large GDT for this to work), and has the right attributes (ie
> base 0xffff0000, 16-bit, etc).
>
> Make sure you reload the other segments with something sanish and be
> 16-bit clean.
>
> - clear the PE bit, but do _not_ do the long jump to reload the segment
> that intel says you should do - just do a short jump to 0xfff0.
>
> One problem is that the code segment you create this way will have the
> right base and size, but it will be non-writeable (no way to create a
> writable code segment in protected mode), so it will be different in other
> ways.
>
> And because you'll have to do some of the the setup with that new and
> inconvenient CS, you'll either have to make the limit be big (and wrap
> around EIP in order to first execute code that is in low memory), or
> you'll have to play even more tricks and clear both PE and PG at the same
> time and just "fall through" to the code at 0xfffffff0.
>
> Sounds like it might work, at least on a few CPU's.

There's a missing piece of behavior here that's probably fatal.
Namely, the next time the CS descriptor is loaded, even with the same
value, the high bits are lost. So, for example, if you're running BIOS
out of ROM, decompressing it into the top of 20-bit address space,
then long jumping to your uncompressed code, you don't want to find
yourself back in ROM.

Perhaps there's a trick that can be played with loading the descriptor
into CS and then clearing the descriptor table without flushing, but it
sounds rather dubious..

--
Matt Mackall : http://www.selenic.com : of or relating to the moon

2003-05-11 18:55:36

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Alan Cox <[email protected]> writes:

> On Sul, 2003-05-11 at 19:01, Eric W. Biederman wrote:
> > > At least some SMP boxes freak if you do a poweroff request on CPU != 0
> >
> > As per the MP spec. The system should reboot on the bootstrap cpu.
> > smp_processor_id() == 0 on x86. apicid??
>
> APM now makes its calls on CPU#0 which was the trigger for these
> problems

I have a couple of issues with the current state of affairs.
1) We should always do this to be safe.
2) Reboot has this issue as well.
3) The way APM does it overrides the kernel command line option,
and apm_power_off forces the cpu twice.
4) We have this implemented in 3 different ways in 3 different places.
5) machine_reboot needs this to do this in interrupt context for
Sys-req-B and certain cases of panic and that is not currently handled.

On a related note do you know why machine_halt and machine_power_off return?
After shutting everything down that seems very much like the wrong thing
to do.

Eric

2003-05-11 19:00:50

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Linus Torvalds <[email protected]> writes:

> On 11 May 2003, Eric W. Biederman wrote:
> >
> > Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> > mode I would find that very interesting.
>
> You should be able to do it the same way as you enter unreal mode, ie:
>
> - in protected mode cpl0, crate a segment that has index 0xf000 (ie you
> need a large GDT for this to work), and has the right attributes (ie
> base 0xffff0000, 16-bit, etc).
>
> Make sure you reload the other segments with something sanish and be
> 16-bit clean.
>
> - clear the PE bit, but do _not_ do the long jump to reload the segment
> that intel says you should do - just do a short jump to 0xfff0.
>
> One problem is that the code segment you create this way will have the
> right base and size, but it will be non-writeable (no way to create a
> writable code segment in protected mode), so it will be different in other
> ways.

I suspect the fact it is unwritable won't be a real problem. ROM chips
are essentially unwritable. But there will be the behavioral difference
between discarding writes and causing an exception.

> And because you'll have to do some of the the setup with that new and
> inconvenient CS, you'll either have to make the limit be big (and wrap
> around EIP in order to first execute code that is in low memory), or
> you'll have to play even more tricks and clear both PE and PG at the same
> time and just "fall through" to the code at 0xfffffff0.
>
> Sounds like it might work, at least on a few CPU's.

I will have to try it one of these times. I keep wondering if I can call
a BIOS without trigger a reset line.

At the same time I think this is very much not what we want in the reboot
path. As I suspect the existing BIOS if it does anything different will
freak out on us. Jumping to a known location where there is ram sounds much
safer.

Eric

2003-05-11 19:07:11

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Matt Mackall <[email protected]> writes:

> There's a missing piece of behavior here that's probably fatal.
> Namely, the next time the CS descriptor is loaded, even with the same
> value, the high bits are lost. So, for example, if you're running BIOS
> out of ROM, decompressing it into the top of 20-bit address space,
> then long jumping to your uncompressed code, you don't want to find
> yourself back in ROM.
>
> Perhaps there's a trick that can be played with loading the descriptor
> into CS and then clearing the descriptor table without flushing, but it
> sounds rather dubious..

If PE is really disabled that should bit should come for free. And it
is why it is so hard to fake this behavior.

Eric

2003-05-11 19:03:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Davide Libenzi <[email protected]> writes:

> On Sun, 11 May 2003, Eric W. Biederman wrote:
>
> > The remapping is quite common but it usually happens that after bootup:
> > 0xf0000-0xfffff is shadowed RAM. While 0xffff0000-0xffffffff still points
> > to the rom chip.
> >
> > Now if someone could tell me how to do a jump to 0xffff0000:0xfff0 in real
> > mode I would find that very interesting.
>
> Have you ever heard about unreal mode ? But I do not think that a reset
> has to start over there. I do not think that exist hw/sw that expect that
> reset address to be 0xfffffff0 instead of 0x000ffff0, since they map the
> same content.

There is some software at least that knows the difference. I have seen short
jumps in a couple of BIOS's. But a reset is very different from a
reboot. As memory must be reinitialized etc. So I think going to
0xffff0000:0xfff0 would be a very bad idea if the intent is to get a
reliable reboot.


Eric

2003-05-11 20:03:25

by wingel

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

[following up to myself]

Christer Weinigel <[email protected]> writes:

> Well, it should be possible to use a trick similar to the BIG REAL or
> UNREAL mode. Just load CS with a segment that has a base of
> 0xffff0000 in protected mode and then jump back to real mode.
> Something like this, completely untested of course, should do it:
>
> .align 4
> reset_gdt:
> .word reset_gdt_end - reset_gdt -1
> .long reset_gdt
> .word 0
>
> /* 16 bit code segment starting at 0xffff0000 */
> .word 0xffff, 0x0000
> .byte 0xff, 0x9b, 0x00, 0xff

better add the following too:

move.l %cr0, %eax
and.l $~1, %eax
move.l %eax, %cr0

> reset_gdt_end:
>
> lgdt %cs:reset_gdt
> ljmp $ROM_CODE_SEG, 0xfff0

BTW, what does Windows do here? Whatever Windows is using should work
with Linux too.

/Christer

--
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se

2003-05-11 20:11:54

by Davide Libenzi

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Sun, 11 May 2003 [email protected] wrote:

> BTW, what does Windows do here?

I just happen to have the sources under my nose, let me look it up ... :)



- Davide

2003-05-12 00:55:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Followup to: <[email protected]>
By author: Matt Mackall <[email protected]>
In newsgroup: linux.dev.kernel
>
> There's a missing piece of behavior here that's probably fatal.
> Namely, the next time the CS descriptor is loaded, even with the same
> value, the high bits are lost. So, for example, if you're running BIOS
> out of ROM, decompressing it into the top of 20-bit address space,
> then long jumping to your uncompressed code, you don't want to find
> yourself back in ROM.
>

Nope, that's *exactly* the desired behaviour.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-05-12 05:39:42

by Eric W. Biederman

[permalink] [raw]
Subject: [PATCH] always shutdown on the bootstrap processor

Alan Cox <[email protected]> writes:

> On Sul, 2003-05-11 at 19:01, Eric W. Biederman wrote:
> > > At least some SMP boxes freak if you do a poweroff request on CPU != 0
> >
> > As per the MP spec. The system should reboot on the bootstrap cpu.
> > smp_processor_id() == 0 on x86. apicid??
>
> APM now makes its calls on CPU#0 which was the trigger for these
> problems

We have the APM case, we have the reboot case, we have the specifications
that say you should use the bootstrap processor, and I have problems
with kexec if we do anything different.

And looking at other architectures this problem also happens on the alpha,
and I don't know how many others.

reboot/shutdown/halt/kexec all of these are a slow path so is there
any good reason not to unify all of these and always switch back to the
bootstrap processor on shutdown? Just to avoid this kinds of problems?

Things I have taken into consideration.
1) Running multiple processors on x86 is intricately tied to apics so
they both need to be shutdown together. And apic shutdown needs to
happen last because devices potentially need interrupts (at least
timer interrupts for timeouts) to shutdown properly. So I cannot
use the device mode to perform the shutdown.

2) The reboot case needs to be callable from an irq handler.

3) We may have off lined the bootstrap cpu for some reason.

4) The user may have specified an override cpu to shut down on.

With a net of 2 additional lines of code I am able to do what
the current code does today, in one central location, and much
more consistency.

For the author the code compiles and runs, and looks obviously
correct. So obviously it is perfect and ready to go in the kernel :)

This patch also makes machine_restart, machine_halt, and
machine_power_off, all no return functions, so the core kernel does
not have to figure out how to deal with a kernel that is half
shutdown. Turning off a cpu and then calling flush_tlbs can be
entertaining.

Eric

arch/i386/kernel/apic.c | 60 +++++++++++++++++++++
arch/i386/kernel/apm.c | 9 ---
arch/i386/kernel/dmi_scan.c | 27 ---------
arch/i386/kernel/io_apic.c | 2
arch/i386/kernel/reboot.c | 78 ++++++++++++----------------
arch/i386/kernel/smp.c | 26 ---------
include/asm-i386/apic.h | 13 ++++
include/asm-i386/mach-default/mach_reboot.h | 2
include/linux/reboot.h | 7 +-
kernel/panic.c | 2
kernel/sys.c | 4 -
11 files changed, 116 insertions, 114 deletions


diff -uNr linux-2.5.69/arch/i386/kernel/apic.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/apic.c
--- linux-2.5.69/arch/i386/kernel/apic.c Sun May 11 09:08:57 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/apic.c Sun May 11 14:38:36 2003
@@ -25,6 +25,7 @@
#include <linux/interrupt.h>
#include <linux/mc146818rtc.h>
#include <linux/kernel_stat.h>
+#include <linux/reboot.h>

#include <asm/atomic.h>
#include <asm/smp.h>
@@ -37,6 +38,7 @@
#include <mach_apic.h>

#include "io_ports.h"
+#include "mach_reboot.h"

void __init apic_intr_init(void)
{
@@ -1114,6 +1116,64 @@
printk (KERN_INFO "APIC error on CPU%d: %02lx(%02lx)\n",
smp_processor_id(), v , v1);
irq_exit();
+}
+
+
+struct stop_apics {
+ NORET_TYPE void (*rest)(void *info) ATTRIB_NORET;
+ void *info;
+ int reboot_cpu_id;
+};
+
+static void cpu_stop_apics(void *ptr)
+{
+ struct stop_apics *arg = ptr;
+ if (smp_processor_id() != arg->reboot_cpu_id) {
+ local_irq_disable();
+ disable_local_APIC();
+ stop_this_cpu();
+ }
+ local_irq_disable();
+ disable_local_APIC();
+ local_irq_enable();
+
+#if defined(CONFIG_X86_IO_APIC)
+ if (smp_found_config) {
+ disable_IO_APIC();
+ }
+#endif
+ disconnect_bsp_APIC();
+ arg->rest(arg->info);
+}
+
+void stop_apics(NORET_TYPE void(*rest)(void *)ATTRIB_NORET, void *info)
+{
+ /* By resetting the APIC's we disable the nmi watchdog */
+ extern int reboot_cpu;
+ struct stop_apics arg;
+
+ /* The boot cpu is always logical cpu 0 */
+ arg.rest = rest;
+ arg.info = info;
+ arg.reboot_cpu_id = 0;
+
+ /* See if there has been give a command line override .
+ */
+ if ((reboot_cpu != -1) && cpu_possible(reboot_cpu)) {
+ arg.reboot_cpu_id = reboot_cpu;
+ }
+
+ /* Make certain the the cpu I'm rebooting on is online */
+ if (!cpu_online(arg.reboot_cpu_id)) {
+ arg.reboot_cpu_id = smp_processor_id();
+ }
+ /* If we aren't in interrupt context use the scheduler,
+ * so rest will not be called in an interrupt context either.
+ */
+ if (!in_interrupt()) {
+ set_cpus_allowed(current, 1 << arg.reboot_cpu_id);
+ }
+ on_each_cpu(cpu_stop_apics, &arg, 1, 0);
}

/*
diff -uNr linux-2.5.69/arch/i386/kernel/apm.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/apm.c
--- linux-2.5.69/arch/i386/kernel/apm.c Sun May 11 09:09:25 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/apm.c Sun May 11 14:39:11 2003
@@ -911,17 +911,8 @@
/*
* This may be called on an SMP machine.
*/
-#ifdef CONFIG_SMP
- /* Some bioses don't like being called from CPU != 0 */
- if (smp_processor_id() != 0) {
- set_cpus_allowed(current, 1 << 0);
- if (unlikely(smp_processor_id() != 0))
- BUG();
- }
-#endif
if (apm_info.realmode_power_off)
{
- (void)apm_save_cpus();
machine_real_restart(po_bios_call, sizeof(po_bios_call));
}
else
diff -uNr linux-2.5.69/arch/i386/kernel/dmi_scan.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/dmi_scan.c
--- linux-2.5.69/arch/i386/kernel/dmi_scan.c Sun May 11 09:08:57 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/dmi_scan.c Sun May 11 14:40:10 2003
@@ -220,31 +220,6 @@
return 0;
}

-/*
- * Some machines require the "reboot=s" commandline option, this quirk makes that automatic.
- */
-static __init int set_smp_reboot(struct dmi_blacklist *d)
-{
-#ifdef CONFIG_SMP
- extern int reboot_smp;
- if (reboot_smp == 0)
- {
- reboot_smp = 1;
- printk(KERN_INFO "%s series board detected. Selecting SMP-method for reboots.\n", d->ident);
- }
-#endif
- return 0;
-}
-
-/*
- * Some machines require the "reboot=b,s" commandline option, this quirk makes that automatic.
- */
-static __init int set_smp_bios_reboot(struct dmi_blacklist *d)
-{
- set_smp_reboot(d);
- set_bios_reboot(d);
- return 0;
-}

/*
* Some bioses have a broken protected mode poweroff and need to use realmode
@@ -554,7 +529,7 @@
MATCH(DMI_BIOS_VERSION, "4.60 PGMA"),
MATCH(DMI_BIOS_DATE, "134526184"), NO_MATCH
} },
- { set_smp_bios_reboot, "Dell PowerEdge 1300", { /* Handle problems with rebooting on Dell 1300's */
+ { set_bios_reboot, "Dell PowerEdge 1300", { /* Handle problems with rebooting on Dell 1300's */
MATCH(DMI_SYS_VENDOR, "Dell Computer Corporation"),
MATCH(DMI_PRODUCT_NAME, "PowerEdge 1300/"),
NO_MATCH, NO_MATCH
diff -uNr linux-2.5.69/arch/i386/kernel/io_apic.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/io_apic.c
--- linux-2.5.69/arch/i386/kernel/io_apic.c Sun May 11 09:09:25 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/io_apic.c Sun May 11 14:41:26 2003
@@ -1545,8 +1545,6 @@
* Clear the IO-APIC before rebooting:
*/
clear_IO_APIC();
-
- disconnect_bsp_APIC();
}

/*
diff -uNr linux-2.5.69/arch/i386/kernel/reboot.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/reboot.c
--- linux-2.5.69/arch/i386/kernel/reboot.c Sun May 11 09:08:13 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/reboot.c Sun May 11 17:53:34 2003
@@ -8,6 +8,7 @@
#include <linux/interrupt.h>
#include <linux/mc146818rtc.h>
#include <asm/uaccess.h>
+#include <asm/apic.h>
#include "mach_reboot.h"

/*
@@ -19,9 +20,8 @@
static int reboot_mode;
int reboot_thru_bios;

+int reboot_cpu = -1; /* specifies the internal linux cpu id, not the apicid */
#ifdef CONFIG_SMP
-int reboot_smp = 0;
-static int reboot_cpu = -1;
/* shamelessly grabbed from lib/vsprintf.c for readability */
#define is_digit(c) ((c) >= '0' && (c) <= '9')
#endif
@@ -43,12 +43,14 @@
break;
#ifdef CONFIG_SMP
case 's': /* "smp" reboot by executing reset on BSP or other CPU*/
- reboot_smp = 1;
if (is_digit(*(str+1))) {
reboot_cpu = (int) (*(str+1) - '0');
if (is_digit(*(str+2)))
reboot_cpu = reboot_cpu*10 + (int)(*(str+2) - '0');
}
+ if ((reboot_cpu < -1) || (reboot_cpu >= NR_CPUS)) {
+ reboot_cpu = -1;
+ }
/* we will leave sorting out the final value
when we are ready to reboot, since we might not
have set up boot_cpu_id or smp_num_cpu */
@@ -65,6 +67,20 @@

__setup("reboot=", reboot_setup);

+
+void stop_this_cpu(void)
+{
+ /*
+ * Remove this CPU:
+ */
+#if CONFIG_SMP
+ clear_bit(smp_processor_id(), &cpu_online_map);
+#endif
+ if (cpu_data[smp_processor_id()].hlt_works_ok)
+ for(;;) __asm__("hlt");
+ for (;;);
+}
+
/* The following code and data reboots the machine by switching to real
mode and jumping to the BIOS reset entry point, as if the CPU has
really been reset. The previous version asked the keyboard
@@ -213,45 +229,8 @@
: "i" ((void *) (0x1000 - sizeof (real_mode_switch) - 100)));
}

-void machine_restart(char * __unused)
+static void machine_restart_1(void * __unused)
{
-#if CONFIG_SMP
- int cpuid;
-
- cpuid = GET_APIC_ID(apic_read(APIC_ID));
-
- if (reboot_smp) {
-
- /* check to see if reboot_cpu is valid
- if its not, default to the BSP */
- if ((reboot_cpu == -1) ||
- (reboot_cpu > (NR_CPUS -1)) ||
- !(phys_cpu_present_map & (1<<cpuid)))
- reboot_cpu = boot_cpu_physical_apicid;
-
- reboot_smp = 0; /* use this as a flag to only go through this once*/
- /* re-run this function on the other CPUs
- it will fall though this section since we have
- cleared reboot_smp, and do the reboot if it is the
- correct CPU, otherwise it halts. */
- if (reboot_cpu != cpuid)
- smp_call_function((void *)machine_restart , NULL, 1, 0);
- }
-
- /* if reboot_cpu is still -1, then we want a tradional reboot,
- and if we are not running on the reboot_cpu,, halt */
- if ((reboot_cpu != -1) && (cpuid != reboot_cpu)) {
- for (;;)
- __asm__ __volatile__ ("hlt");
- }
- /*
- * Stop all CPUs and turn off local APICs and the IO-APIC, so
- * other OSs see a clean IRQ state.
- */
- smp_send_stop();
- disable_IO_APIC();
-#endif
-
if(!reboot_thru_bios) {
/* rebooting needs to touch the page at absolute addr 0 */
*((unsigned short *)__va(0x472)) = reboot_mode;
@@ -265,14 +244,27 @@

machine_real_restart(jump_to_bios, sizeof(jump_to_bios));
}
+void machine_restart(char * __unused)
+{
+ stop_apics(machine_restart_1, 0);
+}

+static void machine_halt_1(void * __unused)
+{
+ stop_this_cpu();
+}
void machine_halt(void)
{
+ stop_apics(machine_halt_1, 0);
}

-void machine_power_off(void)
+static void machine_power_off_1(void * __unused)
{
if (pm_power_off)
pm_power_off();
+ stop_this_cpu();
+}
+void machine_power_off(void)
+{
+ stop_apics(machine_power_off_1, 0);
}
-
diff -uNr linux-2.5.69/arch/i386/kernel/smp.c linux-2.5.69.reboot_on_bsp/arch/i386/kernel/smp.c
--- linux-2.5.69/arch/i386/kernel/smp.c Sun May 11 09:08:43 2003
+++ linux-2.5.69.reboot_on_bsp/arch/i386/kernel/smp.c Sun May 11 17:54:13 2003
@@ -539,32 +539,6 @@
return 0;
}

-static void stop_this_cpu (void * dummy)
-{
- /*
- * Remove this CPU:
- */
- clear_bit(smp_processor_id(), &cpu_online_map);
- local_irq_disable();
- disable_local_APIC();
- if (cpu_data[smp_processor_id()].hlt_works_ok)
- for(;;) __asm__("hlt");
- for (;;);
-}
-
-/*
- * this function calls the 'stop' function on all other CPUs in the system.
- */
-
-void smp_send_stop(void)
-{
- smp_call_function(stop_this_cpu, NULL, 1, 0);
-
- local_irq_disable();
- disable_local_APIC();
- local_irq_enable();
-}
-
/*
* Reschedule call back. Nothing to do,
* all the work is done automatically when
diff -uNr linux-2.5.69/include/asm-i386/apic.h linux-2.5.69.reboot_on_bsp/include/asm-i386/apic.h
--- linux-2.5.69/include/asm-i386/apic.h Sun May 11 09:08:53 2003
+++ linux-2.5.69.reboot_on_bsp/include/asm-i386/apic.h Sun May 11 17:11:16 2003
@@ -3,6 +3,7 @@

#include <linux/config.h>
#include <linux/pm.h>
+#include <linux/linkage.h>
#include <asm/fixmap.h>
#include <asm/apicdef.h>
#include <asm/system.h>
@@ -99,6 +100,18 @@
#define NMI_LOCAL_APIC 2
#define NMI_INVALID 3

+extern NORET_TYPE void
+stop_apics(NORET_TYPE void (*rest)(void *info) ATTRIB_NORET, void *info)
+ATTRIB_NORET;
+#else
+static inline NORET_TYPE void
+stop_apics(NORET_TYPE void (*rest)(void *info) ATTRIB_NORET, void *info)
+ATTRIB_NORET;
+static inline void
+stop_apics(NORET_TYPE void (*rest)(void *info) ATTRIB_NORET, void *info)
+{
+ rest(info);
+}
#endif /* CONFIG_X86_LOCAL_APIC */

#endif /* __ASM_APIC_H */
diff -uNr linux-2.5.69/include/asm-i386/mach-default/mach_reboot.h linux-2.5.69.reboot_on_bsp/include/asm-i386/mach-default/mach_reboot.h
--- linux-2.5.69/include/asm-i386/mach-default/mach_reboot.h Sun May 11 09:08:36 2003
+++ linux-2.5.69.reboot_on_bsp/include/asm-i386/mach-default/mach_reboot.h Sun May 11 17:12:02 2003
@@ -27,4 +27,6 @@
}
}

+void stop_this_cpu(void);
+
#endif /* !_MACH_REBOOT_H */
diff -uNr linux-2.5.69/include/linux/reboot.h linux-2.5.69.reboot_on_bsp/include/linux/reboot.h
--- linux-2.5.69/include/linux/reboot.h Thu Dec 12 07:41:37 2002
+++ linux-2.5.69.reboot_on_bsp/include/linux/reboot.h Sun May 11 17:12:29 2003
@@ -35,6 +35,7 @@
#ifdef __KERNEL__

#include <linux/notifier.h>
+#include <linux/linkage.h>

extern int register_reboot_notifier(struct notifier_block *);
extern int unregister_reboot_notifier(struct notifier_block *);
@@ -44,9 +45,9 @@
* Architecture-specific implementations of sys_reboot commands.
*/

-extern void machine_restart(char *cmd);
-extern void machine_halt(void);
-extern void machine_power_off(void);
+NORET_TYPE void machine_restart(char *cmd) ATTRIB_NORET;
+NORET_TYPE void machine_halt(void) ATTRIB_NORET;
+NORET_TYPE void machine_power_off(void) ATTRIB_NORET;

#endif

diff -uNr linux-2.5.69/kernel/panic.c linux-2.5.69.reboot_on_bsp/kernel/panic.c
--- linux-2.5.69/kernel/panic.c Sun May 11 09:09:21 2003
+++ linux-2.5.69.reboot_on_bsp/kernel/panic.c Sun May 11 17:13:01 2003
@@ -63,7 +63,7 @@
sys_sync();
bust_spinlocks(0);

-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && !defined(__i386__)
smp_send_stop();
#endif

diff -uNr linux-2.5.69/kernel/sys.c linux-2.5.69.reboot_on_bsp/kernel/sys.c
--- linux-2.5.69/kernel/sys.c Sun May 11 09:09:21 2003
+++ linux-2.5.69.reboot_on_bsp/kernel/sys.c Sun May 11 17:13:24 2003
@@ -415,8 +415,6 @@
device_shutdown();
printk(KERN_EMERG "System halted.\n");
machine_halt();
- unlock_kernel();
- do_exit(0);
break;

case LINUX_REBOOT_CMD_POWER_OFF:
@@ -425,8 +423,6 @@
device_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
- unlock_kernel();
- do_exit(0);
break;

case LINUX_REBOOT_CMD_RESTART2:

2003-05-12 15:23:00

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On 11 May 2003, Eric W. Biederman wrote:

> There is some software at least that knows the difference. I have seen short
> jumps in a couple of BIOS's. But a reset is very different from a
> reboot. As memory must be reinitialized etc. So I think going to
> 0xffff0000:0xfff0 would be a very bad idea if the intent is to get a
> reliable reboot.

You may change a bit in the i8042 controller to make a BIOS assume that's
a cold boot. The bit is zeroed (IIRC; apply a complement if my memory is
bad) upon a system RESET that's propagated to the i8042 (i.e. a power-on
or a button reset, but not a triple-fault or i8042 output port or port
0x92, etc. one). The bit is set to one by a BIOS during POST and never
zeroed afterwards, but it's r/w, so there is no problem to clear it if
needed. This should be quite a reliable way to reboot as a BIOS is
assumed to initialize hardware from scratch (regardless of the reset
vector used).

This assumes 100% PC/AT compatibility, of course, which need not be true
these days any longer.

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +

2003-05-13 06:24:52

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Followup to: <[email protected]>
By author: [email protected] (Eric W. Biederman)
In newsgroup: linux.dev.kernel
>
> There is some software at least that knows the difference. I have seen short
> jumps in a couple of BIOS's. But a reset is very different from a
> reboot. As memory must be reinitialized etc. So I think going to
> 0xffff0000:0xfff0 would be a very bad idea if the intent is to get a
> reliable reboot.
>

I agree.

Jumping to 0xf000:0xfff0 is widely accepted to be a standard warm
reboot (as *should* an INIT, e.g. triplefault, be, as well -- make
sure A20 is enabled before tripping, though.) For quite a few (most?)
BIOSes, the vector that is stored at 0xf000:0xfff0 in the running
(BIOS decompressed and shadowed) configuration is *not* the same as
the one at the RESET vector.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-05-13 12:39:18

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Christer Weinigel wrote:

> BTW, what does Windows do here? Whatever Windows is using should work
> with Linux too.

I've only ever seen NT4/2K do a warm reboot, if that's relevant.

FreeBSD unmaps every page in the machine and then flushes the
TLB as its last-resort reboot attempt. I assume this causes a
triplefault...

2003-05-13 18:32:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Followup to: <[email protected]>
By author: Chuck Ebbert <[email protected]>
In newsgroup: linux.dev.kernel
>
> Christer Weinigel wrote:
>
> > BTW, what does Windows do here? Whatever Windows is using should work
> > with Linux too.
>
> I've only ever seen NT4/2K do a warm reboot, if that's relevant.
>
> FreeBSD unmaps every page in the machine and then flushes the
> TLB as its last-resort reboot attempt. I assume this causes a
> triplefault...
>

So it does. It's easier, though, to set the limit on the IDTR to zero
and then trap.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-05-13 18:50:40

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

On Tue, 13 May 2003, H. Peter Anvin wrote:

> Followup to: <[email protected]>
> By author: Chuck Ebbert <[email protected]>
> In newsgroup: linux.dev.kernel
> >
> > Christer Weinigel wrote:
> >
> > > BTW, what does Windows do here? Whatever Windows is using should work
> > > with Linux too.
> >
> > I've only ever seen NT4/2K do a warm reboot, if that's relevant.
> >
> > FreeBSD unmaps every page in the machine and then flushes the
> > TLB as its last-resort reboot attempt. I assume this causes a
> > triplefault...
> >
>
> So it does. It's easier, though, to set the limit on the IDTR to zero
> and then trap.
>
> -hpa

Don't thing there's anything much easier than:

movl $1, %eax
movl %eax, %cr0

... execute that in paged RAM (above the 1:1 mapping), and you
will get a hard processor reset without any bus access at all.
This unmaps everything in one fell-swoop.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-05-13 19:14:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Use correct x86 reboot vector

Richard B. Johnson wrote:
>
> Don't thing there's anything much easier than:
>
> movl $1, %eax
> movl %eax, %cr0
>
> ... execute that in paged RAM (above the 1:1 mapping), and you
> will get a hard processor reset without any bus access at all.
> This unmaps everything in one fell-swoop.
>

You go back to 1:1 mappings at that point, so you *will* have bus accesses.

-hpa