From: Mike Rapoport <[email protected]>
After the consolidation of early memory reservations introduced by the
commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
the kernel fails to boot if X86_RESERVE_LOW is set to 640K.
The boot fails because real-time trampoline must be allocated under 1M (or
essentially under 640K) but with X86_RESERVE_LOW set to 640K the memory is
already reserved by the time reserve_real_mode() is called.
Before the reordering of the early memory reservations it was possible to
allocate from low memory even despite user's request to avoid using that
memory. This lack of consistency could potentially lead to memory
corruptions by BIOS in the areas allocated by kernel.
Decrease the maximum of X86_RESERVE_LOW range to 512K to allow blocking the
use of most of the low memory by the kernel while still leaving space for
allocations that should be compatible with real mode.
Update the Kconfig help text of X86_RESERVE_LOW to make it explicit that
kernel requires low memory to boot properly.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213177
Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/x86/Kconfig | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0045e1b44190..7a972b77819e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1696,7 +1696,7 @@ config X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK
config X86_RESERVE_LOW
int "Amount of low memory, in kilobytes, to reserve for the BIOS"
default 64
- range 4 640
+ range 4 512
help
Specify the amount of low memory to reserve for the BIOS.
@@ -1711,8 +1711,11 @@ config X86_RESERVE_LOW
You can set this to 4 if you are absolutely sure that you
trust the BIOS to get all its memory reservations and usages
right. If you know your BIOS have problems beyond the
- default 64K area, you can set this to 640 to avoid using the
- entire low memory range.
+ default 64K area, you can set this to 512 to avoid using most
+ of the low memory range.
+
+ Note, that a part of the low memory range is still required for
+ kernel to boot properly.
If you have doubts about the BIOS (e.g. suspend/resume does
not work or there's kernel crashes after certain hardware
base-commit: c4681547bcce777daf576925a966ffa824edd09d
--
2.28.0
On Wed, May 26, 2021 at 11:11:00AM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> After the consolidation of early memory reservations introduced by the
> commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> the kernel fails to boot if X86_RESERVE_LOW is set to 640K.
>
> The boot fails because real-time trampoline must be allocated under 1M (or
> essentially under 640K) but with X86_RESERVE_LOW set to 640K the memory is
> already reserved by the time reserve_real_mode() is called.
>
> Before the reordering of the early memory reservations it was possible to
> allocate from low memory even despite user's request to avoid using that
> memory. This lack of consistency could potentially lead to memory
> corruptions by BIOS in the areas allocated by kernel.
Hmm, so this sounds weird to me: real-time trampoline clearly has
precedence over X86_RESERVE_LOW because you need former to boot the
machine, right?
In that case, real-time trampoline should allocate first and *then* the
rest of low range requested to be reserved should be reserved, no?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, May 26, 2021 at 10:47:21AM +0200, Borislav Petkov wrote:
> On Wed, May 26, 2021 at 11:11:00AM +0300, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > After the consolidation of early memory reservations introduced by the
> > commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> > the kernel fails to boot if X86_RESERVE_LOW is set to 640K.
> >
> > The boot fails because real-time trampoline must be allocated under 1M (or
> > essentially under 640K) but with X86_RESERVE_LOW set to 640K the memory is
> > already reserved by the time reserve_real_mode() is called.
> >
> > Before the reordering of the early memory reservations it was possible to
> > allocate from low memory even despite user's request to avoid using that
> > memory. This lack of consistency could potentially lead to memory
> > corruptions by BIOS in the areas allocated by kernel.
>
> Hmm, so this sounds weird to me: real-time trampoline clearly has
> precedence over X86_RESERVE_LOW because you need former to boot the
> machine, right?
>
> In that case, real-time trampoline should allocate first and *then* the
> rest of low range requested to be reserved should be reserved, no?
We can restore that behaviour, but it feels like cheating to me. We let
user say "Hey, don't touch low memory at all", even though we know we must
use at least some of it. And then we sneak in an allocation under 640K
despite user's request not to use it.
--
Sincerely yours,
Mike.
On Wed, May 26, 2021 at 07:30:09PM +0300, Mike Rapoport wrote:
> We can restore that behaviour, but it feels like cheating to me. We let
> user say "Hey, don't touch low memory at all", even though we know we must
> use at least some of it. And then we sneak in an allocation under 640K
> despite user's request not to use it.
Sure but how are we going to tell the user that if we don't sneak that
allocation, we won't boot at all. I believe user would kinda like the
box to boot still, no? :-)
Yeah, you have that now:
+ Note, that a part of the low memory range is still required for
+ kernel to boot properly.
but then why is 512 ok? And why was 640K the upper limit?
Looking at:
d0cd7425fab7 ("x86, bios: By default, reserve the low 64K for all BIOSes")
and reading that bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=16661
it sounds like it is the amount of memory where BIOS could put crap in.
Long story short, we reserve the first 64K by default so if someone
reserves the total range of 640K the early code could probably say
something like
"adjusting upper reserve limit to X for the real-time trampoline"
when the upper limit is too high so that a trampoline can't fit...
Which is basically what your solution does...
But then the previous behavior used to work everywhere so if it is only
cheating, I don't mind doing that as long as boxes keep on booting.
Or am I missing an aspect?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, May 26, 2021 at 08:14:44PM +0200, Borislav Petkov wrote:
> On Wed, May 26, 2021 at 07:30:09PM +0300, Mike Rapoport wrote:
> > We can restore that behaviour, but it feels like cheating to me. We let
> > user say "Hey, don't touch low memory at all", even though we know we must
> > use at least some of it. And then we sneak in an allocation under 640K
> > despite user's request not to use it.
>
> Sure but how are we going to tell the user that if we don't sneak that
> allocation, we won't boot at all. I believe user would kinda like the
> box to boot still, no? :-)
>
> Yeah, you have that now:
>
> + Note, that a part of the low memory range is still required for
> + kernel to boot properly.
>
> but then why is 512 ok? And why was 640K the upper limit?
Well 640K is well known memory limit :)
And 512k is the closest power of 2 which still leaves plenty of space for
the trampoline.
> Looking at:
>
> d0cd7425fab7 ("x86, bios: By default, reserve the low 64K for all BIOSes")
>
> and reading that bugzilla
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16661
>
> it sounds like it is the amount of memory where BIOS could put crap in.
>
> Long story short, we reserve the first 64K by default so if someone
> reserves the total range of 640K the early code could probably say
> something like
>
> "adjusting upper reserve limit to X for the real-time trampoline"
>
> when the upper limit is too high so that a trampoline can't fit...
>
> Which is basically what your solution does...
>
> But then the previous behavior used to work everywhere so if it is only
> cheating, I don't mind doing that as long as boxes keep on booting.
>
> Or am I missing an aspect?
Another aspect IMHO is that making things explicit would reduce the amount
of hidden dependencies and in the end make x86::setup_arch() less fragile.
I'm looking now also at:
5bc653b73182 ("x86/efi: Allocate a trampoline if needed in efi_free_boot_services()")
that retries the allocation of trampoline when we free EFI services, so
there is also could be a conflict between reserve_real_mode() and
reserve_bios_regions() in case EBDA is too low.
So what we have is
- BIOSes that corrupt low memory
- EBDA of unknown size that can be as low as 128k, so we reserve everything
from EBDA start to 640k because we don't trust BIOSes to report EBDA size
properly
- Real mode blob of about 20-30k that must live in the first 640k
- Build time setting to reserve Xk (4K <= X <= 640k) with the default set
to 64k
- Command line option to reserve Yk (4K <= Y <= 640k), this takes precedence
over the build time option.
- A late fallback that uses memory freed from EFI data to place real mode
trampoline there
It seems to me that we can drop both build time and run time options
entirely, reserve 64k early to avoid having trampoline there and then
always reserve everything below 640k after reserve_real_mode().
The late fallback for systems that have most of low memory busy with
BIOS/EFI will remain intact as it does not do memblock allocation anyway.
--
Sincerely yours,
Mike.
On 5/26/21 11:14 AM, Borislav Petkov wrote:
> On Wed, May 26, 2021 at 07:30:09PM +0300, Mike Rapoport wrote:
>> We can restore that behaviour, but it feels like cheating to me. We let
>> user say "Hey, don't touch low memory at all", even though we know we must
>> use at least some of it. And then we sneak in an allocation under 640K
>> despite user's request not to use it.
>
> Sure but how are we going to tell the user that if we don't sneak that
> allocation, we won't boot at all. I believe user would kinda like the
> box to boot still, no? :-)
>
> Yeah, you have that now:
>
> + Note, that a part of the low memory range is still required for
> + kernel to boot properly.
>
> but then why is 512 ok? And why was 640K the upper limit?
>
> Looking at:
>
> d0cd7425fab7 ("x86, bios: By default, reserve the low 64K for all BIOSes")
>
> and reading that bugzilla
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16661
>
> it sounds like it is the amount of memory where BIOS could put crap in.
>
> Long story short, we reserve the first 64K by default so if someone
> reserves the total range of 640K the early code could probably say
> something like
>
> "adjusting upper reserve limit to X for the real-time trampoline"
>
> when the upper limit is too high so that a trampoline can't fit...
>
> Which is basically what your solution does...
>
> But then the previous behavior used to work everywhere so if it is only
> cheating, I don't mind doing that as long as boxes keep on booting.
>
> Or am I missing an aspect?
>
BIOSes have been known to clobber more than 64K. They aren't supposed to
clobber any.
640K is the limit because that is the address of the EGA/VGA frame
buffer. In the words of Bill Gates "640K ought to be enough for anyone."
-hpa
On Thu, May 27, 2021 at 07:12:51PM -0700, H. Peter Anvin wrote:
> BIOSes have been known to clobber more than 64K. They aren't supposed to
> clobber any.
Yah, the BIOSes and what they're not supposed to do. Like they even care.
> 640K is the limit because that is the address of the EGA/VGA frame
> buffer. In the words of Bill Gates "640K ought to be enough for anyone."
Right.
So thoughts on:
https://lkml.kernel.org/r/YK%[email protected]
?
Time to do what windoze 7 does?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, May 27, 2021 at 04:38:07PM +0300, Mike Rapoport wrote:
> Well 640K is well known memory limit :)
Yah, that I remember - was just wondering why but I guess it was out of
caution to cover all that a BIOS *could* touch, see hpa's reply.
> Another aspect IMHO is that making things explicit would reduce the amount
> of hidden dependencies and in the end make x86::setup_arch() less fragile.
Hohumm.
> I'm looking now also at:
>
> 5bc653b73182 ("x86/efi: Allocate a trampoline if needed in efi_free_boot_services()")
>
> that retries the allocation of trampoline when we free EFI services, so
> there is also could be a conflict between reserve_real_mode() and
> reserve_bios_regions() in case EBDA is too low.
>
> So what we have is
> - BIOSes that corrupt low memory
> - EBDA of unknown size that can be as low as 128k, so we reserve everything
> from EBDA start to 640k because we don't trust BIOSes to report EBDA size
> properly
> - Real mode blob of about 20-30k that must live in the first 640k
> - Build time setting to reserve Xk (4K <= X <= 640k) with the default set
> to 64k
> - Command line option to reserve Yk (4K <= Y <= 640k), this takes precedence
> over the build time option.
> - A late fallback that uses memory freed from EFI data to place real mode
> trampoline there
>
> It seems to me that we can drop both build time and run time options
> entirely, reserve 64k early to avoid having trampoline there and then
> always reserve everything below 640k after reserve_real_mode().
>
> The late fallback for systems that have most of low memory busy with
> BIOS/EFI will remain intact as it does not do memblock allocation anyway.
Yah, I certainly like the simplification. The first 640K seem to be a
minefield anyway and to quote from that bugzilla again:
https://bugzilla.kernel.org/show_bug.cgi?id=16661#c2
"As far as I know, Windows 7 actually reserves all memory below 1 MiB to
avoid BIOS bugs."
so yeah, I think we should do that. But pls put that justification above
in the commit message so that we know why we did it.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Please just do it. It is insane to spend more effort on reclaiming at the very most a few hundred kilobytes in this day and age.
On May 28, 2021 7:43:04 AM PDT, Borislav Petkov <[email protected]> wrote:
>On Thu, May 27, 2021 at 07:12:51PM -0700, H. Peter Anvin wrote:
>> BIOSes have been known to clobber more than 64K. They aren't supposed
>to
>> clobber any.
>
>Yah, the BIOSes and what they're not supposed to do. Like they even
>care.
>
>> 640K is the limit because that is the address of the EGA/VGA frame
>> buffer. In the words of Bill Gates "640K ought to be enough for
>anyone."
>
>Right.
>
>So thoughts on:
>
>https://lkml.kernel.org/r/YK%[email protected]
>
>?
>
>Time to do what windoze 7 does?
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
From: H. Peter Anvin
> Sent: 28 May 2021 03:13
....
> BIOSes have been known to clobber more than 64K. They aren't supposed to
> clobber any.
They probably shouldn't need anything above the base of the DOS
transient program area preserved.
Can't remember where that is though :-(
It is hard enough finding a safe memory area for the MBR
code to relocate itself to before loading the PBR.
Both the MBR and PBR load at the same address - 0xc00.
> 640K is the limit because that is the address of the EGA/VGA frame
> buffer. In the words of Bill Gates "640K ought to be enough for anyone."
I thought the original memory map allocated 512K for memory
and 512k for memory mapped I/O.
No one could afford more then 512K DRAM :-)
The 640K limit appears because nothing was actually mapped
as the bottom of the 'I/O area' so memory could expand up
that far.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Sorry... very few correct answers here.
The load address is 0x7c00. The MBR is relocated to 0x600 by DOS.
MS-DOS doesn't *have* a TPR base, you are thinking of CP/M-80.
The IBM PC memory map reserved 640-768K for video, but the first generation adapters (CGA and MDA) didn't use the bottom 96K/64K (respectively) so it was possible to add a little more RAM. The ROM BIOS wouldn't enumerate it, though, so you had to hack around that. EGA took that away, though, but it seems that virtually no machines including clones had taken advantage of it anyway.
As far as affording it: the first IBM PC 5150 I personally used had 512K RAM. Not cheap, but not unheard of either; this was still before the 64K DRAM market crashed in early 1985.
On May 31, 2021 2:32:22 AM PDT, David Laight <[email protected]> wrote:
>From: H. Peter Anvin
>> Sent: 28 May 2021 03:13
>....
>> BIOSes have been known to clobber more than 64K. They aren't supposed
>to
>> clobber any.
>
>They probably shouldn't need anything above the base of the DOS
>transient program area preserved.
>Can't remember where that is though :-(
>
>It is hard enough finding a safe memory area for the MBR
>code to relocate itself to before loading the PBR.
>Both the MBR and PBR load at the same address - 0xc00.
>
>> 640K is the limit because that is the address of the EGA/VGA frame
>> buffer. In the words of Bill Gates "640K ought to be enough for
>anyone."
>
>I thought the original memory map allocated 512K for memory
>and 512k for memory mapped I/O.
>No one could afford more then 512K DRAM :-)
>
>The 640K limit appears because nothing was actually mapped
>as the bottom of the 'I/O area' so memory could expand up
>that far.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.