2009-03-16 13:48:32

by Ozan Çağlayan

[permalink] [raw]
Subject: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

Hi,

Just compiled and tried to boot it on an HP ProLiant DL580 G5. Here's the
interesting part:

..
-- last_pfn = 0x82ffff max_arch_pfn = 0x1000000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.
------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/mtrr/main.c:1655 mtrr_trim_uncached_memory+0x2a9/0x2cd()
Hardware name: ProLiant DL580 G5
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.29_rc8-115 #1
Call Trace:
[<c013134d>] warn_slowpath+0x71/0xa8
[<c039971c>] ? _spin_unlock_irqrestore+0x19/0x1f
[<c039971c>] ? _spin_unlock_irqrestore+0x19/0x1f
[<c01319c4>] ? release_console_sem+0x185/0x1b2
[<c0131e49>] ? vprintk+0x280/0x2a5
[<c03973da>] ? printk+0xf/0x15
[<c053423f>] mtrr_trim_uncached_memory+0x2a9/0x2cd
[<c052f4bd>] setup_arch+0x439/0x99e
[<c01319c4>] ? release_console_sem+0x185/0x1b2
[<c0131e49>] ? vprintk+0x280/0x2a5
[<c0531b37>] ? __reserve_early+0xe4/0xf8
[<c03973da>] ? printk+0xf/0x15
[<c052b5b6>] start_kernel+0x7b/0x345
[<c052b085>] __init_begin+0x85/0x8d
---[ end trace 4eaa2a86a8e2da22 ]---
update e820 for mtrr
..

This is a quite generic x86 desktop kernel with only the
following differences for the server:

CONFIG_X86_GENERICARCH=y
CONFIG_X86_BIGSMP=y
CONFIG_MCORE2=y
CONFIG_HIGHMEM64G=y
# CONFIG_X86_GENERIC is not set

I didn't want to add full dmesg and config for not flooding the e-mail but
if you need them or other output/information, i can send them.

Thanks,


Ozan Çağlayan


2009-03-16 18:21:22

by Yinghai Lu

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

On Mon, Mar 16, 2009 at 6:48 AM, Ozan ?a?layan <[email protected]> wrote:
> Hi,
>
> Just compiled and tried to boot it on an HP ProLiant DL580 G5. Here's the
> interesting part:
>
> ..
> -- last_pfn = 0x82ffff max_arch_pfn = 0x1000000
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.
> ------------[ cut here ]------------
> WARNING: at arch/x86/kernel/cpu/mtrr/main.c:1655 mtrr_trim_uncached_memory+0x2a9/0x2cd()
> Hardware name: ProLiant DL580 G5
> Modules linked in:
> Pid: 0, comm: swapper Not tainted 2.6.29_rc8-115 #1
> Call Trace:
> ?[<c013134d>] warn_slowpath+0x71/0xa8
> ?[<c039971c>] ? _spin_unlock_irqrestore+0x19/0x1f
> ?[<c039971c>] ? _spin_unlock_irqrestore+0x19/0x1f
> ?[<c01319c4>] ? release_console_sem+0x185/0x1b2
> ?[<c0131e49>] ? vprintk+0x280/0x2a5
> ?[<c03973da>] ? printk+0xf/0x15
> ?[<c053423f>] mtrr_trim_uncached_memory+0x2a9/0x2cd
> ?[<c052f4bd>] setup_arch+0x439/0x99e
> ?[<c01319c4>] ? release_console_sem+0x185/0x1b2
> ?[<c0131e49>] ? vprintk+0x280/0x2a5
> ?[<c0531b37>] ? __reserve_early+0xe4/0xf8
> ?[<c03973da>] ? printk+0xf/0x15
> ?[<c052b5b6>] start_kernel+0x7b/0x345
> ?[<c052b085>] __init_begin+0x85/0x8d
> ---[ end trace 4eaa2a86a8e2da22 ]---
> update e820 for mtrr
> ..
>
> This is a quite generic x86 desktop kernel with only the
> following differences for the server:
>
> CONFIG_X86_GENERICARCH=y
> CONFIG_X86_BIGSMP=y
> CONFIG_MCORE2=y
> CONFIG_HIGHMEM64G=y
> # CONFIG_X86_GENERIC is not set
>
> I didn't want to add full dmesg and config for not flooding the e-mail but
> if you need them or other output/information, i can send them.
>

can you check tip/master?

http://people.redhat.com/mingo/tip.git/readme.txt

and post boot log.

YH

2009-03-16 20:19:30

by Ozan Çağlayan

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

Yinghai Lu wrote:

> can you check tip/master?
>
> http://people.redhat.com/mingo/tip.git/readme.txt
>
> and post boot log.

I compiled tip/master with the same configuration. Here's the head of the dmesg:
(BTW, I'm not getting this backtrace with 2.6.25.20. I know its rather old but maybe
it will help)

---

Initializing cgroup subsys cpuset
Linux version 2.6.29-rc8-tip-tip (root@stinson) (gcc version 4.3.2 (Pardus Linux) ) #1 SMP Mon Mar 16 21:26:14 EET 2009
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
NSC Geode by NSC
Cyrix CyrixInstead
Centaur CentaurHauls
Transmeta GenuineTMx86
Transmeta TransmetaCPU
UMC UMC UMC UMC
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000cfd43000 (usable)
BIOS-e820: 00000000cfd43000 - 00000000cfd4c000 (ACPI data)
BIOS-e820: 00000000cfd4c000 - 00000000cfd4d000 (usable)
BIOS-e820: 00000000cfd4d000 - 00000000d0000000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fed00000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 000000082ffff000 (usable)
DMI 2.5 present.
last_pfn = 0x82ffff max_arch_pfn = 0x1000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
00000-9FFFF write-back
A0000-BFFFF uncachable
C0000-FFFFF write-protect
MTRR variable ranges enabled:
0 base 0000000000 mask 0000000000 write-back
1 base 00CFF00000 mask FFFFF00000 uncachable
2 base 00D0000000 mask FFF0000000 uncachable
3 base 00E0000000 mask FFE0000000 uncachable
4 base 0000004000 mask FFFFFFF000 uncachable
5 base 0000005000 mask FFFFFFF000 uncachable
6 base 0000006000 mask FFFFFFF000 uncachable
7 base 0000007000 mask FFFFFFF000 uncachable
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
e820 update range: 0000000000004000 - 0000000000008000 (usable) ==> (reserved)
e820 update range: 00000000cff00000 - 0000000100000000 (usable) ==> (reserved)
WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.
------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/mtrr/cleanup.c:1079 mtrr_trim_uncached_memory+0x2a9/0x2cd()
Hardware name: ProLiant DL580 G5
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.29-rc8-tip-tip #1
Call Trace:
[<c0133871>] warn_slowpath+0x71/0xa8
[<c011e23a>] ? default_spin_lock_flags+0x9/0xb
[<c03aad3c>] ? _spin_unlock_irqrestore+0x17/0x1b
[<c011e23a>] ? default_spin_lock_flags+0x9/0xb
[<c03aad3c>] ? _spin_unlock_irqrestore+0x17/0x1b
[<c0133edd>] ? release_console_sem+0x176/0x1a3
[<c0134358>] ? vprintk+0x276/0x299
[<c03a87cc>] ? printk+0xf/0x13
[<c054aece>] mtrr_trim_uncached_memory+0x2a9/0x2cd
[<c0545cf5>] setup_arch+0x43c/0x9ab
[<c0133edd>] ? release_console_sem+0x176/0x1a3
[<c0134358>] ? vprintk+0x276/0x299
[<c0134358>] ? vprintk+0x276/0x299
[<c0548759>] ? __reserve_early+0xe4/0xf8
[<c03a87cc>] ? printk+0xf/0x13
[<c0543580>] start_kernel+0x77/0x323
[<c0543085>] __init_begin+0x85/0x8d
---[ end trace 4eaa2a86a8e2da22 ]---
update e820 for mtrr

--

Ozan Çağlayan
<ozan_at_pardus.org.tr>

2009-03-16 22:02:19

by Yinghai Lu

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

Ozan Çağlayan wrote:
> Yinghai Lu wrote:
>
>> can you check tip/master?
>>
>> http://people.redhat.com/mingo/tip.git/readme.txt
>>
>> and post boot log.
>
> I compiled tip/master with the same configuration. Here's the head of the dmesg:
> (BTW, I'm not getting this backtrace with 2.6.25.20. I know its rather old but maybe
> it will help)
>
> ---
>
> Initializing cgroup subsys cpuset
> Linux version 2.6.29-rc8-tip-tip (root@stinson) (gcc version 4.3.2 (Pardus Linux) ) #1 SMP Mon Mar 16 21:26:14 EET 2009
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> NSC Geode by NSC
> Cyrix CyrixInstead
> Centaur CentaurHauls
> Transmeta GenuineTMx86
> Transmeta TransmetaCPU
> UMC UMC UMC UMC
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
> BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 00000000cfd43000 (usable)
> BIOS-e820: 00000000cfd43000 - 00000000cfd4c000 (ACPI data)
> BIOS-e820: 00000000cfd4c000 - 00000000cfd4d000 (usable)
> BIOS-e820: 00000000cfd4d000 - 00000000d0000000 (reserved)
> BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
> BIOS-e820: 00000000fec00000 - 00000000fed00000 (reserved)
> BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
> BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved)
> BIOS-e820: 0000000100000000 - 000000082ffff000 (usable)
> DMI 2.5 present.
> last_pfn = 0x82ffff max_arch_pfn = 0x1000000
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-BFFFF uncachable
> C0000-FFFFF write-protect
> MTRR variable ranges enabled:
> 0 base 0000000000 mask 0000000000 write-back
> 1 base 00CFF00000 mask FFFFF00000 uncachable
> 2 base 00D0000000 mask FFF0000000 uncachable
> 3 base 00E0000000 mask FFE0000000 uncachable
> 4 base 0000004000 mask FFFFFFF000 uncachable
> 5 base 0000005000 mask FFFFFFF000 uncachable
> 6 base 0000006000 mask FFFFFFF000 uncachable
> 7 base 0000007000 mask FFFFFFF000 uncachable
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> e820 update range: 0000000000004000 - 0000000000008000 (usable) ==> (reserved)
> e820 update range: 00000000cff00000 - 0000000100000000 (usable) ==> (reserved)
> WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.
> ------------[ cut here ]------------
> WARNING: at arch/x86/kernel/cpu/mtrr/cleanup.c:1079 mtrr_trim_uncached_memory+0x2a9/0x2cd()
> Hardware name: ProLiant DL580 G5
> Modules linked in:
> Pid: 0, comm: swapper Not tainted 2.6.29-rc8-tip-tip #1
> Call Trace:
> [<c0133871>] warn_slowpath+0x71/0xa8
> [<c011e23a>] ? default_spin_lock_flags+0x9/0xb
> [<c03aad3c>] ? _spin_unlock_irqrestore+0x17/0x1b
> [<c011e23a>] ? default_spin_lock_flags+0x9/0xb
> [<c03aad3c>] ? _spin_unlock_irqrestore+0x17/0x1b
> [<c0133edd>] ? release_console_sem+0x176/0x1a3
> [<c0134358>] ? vprintk+0x276/0x299
> [<c03a87cc>] ? printk+0xf/0x13
> [<c054aece>] mtrr_trim_uncached_memory+0x2a9/0x2cd
> [<c0545cf5>] setup_arch+0x43c/0x9ab
> [<c0133edd>] ? release_console_sem+0x176/0x1a3
> [<c0134358>] ? vprintk+0x276/0x299
> [<c0134358>] ? vprintk+0x276/0x299
> [<c0548759>] ? __reserve_early+0xe4/0xf8
> [<c03a87cc>] ? printk+0xf/0x13
> [<c0543580>] start_kernel+0x77/0x323
> [<c0543085>] __init_begin+0x85/0x8d
> ---[ end trace 4eaa2a86a8e2da22 ]---
> update e820 for mtrr
>


please check

[PATCH] x86: workaround system with strange var MTRR

Impact: don't trim e820 according to wrong mtrr

Ozan report his branded server emit strange warning.
it turns out MTRR is some wrong.

Ignore those strange range, and don't trim e820. just emit one warning about
BIOS

Reported-by: Ozan Çağlayan <ozan_at_pardus.org.tr>
Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/kernel/cpu/mtrr/cleanup.c | 13 ++++++++++++-
arch/x86/kernel/cpu/mtrr/main.c | 16 ++++++++--------
arch/x86/kernel/cpu/mtrr/mtrr.h | 1 +
3 files changed, 21 insertions(+), 9 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/mtrr/cleanup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -186,9 +186,20 @@ x86_get_mtrr_mem_range(struct res_range
type != MTRR_TYPE_WRPROT)
continue;
size = range_state[i].size_pfn;
+ base = range_state[i].base_pfn;
+ if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed &&
+ (mtrr_state.enabled & 1)) {
+ /* var MTRR contain UC below 1M ? skip it*/
+ printk(KERN_WARNING "WARNING: BIOS bug: VAR MTRR "
+ "contains strange UC entry under 1M, check "
+ "with your system vendor!\n");
+ if (base + size < (1<<(20-PAGE_SHIFT)))
+ continue;
+ size -= (1<<(20-PAGE_SHIFT)) - base;
+ base = 1<<(20-PAGE_SHIFT);
+ }
if (!size)
continue;
- base = range_state[i].base_pfn;
subtract_range(range, base, base + size - 1);
}
if (extra_remove_size)
Index: linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.h
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -79,6 +79,7 @@ extern struct mtrr_ops * mtrr_if;

extern unsigned int num_var_ranges;
extern u64 mtrr_tom2;
+extern struct mtrr_state_type mtrr_state;

void mtrr_state_warn(void);
const char *mtrr_attrib_to_str(int x);
Index: linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
@@ -574,7 +574,7 @@ struct mtrr_value {
unsigned long lsize;
};

-static struct mtrr_value mtrr_state[MTRR_MAX_VAR_RANGES];
+static struct mtrr_value mtrr_value[MTRR_MAX_VAR_RANGES];

static int mtrr_save(struct sys_device * sysdev, pm_message_t state)
{
@@ -582,9 +582,9 @@ static int mtrr_save(struct sys_device *

for (i = 0; i < num_var_ranges; i++) {
mtrr_if->get(i,
- &mtrr_state[i].lbase,
- &mtrr_state[i].lsize,
- &mtrr_state[i].ltype);
+ &mtrr_value[i].lbase,
+ &mtrr_value[i].lsize,
+ &mtrr_value[i].ltype);
}
return 0;
}
@@ -594,11 +594,11 @@ static int mtrr_restore(struct sys_devic
int i;

for (i = 0; i < num_var_ranges; i++) {
- if (mtrr_state[i].lsize)
+ if (mtrr_value[i].lsize)
set_mtrr(i,
- mtrr_state[i].lbase,
- mtrr_state[i].lsize,
- mtrr_state[i].ltype);
+ mtrr_value[i].lbase,
+ mtrr_value[i].lsize,
+ mtrr_value[i].ltype);
}
return 0;
}

2009-03-16 22:44:44

by Ozan Çağlayan

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

Yinghai Lu wrote:

> please check
>
> [PATCH] x86: workaround system with strange var MTRR
>

Thanks for your interest.

Oops is now replaced with a warning after applying the patch on top of tip/master.

BTW, do that kind of BIOS bugs have a negative impact on the performance of the system?


I'm sending the head of dmesg. And also I just noticed that there were MTRR related
stuff at the tail of the log buffer(with/without the patch). I'm posting them also:

--

MTRR default type: uncachable
MTRR fixed ranges enabled:
00000-9FFFF write-back
A0000-BFFFF uncachable
C0000-FFFFF write-protect
MTRR variable ranges enabled:
0 base 0000000000 mask 0000000000 write-back
1 base 00CFF00000 mask FFFFF00000 uncachable
2 base 00D0000000 mask FFF0000000 uncachable
3 base 00E0000000 mask FFE0000000 uncachable
4 base 0000004000 mask FFFFFFF000 uncachable
5 base 0000005000 mask FFFFFFF000 uncachable
6 base 0000006000 mask FFFFFFF000 uncachable
7 base 0000007000 mask FFFFFFF000 uncachable
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
e820 update range: 00000000cff00000 - 0000000100000000 (usable) ==> (reserved)
init_memory_mapping: 0000000000000000-00000000379fe000
0000000000 - 0000200000 page 4k
0000200000 - 0037800000 page 2M
0037800000 - 00379fe000 page 4k
...
...
...
...
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ADDRCONF(NETDEV_UP): eth1: link is not ready
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
mtrr: type mismatch for d8000000,4000000 old: write-back new: write-combining
get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable

--

Ozan Çağlayan
<ozan_at_pardus.org.tr>

2009-03-16 22:54:19

by Yinghai Lu

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.

Ozan Çağlayan wrote:
> Yinghai Lu wrote:
>
>> please check
>>
>> [PATCH] x86: workaround system with strange var MTRR
>>
>
> Thanks for your interest.
>
> Oops is now replaced with a warning after applying the patch on top of tip/master.
that is intended...
>
> BTW, do that kind of BIOS bugs have a negative impact on the performance of the system?

should not, FIXED MTRR should override VAR MTRR if fixed mtrr is enabled.

>
>
> I'm sending the head of dmesg. And also I just noticed that there were MTRR related
> stuff at the tail of the log buffer(with/without the patch). I'm posting them also:
>
> --
>
> MTRR default type: uncachable
> MTRR fixed ranges enabled:
> 00000-9FFFF write-back
> A0000-BFFFF uncachable
> C0000-FFFFF write-protect
> MTRR variable ranges enabled:
> 0 base 0000000000 mask 0000000000 write-back
> 1 base 00CFF00000 mask FFFFF00000 uncachable
> 2 base 00D0000000 mask FFF0000000 uncachable
> 3 base 00E0000000 mask FFE0000000 uncachable
> 4 base 0000004000 mask FFFFFFF000 uncachable
> 5 base 0000005000 mask FFFFFFF000 uncachable
> 6 base 0000006000 mask FFFFFFF000 uncachable
> 7 base 0000007000 mask FFFFFFF000 uncachable
> x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!

you got four wrong entries

> e820 update range: 00000000cff00000 - 0000000100000000 (usable) ==> (reserved)
> init_memory_mapping: 0000000000000000-00000000379fe000
> 0000000000 - 0000200000 page 4k
> 0000200000 - 0037800000 page 2M
> 0037800000 - 00379fe000 page 4k
> ...
> ...
> ...
> ...
> NET: Registered protocol family 10
> lo: Disabled Privacy Extensions
> ADDRCONF(NETDEV_UP): eth1: link is not ready
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
> ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> mtrr: type mismatch for d8000000,4000000 old: write-back new: write-combining
> get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
>

you may try MTRR cleanup
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1

it should clear the strange entries and find some spare one for your driver that need ...
it will get some performance improvement.

or talk to your system vendor to get a new BIOS.

YH

2009-03-16 23:36:26

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH] x86: workaround system with stange var MTRR -v2


Impact: don't trim e820 according to wrong mtrr

Ozan report his branded server emit strange warning.
it turns out MTRR is some wrong.

Ignore those strange range, and don't trim e820. just emit one warning about
BIOS

Reported-by: Ozan Çağlayan <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/kernel/cpu/mtrr/cleanup.c | 11 +++++++++++
arch/x86/kernel/cpu/mtrr/main.c | 16 ++++++++--------
arch/x86/kernel/cpu/mtrr/mtrr.h | 1 +
3 files changed, 20 insertions(+), 8 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/mtrr/cleanup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -189,6 +189,17 @@ x86_get_mtrr_mem_range(struct res_range
if (!size)
continue;
base = range_state[i].base_pfn;
+ if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed &&
+ (mtrr_state.enabled & 1)) {
+ /* var MTRR contain UC below 1M ? skip it*/
+ printk(KERN_WARNING "WARNING: BIOS bug: VAR MTRR "
+ "contains strange UC entry under 1M, check "
+ "with your system vendor!\n");
+ if (base + size <= (1<<(20-PAGE_SHIFT)))
+ continue;
+ size -= (1<<(20-PAGE_SHIFT)) - base;
+ base = 1<<(20-PAGE_SHIFT);
+ }
subtract_range(range, base, base + size - 1);
}
if (extra_remove_size)
Index: linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.h
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -79,6 +79,7 @@ extern struct mtrr_ops * mtrr_if;

extern unsigned int num_var_ranges;
extern u64 mtrr_tom2;
+extern struct mtrr_state_type mtrr_state;

void mtrr_state_warn(void);
const char *mtrr_attrib_to_str(int x);
Index: linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
@@ -574,7 +574,7 @@ struct mtrr_value {
unsigned long lsize;
};

-static struct mtrr_value mtrr_state[MTRR_MAX_VAR_RANGES];
+static struct mtrr_value mtrr_value[MTRR_MAX_VAR_RANGES];

static int mtrr_save(struct sys_device * sysdev, pm_message_t state)
{
@@ -582,9 +582,9 @@ static int mtrr_save(struct sys_device *

for (i = 0; i < num_var_ranges; i++) {
mtrr_if->get(i,
- &mtrr_state[i].lbase,
- &mtrr_state[i].lsize,
- &mtrr_state[i].ltype);
+ &mtrr_value[i].lbase,
+ &mtrr_value[i].lsize,
+ &mtrr_value[i].ltype);
}
return 0;
}
@@ -594,11 +594,11 @@ static int mtrr_restore(struct sys_devic
int i;

for (i = 0; i < num_var_ranges; i++) {
- if (mtrr_state[i].lsize)
+ if (mtrr_value[i].lsize)
set_mtrr(i,
- mtrr_state[i].lbase,
- mtrr_state[i].lsize,
- mtrr_state[i].ltype);
+ mtrr_value[i].lbase,
+ mtrr_value[i].lsize,
+ mtrr_value[i].ltype);
}
return 0;
}

2009-03-17 09:45:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: [BUG 2.6.29_rc8] BIOS Bug: CPU MTRRs don't cover all of memory, losing 0MB of RAM.


* Yinghai Lu <[email protected]> wrote:

> > x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> > get_mtrr: cpu0 reg00 base=0000000000 size=0010000000 write-back
> > get_mtrr: cpu0 reg01 base=00000cff00 size=0000000100 uncachable
> > get_mtrr: cpu0 reg02 base=00000d0000 size=0000010000 uncachable
> > get_mtrr: cpu0 reg03 base=00000e0000 size=0000020000 uncachable
> > get_mtrr: cpu0 reg04 base=0000000004 size=0000000001 uncachable
> > get_mtrr: cpu0 reg05 base=0000000005 size=0000000001 uncachable
> > get_mtrr: cpu0 reg06 base=0000000006 size=0000000001 uncachable
> > get_mtrr: cpu0 reg07 base=0000000007 size=0000000001 uncachable
> > WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> > WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> > WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
> > WARNING: BIOS bug: VAR MTRR contains strange UC entry under 1M, check with your system vendor!
>
> you got four wrong entries

but it's unintuitive. That warning should specify which entry is
wrong, so that we dont get these pointless duplicated lines.

Ingo