2001-03-22 10:18:09

by Cabaniols, Sebastien

[permalink] [raw]
Subject: AlphaServer with 4 GB RAM, kernel 2.2.19pre17aa1 patched with big mem... locks for 4 Gbytes, works for 2,6,8 Gbytes

Hello lkml,

I have two machines AlphaServers ES40

machine 1 with 4 Gbytes of RAM
machine 2 with 8 Gbytes of RAM

These two machines work perfectly with Tru64, The RAM is ok
on both of these machines.

1) I recompiled kernel 2.2.19pre17aa1

==> The two machines boot well, but are limited to 2 Gbytes of RAM.

2) So I applyed the patch 2.2.19pre17aa1.b, configured with BIGMEM enabled,
recompiled and reboot.

The two machines boot, see respectively 4 Gbytes and 8 Gbytes of RAM. a
little C program allocating writting and reading to the memory shows it is
really working.

THE PROBLEM:
-------------------------

On the machine with 4 Gbytes the system freezes when doing /bin/lspci
or more /proc/pci (2 equivalent system calls fopen(/proc/pci) ) and not
on the machine with 8 GBytes!

Remark: When I say freezes, I don't see the kernel panic message on the
console and when I was running in SMP multiusers I kept receiving spinlock
messages every 2 minutes:

opensched.c:30 spinlock stuck in httpd at fffffc000032d524(3) owner
lspci at fffffc000035155c(1) read_write.c:43
sched.c:30 spinlock stuck in identd at fffffc000032d524(2) owner
lspci at fffffc000035155c(1) read_write.c:43
sched.c:30 spinlock stuck in identd at fffffc000032d524(0) owner
lspci at fffffc000035155c(1) read_write.c:43

For the following I prefered to go back to Uni Processor kernel in single
user
mode because the system is more simple and the problems are the same:

3) If I try to reboot the 4 Gbytes machine giving mem=2048 M at boot time,
everything works fine again (except the 2 GBytes of RAM unavailable)

4) If I try to reboot the 8 Gbytes machine giving mem=4096 M at boot time,
I have the same behaviour, freezing when doing more /proc/pci , as the 4
Gbytes machine

5) If I try to reboot the 8 Gbytes machine giving mem=2048 M or 6144 M,
everything works perfectly.


SMP/UP:
--------------

I HAVE EXACTLY THE SAME RESULTS WITH SMP/UP kernel so I guess
investigating on Uni processor is simpler.


I have tried to go through the code source of the patch but I am new to
kernel
programming, it looks like there is a switching mecanism around the 4 Gbytes
value for RAM size... that could be it.... I am still investigating.


Any Ideas ?


----------------------------------------------------------------------------
--
Sebastien CABANIOLS
COMPAQ France
HPTC Engineer
CustomSystems & Solutions Annecy
High Performance Technical Computing
Office No. +33 (0)4 50 09 44 10
Fax No. +33 (0)4 50 64 01 39
Email. [email protected]
----------------------------------------------------------------------------
--



2001-03-22 15:28:00

by Cabaniols, Sebastien

[permalink] [raw]
Subject: RE: AlphaServer with 4 GB RAM, kernel 2.2.19pre17aa1 patched with big mem... locks for 4 Gbytes, works for 2,6,8 Gbytes

It seems that having a Myrinet 2k board plugged into any slot of the second
PCI bus
of the ES40 make the system freeze with 4 GBytes of RAM when doing cat
/proc/pci.

I just plugged the board back into one slot of PCI 0 and it works again.

Why does it work with Tru64 and not with Linux ? I don't know.

Thanks to Andrea Arcangeli for helping me understand there was something
with the second PCI bus.


----------------------------------------------------------------------------
--
Sebastien CABANIOLS
COMPAQ France
HPTC Engineer
CustomSystems & Solutions Annecy
High Performance Technical Computing
Office No. +33 (0)4 50 09 44 10
Fax No. +33 (0)4 50 64 01 39
Email. [email protected]
----------------------------------------------------------------------------
--