2004-10-26 17:42:55

by Lincoln D. Durey

[permalink] [raw]
Subject: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000

Some of you will recall the woefull tale of the IBM T40 with 2GB RAM, and a
not happy pcmcia_cs. http://lkml.org/lkml/2003/8/14/89
This was resolved with a new BIOS from IBM.

However, along the way Linus replied with: http://lkml.org/lkml/2003/8/14/94
and then David Hinds said this:

> I'd bet that your BIOS is mis-configuring the CardBus bridge because
> it can't handle >1GB of RAM. Check 'lspci -v' and see what memory
> addresses the CardBus bridges are using. I bet they are < 0x80000000.

> In theory the kernel could recognize this situation and remap PCI
> devices to sane addresses. That's a problem with the PCI subsystem
> and you'd need to raise that on the linux-kernel mailing list.

So, now we have a new Sony S170 (spiffy ultra-portable laptop) with a
failure to recognize cards when it has 1GB ram installed. And I'm
wondering if anyone wants to tackle having the kernel PCI system remap this
pcmcia socket's memory so it can see cards ?

booting with 1GB ram:

kernel: Linux Kernel Card Services 3.1.22
kernel: options: [pci] [cardbus] [pm]
kernel: Yenta IRQ list 0000, PCI irq9
kernel: Socket status: 00000000

revert to 512MB or 768MB ram, and you get a happy PCMCIA slot:

kernel: Linux Kernel Card Services 3.1.22
kernel: options: [pci] [cardbus] [pm]
kernel: Yenta IRQ list 0cf8, PCI irq9
kernel: Socket status: 30000410

The is completely independent of the kernel (2.4.24 and 2.6.8 have the same
problem (as its really a BIOS problem I think.) Cardbus and PCMCIA cards
are affected equally.

lspci: Sony S170
02:04.0 CardBus bridge: Texas Instruments: Unknown device ac8e

Full logs are available: (dmesg, lspci -xxx, iomem, ioports)
http://www.emperorlinux.com/research/lkml/S170-1GB-pcmcia

-- Lincoln @ EmperorLinux http://www.EmperorLinux.com


2004-10-26 18:28:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000



On Tue, 26 Oct 2004, Lincoln D. Durey wrote:
>
> So, now we have a new Sony S170 (spiffy ultra-portable laptop) with a
> failure to recognize cards when it has 1GB ram installed. And I'm
> wondering if anyone wants to tackle having the kernel PCI system remap this
> pcmcia socket's memory so it can see cards ?
>
> booting with 1GB ram:
>
> kernel: Linux Kernel Card Services 3.1.22
> kernel: options: [pci] [cardbus] [pm]
> kernel: Yenta IRQ list 0000, PCI irq9
> kernel: Socket status: 00000000
>
> revert to 512MB or 768MB ram, and you get a happy PCMCIA slot:

Judging by your /proc/iomem, which looks like this:

00100000-3ff6ffff : System RAM
00100000-002ac223 : Kernel code
002ac224-003ab5ff : Kernel data
3ff70000-3ff703ff : 0000:00:1f.1 [ ed: IDE controller ]
3ff71000-3ff71fff : 0000:02:04.0 [ ed: cardbus ]
3ff71000-3ff71fff : yenta_socket

the 1GB case should have problem with the IDE controller too, if it has
problems with yenta-socket.

But maybe you only use PIO mode, in which case I guess it doesn't matter
if the MMIO regions are scrogged.

Anyway, the problem seems to be that you are doing something bad with the
user-defined RAM map, for some reason that is not obvious at all. Your
bootup clearly shows:

BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000d8000 - 00000000000e0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003ff70000 (usable)
BIOS-e820: 000000003ff70000 - 000000003ff7c000 (ACPI data)
BIOS-e820: 000000003ff7c000 - 000000003ff80000 (ACPI NVS)
BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)
BIOS-e820: 00000000fffff000 - 0000000100000000 (reserved)

which means that the BIOS marks the 0x000000003ff70000 - 0000000040000000
region properly reserved, but you bave overridden this (incorrectly) with:

user-defined physical RAM map:
user: 0000000000000000 - 000000000009f800 (usable)
user: 000000000009f800 - 00000000000a0000 (reserved)
user: 00000000000d8000 - 00000000000e0000 (reserved)
user: 00000000000e4000 - 0000000000100000 (reserved)
user: 0000000000100000 - 000000003ff70000 (usable)

where your manual user-defined RAM map is crap, because it doesn't mark
the ACPI region as being reserved.

As a result, the kernel doesn't know that there is something there, and
will allocate the Cardbus controller (and IDE MMIO) range in that unusable
range.

So the question is:
- why have you done any user override at all
- and having done so, why aren't the ACPI regions there, marked reserved?

It looks like the BIOS is doing everything right, and the problem is
entirely with the user-defined values..

Linus

2004-10-26 21:59:12

by David Hinds

[permalink] [raw]
Subject: Re: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000

On Tue, Oct 26, 2004 at 11:28:25AM -0700, Linus Torvalds wrote:
>
> So the question is:
> - why have you done any user override at all
> - and having done so, why aren't the ACPI regions there, marked reserved?
>
> It looks like the BIOS is doing everything right, and the problem is
> entirely with the user-defined values..

Maybe "grub" is mucking things up. Though, the grub manual claims
that it should default to not passing a --mem= option to kernels later
than 2.4.18.

http://www.gnu.org/software/grub/manual/html_node/kernel.html

-- Dave

2004-10-26 23:18:46

by Lincoln D. Durey

[permalink] [raw]
Subject: Re: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000

Linus,

> Anyway, the problem seems to be that you are doing something bad with the
> user-defined RAM map, for some reason that is not obvious at all. Your
> bootup clearly shows:

> which means that the BIOS marks the 0x000000003ff70000 - 0000000040000000
> region properly reserved, but you bave overridden this (incorrectly)
> with:

OK, we don't do anything explicit to set the RAM map. so we looked at
setup.c to see where that might get triggered, and it gets turned on by
"mem=". But we don't use mem=... (meanwhile someone runs cat
/proc/cmdline...)

Where did that mem=1048000K come from ? (not me)

well, it must be the boot loader, as the kernel didn't add that, and we
didn't ... looking at the GRUB source ... ARGH: we see in stage2/boot.c in
that big comment about boot proto 2.03 that grub is indeed adding kernel
command line options, (even to 2.4.24 and 2.6.8). How can this be? Their
code says it shouldn't, but it does.

This now works fine with GRUB's --no-mem-option added. Never in all this
time have I seen GRUB trigger this piece of code and write mem= in on its
own. Oh well.

-- Lincoln @ EmperorLinux http://www.EmperorLinux.com

2004-10-26 23:44:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000



On Tue, 26 Oct 2004, Lincoln D. Durey wrote:
>
> well, it must be the boot loader, as the kernel didn't add that, and we
> didn't ... looking at the GRUB source ... ARGH: we see in stage2/boot.c in
> that big comment about boot proto 2.03 that grub is indeed adding kernel
> command line options, (even to 2.4.24 and 2.6.8). How can this be? Their
> code says it shouldn't, but it does.

Ok, good to know that the kernel was correct, but it might be worthwhile
trying to debug why grub thinks it should do it's (incorrect) memory map.
Also, I'd suggest somebody send the grub team a patch to remove the whole
damn mess, I doubt anybody who installs a new bootloader is interested in
installing a buggy one.

Pretty much every kernel has done a better job of memory sizing than grub
seems to do, and I suspect even the "pre-2.4.14" case was just a total bug
in grub, and nothing else.

Linus

2004-10-27 00:11:11

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Sony S170 + 1GB ram => Yenta: ISA IRQ mask 0x0000

On Tue, 26 Oct 2004, Lincoln D. Durey wrote:
>> well, it must be the boot loader, as the kernel didn't add that, and we
>> didn't ... looking at the GRUB source ... ARGH: we see in stage2/boot.c in
>> that big comment about boot proto 2.03 that grub is indeed adding kernel
>> command line options, (even to 2.4.24 and 2.6.8). How can this be? Their
>> code says it shouldn't, but it does.

On Tue, Oct 26, 2004 at 04:44:36PM -0700, Linus Torvalds wrote:
> Ok, good to know that the kernel was correct, but it might be worthwhile
> trying to debug why grub thinks it should do it's (incorrect) memory map.
> Also, I'd suggest somebody send the grub team a patch to remove the whole
> damn mess, I doubt anybody who installs a new bootloader is interested in
> installing a buggy one.
> Pretty much every kernel has done a better job of memory sizing than grub
> seems to do, and I suspect even the "pre-2.4.14" case was just a total bug
> in grub, and nothing else.

The grub mem= is a major screwup. It's worse than that, though. grub
also hardcodes MAXMEM, which it should obtain from the bzImage.


-- wli