2002-12-24 16:16:57

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] Workaround for AMD762MPX "mouse" bug


AMD has published a new errata sheet for the AMD762, which describes
the root cause of the infamous "AMD 762 unstable when no PS/2 mouse
connected" bug. The reason is that without a PS/2 mouse the BIOS doesn't
put a data page in front of the VGA buffer at 640K. When the kernel
puts a page cache page there and does busmaster IO with it then the automatic
PCI prefetch from the chipset can hit the VGA buffer and that may cause a hang.

The workaround is to reserve the page directly before 640K if it wasn't
already reserved by the BIOS.

The bug only occurs in newer revisions (B0,B1)

The workaround here is somewhat hackish. We can only reserve the page
in early boot, but at that time there is no easy way to check for the
AMD762's PCI-ID because the PCI subsystem hasn't been initialized yet.

This patch checks later during the pci quirks pass instead
and then tells the user to pass a kernel option - "vgaguard" - in
case of instability. This is not ideal, but probably preferable than
to connect PS/2 mouses to all boxes in a colocated rack. Another
way would be to always reserve that page, but I didn't feel like
punishing everybody just for a hardware bug in a single chipset.

Patch for 2.5.53. Please consider applying.

-Andi


diff -burp linux/arch/i386/kernel/setup.c linux-tmp/arch/i386/kernel/setup.c
--- linux/arch/i386/kernel/setup.c 2002-12-24 16:45:11.000000000 +0100
+++ linux-tmp/arch/i386/kernel/setup.c 2002-12-24 17:19:22.000000000 +0100
@@ -59,6 +59,8 @@ unsigned long mmu_cr4_features;

int acpi_disabled __initdata = 0;

+static int vgaguard __initdata = 0;
+
int MCA_bus;
/* for MCA, but anyone else can use it if they want */
unsigned int machine_id;
@@ -557,6 +559,9 @@ static void __init parse_cmdline_early (
if (c == ' ' && !memcmp(from, "acpi=off", 8))
acpi_disabled = 1;

+ if (c == ' ' && !memcmp(from, "vgaguard", 8))
+ vgaguard = 1;
+
/*
* highmem=size forces highmem to be exactly 'size' bytes.
* This works even on boxes that have no highmem otherwise.
@@ -748,6 +753,10 @@ static unsigned long __init setup_memory
*/
reserve_bootmem(0, PAGE_SIZE);

+ /* work around AMD-762 errata 56 - prefetch into VGA */
+ if (vgaguard)
+ reserve_bootmem(640*1024 - PAGE_SIZE, PAGE_SIZE);
+
#ifdef CONFIG_SMP
/*
* But first pinch a few for the stack/trampoline stuff
diff -burp linux/drivers/pci/quirks.c linux-tmp/drivers/pci/quirks.c
--- linux/drivers/pci/quirks.c 2002-12-24 16:45:14.000000000 +0100
+++ linux-tmp/drivers/pci/quirks.c 2002-12-24 17:19:22.000000000 +0100
@@ -349,6 +349,24 @@ static void __devinit quirk_amd_ioapic(s
}
}

+/* Some AMD 762 chips hang when a PCI busmaster prefetches into the VGA text buffer
+ at 640K. Workaround is to put a suitable guard page before it
+ Connecting an PS/2 mouse has the same effect, in this case the BIOS reserves
+ an data area there. We cannot detect this in early boot, so tell the user to pass
+ an option to work around. In theory it would be possible to check the E820
+ MAP for a suitable guard page and not print it then. */
+static void __devinit quirk_amd_ioapic2(struct pci_dev *dev)
+{
+ u8 rev;
+
+ pci_read_config_byte(dev, PCI_REVISION_ID, &rev);
+ if(rev >= 0x10) /* B0+ */
+ {
+ printk(KERN_WARNING "I/O APIC: AMD762 Errata #56 may be present.\n"
+ KERN_WARNING "In case of instability boot with \"vgaguard\" or connect a PS/2 mouse.\n");
+ }
+}
+
static void __init quirk_ioapic_rmw(struct pci_dev *dev)
{
if (dev->devfn == 0 && dev->bus->number == 0)
@@ -604,6 +622,7 @@ static struct pci_fixup pci_fixups[] __d

{ PCI_FIXUP_FINAL, PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_PCI_MASTER, quirk_mediagx_master },

+ { PCI_FIXUP_FINAL, PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FE_GATE_700C, quirk_amd_ioapic2 },

{ 0 }
};


2002-12-29 14:38:19

by David Balazic

[permalink] [raw]
Subject: Re: [PATCH] Workaround for AMD762MPX "mouse" bug

Andi Kleen ([email protected]) wrote :

> AMD has published a new errata sheet for the AMD762, which describes
> the root cause of the infamous "AMD 762 unstable when no PS/2 mouse
> connected" bug. The reason is that without a PS/2 mouse the BIOS doesn't
> put a data page in front of the VGA buffer at 640K. When the kernel
> puts a page cache page there and does busmaster IO with it then the automatic
> PCI prefetch from the chipset can hit the VGA buffer and that may cause a hang.
>
>
> The workaround is to reserve the page directly before 640K if it wasn't
> already reserved by the BIOS.
>
>
> The bug only occurs in newer revisions (B0,B1)
>
>
> The workaround here is somewhat hackish. We can only reserve the page
> in early boot, but at that time there is no easy way to check for the
> AMD762's PCI-ID because the PCI subsystem hasn't been initialized yet.
>
>
> This patch checks later during the pci quirks pass instead
> and then tells the user to pass a kernel option - "vgaguard" - in
> case of instability. This is not ideal, but probably preferable than
> to connect PS/2 mouses to all boxes in a colocated rack. Another
> way would be to always reserve that page, but I didn't feel like
> punishing everybody just for a hardware bug in a single chipset.
>
>
> Patch for 2.5.53. Please consider applying.

Some suggestions :

- do not tell the user to use the "vgaguard" option if he is already
using it
- change to more informative text :
old :
I/O APIC: AMD762 Errata #56 may be present.
In case of instability boot with "vgaguard" or connect a PS/2 mouse.
new:
I/O APIC: AMD762 Errata #56 may be present.
In case of instability boot with the "vgaguard" kernel boot option or
connect a PS/2 mouse and reboot.

Just connecting a PS/2 mouse on a running system does not help, right ?
:-)

- maybe rename the option to "amd762vgaguard" ?
- also write some docs and put a link to it in the kernel message ?
For now this would be enough :
I/O APIC: AMD762 Errata #56 may be present.
In case of instability boot with the "vgaguard" kernel boot option or
connect a PS/2 mouse and reboot.
See http://www.uwsg.indiana.edu/hypermail/linux/kernel/0212.3/0043.html


Regards,
David Balazic

2002-12-29 16:43:10

by Eric Lammerts

[permalink] [raw]
Subject: Re: [PATCH] Workaround for AMD762MPX "mouse" bug


On Tue, 24 Dec 2002, Andi Kleen wrote:
> This patch checks later during the pci quirks pass instead
> and then tells the user to pass a kernel option - "vgaguard" - in
> case of instability.

Why not create a more general option instead, like
"reserve_mem=4k@636k"?

Eric

2002-12-29 20:02:29

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: [PATCH] Workaround for AMD762MPX "mouse" bug

On Tue, Dec 24, 2002 at 05:25:01PM +0100, Andi Kleen wrote:
> way would be to always reserve that page, but I didn't feel like
> punishing everybody just for a hardware bug in a single chipset.
>
> Patch for 2.5.53. Please consider applying.

That's the wrong way to do it. Workarounds like this need to be automatic,
and with init code sections, there is no excuse not to. Instead of making
the user pass a quirk option, why not reserve the page and then free it if
the errata is not present?

-ben

2002-12-30 00:01:13

by Alan

[permalink] [raw]
Subject: Re: [PATCH] Workaround for AMD762MPX "mouse" bug

On Sun, 2002-12-29 at 14:46, David Balazic wrote:
> Just connecting a PS/2 mouse on a running system does not help, right ?
> :-)

It has to occur at boot. The fix proposed is crap though. Its perfectly
possible to reserve the page at boot up time and give it back later if
the errata is not found and it isnt in the EBDA.


Subject: Re: [PATCH] Workaround for AMD762MPX "mouse" bug

Benjamin LaHaise <[email protected]> writes:

>On Tue, Dec 24, 2002 at 05:25:01PM +0100, Andi Kleen wrote:
>> way would be to always reserve that page, but I didn't feel like
>> punishing everybody just for a hardware bug in a single chipset.
>>
>> Patch for 2.5.53. Please consider applying.

>That's the wrong way to do it. Workarounds like this need to be automatic,
>and with init code sections, there is no excuse not to. Instead of making
>the user pass a quirk option, why not reserve the page and then free it if
>the errata is not present?

That's exactly what I suggested to Andi in private mail and he said
"yes, this would work". So I expect a patch doing exactly this from
him. :-)

(Yes, I could have done it myself but I have neither the chipset or am
I a deep innards kernel hacker).

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH [email protected]

Am Schwabachgrund 22 Fon.: 09131 / 50654-0 [email protected]
D-91054 Buckenhof Fax.: 09131 / 50654-20