2004-01-22 19:33:07

by Mikael Pettersson

[permalink] [raw]
Subject: [PATCH][2.6] local APIC LVTT init bug

__setup_APIC_LVTT() incorrectly sets i82489DX-only bits
which are reserved in integrated local APICs, causing
problems in some machines. Fixed in this patch by making
this setting conditional.

It's possible these bits don't need to be set on i82489DXs,
but not having this HW for testing I elected to maintain
our current behaviour on these old machines.

/Mikael

diff -ruN linux-2.6.2-rc1/arch/i386/kernel/apic.c linux-2.6.2-rc1.apic-lvtt-init-fix/arch/i386/kernel/apic.c
--- linux-2.6.2-rc1/arch/i386/kernel/apic.c 2003-10-18 11:59:45.000000000 +0200
+++ linux-2.6.2-rc1.apic-lvtt-init-fix/arch/i386/kernel/apic.c 2004-01-22 19:57:37.691617134 +0100
@@ -834,11 +834,13 @@

void __setup_APIC_LVTT(unsigned int clocks)
{
- unsigned int lvtt1_value, tmp_value;
+ unsigned int lvtt_value, tmp_value, ver;

- lvtt1_value = SET_APIC_TIMER_BASE(APIC_TIMER_BASE_DIV) |
- APIC_LVT_TIMER_PERIODIC | LOCAL_TIMER_VECTOR;
- apic_write_around(APIC_LVTT, lvtt1_value);
+ ver = GET_APIC_VERSION(apic_read(APIC_LVR));
+ lvtt_value = APIC_LVT_TIMER_PERIODIC | LOCAL_TIMER_VECTOR;
+ if (!APIC_INTEGRATED(ver))
+ lvtt_value |= SET_APIC_TIMER_BASE(APIC_TIMER_BASE_DIV);
+ apic_write_around(APIC_LVTT, lvtt_value);

/*
* Divide PICLK by 16


2004-01-23 12:06:51

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH][2.6] local APIC LVTT init bug

On Thu, 22 Jan 2004, Mikael Pettersson wrote:

> __setup_APIC_LVTT() incorrectly sets i82489DX-only bits
> which are reserved in integrated local APICs, causing
> problems in some machines. Fixed in this patch by making
> this setting conditional.

Sigh -- why can't designers keep such a trivial backwards
compatibility??? The integrated APIC was said to be backwards compatible
when introduced and so far all implementations used to. What you write
means that has been broken -- could please say which vendor to blame?

> It's possible these bits don't need to be set on i82489DXs,
> but not having this HW for testing I elected to maintain
> our current behaviour on these old machines.

They were originally unset, leading to broken behavior. I introduced the
current settings based on i82489DX specification once the problem was
discovered. The code was checked at the run time on a few systems with
i82489DX APICs.

The timer source has to be correctly selected -- we cannot rely on the
firmware to set it the way we want it as the firmware is free to use the
timer whatever way it wants. Even the power-on/reset setting is
unsuitable as it's the CLK input, straight, and we want it predivided like
with the integrated APIC. The third alternative, the TMBASE input, cannot
be used reliably at all -- it may be simply NC on a given system.

Your proposal is therefore the only correct one.

Maciej

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +

2004-01-23 12:57:43

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [PATCH][2.6] local APIC LVTT init bug

Maciej W. Rozycki writes:
> On Thu, 22 Jan 2004, Mikael Pettersson wrote:
>
> > __setup_APIC_LVTT() incorrectly sets i82489DX-only bits
> > which are reserved in integrated local APICs, causing
> > problems in some machines. Fixed in this patch by making
> > this setting conditional.
>
> Sigh -- why can't designers keep such a trivial backwards
> compatibility??? The integrated APIC was said to be backwards compatible
> when introduced and so far all implementations used to. What you write
> means that has been broken -- could please say which vendor to blame?

The ASUS L3800C was mentioned. I don't know of any others.

> Your proposal is therefore the only correct one.

Thanks for confirming that.

/Mikael

2004-01-23 13:31:37

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH][2.6] local APIC LVTT init bug

On Fri, 23 Jan 2004, Mikael Pettersson wrote:

> > Sigh -- why can't designers keep such a trivial backwards
> > compatibility??? The integrated APIC was said to be backwards compatible
> > when introduced and so far all implementations used to. What you write
> > means that has been broken -- could please say which vendor to blame?
>
> The ASUS L3800C was mentioned. I don't know of any others.

It seems to be P4-based -- I'm pretty sure the integrated APIC behaves
the same way regardless of where its plugged in, so why wouldn't this
problem appear earlier? I've browsed my mailbox and found a patch that
was stated to fix problems on the system involved. But the patch disables
the timer around certain actions -- that may indeed matter for some broken
firmware (especially some SMM crap), but I can't see how these bits could.

That doesn't of course mean your patch shouldn't be applied -- it won't
hurt to be overly careful.

Maciej

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +

2004-01-23 15:52:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH][2.6] local APIC LVTT init bug



On Fri, 23 Jan 2004, Maciej W. Rozycki wrote:
> On Fri, 23 Jan 2004, Mikael Pettersson wrote:
> >
> > The ASUS L3800C was mentioned. I don't know of any others.
>
> It seems to be P4-based -- I'm pretty sure the integrated APIC behaves
> the same way regardless of where its plugged in, so why wouldn't this
> problem appear earlier?

It's entirely possible that the bug isn't in the integrated APIC per se,
but migth be in ACPI/SMM getting confused when it reads the LVTT value and
tries to do something with it. And since the system vendors don't tend
to test with Linux (or test only with a few standard kernels that may not
even have APIC enabled) the code might never have been tested with that
behaviour.

Now quite honestly, I don't know _why_ it would read the value, so that
theory is a pretty weak one, but the point being that it's not absolutely
necessary that the hardware itself be broken. This is the reason we see
most SMM/BIOS bugs - the code just assumes certain states.

Linus

2004-01-23 17:23:32

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH][2.6] local APIC LVTT init bug

On Fri, 23 Jan 2004, Linus Torvalds wrote:

> It's entirely possible that the bug isn't in the integrated APIC per se,
> but migth be in ACPI/SMM getting confused when it reads the LVTT value and
> tries to do something with it. And since the system vendors don't tend
> to test with Linux (or test only with a few standard kernels that may not
> even have APIC enabled) the code might never have been tested with that
> behaviour.

Hmm, but are the timer base selection bits actually flippable in any
integrated APICs? I've never seen them set to anything but "00" in my P5
APICs despite our initialization code.

> Now quite honestly, I don't know _why_ it would read the value, so that
> theory is a pretty weak one, but the point being that it's not absolutely
> necessary that the hardware itself be broken. This is the reason we see
> most SMM/BIOS bugs - the code just assumes certain states.

It constantly amazes me what imaginative ways to trigger failures the
designers of PC firmware find -- given no coupling of OSes to hardware
vendors one cannot assume any particular state of the hardware. This is
especially true with the SMM, which may often get entered at any moment,
beyond control of the OS.

--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +