2002-01-27 16:23:01

by W. Michael Petullo

[permalink] [raw]
Subject: SMP Pentium III, GA-6VXDC7 MoBo. -- 2.4.18-pre7 SMP not working

I have a home-built dual Pentium III computer which does not seem to
want to run recent SMP kernels. The computer is built on a Gigabyte
GA-6VXDC7 motherboard, which is in turn based on a VIA Apollo Pro chip-set.
It is an exclusively SCSI system -- I do not compile any IDE drivers
into my kernel.

Kernel 2.4.12 works fine when compiled with SMP on. However, anything
newer fails to load when compiled with SMP support. In the failing cases,
lilo prints its uncompressing kernel and booting kernel messages followed
by a system hang -- the kernel never prints anything.

Kernel.org
Vanilla CONFIG_SMP=y # CONFIG_SMP is not set
Version SMP Status UP Status
======================================================
2.4.10 SMP works Fine
2.4.11 Wouldn't touch Wouldn't touch
2.4.12 SMP Works Fine
2.4.13 SMP does not boot Fine
2.4.14 Did not try Did not try
2.4.15 Did not try Did not try
2.4.16 SMP does not boot Fine
2.4.17 SMP does not boot Fine
2.4.18-pre7 SMP does not boot Fine

Since the kernel does not even peep an oops message, I'm not sure where
to start debugging. Is anyone else having similar problems?
--
Mike

:wq


2002-01-28 01:46:07

by Petr Vandrovec

[permalink] [raw]
Subject: Re: SMP Pentium III, GA-6VXDC7 MoBo. -- 2.4.18-pre7 SMP not working

On Sun, Jan 27, 2002 at 05:21:50PM +0100, W. Michael Petullo wrote:
> I have a home-built dual Pentium III computer which does not seem to
> want to run recent SMP kernels. The computer is built on a Gigabyte
> GA-6VXDC7 motherboard, which is in turn based on a VIA Apollo Pro chip-set.
> It is an exclusively SCSI system -- I do not compile any IDE drivers
> into my kernel.

Can you open arch/i386/kernel/smpboot.c in your favorite text
editor, locate wakeup_secondary_via_INIT (it has this name
in 2.5.3-pre5), and in this function locate

apic_write_around(APIC_ICR, APIC_DM_STARTUP | (start_eip >> 12));
/* Give the other CPU some time to accept the IPI */
udelay(300);

and try increasing 300 to some bigger value (and make sure that
you are using pristine sources, there must be no printk() between
apic_write_around and udelay()). When I was getting Linux SMP
to work on GA-6VXD7 (it still boots, even with 2.5.3-pre5), I had to
ensure that no bus accesses (and especially PCI write) happen
until secondary CPU is alive. By trial and error I found
that 150us is needed on my motherboard, so I put 300us here.
Maybe 300us is not enough for you, so try increasing this value.

On 6VXD7 if you are not silent after you send startup IPI,
secondary CPU will not execute even first two instructions,
and first CPU will die about 50ms after it sends this
startup IPI.
Best regards,
Petr Vandrovec
[email protected]

2002-02-02 14:13:27

by W. Michael Petullo

[permalink] [raw]
Subject: Re: SMP Pentium III, GA-6VXDC7 MoBo. -- 2.4.18-pre7 SMP not working

> I have a home-built dual Pentium III computer which does not seem to
> want to run recent SMP kernels. The computer is built on a Gigabyte
> GA-6VXDC7 motherboard, which is in turn based on a VIA Apollo Pro chip-set.
> It is an exclusively SCSI system -- I do not compile any IDE drivers
> into my kernel.
>
> Kernel 2.4.12 works fine when compiled with SMP on. However, anything
> newer fails to load when compiled with SMP support. In the failing cases,
> lilo prints its uncompressing kernel and booting kernel messages followed
> by a system hang -- the kernel never prints anything.
>
> Kernel.org
> Vanilla CONFIG_SMP=y # CONFIG_SMP is not set
> Version SMP Status UP Status
> ======================================================
> 2.4.10 SMP works Fine
> 2.4.11 Wouldn't touch Wouldn't touch
> 2.4.12 SMP Works Fine
> 2.4.13 SMP does not boot Fine
> 2.4.14 Did not try Did not try
> 2.4.15 Did not try Did not try
> 2.4.16 SMP does not boot Fine
> 2.4.17 SMP does not boot Fine
> 2.4.18-pre7 SMP does not boot Fine
>
> Since the kernel does not even peep an oops message, I'm not sure where
> to start debugging. Is anyone else having similar problems?

I'm having a lot of trouble debugging this one. Prinks are not being
displayed on the screen, though I know they are being executed. I have
even tried William Lee Irwin's early_printk patch to try and get printks
to display. Apparently the prink buffer is not being flushed this early
in the kernel code?

The only debugging technique I have found is to insert return statements
in order to avoid different sections of code. I find, for example,
that if I make the first statement of the smpboot.c:do_boot_cpu a return,
the kernel will boot (without SMP on, of course).

Obviously, this is not the best technique. I have been able to narrow
the location of the bug down a little with it, but I feel I will be able
to discover little more with this technique.

I'm not sure what else to do without printks.

I would appreciate any tips on debugging smpboot.c.

Thank you.

--
Mike

:wq

2002-02-02 21:45:27

by William Lee Irwin III

[permalink] [raw]
Subject: Re: SMP Pentium III, GA-6VXDC7 MoBo. -- 2.4.18-pre7 SMP not working

On Sat, Feb 02, 2002 at 03:12:41PM +0100, W. Michael Petullo wrote:
> I'm having a lot of trouble debugging this one. Prinks are not being
> displayed on the screen, though I know they are being executed. I have
> even tried William Lee Irwin's early_printk patch to try and get printks
> to display. Apparently the prink buffer is not being flushed this early
> in the kernel code?

printk() flushes the buffers on the exit path, or otherwise the console
code in printk.c is refusing to do anything that early to get around
the bootstrap ordering issues encountered on IA64 (which are truly
arch-generic, though some things coincidentally just work).

I should probably update this so it actually uses the CON_EARLY design
and applies against more recent kernels, and also does not register
things you don't code in explicitly (because of course it's just not
possible to get the bootstrap ordering vs. console output without some
more drivers and registration and unregistration and this logic is not
ever going to be portable).

On Sat, Feb 02, 2002 at 03:12:41PM +0100, W. Michael Petullo wrote:
> The only debugging technique I have found is to insert return statements
> in order to avoid different sections of code. I find, for example,
> that if I make the first statement of the smpboot.c:do_boot_cpu a return,
> the kernel will boot (without SMP on, of course).

I've got a few other tricks up my sleeve -- the output drivers in the
patch are supposed to help some, but there is still more that can be done
to get debugging output out of a machine. I believe that for one reason
or another the console code is refusing to print that early. The driver
defines some callbacks for use by the console code which can be made
non-static and then called directly -- they should simply dump things to
the VGA text buffer with very little interference. They won't provide
convenient string formatting, but sprintf() can be used to help it along.
This is of course not the end of the line either but it may help some.

On Sat, Feb 02, 2002 at 03:12:41PM +0100, W. Michael Petullo wrote:
> Obviously, this is not the best technique. I have been able to narrow
> the location of the bug down a little with it, but I feel I will be able
> to discover little more with this technique.
> I'm not sure what else to do without printks.
> I would appreciate any tips on debugging smpboot.c.
> Thank you.

Let me know if the VGA text buffer stuff I just mentioned helps at all.
I'm also on #kernelnewbies fairly often so flag me down there any time.
And, of course, I'll read the follow-ups. =)


Cheers,
Bill

2002-02-03 14:24:44

by W. Michael Petullo

[permalink] [raw]
Subject: Re: SMP Pentium III, GA-6VXDC7 MoBo. -- 2.4.18-pre7 SMP not working

>> I have a home-built dual Pentium III computer which does not seem to
>> want to run recent SMP kernels. The computer is built on a Gigabyte
>> GA-6VXDC7 motherboard, which is in turn based on a VIA Apollo Pro chip-set.
>> It is an exclusively SCSI system -- I do not compile any IDE drivers
>> into my kernel.
>>
>> Kernel 2.4.12 works fine when compiled with SMP on. However, anything
>> newer fails to load when compiled with SMP support. In the failing cases,
>> lilo prints its uncompressing kernel and booting kernel messages followed
>> by a system hang -- the kernel never prints anything.
>>
>> Kernel.org
>> Vanilla CONFIG_SMP=y # CONFIG_SMP is not set
>> Version SMP Status UP Status
>> ======================================================
>> 2.4.10 SMP works Fine
>> 2.4.11 Wouldn't touch Wouldn't touch
>> 2.4.12 SMP Works Fine
>> 2.4.13 SMP does not boot Fine
>> 2.4.14 Did not try Did not try
>> 2.4.15 Did not try Did not try
>> 2.4.16 SMP does not boot Fine
>> 2.4.17 SMP does not boot Fine
>> 2.4.18-pre7 SMP does not boot Fine
>>
>> Since the kernel does not even peep an oops message, I'm not sure where
>> to start debugging. Is anyone else having similar problems?

> I'm having a lot of trouble debugging this one.
> [...]

Apparently there is some type of conflict introduced in 2.4.13's APM code.
I finally got 2.4.17 to boot with SMP enabled after I disabled APM.
Though APM is not SMP safe, I have been using successfully it for its
power-off feature.

I will look closer at the 2.4.13 APM changes to try and determine what
broke my SMP.

--
Mike

:wq