2005-10-31 17:05:10

by Protasevich, Natalie

[permalink] [raw]
Subject: RE: [Fastboot] [PATCH] i386: move apic init in init_IRQs

> Vivek Goyal <[email protected]> writes:
> > I have attached a patch with the mail which is now using
> > boot_cpu_physical_apicid to hard set presence of boot cpu
> instead of
> > hard_smp_processor_id(). But the interesting questoin
> remains why BIOS
> > is not reporting the boot cpu.
>
>
> Ok. I don't know if we care but I do know why we were not
> seeing the report from the bios about your boot processor.
> We record information about cpus for up to NR_CPUS, and since
> you had a UP kernel NR_CPUS was one.
>
> From your earlier boot log.
>
> > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled)
> Processor #3 6:10
> > APIC version 17
> > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> Processor #0 6:10
> > APIC version 17
> > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
> > ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> Processor #1 6:10
> > APIC version 17
> > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
> > ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
> Processor #2 6:10
> > APIC version 17
> > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
>
> So it looks like we have this problem completely fixed.
>
> I don't see a good way to ensure that we always record our
> boot apicid when we boot a multiple processor system and only
> use one processor.

Hi Eric,

There is another problem with that patch - it broke ES7000, I kept
getting timer panics. It turned out that check_timer() runs before the
actual APIC destination is set up. The IO-APIC uses
cpu_to_logical_apicid to find the destination - which needs
cpu_2_logical_apicid[] to be filled - which only happens after
processors are booted. At the time when check_timer() runs, it will
always be BAD_APICID (0xFF - broadcast) as the IO-APIC rte destination
for the timer, but ES7000 hardware happened not to support 0xFF so it
panics. I used bios_cpu_apicid[] to bring it up, but
cpu_to_logical_apicid is the only one that is kept up-to-date in the
hotplug case, so I cannot replace it in the cpu_mask_to_apicid().

There are probably some ways to fix this such as one below that I tried
(in mpparse.c):

if (m->mpc_cpuflag & CPU_BOOTPROCESSOR) {
Dprintk(" Bootup CPU\n");
boot_cpu_physical_apicid = m->mpc_apicid;
+ cpu_2_logical_apicid[num_processors] = m->mpc_apicid;
}
it worked, but looks more like a kludge of course. I think IO-APIC
setup has to happen after processors were brought online and so is
check_timer(), if timer is connected through the IO-APIC.

--Natalie



2005-10-31 17:12:32

by Zwane Mwaikambo

[permalink] [raw]
Subject: RE: [Fastboot] [PATCH] i386: move apic init in init_IRQs

On Mon, 31 Oct 2005, Protasevich, Natalie wrote:

> > Vivek Goyal <[email protected]> writes:
> > > I have attached a patch with the mail which is now using
> > > boot_cpu_physical_apicid to hard set presence of boot cpu
> > instead of
> > > hard_smp_processor_id(). But the interesting questoin
> > remains why BIOS
> > > is not reporting the boot cpu.
> >
> >
> > Ok. I don't know if we care but I do know why we were not
> > seeing the report from the bios about your boot processor.
> > We record information about cpus for up to NR_CPUS, and since
> > you had a UP kernel NR_CPUS was one.
> >
> > From your earlier boot log.
> >
> > > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled)
> > Processor #3 6:10
> > > APIC version 17
> > > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> > Processor #0 6:10
> > > APIC version 17
> > > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
> > > ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
> > Processor #1 6:10
> > > APIC version 17
> > > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
> > > ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
> > Processor #2 6:10
> > > APIC version 17
> > > WARNING: NR_CPUS limit of 1 reached. Processor ignored.
> >
> > So it looks like we have this problem completely fixed.
> >
> > I don't see a good way to ensure that we always record our
> > boot apicid when we boot a multiple processor system and only
> > use one processor.
>
> Hi Eric,
>
> There is another problem with that patch - it broke ES7000, I kept
> getting timer panics. It turned out that check_timer() runs before the
> actual APIC destination is set up. The IO-APIC uses
> cpu_to_logical_apicid to find the destination - which needs
> cpu_2_logical_apicid[] to be filled - which only happens after
> processors are booted. At the time when check_timer() runs, it will
> always be BAD_APICID (0xFF - broadcast) as the IO-APIC rte destination
> for the timer, but ES7000 hardware happened not to support 0xFF so it
> panics. I used bios_cpu_apicid[] to bring it up, but
> cpu_to_logical_apicid is the only one that is kept up-to-date in the
> hotplug case, so I cannot replace it in the cpu_mask_to_apicid().
>
> There are probably some ways to fix this such as one below that I tried
> (in mpparse.c):
>
> if (m->mpc_cpuflag & CPU_BOOTPROCESSOR) {
> Dprintk(" Bootup CPU\n");
> boot_cpu_physical_apicid = m->mpc_apicid;
> + cpu_2_logical_apicid[num_processors] = m->mpc_apicid;
> }
> it worked, but looks more like a kludge of course. I think IO-APIC
> setup has to happen after processors were brought online and so is
> check_timer(), if timer is connected through the IO-APIC.

Regarding IOAPIC setup I agree, Eric's patch is causing a few problems;

Total of 2 processors activated (14407.06 BogoMIPS).
checking TSC synchronization across 2 CPUs: passed.
softlockup thread 0 started up.
APIC error on CPU1: 00(40) <====
Brought up 2 CPUs

2005-10-31 17:31:35

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Fastboot] [PATCH] i386: move apic init in init_IRQs

Zwane Mwaikambo <[email protected]> writes:

>
> Regarding IOAPIC setup I agree, Eric's patch is causing a few problems;
>
> Total of 2 processors activated (14407.06 BogoMIPS).
> checking TSC synchronization across 2 CPUs: passed.
> softlockup thread 0 started up.
> APIC error on CPU1: 00(40) <====
> Brought up 2 CPUs

Cool! Bug reports!

Zwane can I get a little more detail or is this just a warning?
I don't have enough information to understand what is happening
on your machine.

Eric

2005-10-31 18:19:51

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Fastboot] [PATCH] i386: move apic init in init_IRQs

"Protasevich, Natalie" <[email protected]> writes:

> Hi Eric,
>
> There is another problem with that patch - it broke ES7000, I kept
> getting timer panics. It turned out that check_timer() runs before the
> actual APIC destination is set up. The IO-APIC uses
> cpu_to_logical_apicid to find the destination - which needs
> cpu_2_logical_apicid[] to be filled - which only happens after
> processors are booted. At the time when check_timer() runs, it will
> always be BAD_APICID (0xFF - broadcast) as the IO-APIC rte destination
> for the timer, but ES7000 hardware happened not to support 0xFF so it
> panics. I used bios_cpu_apicid[] to bring it up, but
> cpu_to_logical_apicid is the only one that is kept up-to-date in the
> hotplug case, so I cannot replace it in the cpu_mask_to_apicid().
>
> There are probably some ways to fix this such as one below that I tried
> (in mpparse.c):
>
> if (m->mpc_cpuflag & CPU_BOOTPROCESSOR) {
> Dprintk(" Bootup CPU\n");
> boot_cpu_physical_apicid = m->mpc_apicid;
> + cpu_2_logical_apicid[num_processors] = m->mpc_apicid;
> }
> it worked, but looks more like a kludge of course. I think IO-APIC
> setup has to happen after processors were brought online and so is
> check_timer(), if timer is connected through the IO-APIC.

The first cpu is brought online much earlier than the rest. So
we just need to setup a table for boot cpu earlier. From the looks
of it mach-es700 won't work if you compile a uniprocessor kernel
for it right now.

We need to do this a little later than in mptable but this should be a fairly
simple one or two line change.

If people keep breaking the subarchitectures by accident we might even
inspire someone to make a comprehensible sub architecture implentation
on x86 one of these days.


Eric

2005-10-31 20:20:42

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [Fastboot] [PATCH] i386: move apic init in init_IRQs

On Mon, 31 Oct 2005, Eric W. Biederman wrote:

> Zwane Mwaikambo <[email protected]> writes:
>
> >
> > Regarding IOAPIC setup I agree, Eric's patch is causing a few problems;
> >
> > Total of 2 processors activated (14407.06 BogoMIPS).
> > checking TSC synchronization across 2 CPUs: passed.
> > softlockup thread 0 started up.
> > APIC error on CPU1: 00(40) <====
> > Brought up 2 CPUs
>
> Cool! Bug reports!
>
> Zwane can I get a little more detail or is this just a warning?
> I don't have enough information to understand what is happening
> on your machine.

I just isolated which patch it was last night so i'm still not sure which
part of it causes problems. The patch in question is;

i386-nmi_watchdog-merge-check_nmi_watchdog-fixes-from-x86_64.patch

This happens on both a dual P2-400 and a 3.6GHz P4 with HT enabled. What
kind of information were you after?