[Background]
MP specification defines three different interrupt delivery modes as follows:
1. PIC Mode
2. Virtual Wire Mode
3. Symmetric I/O Mode
They will be setup in the different periods of booting time:
1. *PIC Mode*, the default interrupt delivery modes, will be set first.
2. *Virtual Wire Mode* will be setup during ISA IRQ initialization( step 1
in the figure.1).
3. *Symmetric I/O Mode*'s setup is related to the system
3.1 In SMP-capable system, setup during prepares CPUs(step 2)
3.2 In UP system, setup during initializes itself(step 3).
start_kernel
+---------------+
|
+--> .......
|
| setup_arch
+--> +-------+
|
| init_IRQ
+-> +--+-----+
| | init_ISA_irqs
| +------> +-+--------+
| | +----------------+
+---> +------> | 1.init_bsp_APIC|
| ....... +----------------+
+--->
| rest_init
+--->---+-----+
| | kernel_init
| +> ----+-----+
| | kernel_init_freeable
| +-> ----+-------------+
| | smp_prepare_cpus
| +---> +----+---------+
| | | +-------------------+
| | +-> |2. apic_bsp_setup |
| | +-------------------+
| |
v | smp_init
+---> +---+----+
| +-------------------+
+--> |3. apic_bsp_setup |
+-------------------+
figure.1 The flow chart of the kernel startup process
[Problem]
1. Cause kernel in an unmatched mode at the beginning of booting time.
2. Cause the dump-capture kernel hangs with 'notsc' option inherited
from 1st kernel option.
3. Cause the code hard to read and maintain.
As Ingo's and Eric's discusses[1,2], it need to be refactor.
[Solution]
1. Construct a selector to unify these switches
+------------+
|disable_apic+--------------------+
+------------+ true |
|false |
| |
+------------v------------------+ |
|!boot_cpu_has(X86_FEATURE_APIC)+-------+
+-------------------------------+ true |
|false |
| |
+-------v---------+ v
|!smp_found_config| PIC MODE
+---------------+-+
|false |true
| |
v +---v---------+
SYMMETRIC IO MODE | !acpi_lapic |
+------+------+
|
v
VIRTUAL WIRE MODE
2. Unifying these setup steps of SMP-capable and UP system
start_kernel
---------------+
|
|
|
| x86_late_time_init
+---->---+------------+
| |
| | +------------------------+
| +----> | 4. init_interrupt_mode |
| +------------------------+
v
3. Execute the function as soon as possible.
[Test]
1. In a theoretical code analysis, the patchset can wrap the original
logic.
1) The original logic of the interrupt delivery mode setup:
-Step O_1) Keep in PIC mode or virtual wire mode:
Check (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC))
true: PIC mode
false: virtual wire mode
-Step O_2) Try to switch to symmetric IO mode:
O_2_1) In up system:
-Check disable_apic
ture: *O_S_1* (original situation 1)
-Check whether there is a separate or integrated chip
don't has: *O_S_2*
-Check !smp_found_config
ture: *O_S_3*
-Others:
*O_S_4*
O_2_2) In smp-capable system:
-Check !smp_found_config && !acpi_lapic
true: goto *O_2_1)*
-Check if it is LAPIC
don't has: *O_S_5*
-Check !max_cpus
true: *O_S_6*
-read_apic_id() != boot_cpu_physical_apicid
true: *O_S_7*
-Others:
*O_S_8*
2) After that patchset, the new logic:
-Step N_1) Skip step O_1 and try to switch to the final interrupt mode
-Check disable_apic
ture: *N_S_1* (New situation 1)
-Check whether there is a separate or integrated chip
ture: *N_S_2*
-Check if (!smp_found_config)
ture: *N_S_3*
-Check !setup_max_cpus
ture: *N_S_4*
-Check read_apic_id() != boot_cpu_physical_apicid
ture: *N_S_5*
-Others:
*N_S_6*
O_S_1 is covered in N_S_1
O_S_2 is covered in N_S_2
O_S_3 is covered in N_S_3
O_S_4 is covered in N_S_6
O_S_5 is covered in N_S_2
O_S_6 is covered in N_S_4
O_S_7 is covered in N_S_5
O_S_8 is covered in N_S_6
2. In the actual test, It also can work well in the situations of
my test matrix
The factors of test matrix:
X86 | SMP |LOCAL APIC|I/O APIC|UP_LATE_INIT|
----- |-----|----------|--------|------------|
32-bit| Y | Y | Y | Y |
64-bit| N | N | N | N |
xen PV |
xen HVM|
disable_apic|X86_FEATURE_APIC|smp_found_config|
------------|----------------|----------------|
0 | 0 | 0 |
1 | 1 | 1 |
acpi_lapic|acpi_ioapic|setup_max_cpus|
----------|-----------|--------------|
0 | 0 | =0 |
1 | 1 | >0 |
[Link]
[1]. https://lkml.org/lkml/2016/8/2/929
[2]. https://lkml.org/lkml/2016/8/1/506
For previous discussion, please refer to:
https://lkml.org/lkml/2017/5/10/323
https://www.spinics.net/lists/kernel/msg2491620.html
https://lkml.org/lkml/2017/3/29/481
https://lkml.org/lkml/2017/6/30/17
Changes V5 --> V6:
- change the check order for X86_32 in apic_intr_mode_select()
- replace the apic_printk with pr_info in apic_intr_mode_init()
- add a seperate helper function for get the logical apicid
- remove the extra argument upmode in apic_intr_mode_select()
- cleanup the logic of apic_intr_mode_init()
- replcae the 'ticks = jiffies' with 'end = jiffies + 4'
- rewrite the 9th and 10th patches's changelog
Changes V4 --> V5:
- remove the RFC presix
- remove the 1/12 patch in V4
- merge 2 patches together for SMP-capable system
- replace the *_interrupt_* with *_intr_*
- replace the pr_info with apic_printk in apic_intr_mode_init()
- add a patch for PV xen to bypass intr_mode_init()
Changes V3 --> V4:
- Move interrupt_mode_init to x86_init_ops instead of the use of
header files
- Replace "return" with "break" in case of APIC_SYMMETRIC_IO_NO_ROUTING
- Setup upmode earlier for UP system.
- Check interrupt mode before per cpu clock event setup.
Changes V2 --> V3:
- Rebase the patches.
- Change two function name:
apic_bsp_mode_check --> apic_interrupt_mode_select
init_interrupt_mode --> apic_interrupt_mode_init
- Find a new waiting way to check whether timer IRQs work or not
- Refine the switch logic in apic_interrupt_mode_init()
- Consistently start sentences with upper case letters
- Fix some typos and comments
- Try my best to rewrite some changelog again
Changes since V1:
- Move the initialization from init_IRQ() to x86_late_time_init()
- Use a threshold to refactor the check logic in timer_irq_works()
- Rename the framework to a selector
- Split two patches
- Consistently start sentences with upper case letters
- Fix some typos
- Rewrite the changelog
Dou Liyang (12):
x86/apic: Construct a selector for the interrupt delivery mode
x86/apic: Prepare for unifying the interrupt delivery modes setup
x86/apic: Split local APIC timer setup from the APIC setup
x86/apic: Move logical APIC ID away from apic_bsp_setup()
x86/apic: Unify interrupt mode setup for SMP-capable system
x86/apic: Mark the apic_intr_mode extern for sanity check cleanup
x86/apic: Unify interrupt mode setup for UP system
x86/ioapic: Refactor the delay logic in timer_irq_works()
x86/init: add intr_mode_init to x86_init_ops
x86/xen: Bypass intr mode setup in enlighten_pv system
x86/time: Initialize interrupt mode behind timer init
x86/apic: Remove the init_bsp_APIC()
arch/x86/include/asm/apic.h | 15 +++-
arch/x86/include/asm/x86_init.h | 2 +
arch/x86/kernel/apic/apic.c | 190 +++++++++++++++++++++-------------------
arch/x86/kernel/apic/io_apic.c | 45 +++++++++-
arch/x86/kernel/irqinit.c | 3 -
arch/x86/kernel/smpboot.c | 80 +++++------------
arch/x86/kernel/time.c | 5 ++
arch/x86/kernel/x86_init.c | 1 +
arch/x86/xen/enlighten_pv.c | 1 +
9 files changed, 187 insertions(+), 155 deletions(-)
--
2.5.5
There are three positions for initializing the interrupt delivery
modes:
1) In IRQ initial function, may setup the through-local-APIC
virtual wire mode.
2) In an SMP-capable system, will try to switch to symmetric I/O
model when preparing the cpus in native_smp_prepare_cpus().
3) In UP system with UP_LATE_INIT=y, will set up local APIC and
I/O APIC in smp_init().
Switching to symmetric I/O mode is so late, which causes kernel
in an unmatched mode at the beginning of booting time. And it
causes the dump-capture kernel hangs with 'notsc' option inherited
from 1st kernel option.
Provide a new function to unify that three positions. Preparatory
patch to initialize an interrupt mode directly.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/apic.h | 2 ++
arch/x86/kernel/apic/apic.c | 16 ++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index bdffcd9..ddc16ff 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -128,6 +128,7 @@ extern void disable_local_APIC(void);
extern void lapic_shutdown(void);
extern void sync_Arb_IDs(void);
extern void init_bsp_APIC(void);
+extern void apic_intr_mode_init(void);
extern void setup_local_APIC(void);
extern void init_apic_mappings(void);
void register_lapic_address(unsigned long address);
@@ -170,6 +171,7 @@ static inline void disable_local_APIC(void) { }
# define setup_boot_APIC_clock x86_init_noop
# define setup_secondary_APIC_clock x86_init_noop
static inline void lapic_update_tsc_freq(void) { }
+static inline void apic_intr_mode_init(void) { }
#endif /* !CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_X2APIC
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ad5a853..3f22717 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1265,6 +1265,22 @@ void __init init_bsp_APIC(void)
apic_write(APIC_LVT1, value);
}
+/* Init the interrupt delivery mode for the BSP */
+void __init apic_intr_mode_init(void)
+{
+ switch (apic_intr_mode_select()) {
+ case APIC_PIC:
+ pr_info("APIC: keep in PIC mode(8259)\n");
+ return;
+ case APIC_VIRTUAL_WIRE:
+ pr_info("APIC: switch to virtual wire mode setup\n");
+ return;
+ case APIC_SYMMETRIC_IO:
+ pr_info("APIC: switch to symmectic I/O mode setup\n");
+ return;
+ }
+}
+
static void lapic_setup_esr(void)
{
unsigned int oldvalue, value, maxlvt;
--
2.5.5
apic_bsp_setup() sets and returns logical APIC ID for initializing
cpu0_logical_apicid in SMP-capable system.
The id has nothing to do with the initialization of local APIC and
I/O APIC. And apic_bsp_setup() should be called for interrupt mode
setup intently.
Move the id setup into a separate helper function for cleanup and
mark apic_bsp_setup() void.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/apic.h | 2 +-
arch/x86/kernel/apic/apic.c | 10 +---------
arch/x86/kernel/smpboot.c | 12 +++++++++++-
3 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index ddc16ff..c3bedbd 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -146,7 +146,7 @@ static inline int apic_force_enable(unsigned long addr)
extern int apic_force_enable(unsigned long addr);
#endif
-extern int apic_bsp_setup(bool upmode);
+extern void apic_bsp_setup(bool upmode);
extern void apic_ap_setup(void);
/*
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 83b7c2e..51536b9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2337,25 +2337,17 @@ static void __init apic_bsp_up_setup(void)
* Returns:
* apic_id of BSP APIC
*/
-int __init apic_bsp_setup(bool upmode)
+void __init apic_bsp_setup(bool upmode)
{
- int id;
-
connect_bsp_APIC();
if (upmode)
apic_bsp_up_setup();
setup_local_APIC();
- if (x2apic_mode)
- id = apic_read(APIC_LDR);
- else
- id = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
-
enable_IO_APIC();
end_local_APIC_setup();
irq_remap_enable_fault_handling();
setup_IO_APIC();
- return id;
}
/*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 93f0cda..3f74288 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1287,6 +1287,14 @@ static void __init smp_cpu_index_default(void)
}
}
+static void __init smp_get_logical_apicid(void)
+{
+ if (x2apic_mode)
+ cpu0_logical_apicid = apic_read(APIC_LDR);
+ else
+ cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
+}
+
/*
* Prepare for SMP bootup. The MP table or ACPI has been read
* earlier. Just do some sanity checking here and enable APIC mode.
@@ -1347,11 +1355,13 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
}
default_setup_apic_routing();
- cpu0_logical_apicid = apic_bsp_setup(false);
+ apic_bsp_setup(false);
/* Setup local timer */
x86_init.timers.setup_percpu_clockev();
+ smp_get_logical_apicid();
+
pr_info("CPU0: ");
print_cpu_info(&cpu_data(0));
--
2.5.5
Kernel use timer_irq_works() to detects the timer IRQs. It calls
mdelay(10) to delay ten ticks and check whether the timer IRQ work
or not. The mdelay() depends on the loops_per_jiffy which is set up
in calibrate_delay(). Current kernel defaults the IRQ 0 is available
when it calibrates delay.
But it is wrong in the dump-capture kernel with 'notsc' option inherited
from 1st kernel option. dump-capture kernel can't make sure the timer IRQ
works well.
The correct design is making the interrupt mode setup and checking timer
IRQ works in advance of calibrate_delay(). That results in the mdelay()
being unusable in timer_irq_works().
Preparatory patch to make the setup in advance. Refactor the delay logic
by waiting for some cycles. In the system with X86_FEATURE_TSC feature,
Use rdtsc(), others will call __delay() directly.
Note: regard 4G as the max CPU frequence of current single CPU.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/kernel/apic/io_apic.c | 45 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 43 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 347bb9f..3087f0a 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1607,6 +1607,43 @@ static int __init notimercheck(char *s)
}
__setup("no_timer_check", notimercheck);
+static void __init delay_with_tsc(void)
+{
+ unsigned long long start, now;
+ unsigned long end = jiffies + 4;
+
+ start = rdtsc();
+
+ /*
+ * We don't know the TSC frequency yet, but waiting for
+ * 40000000000/HZ TSC cycles is safe:
+ * 4 GHz == 10 jiffies
+ * 1 GHz == 40 jiffies
+ */
+ do {
+ rep_nop();
+ now = rdtsc();
+ } while ((now - start) < 40000000000UL / HZ &&
+ time_before_eq(jiffies, end));
+}
+
+static void __init delay_without_tsc(void)
+{
+ unsigned long end = jiffies + 4;
+ int band = 1;
+
+ /*
+ * We don't know any frequency yet, but waiting for
+ * 40940000000/HZ cycles is safe:
+ * 4 GHz == 10 jiffies
+ * 1 GHz == 40 jiffies
+ * 1 << 1 + 1 << 2 +...+ 1 << 11 = 4094
+ */
+ do {
+ __delay(((1U << band++) * 10000000UL) / HZ);
+ } while (band < 12 && time_before_eq(jiffies, end));
+}
+
/*
* There is a nasty bug in some older SMP boards, their mptable lies
* about the timer IRQ. We do the following to work around the situation:
@@ -1625,8 +1662,12 @@ static int __init timer_irq_works(void)
local_save_flags(flags);
local_irq_enable();
- /* Let ten ticks pass... */
- mdelay((10 * 1000) / HZ);
+
+ if (boot_cpu_has(X86_FEATURE_TSC))
+ delay_with_tsc();
+ else
+ delay_without_tsc();
+
local_irq_restore(flags);
/*
--
2.5.5
XEN PV overrides smp_prepare_cpus(). xen_pv_smp_prepare_cpus()
initializes interrupts in the XEN PV specific way and does not invoke
native_smp_prepare_cpus(). As a consequence, x86_init.intr_mode_init() is
not invoked either.
The invocation of x86_init.intr_mode_init() will be moved from
native_smp_prepare_cpus() in a follow up patch to solve <INSERT
REASON/PROBLEM>.
That move would cause the invocation of x86_init.intr_mode_init() for XEN
PV platforms. To prevent that, override the default x86_init.intr_mode_init()
callback with a noop().
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/xen/enlighten_pv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index f33eef4..d3362a3 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1268,6 +1268,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
x86_platform.get_nmi_reason = xen_get_nmi_reason;
x86_init.resources.memory_setup = xen_memory_setup;
+ x86_init.irqs.intr_mode_init = x86_init_noop;
x86_init.oem.arch_setup = xen_arch_setup;
x86_init.oem.banner = xen_banner;
--
2.5.5
The init_bsp_APIC() which works for the virtual wire mode is used
in ISA irq initialization at the booting time.
Currently, enable and setup the interrupt mode has been unified
and advanced just behind the timer IRQ setup. Kernel switches to
the final interrupt delivery mode directly. So init_bsp_APIC()
is redundant.
Remove the init_bsp_APIC() function.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/apic.h | 1 -
arch/x86/kernel/apic/apic.c | 49 ---------------------------------------------
arch/x86/kernel/irqinit.c | 3 ---
3 files changed, 53 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index b35bbbf..3c9fc9e 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -136,7 +136,6 @@ extern void disconnect_bsp_APIC(int virt_wire_setup);
extern void disable_local_APIC(void);
extern void lapic_shutdown(void);
extern void sync_Arb_IDs(void);
-extern void init_bsp_APIC(void);
extern void apic_intr_mode_init(void);
extern void setup_local_APIC(void);
extern void init_apic_mappings(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 2707aa2..41cd020 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1229,55 +1229,6 @@ static int __init apic_intr_mode_select(void)
return APIC_SYMMETRIC_IO;
}
-/*
- * An initial setup of the virtual wire mode.
- */
-void __init init_bsp_APIC(void)
-{
- unsigned int value;
-
- /*
- * Don't do the setup now if we have a SMP BIOS as the
- * through-I/O-APIC virtual wire mode might be active.
- */
- if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC))
- return;
-
- /*
- * Do not trust the local APIC being empty at bootup.
- */
- clear_local_APIC();
-
- /*
- * Enable APIC.
- */
- value = apic_read(APIC_SPIV);
- value &= ~APIC_VECTOR_MASK;
- value |= APIC_SPIV_APIC_ENABLED;
-
-#ifdef CONFIG_X86_32
- /* This bit is reserved on P4/Xeon and should be cleared */
- if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
- (boot_cpu_data.x86 == 15))
- value &= ~APIC_SPIV_FOCUS_DISABLED;
- else
-#endif
- value |= APIC_SPIV_FOCUS_DISABLED;
- value |= SPURIOUS_APIC_VECTOR;
- apic_write(APIC_SPIV, value);
-
- /*
- * Set up the virtual wire mode.
- */
- apic_write(APIC_LVT0, APIC_DM_EXTINT);
- value = APIC_DM_NMI;
- if (!lapic_is_integrated()) /* 82489DX */
- value |= APIC_LVT_LEVEL_TRIGGER;
- if (apic_extnmi == APIC_EXTNMI_NONE)
- value |= APIC_LVT_MASKED;
- apic_write(APIC_LVT1, value);
-}
-
/* Init the interrupt delivery mode for the BSP */
void __init apic_intr_mode_init(void)
{
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index 7468c69..488c9e2 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -72,9 +72,6 @@ void __init init_ISA_irqs(void)
struct irq_chip *chip = legacy_pic->chip;
int i;
-#if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC)
- init_bsp_APIC();
-#endif
legacy_pic->init(0);
for (i = 0; i < nr_legacy_irqs(); i++)
--
2.5.5
In UniProcessor kernel with UP_LATE_INIT=y, it enables and setups
interrupt delivery mode in up_late_init().
Unify it to apic_intr_mode_init(), remove APIC_init_uniprocessor().
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/apic.h | 1 -
arch/x86/kernel/apic/apic.c | 47 ++++++---------------------------------------
2 files changed, 6 insertions(+), 42 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index bfbf715..b35bbbf 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -144,7 +144,6 @@ void register_lapic_address(unsigned long address);
extern void setup_boot_APIC_clock(void);
extern void setup_secondary_APIC_clock(void);
extern void lapic_update_tsc_freq(void);
-extern int APIC_init_uniprocessor(void);
#ifdef CONFIG_X86_64
static inline int apic_force_enable(unsigned long addr)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 8cae32c..6dc399e 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1281,7 +1281,7 @@ void __init init_bsp_APIC(void)
/* Init the interrupt delivery mode for the BSP */
void __init apic_intr_mode_init(void)
{
- bool upmode = false;
+ bool upmode = IS_ENABLED(CONFIG_UP_LATE_INIT);
apic_intr_mode = apic_intr_mode_select();
@@ -2381,51 +2381,16 @@ void __init apic_bsp_setup(bool upmode)
setup_IO_APIC();
}
-/*
- * This initializes the IO-APIC and APIC hardware if this is
- * a UP kernel.
- */
-int __init APIC_init_uniprocessor(void)
+#ifdef CONFIG_UP_LATE_INIT
+void __init up_late_init(void)
{
- if (disable_apic) {
- pr_info("Apic disabled\n");
- return -1;
- }
-#ifdef CONFIG_X86_64
- if (!boot_cpu_has(X86_FEATURE_APIC)) {
- disable_apic = 1;
- pr_info("Apic disabled by BIOS\n");
- return -1;
- }
-#else
- if (!smp_found_config && !boot_cpu_has(X86_FEATURE_APIC))
- return -1;
+ apic_intr_mode_init();
- /*
- * Complain if the BIOS pretends there is one.
- */
- if (!boot_cpu_has(X86_FEATURE_APIC) &&
- APIC_INTEGRATED(boot_cpu_apic_version)) {
- pr_err("BIOS bug, local APIC 0x%x not detected!...\n",
- boot_cpu_physical_apicid);
- return -1;
- }
-#endif
-
- if (!smp_found_config)
- disable_ioapic_support();
+ if (apic_intr_mode == APIC_PIC)
+ return;
- default_setup_apic_routing();
- apic_bsp_setup(true);
/* Setup local timer */
x86_init.timers.setup_percpu_clockev();
- return 0;
-}
-
-#ifdef CONFIG_UP_LATE_INIT
-void __init up_late_init(void)
-{
- APIC_init_uniprocessor();
}
#endif
--
2.5.5
In start_kernel(), firstly, it works on the default interrupy mode, then
switch to the final mode. Normally, Booting with BIOS reset is OK.
But, At dump-capture kernel, it boot up without BIOS reset, default mode
may not be compatible with the actual registers, that causes the delivery
interrupt to fail.
Try to set up the final mode as soon as possible. according to the parts
which split from that initialization:
1) Set up the APIC/IOAPIC (including testing whether the timer
interrupt works)
2) Calibrate TSC
3) Set up the local APIC timer
-- From Thomas Gleixner
Initializing the mode should be earlier than calibrating TSC as soon as
possible and needs testing whether the timer interrupt works at the same
time.
call it behind timers init, which meets the above conditions.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/kernel/apic/apic.c | 2 --
arch/x86/kernel/smpboot.c | 2 --
arch/x86/kernel/time.c | 5 +++++
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 9573c14..2707aa2 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2384,8 +2384,6 @@ void __init apic_bsp_setup(bool upmode)
#ifdef CONFIG_UP_LATE_INIT
void __init up_late_init(void)
{
- x86_init.irqs.intr_mode_init();
-
if (apic_intr_mode == APIC_PIC)
return;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 208f4f3..a26d12c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1289,8 +1289,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
set_cpu_sibling_map(0);
- x86_init.irqs.intr_mode_init();
-
smp_sanity_check();
switch (apic_intr_mode) {
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index d39c091..0f04d4b 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -84,6 +84,11 @@ void __init hpet_time_init(void)
static __init void x86_late_time_init(void)
{
x86_init.timers.timer_init();
+ /*
+ * After PIT/HPET timers init, select and setup
+ * the final interrupt mode for delivering IRQs.
+ */
+ x86_init.irqs.intr_mode_init();
tsc_init();
}
--
2.5.5
In the SMP-capable system, enable and setup the interrupt delivery
mode in native_smp_prepare_cpus().
This design mixs the APIC and SMP together, it has highly coupling.
Make the initialization of interrupt mode independent, Unify and
refine it to apic_intr_mode_init() for SMP-capable system.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/kernel/apic/apic.c | 41 ++++++++++++++++++++++++++++++++++++++---
arch/x86/kernel/smpboot.c | 14 ++------------
2 files changed, 40 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 51536b9..4e24aaf 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1160,7 +1160,9 @@ void __init sync_Arb_IDs(void)
enum apic_intr_mode {
APIC_PIC,
APIC_VIRTUAL_WIRE,
+ APIC_VIRTUAL_WIRE_NO_CONFIG,
APIC_SYMMETRIC_IO,
+ APIC_SYMMETRIC_IO_NO_ROUTING,
};
static int __init apic_intr_mode_select(void)
@@ -1207,12 +1209,29 @@ static int __init apic_intr_mode_select(void)
if (!smp_found_config) {
disable_ioapic_support();
- if (!acpi_lapic)
+ if (!acpi_lapic) {
pr_info("APIC: ACPI MADT or MP tables are not detected\n");
+ return APIC_VIRTUAL_WIRE_NO_CONFIG;
+ }
+
return APIC_VIRTUAL_WIRE;
}
+#ifdef CONFIG_SMP
+ /* If SMP should be disabled, then really disable it! */
+ if (!setup_max_cpus) {
+ pr_info("APIC: SMP mode deactivated\n");
+ return APIC_SYMMETRIC_IO_NO_ROUTING;
+ }
+
+ if (read_apic_id() != boot_cpu_physical_apicid) {
+ panic("Boot APIC ID in local APIC unexpected (%d vs %d)",
+ read_apic_id(), boot_cpu_physical_apicid);
+ /* Or can we switch back to PIC here? */
+ }
+#endif
+
return APIC_SYMMETRIC_IO;
}
@@ -1268,17 +1287,33 @@ void __init init_bsp_APIC(void)
/* Init the interrupt delivery mode for the BSP */
void __init apic_intr_mode_init(void)
{
+ bool upmode = false;
+
switch (apic_intr_mode_select()) {
case APIC_PIC:
pr_info("APIC: keep in PIC mode(8259)\n");
return;
case APIC_VIRTUAL_WIRE:
pr_info("APIC: switch to virtual wire mode setup\n");
- return;
+ default_setup_apic_routing();
+ break;
+ case APIC_VIRTUAL_WIRE_NO_CONFIG:
+ pr_info("APIC: switch to virtual wire mode setup "
+ "with no configuration\n");
+ upmode = true;
+ default_setup_apic_routing();
+ break;
case APIC_SYMMETRIC_IO:
pr_info("APIC: switch to symmectic I/O mode setup\n");
- return;
+ default_setup_apic_routing();
+ break;
+ case APIC_SYMMETRIC_IO_NO_ROUTING:
+ pr_info("APIC: switch to symmectic I/O mode setup "
+ "in no SMP routine\n");
+ break;
}
+
+ apic_bsp_setup(upmode);
}
static void lapic_setup_esr(void)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3f74288..a4b072d 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1329,18 +1329,17 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
set_cpu_sibling_map(0);
+ apic_intr_mode_init();
+
switch (smp_sanity_check(max_cpus)) {
case SMP_NO_CONFIG:
disable_smp();
- if (APIC_init_uniprocessor())
- pr_notice("Local APIC not detected. Using dummy APIC emulation.\n");
return;
case SMP_NO_APIC:
disable_smp();
return;
case SMP_FORCE_UP:
disable_smp();
- apic_bsp_setup(false);
/* Setup local timer */
x86_init.timers.setup_percpu_clockev();
return;
@@ -1348,15 +1347,6 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
break;
}
- if (read_apic_id() != boot_cpu_physical_apicid) {
- panic("Boot APIC ID in local APIC unexpected (%d vs %d)",
- read_apic_id(), boot_cpu_physical_apicid);
- /* Or can we switch back to PIC here? */
- }
-
- default_setup_apic_routing();
- apic_bsp_setup(false);
-
/* Setup local timer */
x86_init.timers.setup_percpu_clockev();
--
2.5.5
Calling native_smp_prepare_cpus() to prepare for SMP bootup, does
some sanity checking, enables APIC mode and disables SMP feature.
Now, APIC mode setup has been unified to apic_intr_mode_init(),
some sanity checks are redundant and need to be cleanup.
Mark the apic_intr_mode extern to refine the switch and remove
the redundant sanity check.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/apic.h | 9 +++++++
arch/x86/kernel/apic/apic.c | 12 ++++------
arch/x86/kernel/smpboot.c | 57 +++++++--------------------------------------
3 files changed, 22 insertions(+), 56 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index c3bedbd..bfbf715 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -53,6 +53,15 @@ extern int local_apic_timer_c2_ok;
extern int disable_apic;
extern unsigned int lapic_timer_frequency;
+extern enum apic_intr_mode_id apic_intr_mode;
+enum apic_intr_mode_id {
+ APIC_PIC,
+ APIC_VIRTUAL_WIRE,
+ APIC_VIRTUAL_WIRE_NO_CONFIG,
+ APIC_SYMMETRIC_IO,
+ APIC_SYMMETRIC_IO_NO_ROUTING
+};
+
#ifdef CONFIG_SMP
extern void __inquire_remote_apic(int apicid);
#else /* CONFIG_SMP */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 4e24aaf..8cae32c 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1157,13 +1157,7 @@ void __init sync_Arb_IDs(void)
APIC_INT_LEVELTRIG | APIC_DM_INIT);
}
-enum apic_intr_mode {
- APIC_PIC,
- APIC_VIRTUAL_WIRE,
- APIC_VIRTUAL_WIRE_NO_CONFIG,
- APIC_SYMMETRIC_IO,
- APIC_SYMMETRIC_IO_NO_ROUTING,
-};
+enum apic_intr_mode_id apic_intr_mode;
static int __init apic_intr_mode_select(void)
{
@@ -1289,7 +1283,9 @@ void __init apic_intr_mode_init(void)
{
bool upmode = false;
- switch (apic_intr_mode_select()) {
+ apic_intr_mode = apic_intr_mode_select();
+
+ switch (apic_intr_mode) {
case APIC_PIC:
pr_info("APIC: keep in PIC mode(8259)\n");
return;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index a4b072d..b40cf59 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1183,17 +1183,10 @@ static __init void disable_smp(void)
cpumask_set_cpu(0, topology_core_cpumask(0));
}
-enum {
- SMP_OK,
- SMP_NO_CONFIG,
- SMP_NO_APIC,
- SMP_FORCE_UP,
-};
-
/*
* Various sanity checks.
*/
-static int __init smp_sanity_check(unsigned max_cpus)
+static void __init smp_sanity_check(void)
{
preempt_disable();
@@ -1231,16 +1224,6 @@ static int __init smp_sanity_check(unsigned max_cpus)
}
/*
- * If we couldn't find an SMP configuration at boot time,
- * get out of here now!
- */
- if (!smp_found_config && !acpi_lapic) {
- preempt_enable();
- pr_notice("SMP motherboard not detected\n");
- return SMP_NO_CONFIG;
- }
-
- /*
* Should not be necessary because the MP table should list the boot
* CPU too, but we do it for the sake of robustness anyway.
*/
@@ -1250,29 +1233,6 @@ static int __init smp_sanity_check(unsigned max_cpus)
physid_set(hard_smp_processor_id(), phys_cpu_present_map);
}
preempt_enable();
-
- /*
- * If we couldn't find a local APIC, then get out of here now!
- */
- if (APIC_INTEGRATED(boot_cpu_apic_version) &&
- !boot_cpu_has(X86_FEATURE_APIC)) {
- if (!disable_apic) {
- pr_err("BIOS bug, local APIC #%d not detected!...\n",
- boot_cpu_physical_apicid);
- pr_err("... forcing use of dummy APIC emulation (tell your hw vendor)\n");
- }
- return SMP_NO_APIC;
- }
-
- /*
- * If SMP should be disabled, then really disable it!
- */
- if (!max_cpus) {
- pr_info("SMP mode deactivated\n");
- return SMP_FORCE_UP;
- }
-
- return SMP_OK;
}
static void __init smp_cpu_index_default(void)
@@ -1331,19 +1291,20 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
apic_intr_mode_init();
- switch (smp_sanity_check(max_cpus)) {
- case SMP_NO_CONFIG:
- disable_smp();
- return;
- case SMP_NO_APIC:
+ smp_sanity_check();
+
+ switch (apic_intr_mode) {
+ case APIC_PIC:
+ case APIC_VIRTUAL_WIRE_NO_CONFIG:
disable_smp();
return;
- case SMP_FORCE_UP:
+ case APIC_SYMMETRIC_IO_NO_ROUTING:
disable_smp();
/* Setup local timer */
x86_init.timers.setup_percpu_clockev();
return;
- case SMP_OK:
+ case APIC_VIRTUAL_WIRE:
+ case APIC_SYMMETRIC_IO:
break;
}
--
2.5.5
X86 and XEN initialize interrupt delivery mode in different way.
Ordinary conditional function calls will make the code mess.
Add an unconditional x86_init_ops function which defaults to the
standard function and can be overridden by the early platform code.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/include/asm/x86_init.h | 2 ++
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/kernel/x86_init.c | 1 +
4 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 7ba7e90..f45acdf 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -50,11 +50,13 @@ struct x86_init_resources {
* are set up.
* @intr_init: interrupt init code
* @trap_init: platform specific trap setup
+ * @intr_mode_init: interrupt delivery mode setup
*/
struct x86_init_irqs {
void (*pre_vector_init)(void);
void (*intr_init)(void);
void (*trap_init)(void);
+ void (*intr_mode_init)(void);
};
/**
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6dc399e..9573c14 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2384,7 +2384,7 @@ void __init apic_bsp_setup(bool upmode)
#ifdef CONFIG_UP_LATE_INIT
void __init up_late_init(void)
{
- apic_intr_mode_init();
+ x86_init.irqs.intr_mode_init();
if (apic_intr_mode == APIC_PIC)
return;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index b40cf59..208f4f3 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1289,7 +1289,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
set_cpu_sibling_map(0);
- apic_intr_mode_init();
+ x86_init.irqs.intr_mode_init();
smp_sanity_check();
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index a088b2c..a7889b9 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -55,6 +55,7 @@ struct x86_init_ops x86_init __initdata = {
.pre_vector_init = init_ISA_irqs,
.intr_init = native_init_IRQ,
.trap_init = x86_init_noop,
+ .intr_mode_init = apic_intr_mode_init
},
.oem = {
--
2.5.5
apic_bsp_setup() sets up the local APIC, I/O APIC and APIC timer.
The local APIC and I/O APIC setup belongs to interrupt delivery mode
setup. Setting up the local APIC timer for booting CPU is another job
and has nothing to do with interrupt delivery mode setup.
Split local APIC timer setup from the APIC setup, keep it in the
original position for SMP and UP kernel for preparation.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/kernel/apic/apic.c | 4 ++--
arch/x86/kernel/smpboot.c | 5 +++++
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3f22717..83b7c2e 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2355,8 +2355,6 @@ int __init apic_bsp_setup(bool upmode)
end_local_APIC_setup();
irq_remap_enable_fault_handling();
setup_IO_APIC();
- /* Setup local timer */
- x86_init.timers.setup_percpu_clockev();
return id;
}
@@ -2396,6 +2394,8 @@ int __init APIC_init_uniprocessor(void)
default_setup_apic_routing();
apic_bsp_setup(true);
+ /* Setup local timer */
+ x86_init.timers.setup_percpu_clockev();
return 0;
}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f04479a..93f0cda 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1333,6 +1333,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
case SMP_FORCE_UP:
disable_smp();
apic_bsp_setup(false);
+ /* Setup local timer */
+ x86_init.timers.setup_percpu_clockev();
return;
case SMP_OK:
break;
@@ -1347,6 +1349,9 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
default_setup_apic_routing();
cpu0_logical_apicid = apic_bsp_setup(false);
+ /* Setup local timer */
+ x86_init.timers.setup_percpu_clockev();
+
pr_info("CPU0: ");
print_cpu_info(&cpu_data(0));
--
2.5.5
Now, there are many switches in kernel which are used to determine
the final interrupt delivery mode, as shown below:
1) kconfig:
CONFIG_X86_64; CONFIG_X86_LOCAL_APIC; CONFIG_x86_IO_APIC
2) kernel option: disable_apic; skip_ioapic_setup
3) CPU Capability: boot_cpu_has(X86_FEATURE_APIC)
4) MP table: smp_found_config
5) ACPI: acpi_lapic; acpi_ioapic; nr_ioapic
These switches are disordered and scattered and there are also some
dependencies with each other. These make the code difficult to
maintain and read.
Construct a selector to unify them into a single function, then,
Use this selector to get an interrupt delivery mode directly.
Signed-off-by: Dou Liyang <[email protected]>
---
arch/x86/kernel/apic/apic.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 2d75faf..ad5a853 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1157,6 +1157,65 @@ void __init sync_Arb_IDs(void)
APIC_INT_LEVELTRIG | APIC_DM_INIT);
}
+enum apic_intr_mode {
+ APIC_PIC,
+ APIC_VIRTUAL_WIRE,
+ APIC_SYMMETRIC_IO,
+};
+
+static int __init apic_intr_mode_select(void)
+{
+ /* Check kernel option */
+ if (disable_apic) {
+ pr_info("APIC disabled via kernel command line\n");
+ return APIC_PIC;
+ }
+
+ /* Check BIOS */
+#ifdef CONFIG_X86_64
+ /* On 64-bit, the APIC must be integrated, Check local APIC only */
+ if (!boot_cpu_has(X86_FEATURE_APIC)) {
+ disable_apic = 1;
+ pr_info("APIC disabled by BIOS\n");
+ return APIC_PIC;
+ }
+#else
+ /*
+ * On 32-bit, check whether there is a separate chip or integrated
+ * APIC
+ */
+
+ /* Has a separate chip ? */
+ if (!boot_cpu_has(X86_FEATURE_APIC) && !smp_found_config) {
+ disable_apic = 1;
+
+ return APIC_PIC;
+ }
+
+ /* Has a local APIC ? */
+ if (!boot_cpu_has(X86_FEATURE_APIC) &&
+ APIC_INTEGRATED(boot_cpu_apic_version)) {
+ disable_apic = 1;
+ pr_err(FW_BUG "Local APIC %d not detected, force emulation\n",
+ boot_cpu_physical_apicid);
+
+ return APIC_PIC;
+ }
+#endif
+
+ /* Check MP table or ACPI MADT configuration */
+ if (!smp_found_config) {
+ disable_ioapic_support();
+
+ if (!acpi_lapic)
+ pr_info("APIC: ACPI MADT or MP tables are not detected\n");
+
+ return APIC_VIRTUAL_WIRE;
+ }
+
+ return APIC_SYMMETRIC_IO;
+}
+
/*
* An initial setup of the virtual wire mode.
*/
--
2.5.5
FYI, we noticed the following commit:
commit: f61a8e12b5972879f8decfe059e54c813dc4416b ("x86/time: Initialize interrupt mode behind timer init")
url: https://github.com/0day-ci/linux/commits/Dou-Liyang/Unify-the-interrupt-delivery-mode-and-do-its-setup-in-advance/20170705-124610
in testcase: will-it-scale
with following parameters:
nr_task: 50%
mode: process
test: writeseek3
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------------------------------------+------------+------------+
| | d021c73124 | f61a8e12b5 |
+-------------------------------------------------------------------------------+------------+------------+
| boot_successes | 0 | 6 |
| boot_failures | 2 | 4 |
| invoked_oom-killer:gfp_mask=0x | 2 | |
| Mem-Info | 2 | |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 2 | |
| ACPI_Error:Table[DMAR]is_not_invalidated_during_early_boot_stage(#/tbxface-#) | 0 | 4 |
| WARNING:at_mm/early_ioremap.c:#check_early_ioremap_leak | 0 | 4 |
+-------------------------------------------------------------------------------+------------+------------+
kern :info : [ 0.005000] tsc: Fast TSC calibration using PIT
kern :info : [ 0.006000] tsc: Detected 2194.957 MHz processor
kern :info : [ 0.007000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4389.91 BogoMIPS (lpj=2194957)
kern :info : [ 0.008002] pid_max: default: 90112 minimum: 704
kern :info : [ 0.009034] ACPI: Core revision 20170303
kern :err : [ 0.010002] ACPI Error: Table [DMAR] is not invalidated during early boot stage (20170303/tbxface-193)
kern :info : [ 0.125364] ACPI: 4 ACPI AML tables successfully acquired and loaded
kern :info : [ 0.126116] Security Framework initialized
kern :info : [ 0.127003] SELinux: Initializing.
kern :debug : [ 0.128012] SELinux: Starting in permissive mode
kern :info : [ 0.131850] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Thanks,
Kernel Test Robot
Hi, Xiaolong
Thank you very much for testing. I have got the reason why the ACPI
error happened and will give a fix patch below.
Also cc ACPI maintainers..
At 07/08/2017 11:48 AM, kernel test robot wrote:
> FYI, we noticed the following commit:
>
> commit: f61a8e12b5972879f8decfe059e54c813dc4416b ("x86/time: Initialize interrupt mode behind timer init")
> url: https://github.com/0day-ci/linux/commits/Dou-Liyang/Unify-the-interrupt-delivery-mode-and-do-its-setup-in-advance/20170705-124610
>
>
> in testcase: will-it-scale
> with following parameters:
>
> nr_task: 50%
> mode: process
> test: writeseek3
> cpufreq_governor: performance
>
> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
>
>
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +-------------------------------------------------------------------------------+------------+------------+
> | | d021c73124 | f61a8e12b5 |
> +-------------------------------------------------------------------------------+------------+------------+
> | boot_successes | 0 | 6 |
> | boot_failures | 2 | 4 |
> | invoked_oom-killer:gfp_mask=0x | 2 | |
> | Mem-Info | 2 | |
> | Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 2 | |
> | ACPI_Error:Table[DMAR]is_not_invalidated_during_early_boot_stage(#/tbxface-#) | 0 | 4 |
> | WARNING:at_mm/early_ioremap.c:#check_early_ioremap_leak | 0 | 4 |
> +-------------------------------------------------------------------------------+------------+------------+
>
>
>
> kern :info : [ 0.005000] tsc: Fast TSC calibration using PIT
> kern :info : [ 0.006000] tsc: Detected 2194.957 MHz processor
> kern :info : [ 0.007000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4389.91 BogoMIPS (lpj=2194957)
> kern :info : [ 0.008002] pid_max: default: 90112 minimum: 704
> kern :info : [ 0.009034] ACPI: Core revision 20170303
> kern :err : [ 0.010002] ACPI Error: Table [DMAR] is not invalidated during early boot stage (20170303/tbxface-193)
Here is the ACPI error.
The reason:
-----------
As we know, Linux divides the ACPI table management into two stages:
1) the early stage:
the mapped ACPI tables is temporary and should be unmapped before the
early stage ends.
2) the late stage.
the mapped ACPI tables is permanent.
And Linux uses *acpi_permanent_mmap* to distinguish the early stage and
the late stage. the default of *acpi_permanent_mmap* is false and will
be set to true in *acpi_early_init()*, which means that Linux regards
*acpi_early_init()* as the dividing line.
Linux maps the ACPI DMAR table in the late stage, But this patch make
the mapping earlier in the early stage, so the ACPI error happened.
The solution:
-------------
There are two solution we can use:
1) After use the DMAR table, we unmap it, which likes following
acpi_get_table();
parse the DMAR table and use it...
acpi_put_table();
2) Invoke the *acpi_early_init()* earlier.
The 1) has influence on the initialization of IOMMU, not work well.
The 2) is suitable, and it also can make the change of interrupt trigger
type earlier than Linux enter into the final interrupt delivery
mode.
The patch:
----------
it will be added in my next version.
diff --git a/init/main.c b/init/main.c
index df58a41..7a09467 100644
--- a/init/main.c
+++ b/init/main.c
@@ -654,12 +654,12 @@ asmlinkage __visible void __init start_kernel(void)
kmemleak_init();
setup_per_cpu_pageset();
numa_policy_init();
+ acpi_early_init();
if (late_time_init)
late_time_init();
calibrate_delay();
pidmap_init();
anon_vma_init();
- acpi_early_init();
#ifdef CONFIG_X86
if (efi_enabled(EFI_RUNTIME_SERVICES))
efi_enter_virtual_mode();
BTY, cc Lee
I try to invoke acpi_early_init() earlier before
timekeeping_init() for accessing ACPI TAD device to set system clock.
Now the testing is OK, but There are a lot of operations between them,
I think we need more investigation.
Thanks,
dou.
> kern :info : [ 0.125364] ACPI: 4 ACPI AML tables successfully acquired and loaded
> kern :info : [ 0.126116] Security Framework initialized
> kern :info : [ 0.127003] SELinux: Initializing.
> kern :debug : [ 0.128012] SELinux: Starting in permissive mode
> kern :info : [ 0.131850] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
>
>
> To reproduce:
>
> git clone https://github.com/01org/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp run job.yaml
>
>
>
> Thanks,
> Kernel Test Robot
>
>