2015-07-02 04:17:16

by Xiao, Jin

[permalink] [raw]
Subject: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

While running cpuhotplug + reboot test, I can easily hit a IPANIC on kernel 3.14.

[ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[ 106.116702] IP:
[ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[ 106.126809] PGD 0
[ 106.129110] Oops: 0000 [#1] PREEMPT SMP
[ 106.133613] Modules linked in: atomisp_css2401a0_v21 lm3554 ov2722 hid_sensor_hub sens_col_core hid_heci_ish heci_ish heci vidt_driver rfkill_gpi o bcmdhd_pcie(O) cfg80211 ov5693 videobuf_vmalloc pn544_nfc(C) videobuf_core bt_lpm 6lowpan_iphc ip6table_raw iptable_raw atmel_mxt_ts
[ 106.161897] CPU: 2 PID: 18 Comm: migration/2 Tainted: G WC O 3.14.37-x86_64-L1-R467-g68db82c #1
[ 106.172323] Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Cherry Trail FFD, BIOS CH2TFFD.X64.0004.R83.1506171149 06/17/2015
[ 106.185758] task: ffff880077e98510 ti: ffff880077e9a000 task.ti: ffff880077e9a000
[ 106.194143] RIP: 0010:[<ffffffff810044f6>]
[ 106.198646] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[ 106.206969] RSP: 0000:ffff880077e9bcf8 EFLAGS: 00010046
[ 106.212926] RAX: 0000000000000000 RBX: 00000000000000d3 RCX: 0000000000000000
[ 106.220921] RDX: 0000000000000000 RSI: 0000000000000088 RDI: 0000000000000001
[ 106.228918] RBP: ffff880077e9bd28 R08: 0000000000000000 R09: ffff8800784008e0
[ 106.236915] R10: 000000000000000a R11: 0000000000000000 R12: 0000000000000015
[ 106.244911] R13: 0000000000000000 R14: 0000000000000088 R15: 0000000000000002
[ 106.252898] FS: 0000000000000000(0000) GS:ffff88007a300000(0000) knlGS:0000000000000000
[ 106.261961] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 106.268405] CR2: 0000000000000040 CR3: 000000006e03c000 CR4: 00000000001007e0
[ 106.276400] Last Branch Records:
[ 106.280052] to: [<ffffffff819f3840>] page_fault+0x0/0x80
[ 106.286335] from: [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[ 106.295044] to: [<ffffffff810044f3>] check_irq_vectors_for_cpu_disable+0x73/0x180
[ 106.303749] from: [<ffffffff810da4c8>] irq_to_desc+0x18/0x20
[ 106.310227] to: [<ffffffff810da4c7>] irq_to_desc+0x17/0x20
[ 106.316701] from: [<ffffffff8137347c>] radix_tree_lookup+0xc/0x10
[ 106.323655] to: [<ffffffff8137347b>] radix_tree_lookup+0xb/0x10
[ 106.330616] from: [<ffffffff81373425>] radix_tree_lookup_element+0x55/0x90
[ 106.338450] to: [<ffffffff81373400>] radix_tree_lookup_element+0x30/0x90
[ 106.346274] from: [<ffffffff81373420>] radix_tree_lookup_element+0x50/0x90
[ 106.354108] to: [<ffffffff8137340b>] radix_tree_lookup_element+0x3b/0x90
[ 106.361934] from: [<ffffffff813733fd>] radix_tree_lookup_element+0x2d/0x90
[ 106.369750] to: [<ffffffff813733d0>] radix_tree_lookup_element+0x0/0x90
[ 106.377487] from: [<ffffffff81373476>] radix_tree_lookup+0x6/0x10
[ 106.384447] to: [<ffffffff81373470>] radix_tree_lookup+0x0/0x10
[ 106.391408] from: [<ffffffff810da4c2>] irq_to_desc+0x12/0x20
[ 106.397882] Stack:
[ 106.400140] 00000002810bb4f2 ffff88006e07bde8 ffff88006e07bd88 0000000000000000
[ 106.408551] ffffffff8110a801 0000000000000202 ffff880077e9bd40 ffffffff81030f62
[ 106.416967] 0000000000000282 ffff880077e9bd58 ffffffff819dd043 0000000000000003
[ 106.425375] Call Trace:
[ 106.428136] [<ffffffff8110a801>] ? multi_cpu_stop+0x1/0x110
[ 106.434475] [<ffffffff81030f62>] native_cpu_disable+0x12/0x40
[ 106.441018] [<ffffffff819dd043>] take_cpu_down+0x13/0x40
[ 106.447074] [<ffffffff8110a8c1>] multi_cpu_stop+0xc1/0x110
[ 106.453324] [<ffffffff8110a800>] ? cpu_stop_should_run+0x50/0x50
[ 106.460156] [<ffffffff8110aad8>] cpu_stopper_thread+0x78/0x150
[ 106.466795] [<ffffffff819f2bde>] ? _raw_spin_unlock_irq+0x1e/0x40
[ 106.473726] [<ffffffff810b33d7>] ? finish_task_switch+0x57/0xd0
[ 106.480464] [<ffffffff819edffe>] ? __schedule+0x37e/0x7b0
[ 106.486619] [<ffffffff810b20fd>] smpboot_thread_fn+0x17d/0x2b0
[ 106.493259] [<ffffffff810b1f80>] ? SyS_setgroups+0x160/0x160
[ 106.499704] [<ffffffff810ab1a4>] kthread+0xe4/0x100

We find latest upstream has commit d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 to
solve this IPANIC. But from the link http://lkml.kernel.org/r/[email protected],
it looks the root cause is not clear.

As it's easily to hit with the specific test case, we have more check and find
the IPANIC scenario as below.

cpu N (N = 1, or 2, or 3) cpu 0
native_cpu_up device_shutdown
=> do_boot_cpu
=> start_secondary
=> smp_callin
=> setup_vector_irq
=> __setup_vector_irq
=> free_msi_irqs
=> arch_teardown_msi_irqs
=> default_teardown_msi_irqs
=> arch_teardown_msi_irq
=> native_teardown_msi_irq
=> destroy_irq
=> __clear_irq_vector
=> set_cpu_online

The cpu still is not online when clear irq vector, it makes the irq number remain
in irq vector after free_msi_irqs. Next native_cpu_disable() will hit NULL pointer
when check irq vector.

The patch move setup_vector_irq after set_cpu_online.

Signed-off-by: xiao jin <[email protected]>
---
arch/x86/kernel/smpboot.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 50e547e..f7d5d79 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -172,11 +172,6 @@ static void smp_callin(void)
apic_ap_setup();

/*
- * Need to setup vector mappings before we enable interrupts.
- */
- setup_vector_irq(smp_processor_id());
-
- /*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
@@ -257,6 +252,11 @@ static void notrace start_secondary(void *unused)
cpu_set_state_online(smp_processor_id());
x86_platform.nmi_init();

+ /*
+ * Need to setup vector mappings before we enable interrupts.
+ */
+ setup_vector_irq(smp_processor_id());
+
/* enable local interrupts */
local_irq_enable();

--
1.7.9.5


2015-07-02 06:52:28

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

Hi Jin,

On Thu, Jul 02, 2015 at 12:24:34PM +0800, xiao jin wrote:
> [ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> [ 106.116702] IP:
> [ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180

This might be the same issue I fixed with:

d97eb8966c91 x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()

Can you try whether applying this patch on your kernel fixes your
issue?


Thanks,

Joerg

2015-07-02 07:02:32

by Xiao, Jin

[permalink] [raw]
Subject: Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

Hi Joerg,

On 7/2/2015 2:52 PM, Joerg Roedel wrote:
> Hi Jin,
>
> On Thu, Jul 02, 2015 at 12:24:34PM +0800, xiao jin wrote:
>> [ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
>> [ 106.116702] IP:
>> [ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
> This might be the same issue I fixed with:
>
> d97eb8966c91 x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()
>
> Can you try whether applying this patch on your kernel fixes your
> issue?
>
Yes, I agree d97eb8966c91 fix the IPANIC. I just notice a description
from http://lkml.kernel.org/r/[email protected].

I wish to share the debug result and root cause from my side.

> Thanks,
>
> Joerg
>

Thanks.

2015-07-02 09:20:40

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

On 2015/7/2 15:02, Xiao, Jin wrote:
> Hi Joerg,
>
> On 7/2/2015 2:52 PM, Joerg Roedel wrote:
>> Hi Jin,
>>
>> On Thu, Jul 02, 2015 at 12:24:34PM +0800, xiao jin wrote:
>>> [ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
>>> [ 106.116702] IP:
>>> [ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
>> This might be the same issue I fixed with:
>>
>> d97eb8966c91 x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()
>>
>> Can you try whether applying this patch on your kernel fixes your
>> issue?
>>
> Yes, I agree d97eb8966c91 fix the IPANIC. I just notice a description from http://lkml.kernel.org/r/[email protected].
>
> I wish to share the debug result and root cause from my side.

commit d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 looks like a workaround.

Could Jin's patch be merged also?
free_msi_irqs does have a race with smp_callin=>..=>__setup_vector_irq.

2015-07-02 12:50:49

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

On Thu, 2 Jul 2015, Zhang, Yanmin wrote:
> On 2015/7/2 15:02, Xiao, Jin wrote:
> > Hi Joerg,
> >
> > On 7/2/2015 2:52 PM, Joerg Roedel wrote:
> >> Hi Jin,
> >>
> >> On Thu, Jul 02, 2015 at 12:24:34PM +0800, xiao jin wrote:
> >>> [ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> >>> [ 106.116702] IP:
> >>> [ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
> >> This might be the same issue I fixed with:
> >>
> >> d97eb8966c91 x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()
> >>
> >> Can you try whether applying this patch on your kernel fixes your
> >> issue?
> >>
> > Yes, I agree d97eb8966c91 fix the IPANIC. I just notice a description from http://lkml.kernel.org/r/[email protected].
> >
> > I wish to share the debug result and root cause from my side.
>
> commit d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 looks like a workaround.
>
> Could Jin's patch be merged also?
> free_msi_irqs does have a race with smp_callin=>..=>__setup_vector_irq.

The whole vector stuff is racy versus cpu hotplug and Jins patch
merily addresses a small part of it and by doing that it breaks stuff
as well.

With that patch we move the vector setup after marking the cpu online,
which is wrong because the vector array on that cpu is not up to date
until we call __setup_vector_irq(). Proper patch below.

We still have an issue in the cpu_disable() patch, but I haven't yet
wrapped my head around it completely.

Thanks,

tglx
---
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 28eba2d38b15..f813261d9740 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -409,12 +409,6 @@ static void __setup_vector_irq(int cpu)
int irq, vector;
struct apic_chip_data *data;

- /*
- * vector_lock will make sure that we don't run into irq vector
- * assignments that might be happening on another cpu in parallel,
- * while we setup our initial vector to irq mappings.
- */
- raw_spin_lock(&vector_lock);
/* Mark the inuse vectors */
for_each_active_irq(irq) {
data = apic_chip_data(irq_get_irq_data(irq));
@@ -436,16 +430,16 @@ static void __setup_vector_irq(int cpu)
if (!cpumask_test_cpu(cpu, data->domain))
per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED;
}
- raw_spin_unlock(&vector_lock);
}

/*
- * Setup the vector to irq mappings.
+ * Setup the vector to irq mappings. Must be called with vector_lock held.
*/
void setup_vector_irq(int cpu)
{
int irq;

+ lockdep_assert_held(&vector_lock);
/*
* On most of the platforms, legacy PIC delivers the interrupts on the
* boot cpu. But there are certain platforms where PIC interrupts are
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8add66b22f33..67dc638035d3 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -171,11 +171,6 @@ static void smp_callin(void)
apic_ap_setup();

/*
- * Need to setup vector mappings before we enable interrupts.
- */
- setup_vector_irq(smp_processor_id());
-
- /*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
@@ -246,11 +241,13 @@ static void notrace start_secondary(void *unused)
#endif

/*
- * We need to hold vector_lock so there the set of online cpus
- * does not change while we are assigning vectors to cpus. Holding
- * this lock ensures we don't half assign or remove an irq from a cpu.
+ * Lock vector_lock and initialize the vectors on this cpu
+ * before setting the cpu online. We must set it online with
+ * vector_lock held to prevent a concurrent setup/teardown
+ * from seeing a half valid vector space.
*/
lock_vector_lock();
+ setup_vector_irq(smp_processor_id());
set_cpu_online(smp_processor_id(), true);
unlock_vector_lock();
cpu_set_state_online(smp_processor_id());

2015-07-03 02:02:48

by Xiao, Jin

[permalink] [raw]
Subject: Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

Hi tglx,

On 7/2/2015 8:50 PM, Thomas Gleixner wrote:
>
> The whole vector stuff is racy versus cpu hotplug and Jins patch
> merily addresses a small part of it and by doing that it breaks stuff
> as well.
>
> With that patch we move the vector setup after marking the cpu online,
> which is wrong because the vector array on that cpu is not up to date
> until we call __setup_vector_irq(). Proper patch below.
>
> We still have an issue in the cpu_disable() patch, but I haven't yet
> wrapped my head around it completely.
>
> Thanks,
>
> tglx
>

Your patch is better. Thanks.

Jin