Hi Thomas,
We started seeing new issues in our net-device daily regression tests.
They are related to patch [1] introduced in kernel 4.15-rc1.
We frequently see a warning in dmesg [2]. Repro is not consistent, we
tried to narrow it down to a smaller run but couldn't.
In addition, sometimes (less frequent) the warning is followed by a
panic [3].
I can share all needed details to help analyze this bug.
If you suspect specific flows, we can do an educated narrow down.
Regards,
Tariq
[1] 2f75d9e1c905 genirq: Implement bitmap matrix allocator
[2]
[ 8664.868564] WARNING: CPU: 5 PID: 0 at kernel/irq/matrix.c:370
irq_matrix_free+0x30/0xd0
[ 8664.891905] Modules linked in: bonding rdma_ucm ib_ucm rdma_cm iw_cm
ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core mlxfw mlx4_ib ib_core
mlx4_en mlx4_core devlink macvlan vxlan ip6_udp_tunnel udp_tunnel 8021q
garp mrp stp llc mst_pciconf(OE) nfsv3 nfs fscache netconsole dm_mirror
dm_region_hash dm_log dm_mod dax kvm_intel kvm irqbypass pcspkr
i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables
ata_generic cirrus drm_kms_helper syscopyarea sysfillrect pata_acpi
sysimgblt fb_sys_fops ttm drm e1000 serio_raw virtio_console i2c_core
floppy ata_piix [last unloaded: mst_pci]
[ 8664.905117] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G OE
4.15.0-for-upstream-perf-2018-02-08_07-00-42-18 #1
[ 8664.907613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 8664.910144] RIP: 0010:irq_matrix_free+0x30/0xd0
[ 8664.912624] RSP: 0018:ffff88023fd43f70 EFLAGS: 00010002
[ 8664.915149] RAX: 0000000000026318 RBX: ffff880157a77ec0 RCX:
0000000000000000
[ 8664.917679] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
ffff880237038400
[ 8664.920244] RBP: ffff880237038400 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8664.922813] R10: 00000000000003ff R11: 0000000000000ad9 R12:
ffff88023fc40000
[ 8664.925345] R13: 0000000000000000 R14: 0000000000000001 R15:
000000000000002b
[ 8664.927872] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000)
knlGS:0000000000000000
[ 8664.930455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8664.932996] CR2: 0000000000f2c030 CR3: 000000000220a000 CR4:
00000000000006e0
[ 8664.935557] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 8664.938051] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 8664.940541] Call Trace:
[ 8664.942980] <IRQ>
[ 8664.945399] free_moved_vector+0x4e/0x100
[ 8664.947787] smp_irq_move_cleanup_interrupt+0x89/0x9e
[ 8664.950134] irq_move_cleanup_interrupt+0x95/0xa0
[ 8664.952480] </IRQ>
[ 8664.954800] RIP: 0010:native_safe_halt+0x2/0x10
[ 8664.957052] RSP: 0018:ffffc90000ccfee0 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffffdf
[ 8664.959186] RAX: ffffffff818ab6e0 RBX: ffff880236233f00 RCX:
0000000000000000
[ 8664.960499] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 8664.961774] RBP: 0000000000000005 R08: 0000000000000000 R09:
0000000000000000
[ 8664.963048] R10: 00000000000003ff R11: 0000000000000ad9 R12:
ffff880236233f00
[ 8664.964345] R13: ffff880236233f00 R14: 0000000000000000 R15:
0000000000000000
[ 8664.965579] ? __cpuidle_text_start+0x8/0x8
[ 8664.966808] default_idle+0x18/0xf0
[ 8664.968040] do_idle+0x150/0x1d0
[ 8664.969249] cpu_startup_entry+0x19/0x20
[ 8664.970477] start_secondary+0x133/0x170
[ 8664.971700] secondary_startup_64+0xa5/0xb0
[ 8664.972909] Code: 41 56 41 89 f6 41 55 41 89 d5 89 f2 41 54 4c 8b 24
d5 60 24 18 82 55 48 89 fd 53 48 8b 47 28 44 39 6f 04 77 06 44 3b 6f 08
72 0b <0f> ff 5b 5d 41 5c 41 5d 41 5e c3 49 01 c4 41 80 7c 24 0c 00 74
[ 8664.975420] ---[ end trace 8be4ba51cd83f4bd ]---
[3]
[ 8943.038767] BUG: unable to handle kernel paging request at
000000037a6b561b
[ 8943.040114] IP: free_moved_vector+0x61/0x100
[ 8943.041531] PGD 0 P4D 0
[ 8943.042855] Oops: 0002 [#1] SMP PTI
[ 8943.044128] Modules linked in: bonding rdma_ucm ib_ucm rdma_cm iw_cm
ib_ipoib ib_cm ib_uverbs ib_umad mlx5_ib mlx5_core mlxfw mlx4_ib ib_core
mlx4_en mlx4_core devlink iptable_filter fuse btrfs xor zstd_decompress
zstd_compress xxhash raid6_pq vfat msdos fat binfmt_misc bridge macvlan
vxlan ip6_udp_tunnel udp_tunnel 8021q garp mrp stp llc mst_pciconf(OE)
nfsv3 nfs fscache netconsole dm_mirror dm_region_hash dm_log dm_mod dax
kvm_intel kvm irqbypass pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd
grace sunrpc ip_tables ata_generic cirrus drm_kms_helper syscopyarea
sysfillrect pata_acpi sysimgblt fb_sys_fops ttm drm e1000 serio_raw
virtio_console i2c_core floppy ata_piix [last unloaded: mst_pci]
[ 8943.052038] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE
4.15.0-for-upstream-perf-2018-02-08_07-00-42-18 #1
[ 8943.053350] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 8943.054654] RIP: 0010:free_moved_vector+0x61/0x100
[ 8943.055940] RSP: 0018:ffff88023fd43fa0 EFLAGS: 00010007
[ 8943.057233] RAX: 000000037a6b561b RBX: ffff880157a77ec0 RCX:
0000000000000001
[ 8943.058506] RDX: 00000000000155a8 RSI: 00000000000155a8 RDI:
ffff880237038400
[ 8943.059784] RBP: ffff880157a77ec0 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8943.061051] R10: 0000000000000000 R11: 0000000000000000 R12:
000000007f0c0001
[ 8943.062462] R13: 00000000000155a8 R14: 0000000000000001 R15:
0000000000cc620d
[ 8943.063726] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000)
knlGS:0000000000000000
[ 8943.064993] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8943.066253] CR2: 000000037a6b561b CR3: 000000010badc000 CR4:
00000000000006e0
[ 8943.067522] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 8943.068771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 8943.070029] Call Trace:
[ 8943.071273] <IRQ>
[ 8943.072503] smp_irq_move_cleanup_interrupt+0x89/0x9e
[ 8943.073794] irq_move_cleanup_interrupt+0x95/0xa0
[ 8943.075048] </IRQ>
[ 8943.076288] RIP: 0010:native_safe_halt+0x2/0x10
[ 8943.077530] RSP: 0018:ffffc90000ccfee0 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffffdf
[ 8943.078795] RAX: ffffffff818ab6e0 RBX: ffff880236233f00 RCX:
0000000000000000
[ 8943.080077] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 8943.081435] RBP: 0000000000000005 R08: 00000000e8ba3c69 R09:
0000000000000000
[ 8943.082683] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff880236233f00
[ 8943.083932] R13: ffff880236233f00 R14: 0000000000000000 R15:
0000000000000000
[ 8943.085185] ? __cpuidle_text_start+0x8/0x8
[ 8943.086438] default_idle+0x18/0xf0
[ 8943.087694] do_idle+0x150/0x1d0
[ 8943.088921] cpu_startup_entry+0x19/0x20
[ 8943.090163] start_secondary+0x133/0x170
[ 8943.091402] secondary_startup_64+0xa5/0xb0
[ 8943.092659] Code: 44 00 00 48 8b 3d c8 f7 9f 01 44 89 f1 44 89 e2 44
89 ee e8 e2 05 0b 00 48 c7 c0 20 50 01 00 4a 8d 04 e0 4a 03 04 ed 60 24
18 82 <48> c7 00 00 00 00 00 48 8b 45 28 48 85 c0 74 20 48 8b 55 20 48
[ 8943.095371] RIP: free_moved_vector+0x61/0x100 RSP: ffff88023fd43fa0
[ 8943.096685] CR2: 000000037a6b561b
[ 8943.098120] ---[ end trace 8be4ba51cd83f4c0 ]---
[ 8943.099387] Kernel panic - not syncing: Fatal exception in interrupt
[ 8943.101170] Kernel Offset: disabled
[ 8943.102410] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt
On Tue, 20 Feb 2018, Tariq Toukan wrote:
> Hi Thomas,
>
> We started seeing new issues in our net-device daily regression tests.
> They are related to patch [1] introduced in kernel 4.15-rc1.
>
> We frequently see a warning in dmesg [2]. Repro is not consistent, we tried to
> narrow it down to a smaller run but couldn't.
>
> In addition, sometimes (less frequent) the warning is followed by a panic [3].
Which might be just a consequence of the issue which triggers the warning.
> I can share all needed details to help analyze this bug.
> If you suspect specific flows, we can do an educated narrow down.
> [2]
> [ 8664.868564] WARNING: CPU: 5 PID: 0 at kernel/irq/matrix.c:370 irq_matrix_free+0x30/0xd0
> [ 8664.910144] RIP: 0010:irq_matrix_free+0x30/0xd0
> [ 8664.912624] RSP: 0018:ffff88023fd43f70 EFLAGS: 00010002
> [ 8664.915149] RAX: 0000000000026318 RBX: ffff880157a77ec0 RCX: 0000000000000000
> [ 8664.917679] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff880237038400
> [ 8664.920244] RBP: ffff880237038400 R08: 00000000e8ba3c69 R09: 0000000000000000
> [ 8664.922813] R10: 00000000000003ff R11: 0000000000000ad9 R12: ffff88023fc40000
> [ 8664.925345] R13: 0000000000000000 R14: 0000000000000001 R15: 000000000000002b
> [ 8664.927872] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
> [ 8664.930455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8664.932996] CR2: 0000000000f2c030 CR3: 000000000220a000 CR4: 00000000000006e0
> [ 8664.935557] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8664.938051] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8664.940541] Call Trace:
> [ 8664.942980] <IRQ>
> [ 8664.945399] free_moved_vector+0x4e/0x100
> [ 8664.947787] smp_irq_move_cleanup_interrupt+0x89/0x9e
> [ 8664.950134] irq_move_cleanup_interrupt+0x95/0xa0
> [ 8664.952480] </IRQ>
So thats:
if (WARN_ON_ONCE(bit < m->alloc_start || bit >= m->alloc_end))
return;
and the disassembly of
> [ 8664.972909] Code: 41 56 41 89 f6 41 55 41 89 d5 89 f2 41 54 4c 8b 24 d5 60
> 24 18 82 55 48 89 fd 53 48 8b 47 28 44 39 6f 04 77 06 44 3b 6f 08 72 0b <0f>
> ff 5b 5d 41 5c 41 5d 41 5e c3 49 01 c4 41 80 7c 24 0c 00 74
gives:
1f: 44 39 6f 04 cmp %r13d,0x4(%rdi)
23: 77 06 ja 0x2b
25: 44 3b 6f 08 cmp 0x8(%rdi),%r13d
29: 72 0b jb 0x36
2b:* 0f ff (bad) <-- trapping instruction
So the vector to release should be in r13d, which is 0. /me scratches head.
> [ 8943.038767] BUG: unable to handle kernel paging request at 000000037a6b561b
> [ 8943.040114] IP: free_moved_vector+0x61/0x100
> [ 8943.054654] RIP: 0010:free_moved_vector+0x61/0x100
> [ 8943.055940] RSP: 0018:ffff88023fd43fa0 EFLAGS: 00010007
> [ 8943.057233] RAX: 000000037a6b561b RBX: ffff880157a77ec0 RCX: 0000000000000001
> [ 8943.058506] RDX: 00000000000155a8 RSI: 00000000000155a8 RDI: ffff880237038400
> [ 8943.059784] RBP: ffff880157a77ec0 R08: 00000000e8ba3c69 R09: 0000000000000000
> [ 8943.061051] R10: 0000000000000000 R11: 0000000000000000 R12: 000000007f0c0001
> [ 8943.062462] R13: 00000000000155a8 R14: 0000000000000001 R15: 0000000000cc620d
> [ 8943.063726] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
> [ 8943.064993] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8943.066253] CR2: 000000037a6b561b CR3: 000000010badc000 CR4: 00000000000006e0
> [ 8943.067522] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8943.068771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8943.070029] Call Trace:
> [ 8943.071273] <IRQ>
> [ 8943.072503] smp_irq_move_cleanup_interrupt+0x89/0x9e
> [ 8943.073794] irq_move_cleanup_interrupt+0x95/0xa0
> [ 8943.075048] </IRQ>
...
> [ 8943.092659] Code: 44 00 00 48 8b 3d c8 f7 9f 01 44 89 f1 44 89 e2 44 89 ee
> e8 e2 05 0b 00 48 c7 c0 20 50 01 00 4a 8d 04 e0 4a 03 04 ed 60 24 18 82 <48>
> c7 00 00 00 00 00 48 8b 45 28 48 85 c0 74 20 48 8b 55 20 48
17:* 48 c7 00 00 00 00 00 movq $0x0,(%rax) <-- trapping instruction
I can't tell exactly which part of free_moved_vector() that is and which
vector it tries to free, but its more or less the same crap just with a
fatal vector number.
Is there CPU hotplugging in play?
I'll come back to you tomorrow with a plan how to debug that after staring
into the code some more.
Thanks,
tglx
On Tue, 20 Feb 2018, Thomas Gleixner wrote:
> On Tue, 20 Feb 2018, Tariq Toukan wrote:
>
> Is there CPU hotplugging in play?
>
> I'll come back to you tomorrow with a plan how to debug that after staring
> into the code some more.
Do you have a rough idea what the test case is doing?
Thanks,
tglx
On 20/02/2018 8:18 PM, Thomas Gleixner wrote:
> On Tue, 20 Feb 2018, Thomas Gleixner wrote:
>> On Tue, 20 Feb 2018, Tariq Toukan wrote:
>>
>> Is there CPU hotplugging in play?
No.
>>
>> I'll come back to you tomorrow with a plan how to debug that after staring
>> into the code some more.
>
> Do you have a rough idea what the test case is doing?
>
It arbitrary appears in different flows, like sending traffic or
interface configuration changes.
Thanks,
Tariq
On Wed, 21 Feb 2018, Tariq Toukan wrote:
> On 20/02/2018 8:18 PM, Thomas Gleixner wrote:
> > On Tue, 20 Feb 2018, Thomas Gleixner wrote:
> > > On Tue, 20 Feb 2018, Tariq Toukan wrote:
> > >
> > > Is there CPU hotplugging in play?
>
> No.
Ok.
> > >
> > > I'll come back to you tomorrow with a plan how to debug that after staring
> > > into the code some more.
> >
> > Do you have a rough idea what the test case is doing?
> >
>
> It arbitrary appears in different flows, like sending traffic or interface
> configuration changes.
Hmm. Looks like memory corruption, but I can't pin point it.
Find below a debug patch which should prevent the crash and might give us
some insight into the type of corruption.
Please enable the irq_matrix and vector allocation trace points.
echo 1 >/sys/kernel/debug/tracing/events/irq_matrix/enable
echo 1 >/sys/kernel/debug/tracing/events/irq_vectors/vector*/enable
When the problem triggers the bogus vector is printed and the trace is
frozen. Please provide dmesg and the tracebuffer output.
Thanks,
tglx
8<--------------
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -822,6 +822,12 @@ static void free_moved_vector(struct api
unsigned int cpu = apicd->prev_cpu;
bool managed = apicd->is_managed;
+ if (vector < FIRST_EXTERNAL_VECTOR || vector >= FIRST_SYSTEM_VECTOR) {
+ tracing_off();
+ pr_err("Trying to clear prev_vector: %u\n", vector);
+ goto out;
+ }
+
/*
* This should never happen. Managed interrupts are not
* migrated except on CPU down, which does not involve the
@@ -833,6 +839,7 @@ static void free_moved_vector(struct api
trace_vector_free_moved(apicd->irq, cpu, vector, managed);
irq_matrix_free(vector_matrix, cpu, vector, managed);
per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
+out:
hlist_del_init(&apicd->clist);
apicd->prev_vector = 0;
apicd->move_in_progress = 0;
On 22/02/2018 11:38 PM, Thomas Gleixner wrote:
> On Wed, 21 Feb 2018, Tariq Toukan wrote:
>> On 20/02/2018 8:18 PM, Thomas Gleixner wrote:
>>> On Tue, 20 Feb 2018, Thomas Gleixner wrote:
>>>> On Tue, 20 Feb 2018, Tariq Toukan wrote:
>>>>
>>>> Is there CPU hotplugging in play?
>>
>> No.
>
> Ok.
>
>>>>
>>>> I'll come back to you tomorrow with a plan how to debug that after staring
>>>> into the code some more.
>>>
>>> Do you have a rough idea what the test case is doing?
>>>
>>
>> It arbitrary appears in different flows, like sending traffic or interface
>> configuration changes.
>
> Hmm. Looks like memory corruption, but I can't pin point it.
>
> Find below a debug patch which should prevent the crash and might give us
> some insight into the type of corruption.
>
> Please enable the irq_matrix and vector allocation trace points.
>
> echo 1 >/sys/kernel/debug/tracing/events/irq_matrix/enable
> echo 1 >/sys/kernel/debug/tracing/events/irq_vectors/vector*/enable
>
> When the problem triggers the bogus vector is printed and the trace is
> frozen. Please provide dmesg and the tracebuffer output.
>
OK, I'm temporarily adding this to the regression internal branch. I'll
let you know once we have a repro.
Thanks,
Tariq
> Thanks,
>
> tglx
>
> 8<--------------
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -822,6 +822,12 @@ static void free_moved_vector(struct api
> unsigned int cpu = apicd->prev_cpu;
> bool managed = apicd->is_managed;
>
> + if (vector < FIRST_EXTERNAL_VECTOR || vector >= FIRST_SYSTEM_VECTOR) {
> + tracing_off();
> + pr_err("Trying to clear prev_vector: %u\n", vector);
> + goto out;
> + }
> +
> /*
> * This should never happen. Managed interrupts are not
> * migrated except on CPU down, which does not involve the
> @@ -833,6 +839,7 @@ static void free_moved_vector(struct api
> trace_vector_free_moved(apicd->irq, cpu, vector, managed);
> irq_matrix_free(vector_matrix, cpu, vector, managed);
> per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
> +out:
> hlist_del_init(&apicd->clist);
> apicd->prev_vector = 0;
> apicd->move_in_progress = 0;
>
Hi Thomas, Tariq,
2018-02-20 18:11 GMT+00:00 Thomas Gleixner <[email protected]>:
> On Tue, 20 Feb 2018, Tariq Toukan wrote:
>
>> Hi Thomas,
>>
>> We started seeing new issues in our net-device daily regression tests.
>> They are related to patch [1] introduced in kernel 4.15-rc1.
>>
>> We frequently see a warning in dmesg [2]. Repro is not consistent, we tried to
>> narrow it down to a smaller run but couldn't.
>>
>> In addition, sometimes (less frequent) the warning is followed by a panic [3].
>
> Which might be just a consequence of the issue which triggers the warning.
>
>> I can share all needed details to help analyze this bug.
>> If you suspect specific flows, we can do an educated narrow down.
>
>> [2]
>> [ 8664.868564] WARNING: CPU: 5 PID: 0 at kernel/irq/matrix.c:370 irq_matrix_free+0x30/0xd0
>> [ 8664.910144] RIP: 0010:irq_matrix_free+0x30/0xd0
>> [ 8664.912624] RSP: 0018:ffff88023fd43f70 EFLAGS: 00010002
>> [ 8664.915149] RAX: 0000000000026318 RBX: ffff880157a77ec0 RCX: 0000000000000000
>> [ 8664.917679] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff880237038400
>> [ 8664.920244] RBP: ffff880237038400 R08: 00000000e8ba3c69 R09: 0000000000000000
>> [ 8664.922813] R10: 00000000000003ff R11: 0000000000000ad9 R12: ffff88023fc40000
>> [ 8664.925345] R13: 0000000000000000 R14: 0000000000000001 R15: 000000000000002b
>> [ 8664.927872] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
>> [ 8664.930455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 8664.932996] CR2: 0000000000f2c030 CR3: 000000000220a000 CR4: 00000000000006e0
>> [ 8664.935557] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 8664.938051] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 8664.940541] Call Trace:
>> [ 8664.942980] <IRQ>
>> [ 8664.945399] free_moved_vector+0x4e/0x100
>> [ 8664.947787] smp_irq_move_cleanup_interrupt+0x89/0x9e
>> [ 8664.950134] irq_move_cleanup_interrupt+0x95/0xa0
>> [ 8664.952480] </IRQ>
>
> So thats:
>
> if (WARN_ON_ONCE(bit < m->alloc_start || bit >= m->alloc_end))
> return;
>
> and the disassembly of
>
>> [ 8664.972909] Code: 41 56 41 89 f6 41 55 41 89 d5 89 f2 41 54 4c 8b 24 d5 60
>> 24 18 82 55 48 89 fd 53 48 8b 47 28 44 39 6f 04 77 06 44 3b 6f 08 72 0b <0f>
>> ff 5b 5d 41 5c 41 5d 41 5e c3 49 01 c4 41 80 7c 24 0c 00 74
>
> gives:
>
> 1f: 44 39 6f 04 cmp %r13d,0x4(%rdi)
> 23: 77 06 ja 0x2b
> 25: 44 3b 6f 08 cmp 0x8(%rdi),%r13d
> 29: 72 0b jb 0x36
> 2b:* 0f ff (bad) <-- trapping instruction
>
> So the vector to release should be in r13d, which is 0. /me scratches head.
>
>> [ 8943.038767] BUG: unable to handle kernel paging request at 000000037a6b561b
>> [ 8943.040114] IP: free_moved_vector+0x61/0x100
>> [ 8943.054654] RIP: 0010:free_moved_vector+0x61/0x100
>> [ 8943.055940] RSP: 0018:ffff88023fd43fa0 EFLAGS: 00010007
>> [ 8943.057233] RAX: 000000037a6b561b RBX: ffff880157a77ec0 RCX: 0000000000000001
>> [ 8943.058506] RDX: 00000000000155a8 RSI: 00000000000155a8 RDI: ffff880237038400
>> [ 8943.059784] RBP: ffff880157a77ec0 R08: 00000000e8ba3c69 R09: 0000000000000000
>> [ 8943.061051] R10: 0000000000000000 R11: 0000000000000000 R12: 000000007f0c0001
>> [ 8943.062462] R13: 00000000000155a8 R14: 0000000000000001 R15: 0000000000cc620d
>> [ 8943.063726] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
>> [ 8943.064993] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 8943.066253] CR2: 000000037a6b561b CR3: 000000010badc000 CR4: 00000000000006e0
>> [ 8943.067522] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 8943.068771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 8943.070029] Call Trace:
>> [ 8943.071273] <IRQ>
>> [ 8943.072503] smp_irq_move_cleanup_interrupt+0x89/0x9e
>> [ 8943.073794] irq_move_cleanup_interrupt+0x95/0xa0
>> [ 8943.075048] </IRQ>
> ...
>> [ 8943.092659] Code: 44 00 00 48 8b 3d c8 f7 9f 01 44 89 f1 44 89 e2 44 89 ee
>> e8 e2 05 0b 00 48 c7 c0 20 50 01 00 4a 8d 04 e0 4a 03 04 ed 60 24 18 82 <48>
>> c7 00 00 00 00 00 48 8b 45 28 48 85 c0 74 20 48 8b 55 20 48
>
> 17:* 48 c7 00 00 00 00 00 movq $0x0,(%rax) <-- trapping instruction
>
> I can't tell exactly which part of free_moved_vector() that is and which
> vector it tries to free, but its more or less the same crap just with a
> fatal vector number.
>
> Is there CPU hotplugging in play?
>
> I'll come back to you tomorrow with a plan how to debug that after staring
> into the code some more.
I'm not entirely sure that it's the same fault, but at least backtrace
looks resembling.
I've got it in VM on the last master: v4.17-rc5-110-g2c71d338bef2
Let me know if you need .config or CPU/qemu details or anything.
I'm not sure if it'll reproduce, but might still worth to paste it here:
[root@localhost self]# [ 125.240084] WARNING: CPU: 1 PID: 2198 at
kernel/irq/matrix.c:371 irq_matrix_free+0x20/0xc0
[ 125.242306] Modules linked in:
[ 125.242709] CPU: 1 PID: 2198 Comm: sshd Not tainted 4.17.0-rc5+ #11
[ 125.242949] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc26 04/01/2014
[ 125.243342] RIP: 0010:irq_matrix_free+0x20/0xc0
[ 125.243517] RSP: 0000:ffff926e7fd03e30 EFLAGS: 00000002
[ 125.243737] RAX: 0000000000000001 RBX: ffff926e7fd24700 RCX: 0000000000000000
[ 125.243993] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff926e7d108400
[ 125.244249] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 125.244503] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 125.244761] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[ 125.245049] FS: 00007ffb4f9e74c0(0000) GS:ffff926e7fd00000(0000)
knlGS:0000000000000000
[ 125.245331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 125.245540] CR2: 0000564a977e9658 CR3: 00000000717fe000 CR4: 00000000000006e0
[ 125.245839] Call Trace:
[ 125.246308] <IRQ>
[ 125.246571] free_moved_vector+0x43/0x100
[ 125.246765] smp_irq_move_cleanup_interrupt+0xa4/0xb9
[ 125.246959] irq_move_cleanup_interrupt+0xc/0x20
[ 125.247218] RIP: 0010:__do_softirq+0xa3/0x324
[ 125.247382] RSP: 0000:ffff926e7fd03f78 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffffdf
[ 125.247655] RAX: 00000000ffffffff RBX: ffff926e6ca6b300 RCX: 0000000000000020
[ 125.247911] RDX: 0000000000000000 RSI: 0000000080000000 RDI: ffffffff9566211b
[ 125.248156] RBP: 0000000000000000 R08: 00000000000f4200 R09: 0000001d27c0d340
[ 125.248403] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 125.248650] R13: 0000000000000000 R14: 0000000000000200 R15: 0000000000000000
[ 125.248929] ? irq_move_cleanup_interrupt+0x7/0x20
[ 125.249129] ? irq_exit+0xbb/0xc0
[ 125.249285] ? __do_softirq+0x9d/0x324
[ 125.249432] ? hrtimer_interrupt+0x118/0x240
[ 125.249606] irq_exit+0xbb/0xc0
[ 125.249751] smp_apic_timer_interrupt+0x6c/0x120
[ 125.249932] apic_timer_interrupt+0xf/0x20
[ 125.250122] </IRQ>
[ 125.250318] RIP: 0033:0x7ffb4cdb2cdd
[ 125.250456] RSP: 002b:00007fff031177e0 EFLAGS: 00000206 ORIG_RAX:
ffffffffffffff13
[ 125.250722] RAX: 0000564a975a13ee RBX: 0000564a975a13e7 RCX: 0000000000000004
[ 125.250966] RDX: 0000000000000074 RSI: 0000000000000025 RDI: 0000564a975a13ee
[ 125.251211] RBP: 00007fff03117d50 R08: 0000564a975a13ee R09: 00000000fffffffc
[ 125.251457] R10: 0000564a98c7a4f0 R11: 0000000000000004 R12: 00007fff03117ea0
[ 125.251700] R13: 0000000000000009 R14: 00007fff03117d60 R15: 00007ffb4d12f4c0
[ 125.251977] Code: 98 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
89 f0 41 56 41 55 41 54 55 53 48 8b 5f 28 48 03 1c c5 20 14 6b 96 39
57 04 76 0b <0f> 0b 5b 5d 41 5c 41 5d 41 5e c3 39 57 08 76 f0 89 d0 f0
48 0f
[ 125.253132] ---[ end trace 47657dc74f8ba53a ]---
What I did - was running x86 selftests and ssh'ing at the same moment.
Will try it again if it'll fire by any chance.
Thanks,
Dmitry
2018-05-18 23:41 GMT+01:00 Dmitry Safonov <[email protected]>:
> What I did - was running x86 selftests and ssh'ing at the same moment.
> Will try it again if it'll fire by any chance.
No, I've tried a couple of times (after reboot as it's ONCE), but it looks
hard to reproduce by hands without some automation or something.
Thanks,
Dmitry
On Fri, 18 May 2018, Dmitry Safonov wrote:
> I'm not entirely sure that it's the same fault, but at least backtrace
> looks resembling.
Yes, it's similar, but not the same issue. I'll stare are the code ...
Thanks,
tglx
On 19/05/2018 2:20 PM, Thomas Gleixner wrote:
> On Fri, 18 May 2018, Dmitry Safonov wrote:
>> I'm not entirely sure that it's the same fault, but at least backtrace
>> looks resembling.
>
> Yes, it's similar, but not the same issue. I'll stare are the code ...
>
> Thanks,
>
> tglx
>
We still see the issue in our daily regression runs.
I have your patch merged into my internal branch, it prints the following:
[ 4898.226258] Trying to clear prev_vector: 0
[ 4898.226439] Trying to clear prev_vector: 0
i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
On Wed, 23 May 2018, Tariq Toukan wrote:
> On 19/05/2018 2:20 PM, Thomas Gleixner wrote:
> > On Fri, 18 May 2018, Dmitry Safonov wrote:
> > > I'm not entirely sure that it's the same fault, but at least backtrace
> > > looks resembling.
> >
> > Yes, it's similar, but not the same issue. I'll stare are the code ...
> >
> > Thanks,
> >
> > tglx
> >
>
> We still see the issue in our daily regression runs.
> I have your patch merged into my internal branch, it prints the following:
>
> [ 4898.226258] Trying to clear prev_vector: 0
> [ 4898.226439] Trying to clear prev_vector: 0
>
> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
Could you please enable the vector and irq matrix trace points and capture
the trace when this happens?
Thanks,
tglx
Hi,
We are seeing something probably related.
We run ethtool on a system with Broadcom NIC to increase number of
combined queues.
[root@ ~]# ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX: 9
TX: 8
Other: 0
Combined: 17
Current hardware settings:
RX: 0
TX: 0
Other: 0
Combined: 8
[root@ ~]# ethtool -L eth0 combined 16
The last command PANIC the kernel easily (5 out of 5 in my tests).
I haven't got luck to catch much console output, the only line I got is:
[ 504.727865] WARNING: CPU: 10 PID: 0 at kernel/irq/matrix.c:371
irq_matrix_free+0x32/0xd0
The NIC we have is
Broadcom Limited BCM57302 NetXtreme-C 10Gb/25Gb Ethernet Controller
Thanks,
Song
On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 23 May 2018, Tariq Toukan wrote:
>> On 19/05/2018 2:20 PM, Thomas Gleixner wrote:
>> > On Fri, 18 May 2018, Dmitry Safonov wrote:
>> > > I'm not entirely sure that it's the same fault, but at least backtrace
>> > > looks resembling.
>> >
>> > Yes, it's similar, but not the same issue. I'll stare are the code ...
>> >
>> > Thanks,
>> >
>> > tglx
>> >
>>
>> We still see the issue in our daily regression runs.
>> I have your patch merged into my internal branch, it prints the following:
>>
>> [ 4898.226258] Trying to clear prev_vector: 0
>> [ 4898.226439] Trying to clear prev_vector: 0
>>
>> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
>
> Could you please enable the vector and irq matrix trace points and capture
> the trace when this happens?
>
> Thanks,
>
> tglx
>
Adding Broadcom developers to the thread.
Hi Michael, Vasundhara, and Andy,
Could you please help look into the case I found? ethtool crashes the system
for both net/master and net-next/master.
Thanks,
Song
On Fri, May 25, 2018 at 1:10 PM, Song Liu <[email protected]> wrote:
> Hi,
>
> We are seeing something probably related.
>
> We run ethtool on a system with Broadcom NIC to increase number of
> combined queues.
>
>
> [root@ ~]# ethtool -l eth0
> Channel parameters for eth0:
> Pre-set maximums:
> RX: 9
> TX: 8
> Other: 0
> Combined: 17
> Current hardware settings:
> RX: 0
> TX: 0
> Other: 0
> Combined: 8
>
> [root@ ~]# ethtool -L eth0 combined 16
>
> The last command PANIC the kernel easily (5 out of 5 in my tests).
>
> I haven't got luck to catch much console output, the only line I got is:
>
> [ 504.727865] WARNING: CPU: 10 PID: 0 at kernel/irq/matrix.c:371
> irq_matrix_free+0x32/0xd0
>
> The NIC we have is
>
> Broadcom Limited BCM57302 NetXtreme-C 10Gb/25Gb Ethernet Controller
>
> Thanks,
> Song
>
> On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]> wrote:
>> On Wed, 23 May 2018, Tariq Toukan wrote:
>>> On 19/05/2018 2:20 PM, Thomas Gleixner wrote:
>>> > On Fri, 18 May 2018, Dmitry Safonov wrote:
>>> > > I'm not entirely sure that it's the same fault, but at least backtrace
>>> > > looks resembling.
>>> >
>>> > Yes, it's similar, but not the same issue. I'll stare are the code ...
>>> >
>>> > Thanks,
>>> >
>>> > tglx
>>> >
>>>
>>> We still see the issue in our daily regression runs.
>>> I have your patch merged into my internal branch, it prints the following:
>>>
>>> [ 4898.226258] Trying to clear prev_vector: 0
>>> [ 4898.226439] Trying to clear prev_vector: 0
>>>
>>> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
>>
>> Could you please enable the vector and irq matrix trace points and capture
>> the trace when this happens?
>>
>> Thanks,
>>
>> tglx
>>
On 28/05/2018 1:53 PM, Thomas Gleixner wrote:
> On Fri, 25 May 2018, Song Liu wrote:
>> On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]> wrote:
>>> On Wed, 23 May 2018, Tariq Toukan wrote:
>>>> I have your patch merged into my internal branch, it prints the following:
>>>>
>>>> [ 4898.226258] Trying to clear prev_vector: 0
>>>> [ 4898.226439] Trying to clear prev_vector: 0
>>>>
>>>> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
>>>
>>> Could you please enable the vector and irq matrix trace points and capture
>>> the trace when this happens?
>
> Does the patch below fix it?
>
> Thanks,
>
> tglx
>
> 8<-------------------
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index bb6f7a2148d7..54af3d4884b1 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -148,6 +148,7 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
> * prev_vector for this and the offlined target case.
> */
> apicd->prev_vector = 0;
> + apicd->move_in_progress = false;
> if (!apicd->vector || apicd->vector == MANAGED_IRQ_SHUTDOWN_VECTOR)
> goto setnew;
> /*
>
I took it into my internal branch. Will let you know.
Tariq
On Fri, 25 May 2018, Song Liu wrote:
> On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]> wrote:
> > On Wed, 23 May 2018, Tariq Toukan wrote:
> >> I have your patch merged into my internal branch, it prints the following:
> >>
> >> [ 4898.226258] Trying to clear prev_vector: 0
> >> [ 4898.226439] Trying to clear prev_vector: 0
> >>
> >> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
> >
> > Could you please enable the vector and irq matrix trace points and capture
> > the trace when this happens?
Does the patch below fix it?
Thanks,
tglx
8<-------------------
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index bb6f7a2148d7..54af3d4884b1 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -148,6 +148,7 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
* prev_vector for this and the offlined target case.
*/
apicd->prev_vector = 0;
+ apicd->move_in_progress = false;
if (!apicd->vector || apicd->vector == MANAGED_IRQ_SHUTDOWN_VECTOR)
goto setnew;
/*
On Mon, 28 May 2018, Tariq Toukan wrote:
> On 28/05/2018 1:53 PM, Thomas Gleixner wrote:
> > On Fri, 25 May 2018, Song Liu wrote:
> > > On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]>
> > > wrote:
> > > > On Wed, 23 May 2018, Tariq Toukan wrote:
> > > > > I have your patch merged into my internal branch, it prints the
> > > > > following:
> > > > >
> > > > > [ 4898.226258] Trying to clear prev_vector: 0
> > > > > [ 4898.226439] Trying to clear prev_vector: 0
> > > > >
> > > > > i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
> > > >
> > > > Could you please enable the vector and irq matrix trace points and
> > > > capture
> > > > the trace when this happens?
> >
> > Does the patch below fix it?
> >
> > Thanks,
> >
> > tglx
> >
> > 8<-------------------
> > diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> > index bb6f7a2148d7..54af3d4884b1 100644
> > --- a/arch/x86/kernel/apic/vector.c
> > +++ b/arch/x86/kernel/apic/vector.c
> > @@ -148,6 +148,7 @@ static void apic_update_vector(struct irq_data *irqd,
> > unsigned int newvec,
> > * prev_vector for this and the offlined target case.
> > */
> > apicd->prev_vector = 0;
> > + apicd->move_in_progress = false;
> > if (!apicd->vector || apicd->vector == MANAGED_IRQ_SHUTDOWN_VECTOR)
> > goto setnew;
> > /*
> >
>
> I took it into my internal branch. Will let you know.
Note, I'd still like to see a trace w/o the patch which shows which order
of events leads to this. Even if the patch cures it, it might just paper
over it and not fixing the root cause.
Thanks,
tglx
> On May 28, 2018, at 3:53 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Fri, 25 May 2018, Song Liu wrote:
>> On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]> wrote:
>>> On Wed, 23 May 2018, Tariq Toukan wrote:
>>>> I have your patch merged into my internal branch, it prints the following:
>>>>
>>>> [ 4898.226258] Trying to clear prev_vector: 0
>>>> [ 4898.226439] Trying to clear prev_vector: 0
>>>>
>>>> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
>>>
>>> Could you please enable the vector and irq matrix trace points and capture
>>> the trace when this happens?
>
> Does the patch below fix it?
>
> Thanks,
>
> tglx
>
> 8<-------------------
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index bb6f7a2148d7..54af3d4884b1 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -148,6 +148,7 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
> * prev_vector for this and the offlined target case.
> */
> apicd->prev_vector = 0;
> + apicd->move_in_progress = false;
> if (!apicd->vector || apicd->vector == MANAGED_IRQ_SHUTDOWN_VECTOR)
> goto setnew;
> /*
This doesn't fix the issue with bnxt. Here is a trace with this patch:
[ 569.222495] WARNING: CPU: 20 PID: 0 at kernel/irq/matrix.c:373 irq_matrix_free+0x32/0xd0
[ 569.238811] BUG: unable to handle kernel
[ 569.238811] NULL pointer dereference
[ 569.238812] at 0000000000000000
[ 569.238812] IP: bnxt_poll+0x163/0x830
[ 569.238812] PGD 0
[ 569.238812] P4D 0
[ 569.238813] Oops: 0002 [#1] SMP PTI
[ 569.238813] Modules linked in:
[ 569.238813] nfsv3
[ 569.238814] nfs
[ 569.238814] fscache
[ 569.238814] ip6table_raw
[ 569.238814] ip6table_mangle
[ 569.238815] iptable_raw
[ 569.238815] iptable_mangle
[ 569.238815] ip6table_filter
[ 569.238816] xt_NFLOG
[ 569.238816] xt_comment
[ 569.238816] xt_statistic
[ 569.238816] iptable_filter
[ 569.238817] nfnetlink_log
[ 569.238817] tcp_diag
[ 569.238817] inet_diag
[ 569.238817] sb_edac
[ 569.238818] x86_pkg_temp_thermal
[ 569.238818] intel_powerclamp
[ 569.238818] coretemp
[ 569.238818] kvm_intel
[ 569.238818] kvm
[ 569.238819] irqbypass
[ 569.238819] iTCO_wdt
[ 569.238819] iTCO_vendor_support
[ 569.238819] lpc_ich
[ 569.238819] efivars
[ 569.238820] mfd_core
[ 569.238820] i2c_i801
[ 569.238820] ipmi_si
[ 569.238820] ipmi_devintf
[ 569.238820] ipmi_msghandler
[ 569.238821] button
[ 569.238821] acpi_cpufreq
[ 569.238821] sch_fq_codel
[ 569.238821] nfsd
[ 569.238821] nfs_acl
[ 569.238822] lockd
[ 569.238822] auth_rpcgss
[ 569.238822] oid_registry
[ 569.238822] grace
[ 569.238822] sunrpc
[ 569.238823] fuse
[ 569.238823] loop
[ 569.238823] efivarfs
[ 569.238823] autofs4
[ 569.238824] CPU: 20 PID: 0 Comm: swapper/20 Not tainted 4.16.0-00391-g3742c6a #813
[ 569.238824] Hardware name: Quanta Leopard ORv2-DDR4/Leopard ORv2-DDR4, BIOS F06_3B12 08/17/2017
[ 569.238824] RIP: 0010:bnxt_poll+0x163/0x830
[ 569.238825] RSP: 0018:ffff883ffef83b18 EFLAGS: 00010006
[ 569.238825] RAX: 000000002c000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 569.238825] RDX: 0000000000000000 RSI: ffff881fdc53c000 RDI: ffff881f55a1da80
[ 569.238826] RBP: ffff883ffef83b60 R08: 0000000000000011 R09: 000000004fbf4c8d
[ 569.238826] R10: ffff883ffef83ab0 R11: ffff883fec6eee02 R12: ffff881ff6be2780
[ 569.238826] R13: 0000000000000000 R14: ffff881f55a1da80 R15: 0000000000000000
[ 569.238827] FS: 0000000000000000(0000) GS:ffff883ffef80000(0000) knlGS:0000000000000000
[ 569.238827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 569.238827] CR2: 0000000000000000 CR3: 000000000220a003 CR4: 00000000003606e0
[ 569.238827] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 569.238828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 569.238828] Call Trace:
[ 569.238828] <IRQ>
[ 569.238828] ? __skb_tx_hash+0x94/0xb0
[ 569.238829] netpoll_poll_dev+0xc5/0x1a0
[ 569.238829] netpoll_send_skb_on_dev+0x12c/0x200
[ 569.238829] netpoll_send_udp+0x2d5/0x410
[ 569.238829] write_ext_msg+0x1e7/0x200
[ 569.238830] ? scnprintf+0x3a/0x70
[ 569.238830] console_unlock+0x35c/0x530
[ 569.238830] vprintk_emit+0x225/0x340
[ 569.238830] ? irq_matrix_free+0x32/0xd0
[ 569.238831] vprintk_default+0x1f/0x30
[ 569.238831] vprintk_func+0x35/0x70
[ 569.238831] printk+0x43/0x4b
[ 569.238832] ? irq_matrix_free+0x32/0xd0
[ 569.238832] __warn+0x6f/0x150
[ 569.238832] ? irq_matrix_free+0x32/0xd0
[ 569.238832] report_bug+0x83/0xe0
[ 569.238833] do_invalid_op+0x2c/0x70
[ 569.238833] invalid_op+0x1b/0x40
[ 569.238833] RIP: 0010:irq_matrix_free+0x32/0xd0
[ 569.238834] RSP: 0018:ffff883ffef83f58 EFLAGS: 00010002
[ 569.238834] RAX: 0000000000000014 RBX: 0000000000014140 RCX: 0000000000000000
[ 569.238835] RDX: 0000000000000000 RSI: 0000000000000014 RDI: ffff881fff028c00
[ 569.238835] RBP: ffff883ffef83fc0 R08: ffff883ffefa1c40 R09: 0000000000000000
[ 569.238836] R10: 0000000000000000 R11: 0000000000000000 R12: ffff883ffefa4e00
[ 569.238836] R13: 0000000000000000 R14: 0000000000000014 R15: ffff881fff028c00
[ 569.238836] ? free_moved_vector+0x60/0x110
[ 569.238836] smp_irq_move_cleanup_interrupt+0x91/0xa9
[ 569.238837] irq_move_cleanup_interrupt+0xc/0x20
[ 569.238837] </IRQ>
[ 569.238837] RIP: 0010:cpuidle_enter_state+0xb6/0x2c0
[ 569.238837] RSP: 0018:ffffc9000c5d7e80 EFLAGS: 00000246
[ 569.238838] ORIG_RAX: ffffffffffffffdf
[ 569.238838] RAX: ffff883ffefa1080 RBX: ffffe8ffff780d00 RCX: 000000000000001f
[ 569.238838] RDX: 20c49ba5e353f7cf RSI: 0000000000000004 RDI: 0000000000000000
[ 569.238838] RBP: ffffc9000c5d7eb8 R08: fffa7b7805587a32 R09: ffffffff822ec140
[ 569.238839] R10: ffffc9000c5d7e60 R11: 0000000000004817 R12: 0000000000000004
[ 569.238839] R13: 0000000000000004 R14: 0000000000000014 R15: 0000008487066592
[ 569.238839] cpuidle_enter+0x17/0x20
[ 569.238839] call_cpuidle+0x2d/0x40
[ 569.238840] do_idle+0x109/0x1a0
[ 569.238840] cpu_startup_entry+0x1d/0x20
[ 569.238840] start_secondary+0x10e/0x120
[ 569.238840] secondary_startup_64+0xa5/0xb0
[ 569.238841] Code:
[ 569.238841] 89
[ 569.238841] 02
[ 569.238841] 41
[ 569.238841] f6
[ 569.238842] 44
[ 569.238842] 24
[ 569.238842] 36
[ 569.238842] 40
[ 569.238842] 74
[ 569.238842] 02
[ 569.238843] 89
[ 569.238843] 02
[ 569.238843] 8b
[ 569.238843] 44
[ 569.238843] 24
[ 569.238843] 1c
[ 569.238844] 49
[ 569.238844] 8b
[ 569.238844] 96
[ 569.238844] c8
[ 569.238844] 00
[ 569.238844] 00
[ 569.238845] 00
[ 569.238845] 41
[ 569.238845] 89
[ 569.238845] 86
[ 569.238845] c0
[ 569.238845] 00
[ 569.238846] 00
[ 569.238846] 00
[ 569.238846] 41
[ 569.238846] 23
[ 569.238846] 84
[ 569.238846] 24
[ 569.238847] d4
[ 569.238847] 00
[ 569.238847] 00
[ 569.238847] 00
[ 569.238847] 0d
[ 569.238847] 00
[ 569.238848] 00
[ 569.238848] 00
[ 569.238848] 2c
[ 569.238848] <89>
[ 569.238848] 02
[ 569.238848] 85
[ 569.238849] db
[ 569.238849] 0f
[ 569.238849] 85
[ 569.238849] e7
[ 569.238849] 00
[ 569.238849] 00
[ 569.238850] 00
[ 569.238850] f6
[ 569.238850] 44
[ 569.238850] 24
[ 569.238850] 1b
[ 569.238850] 01
[ 569.238851] 74
[ 569.238851] 40
[ 569.238851] 49
[ 569.238851] 8b
[ 569.238851] 96
[ 569.238851] a8
[ 569.238852] RIP: bnxt_poll+0x163/0x830 RSP: ffff883ffef83b18
[ 569.238852] CR2: 0000000000000000
> On May 28, 2018, at 7:27 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Mon, 28 May 2018, Tariq Toukan wrote:
>> On 28/05/2018 1:53 PM, Thomas Gleixner wrote:
>>> On Fri, 25 May 2018, Song Liu wrote:
>>>> On Wed, May 23, 2018 at 1:49 AM, Thomas Gleixner <[email protected]>
>>>> wrote:
>>>>> On Wed, 23 May 2018, Tariq Toukan wrote:
>>>>>> I have your patch merged into my internal branch, it prints the
>>>>>> following:
>>>>>>
>>>>>> [ 4898.226258] Trying to clear prev_vector: 0
>>>>>> [ 4898.226439] Trying to clear prev_vector: 0
>>>>>>
>>>>>> i.e. vector(0) is lower than FIRST_EXTERNAL_VECTOR.
>>>>>
>>>>> Could you please enable the vector and irq matrix trace points and
>>>>> capture
>>>>> the trace when this happens?
>>>
>>> Does the patch below fix it?
>>>
>>> Thanks,
>>>
>>> tglx
>>>
>>> 8<-------------------
>>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>>> index bb6f7a2148d7..54af3d4884b1 100644
>>> --- a/arch/x86/kernel/apic/vector.c
>>> +++ b/arch/x86/kernel/apic/vector.c
>>> @@ -148,6 +148,7 @@ static void apic_update_vector(struct irq_data *irqd,
>>> unsigned int newvec,
>>> * prev_vector for this and the offlined target case.
>>> */
>>> apicd->prev_vector = 0;
>>> + apicd->move_in_progress = false;
>>> if (!apicd->vector || apicd->vector == MANAGED_IRQ_SHUTDOWN_VECTOR)
>>> goto setnew;
>>> /*
>>>
>>
>> I took it into my internal branch. Will let you know.
>
> Note, I'd still like to see a trace w/o the patch which shows which order
> of events leads to this. Even if the patch cures it, it might just paper
> over it and not fixing the root cause.
>
> Thanks,
>
> tglx
I just checked trace without the patch. It looks the same as the one with
the patch.
Thanks,
Song
On Mon, 28 May 2018, Song Liu wrote:
> This doesn't fix the issue with bnxt. Here is a trace with this patch:
That's just the backtrace which is not really helpful. The real question is
what leads to this scenaria.
What I was asking for is to enable the irq_vector and irq_matrix trace
points along with ftrace_dump_on_oops. The latter dumps the trace buffer
when the problem happens and that should give us a hint what actually
causes that.
Thanks,
tglx
On Mon, 28 May 2018, Song Liu wrote:
> > On May 28, 2018, at 1:09 PM, Thomas Gleixner <[email protected]> wrote:
> >
> > On Mon, 28 May 2018, Song Liu wrote:
> >> This doesn't fix the issue with bnxt. Here is a trace with this patch:
> >
> > That's just the backtrace which is not really helpful. The real question is
> > what leads to this scenaria.
> >
> > What I was asking for is to enable the irq_vector and irq_matrix trace
> > points along with ftrace_dump_on_oops. The latter dumps the trace buffer
> > when the problem happens and that should give us a hint what actually
> > causes that.
>
> Attached is the dump I got. It seems pretty noisy, with many message lost.
>
> Maybe we cannot enable all trace points under irq_vectors/ and irq_matrix.
Right. Sorry, I forgot to say that we only need the irq_vectors ones which
are related to vector allocation, i.e.: irq_vectors/vector_*
Thanks,
tglx
> On May 29, 2018, at 1:35 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Mon, 28 May 2018, Song Liu wrote:
>>> On May 28, 2018, at 1:09 PM, Thomas Gleixner <[email protected]> wrote:
>>>
>>> On Mon, 28 May 2018, Song Liu wrote:
>>>> This doesn't fix the issue with bnxt. Here is a trace with this patch:
>>>
>>> That's just the backtrace which is not really helpful. The real question is
>>> what leads to this scenaria.
>>>
>>> What I was asking for is to enable the irq_vector and irq_matrix trace
>>> points along with ftrace_dump_on_oops. The latter dumps the trace buffer
>>> when the problem happens and that should give us a hint what actually
>>> causes that.
>>
>> Attached is the dump I got. It seems pretty noisy, with many message lost.
>>
>> Maybe we cannot enable all trace points under irq_vectors/ and irq_matrix.
>
> Right. Sorry, I forgot to say that we only need the irq_vectors ones which
> are related to vector allocation, i.e.: irq_vectors/vector_*
>
> Thanks,
>
> tglx
Here is the ftrace dump:
[ 1609.472697] <...>-53363 19d... 1610359225us : vector_deactivate: irq=29 is_managed=0 can_reserve=1 reserve=0
[ 1609.472698] <...>-53363 19d... 1610359229us : vector_clear: irq=29 vector=33 cpu=21 prev_vector=0 prev_cpu=0
[ 1609.472699] <...>-53363 19d... 1610359230us : irq_matrix_free: bit=33 cpu=21 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11239, global_rsvd=23, total_alloc=17
[ 1609.472699] <...>-53363 19d... 1610359231us : irq_matrix_reserve: online_maps=56 global_avl=11239, global_rsvd=24, total_alloc=17
[ 1609.472700] <...>-53363 19d... 1610359232us : vector_reserve: irq=29 ret=0
[ 1609.472700] <...>-53363 19d... 1610359233us : vector_config: irq=29 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472701] <...>-53363 19d... 1610359242us : vector_deactivate: irq=30 is_managed=0 can_reserve=1 reserve=0
[ 1609.472701] <...>-53363 19d... 1610359242us : vector_clear: irq=30 vector=33 cpu=11 prev_vector=0 prev_cpu=1
[ 1609.472701] <...>-53363 19d... 1610359242us : irq_matrix_free: bit=33 cpu=11 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11240, global_rsvd=24, total_alloc=16
[ 1609.472702] <...>-53363 19d... 1610359243us : irq_matrix_reserve: online_maps=56 global_avl=11240, global_rsvd=25, total_alloc=16
[ 1609.472702] <...>-53363 19d... 1610359243us : vector_reserve: irq=30 ret=0
[ 1609.472703] <...>-53363 19d... 1610359243us : vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472703] <...>-53363 19d... 1610359248us : vector_deactivate: irq=31 is_managed=0 can_reserve=1 reserve=0
[ 1609.472703] <...>-53363 19d... 1610359248us : vector_clear: irq=31 vector=33 cpu=20 prev_vector=0 prev_cpu=2
[ 1609.472704] <...>-53363 19d... 1610359249us : irq_matrix_free: bit=33 cpu=20 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11241, global_rsvd=25, total_alloc=15
[ 1609.472704] <...>-53363 19d... 1610359249us : irq_matrix_reserve: online_maps=56 global_avl=11241, global_rsvd=26, total_alloc=15
[ 1609.472705] <...>-53363 19d... 1610359249us : vector_reserve: irq=31 ret=0
[ 1609.472705] <...>-53363 19d... 1610359249us : vector_config: irq=31 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472705] <...>-53363 19d... 1610359255us : vector_deactivate: irq=32 is_managed=0 can_reserve=1 reserve=0
[ 1609.472706] <...>-53363 19d... 1610359255us : vector_clear: irq=32 vector=33 cpu=9 prev_vector=0 prev_cpu=3
[ 1609.472707] <...>-53363 19d... 1610359255us : irq_matrix_free: bit=33 cpu=9 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11242, global_rsvd=26, total_alloc=14
[ 1609.472707] <...>-53363 19d... 1610359256us : irq_matrix_reserve: online_maps=56 global_avl=11242, global_rsvd=27, total_alloc=14
[ 1609.472708] <...>-53363 19d... 1610359256us : vector_reserve: irq=32 ret=0
[ 1609.472708] <...>-53363 19d... 1610359256us : vector_config: irq=32 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472709] <...>-53363 19d... 1610359261us : vector_deactivate: irq=33 is_managed=0 can_reserve=1 reserve=0
[ 1609.472709] <...>-53363 19d... 1610359261us : vector_clear: irq=33 vector=33 cpu=18 prev_vector=0 prev_cpu=4
[ 1609.472710] <...>-53363 19d... 1610359262us : irq_matrix_free: bit=33 cpu=18 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11243, global_rsvd=27, total_alloc=13
[ 1609.472710] <...>-53363 19d... 1610359262us : irq_matrix_reserve: online_maps=56 global_avl=11243, global_rsvd=28, total_alloc=13
[ 1609.472711] <...>-53363 19d... 1610359262us : vector_reserve: irq=33 ret=0
[ 1609.472711] <...>-53363 19d... 1610359262us : vector_config: irq=33 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472711] <...>-53363 19d... 1610359268us : vector_deactivate: irq=34 is_managed=0 can_reserve=1 reserve=0
[ 1609.472712] <...>-53363 19d... 1610359268us : vector_clear: irq=34 vector=34 cpu=7 prev_vector=0 prev_cpu=5
[ 1609.472712] <...>-53363 19d... 1610359268us : irq_matrix_free: bit=34 cpu=7 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11244, global_rsvd=28, total_alloc=12
[ 1609.472713] <...>-53363 19d... 1610359268us : irq_matrix_reserve: online_maps=56 global_avl=11244, global_rsvd=29, total_alloc=12
[ 1609.472713] <...>-53363 19d... 1610359269us : vector_reserve: irq=34 ret=0
[ 1609.472713] <...>-53363 19d... 1610359269us : vector_config: irq=34 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472714] <...>-53363 19d... 1610359279us : vector_deactivate: irq=35 is_managed=0 can_reserve=1 reserve=0
[ 1609.472714] <...>-53363 19d... 1610359279us : vector_clear: irq=35 vector=33 cpu=26 prev_vector=0 prev_cpu=6
[ 1609.472715] <...>-53363 19d... 1610359279us : irq_matrix_free: bit=33 cpu=26 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11245, global_rsvd=29, total_alloc=11
[ 1609.472715] <...>-53363 19d... 1610359279us : irq_matrix_reserve: online_maps=56 global_avl=11245, global_rsvd=30, total_alloc=11
[ 1609.472715] <...>-53363 19d... 1610359280us : vector_reserve: irq=35 ret=0
[ 1609.472716] <...>-53363 19d... 1610359280us : vector_config: irq=35 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472716] <...>-53363 19d... 1610359285us : vector_deactivate: irq=36 is_managed=0 can_reserve=1 reserve=0
[ 1609.472717] <...>-53363 19d... 1610359285us : vector_clear: irq=36 vector=33 cpu=10 prev_vector=0 prev_cpu=7
[ 1609.472717] <...>-53363 19d... 1610359286us : irq_matrix_free: bit=33 cpu=10 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11246, global_rsvd=30, total_alloc=10
[ 1609.472718] <...>-53363 19d... 1610359286us : irq_matrix_reserve: online_maps=56 global_avl=11246, global_rsvd=31, total_alloc=10
[ 1609.472718] <...>-53363 19d... 1610359286us : vector_reserve: irq=36 ret=0
[ 1609.472718] <...>-53363 19d... 1610359286us : vector_config: irq=36 vector=239 cpu=0 apicdest=0x00000000
[ 1609.472719] <...>-53363 20d... 1610366617us : vector_activate: irq=29 is_managed=0 can_reserve=1 reserve=0
[ 1609.472719] <...>-53363 20d... 1610366619us : vector_alloc: irq=29 vector=4294967268 reserved=1 ret=0
[ 1609.472719] <...>-53363 20d... 1610366621us : irq_matrix_alloc: bit=33 cpu=7 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11245, global_rsvd=30, total_alloc=11
[ 1609.472720] <...>-53363 20d... 1610366623us : vector_update: irq=29 vector=33 cpu=7 prev_vector=0 prev_cpu=21
[ 1609.472720] <...>-53363 20d... 1610366623us : vector_alloc: irq=29 vector=33 reserved=1 ret=0
[ 1609.472721] <...>-53363 20d... 1610366623us : vector_config: irq=29 vector=33 cpu=7 apicdest=0x00000010
[ 1609.472721] <...>-53363 20d... 1610366629us : irq_matrix_alloc: bit=33 cpu=21 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11244, global_rsvd=30, total_alloc=12
[ 1609.472722] <...>-53363 20d... 1610366629us : vector_update: irq=29 vector=33 cpu=21 prev_vector=33 prev_cpu=7
[ 1609.472723] <...>-53363 20d... 1610366629us : vector_alloc: irq=29 vector=33 reserved=1 ret=0
[ 1609.472723] <...>-53363 20d... 1610366630us : vector_config: irq=29 vector=33 cpu=21 apicdest=0x00000030
[ 1609.472724] <...>-53363 20d... 1610366637us : irq_matrix_alloc: bit=35 cpu=0 online=1 avl=197 alloc=4 managed=0 online_maps=56 global_avl=11243, global_rsvd=30, total_alloc=13
[ 1609.472724] <...>-53363 20d... 1610366637us : vector_update: irq=29 vector=35 cpu=0 prev_vector=33 prev_cpu=21
[ 1609.472725] <...>-53363 20d... 1610366637us : vector_alloc: irq=29 vector=35 reserved=1 ret=0
[ 1609.472725] <...>-53363 20d... 1610366638us : vector_config: irq=29 vector=35 cpu=0 apicdest=0x00000000
[ 1609.472725] <...>-53363 20dN.. 1610366643us : vector_activate: irq=30 is_managed=0 can_reserve=1 reserve=0
[ 1609.472726] <...>-53363 20dN.. 1610366644us : irq_matrix_alloc: bit=33 cpu=11 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11242, global_rsvd=29, total_alloc=14
[ 1609.472726] <...>-53363 20dN.. 1610366644us : vector_update: irq=30 vector=33 cpu=11 prev_vector=0 prev_cpu=11
[ 1609.472727] <...>-53363 20dN.. 1610366644us : vector_alloc: irq=30 vector=33 reserved=1 ret=0
[ 1609.472727] <...>-53363 20dN.. 1610366644us : vector_config: irq=30 vector=33 cpu=11 apicdest=0x00000018
[ 1609.472727] <...>-53363 20dN.. 1610366647us : vector_config: irq=30 vector=33 cpu=11 apicdest=0x00000018
[ 1609.472728] <...>-53363 20dN.. 1610366650us : irq_matrix_alloc: bit=34 cpu=1 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11241, global_rsvd=29, total_alloc=15
[ 1609.472728] <...>-53363 20dN.. 1610366651us : vector_update: irq=30 vector=34 cpu=1 prev_vector=33 prev_cpu=11
[ 1609.472729] <...>-53363 20dN.. 1610366651us : vector_alloc: irq=30 vector=34 reserved=1 ret=0
[ 1609.472729] <...>-53363 20dN.. 1610366651us : vector_config: irq=30 vector=34 cpu=1 apicdest=0x00000002
[ 1609.472729] <...>-53363 20dN.. 1610366654us : vector_activate: irq=31 is_managed=0 can_reserve=1 reserve=0
[ 1609.472730] <...>-53363 20dN.. 1610366654us : vector_alloc: irq=31 vector=4294967268 reserved=1 ret=0
[ 1609.472730] <...>-53363 20dN.. 1610366655us : irq_matrix_alloc: bit=33 cpu=9 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
[ 1609.472731] <...>-53363 20dN.. 1610366655us : vector_update: irq=31 vector=33 cpu=9 prev_vector=0 prev_cpu=20
[ 1609.472731] <...>-53363 20dN.. 1610366656us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
[ 1609.472731] <...>-53363 20dN.. 1610366656us : vector_config: irq=31 vector=33 cpu=9 apicdest=0x00000014
[ 1609.472732] <idle>-0 7d.h. 1610366658us : vector_free_moved: irq=29 cpu=21 vector=33 is_managed=0
[ 1609.472732] <idle>-0 7d.h. 1610366660us : irq_matrix_free: bit=33 cpu=21 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11241, global_rsvd=28, total_alloc=15
[ 1609.472733] <...>-53363 20dN.. 1610366662us : irq_matrix_alloc: bit=33 cpu=20 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
[ 1609.472733] <...>-53363 20dN.. 1610366662us : vector_update: irq=31 vector=33 cpu=20 prev_vector=33 prev_cpu=9
[ 1609.472734] <...>-53363 20dN.. 1610366662us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
[ 1609.472734] <...>-53363 20dN.. 1610366662us : vector_config: irq=31 vector=33 cpu=20 apicdest=0x0000002c
[ 1609.472735] <...>-53363 20dN.. 1610366666us : irq_matrix_alloc: bit=34 cpu=2 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11239, global_rsvd=28, total_alloc=17
[ 1609.472735] <...>-53363 20dN.. 1610366666us : vector_update: irq=31 vector=34 cpu=2 prev_vector=33 prev_cpu=20
[ 1609.472736] <...>-53363 20dN.. 1610366666us : vector_alloc: irq=31 vector=34 reserved=1 ret=0
[ 1609.472736] <...>-53363 20dN.. 1610366666us : vector_config: irq=31 vector=34 cpu=2 apicdest=0x00000004
[ 1609.472737] <...>-53363 20dNh. 1610366669us : vector_free_moved: irq=31 cpu=20 vector=33 is_managed=0
[ 1609.472737] <...>-53363 20dNh. 1610366670us : irq_matrix_free: bit=33 cpu=20 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
[ 1609.472738] <...>-53363 20dN.. 1610366671us : vector_activate: irq=32 is_managed=0 can_reserve=1 reserve=0
[ 1609.472739] <...>-53363 20dN.. 1610366671us : irq_matrix_alloc: bit=34 cpu=9 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11239, global_rsvd=27, total_alloc=17
[ 1609.472739] <...>-53363 20dN.. 1610366671us : vector_update: irq=32 vector=34 cpu=9 prev_vector=0 prev_cpu=9
[ 1609.472739] <...>-53363 20dN.. 1610366671us : vector_alloc: irq=32 vector=34 reserved=1 ret=0
[ 1609.472740] <...>-53363 20dN.. 1610366672us : vector_config: irq=32 vector=34 cpu=9 apicdest=0x00000014
[ 1609.472740] <...>-53363 20dN.. 1610366674us : vector_config: irq=32 vector=34 cpu=9 apicdest=0x00000014
[ 1609.472741] <...>-53363 20dN.. 1610366678us : irq_matrix_alloc: bit=34 cpu=3 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11238, global_rsvd=27, total_alloc=18
[ 1609.472741] <...>-53363 20dN.. 1610366678us : vector_update: irq=32 vector=34 cpu=3 prev_vector=34 prev_cpu=9
[ 1609.472741] <...>-53363 20dN.. 1610366679us : vector_alloc: irq=32 vector=34 reserved=1 ret=0
[ 1609.472742] <...>-53363 20dN.. 1610366679us : vector_config: irq=32 vector=34 cpu=3 apicdest=0x00000006
[ 1609.472742] <idle>-0 11d.h. 1610366680us : vector_free_moved: irq=30 cpu=11 vector=33 is_managed=0
[ 1609.472743] <idle>-0 11d.h. 1610366681us : irq_matrix_free: bit=33 cpu=11 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11239, global_rsvd=27, total_alloc=17
[ 1609.472743] <...>-53363 20dN.. 1610366683us : vector_activate: irq=33 is_managed=0 can_reserve=1 reserve=0
[ 1609.472743] <...>-53363 20dN.. 1610366683us : vector_alloc: irq=33 vector=4294967268 reserved=1 ret=0
[ 1609.472744] <...>-53363 20dN.. 1610366684us : irq_matrix_alloc: bit=33 cpu=10 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11238, global_rsvd=26, total_alloc=18
[ 1609.472744] <...>-53363 20dN.. 1610366684us : vector_update: irq=33 vector=33 cpu=10 prev_vector=0 prev_cpu=18
[ 1609.472745] <...>-53363 20dN.. 1610366684us : vector_alloc: irq=33 vector=33 reserved=1 ret=0
[ 1609.472745] <...>-53363 20dN.. 1610366685us : vector_config: irq=33 vector=33 cpu=10 apicdest=0x00000016
[ 1609.472745] <...>-53363 20dN.. 1610366688us : irq_matrix_alloc: bit=33 cpu=18 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11237, global_rsvd=26, total_alloc=19
[ 1609.472746] <...>-53363 20dN.. 1610366688us : vector_update: irq=33 vector=33 cpu=18 prev_vector=33 prev_cpu=10
[ 1609.472746] <...>-53363 20dN.. 1610366688us : vector_alloc: irq=33 vector=33 reserved=1 ret=0
[ 1609.472747] <...>-53363 20dN.. 1610366688us : vector_config: irq=33 vector=33 cpu=18 apicdest=0x00000028
[ 1609.472747] <...>-53363 20dN.. 1610366692us : irq_matrix_alloc: bit=34 cpu=4 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11236, global_rsvd=26, total_alloc=20
[ 1609.472747] <...>-53363 20dN.. 1610366692us : vector_update: irq=33 vector=34 cpu=4 prev_vector=33 prev_cpu=18
[ 1609.472748] <...>-53363 20dN.. 1610366692us : vector_alloc: irq=33 vector=34 reserved=1 ret=0
[ 1609.472748] <...>-53363 20dN.. 1610366693us : vector_config: irq=33 vector=34 cpu=4 apicdest=0x00000008
[ 1609.472749] <idle>-0 9d.h. 1610366694us : vector_free_moved: irq=32 cpu=9 vector=34 is_managed=0
[ 1609.472749] <idle>-0 9d.h. 1610366695us : irq_matrix_free: bit=34 cpu=9 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11237, global_rsvd=26, total_alloc=19
[ 1609.472750] <idle>-0 9d.h. 1610366696us : vector_free_moved: irq=31 cpu=20 vector=0 is_managed=0
Thanks,
Song
Song,
On Tue, 29 May 2018, Song Liu wrote:
> > On May 29, 2018, at 1:35 AM, Thomas Gleixner <[email protected]> wrote:
> >> Maybe we cannot enable all trace points under irq_vectors/ and irq_matrix.
> >
> > Right. Sorry, I forgot to say that we only need the irq_vectors ones which
> > are related to vector allocation, i.e.: irq_vectors/vector_*
>
> Here is the ftrace dump:
Thanks for providing the data!
> 19d... 1610359248us : vector_deactivate: irq=31 is_managed=0 can_reserve=1 reserve=0
> 19d... 1610359248us : vector_clear: irq=31 vector=33 cpu=20 prev_vector=0 prev_cpu=2
> 19d... 1610359249us : irq_matrix_free: bit=33 cpu=20 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11241, global_rsvd=25, total_alloc=15
Here IRQ 31 is shutdown and the vector freed.
> 19d... 1610359249us : irq_matrix_reserve: online_maps=56 global_avl=11241, global_rsvd=26, total_alloc=15
> 19d... 1610359249us : vector_reserve: irq=31 ret=0
> 19d... 1610359249us : vector_config: irq=31 vector=239 cpu=0 apicdest=0x00000000
And set to the magic reservation vector 239 to catch spurious interrupts.
> 20dN.. 1610366654us : vector_activate: irq=31 is_managed=0 can_reserve=1 reserve=0
> 20dN.. 1610366654us : vector_alloc: irq=31 vector=4294967268 reserved=1 ret=0
> 20dN.. 1610366655us : irq_matrix_alloc: bit=33 cpu=9 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
> 20dN.. 1610366655us : vector_update: irq=31 vector=33 cpu=9 prev_vector=0 prev_cpu=20
> 20dN.. 1610366656us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
> 20dN.. 1610366656us : vector_config: irq=31 vector=33 cpu=9 apicdest=0x00000014
So here it gets initialized again and targets CPU9 now.
> 20dN.. 1610366662us : irq_matrix_alloc: bit=33 cpu=20 online=1 avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
> 20dN.. 1610366662us : vector_update: irq=31 vector=33 cpu=20 prev_vector=33 prev_cpu=9
> 20dN.. 1610366662us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
> 20dN.. 1610366662us : vector_config: irq=31 vector=33 cpu=20 apicdest=0x0000002c
Here it is reconfigured to CPU 20. Now that update schedules vector 33 on
CPU9 for cleanup.
> 20dN.. 1610366666us : irq_matrix_alloc: bit=34 cpu=2 online=1 avl=199 alloc=2 managed=0 online_maps=56 global_avl=11239, global_rsvd=28, total_alloc=17
> 20dN.. 1610366666us : vector_update: irq=31 vector=34 cpu=2 prev_vector=33 prev_cpu=20
> 20dN.. 1610366666us : vector_alloc: irq=31 vector=34 reserved=1 ret=0
> 20dN.. 1610366666us : vector_config: irq=31 vector=34 cpu=2 apicdest=0x00000004
So here the shit hits the fan because that update schedules vector 33 on
CPU20 for cleanup while the previous cleanup for CPU9 has not been done
yet. Cute. or not so cute....
> 20dNh. 1610366669us : vector_free_moved: irq=31 cpu=20 vector=33 is_managed=0
> 20dNh. 1610366670us : irq_matrix_free: bit=33 cpu=20 online=1 avl=201 alloc=0 managed=0 online_maps=56 global_avl=11240, global_rsvd=28, total_alloc=16
And frees the CPU 20 vector
> 9d.h. 1610366696us : vector_free_moved: irq=31 cpu=20 vector=0 is_managed=0
And then CPU9 claims that it's queued for cleanup. Bah.
I'm still working on a fix as the elegant solution refused to work.
Thanks,
tglx
Hi Thomas,
Sorry to ask the questions at this series, my mailbox was kicked out of
the mailing list a few days ago, and didn't receive the e-mail.
please see below
At 05/29/2018 04:09 AM, Thomas Gleixner wrote:
> On Mon, 28 May 2018, Song Liu wrote:
>> This doesn't fix the issue with bnxt. Here is a trace with this patch:
[...]
>
> Thanks for providing the data!
>
> > 19d... 1610359248us : vector_deactivate: irq=31 is_managed=0
> can_reserve=1 reserve=0
> > 19d... 1610359248us : vector_clear: irq=31 vector=33 cpu=20
> prev_vector=0 prev_cpu=2
> > 19d... 1610359249us : irq_matrix_free: bit=33 cpu=20 online=1
> avl=201 alloc=0 managed=0 online_maps=56 global_avl=11241,
> global_rsvd=25, total_alloc=15
>
> Here IRQ 31 is shutdown and the vector freed.
>
> > 19d... 1610359249us : irq_matrix_reserve: online_maps=56
> global_avl=11241, global_rsvd=26, total_alloc=15
> > 19d... 1610359249us : vector_reserve: irq=31 ret=0
> > 19d... 1610359249us : vector_config: irq=31 vector=239 cpu=0
> apicdest=0x00000000
>
> And set to the magic reservation vector 239 to catch spurious interrupts.
>
> > 20dN.. 1610366654us : vector_activate: irq=31 is_managed=0
> can_reserve=1 reserve=0
> > 20dN.. 1610366654us : vector_alloc: irq=31 vector=4294967268
> reserved=1 ret=0
> > 20dN.. 1610366655us : irq_matrix_alloc: bit=33 cpu=9 online=1
> avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240,
> global_rsvd=28, total_alloc=16
> > 20dN.. 1610366655us : vector_update: irq=31 vector=33 cpu=9
> prev_vector=0 prev_cpu=20
^^^^^^^^^^^^
this means there is no associated previous vector.
> > 20dN.. 1610366656us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
> > 20dN.. 1610366656us : vector_config: irq=31 vector=33 cpu=9
> apicdest=0x00000014
>
> So here it gets initialized again and targets CPU9 now.
>
> > 20dN.. 1610366662us : irq_matrix_alloc: bit=33 cpu=20 online=1
> avl=200 alloc=1 managed=0 online_maps=56 global_avl=11240,
> global_rsvd=28, total_alloc=16
> > 20dN.. 1610366662us : vector_update: irq=31 vector=33 cpu=20
> prev_vector=33 prev_cpu=9
> > 20dN.. 1610366662us : vector_alloc: irq=31 vector=33 reserved=1 ret=0
> > 20dN.. 1610366662us : vector_config: irq=31 vector=33 cpu=20
> apicdest=0x0000002c
>
> Here it is reconfigured to CPU 20. Now that update schedules vector 33 on
> CPU9 for cleanup.
>
> > 20dN.. 1610366666us : irq_matrix_alloc: bit=34 cpu=2 online=1
> avl=199 alloc=2 managed=0 online_maps=56 global_avl=11239,
> global_rsvd=28, total_alloc=17
> > 20dN.. 1610366666us : vector_update: irq=31 vector=34 cpu=2
> prev_vector=33 prev_cpu=20
> > 20dN.. 1610366666us : vector_alloc: irq=31 vector=34 reserved=1 ret=0
> > 20dN.. 1610366666us : vector_config: irq=31 vector=34 cpu=2
> apicdest=0x00000004
>
> So here the shit hits the fan because that update schedules vector 33 on
> CPU20 for cleanup while the previous cleanup for CPU9 has not been done
> yet. Cute. or not so cute....
>
> > 20dNh. 1610366669us : vector_free_moved: irq=31 cpu=20 vector=33
> is_managed=0
> > 20dNh. 1610366670us : irq_matrix_free: bit=33 cpu=20 online=1
> avl=201 alloc=0 managed=0 online_maps=56 global_avl=11240,
> global_rsvd=28, total_alloc=16
>
> And frees the CPU 20 vector
>
> > 9d.h. 1610366696us : vector_free_moved: irq=31 cpu=20 vector=0
> is_managed=0
>
Here, why didn't we avoid this cleanup by
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index a75de0792942..0cc59646755f 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -821,6 +821,9 @@ static void free_moved_vector(struct apic_chip_data
*apicd)
*/
WARN_ON_ONCE(managed);
+ if (!vector)
+ return;
+
trace_vector_free_moved(apicd->irq, cpu, vector, managed);
irq_matrix_free(vector_matrix, cpu, vector, managed);
per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
Is there something I didn't consider with? ;-)
Thanks,
dou.
> And then CPU9 claims that it's queued for cleanup. Bah.
>
> I'm still working on a fix as the elegant solution refused to work.
>
> Thanks,
>
> tglx
On Mon, 4 Jun 2018, Dou Liyang wrote:
> Here, why didn't we avoid this cleanup by
>
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index a75de0792942..0cc59646755f 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -821,6 +821,9 @@ static void free_moved_vector(struct apic_chip_data
> *apicd)
> */
> WARN_ON_ONCE(managed);
>
> + if (!vector)
> + return;
> +
> trace_vector_free_moved(apicd->irq, cpu, vector, managed);
> irq_matrix_free(vector_matrix, cpu, vector, managed);
> per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
>
> Is there something I didn't consider with? ;-)
Well, that just prevents the warning, but the hlist is already
corrupted. So you'd just cure the symptom ...
I'm about to send a patch series which addresses that. Just need to finish
writing changelogs.
Thanks,
tglx
Hi Thomas,
At 06/04/2018 07:17 PM, Thomas Gleixner wrote:
> On Mon, 4 Jun 2018, Dou Liyang wrote:
>> Here, why didn't we avoid this cleanup by
>>
>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>> index a75de0792942..0cc59646755f 100644
>> --- a/arch/x86/kernel/apic/vector.c
>> +++ b/arch/x86/kernel/apic/vector.c
>> @@ -821,6 +821,9 @@ static void free_moved_vector(struct apic_chip_data
>> *apicd)
>> */
>> WARN_ON_ONCE(managed);
>>
>> + if (!vector)
>> + return;
>> +
>> trace_vector_free_moved(apicd->irq, cpu, vector, managed);
>> irq_matrix_free(vector_matrix, cpu, vector, managed);
>> per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
>>
>> Is there something I didn't consider with? ;-)
>
> Well, that just prevents the warning, but the hlist is already
> corrupted. So you'd just cure the symptom ...
>
I see.
> I'm about to send a patch series which addresses that. Just need to finish
> writing changelogs.
>
Thank you for telling me that.
Thanks,
dou