2017-03-02 07:05:39

by Xishi Qiu

[permalink] [raw]
Subject: WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80()

Hi, I test Trinity, and got the following log.
My OS version is RHEL 7.2, I'm not sure if it has fixed in mainline.
Any comment is welcome.

[57676.532593] ------------[ cut here ]------------
[57676.537415] WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80()
[57676.546299] Modules linked in: 8021q garp stp mrp llc fuse cmtp kernelcapi scsi_transport_iscsi rfcomm dccp_ipv6 dccp_ipv4 dccp ipt_ULOG bluetooth rfkill af_key nfnetlink af_802154 vmw_vsock_vmci_transport vmw_vmci vsock atm pppoe pppox ppp_generic slhc openvswitch ipmi_devintf ipmi_si ipmi_msghandler coretemp intel_rapl crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support tg3 aesni_intel lrw gf128mul glue_helper ptp pps_core ses enclosure ablk_helper cryptd sg sb_edac pcspkr edac_core acpi_power_meter i2c_i801 i2c_core mei_me mei shpchp lpc_ich mfd_core ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common ahci libahci libata megaraid_sas dm_mod nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net tun vhost macvtap macvlan vfio_pci irqbypass
[57676.620009] vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c
[57676.629998] CPU: 11 PID: 114 Comm: kworker/11:0 Not tainted 3.10.0-327.44.58.22.x86_64 #1
[57676.638525] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.31 08/22/2016
[57676.645746] Workqueue: events intel_cqm_rmid_rotate
[57676.650821] 0000000000000000 00000000701d762a ffff882027873d60 ffffffff8163b400
[57676.658639] ffff882027873d98 ffffffff8107b1f0 0000000000000001 00000000ffffffff
[57676.666458] ffff88201fba7000 ffff88100ceeb400 ffff8820265730c0 ffff882027873da8
[57676.674285] Call Trace:
[57676.676915] [<ffffffff8163b400>] dump_stack+0x19/0x1b
[57676.682227] [<ffffffff8107b1f0>] warn_slowpath_common+0x70/0xb0
[57676.688404] [<ffffffff8107b33a>] warn_slowpath_null+0x1a/0x20
[57676.694409] [<ffffffff8103a578>] __put_rmid+0x28/0x80
[57676.699694] [<ffffffff8103a74a>] intel_cqm_rmid_rotate+0xba/0x440
[57676.706051] [<ffffffff8109d8cb>] process_one_work+0x17b/0x470
[57676.712070] [<ffffffff8109e69b>] worker_thread+0x11b/0x400
[57676.717819] [<ffffffff8109e580>] ? rescuer_thread+0x400/0x400
[57676.723830] [<ffffffff810a5ddf>] kthread+0xcf/0xe0
[57676.728883] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
[57676.735586] [<ffffffff8164b6d8>] ret_from_fork+0x58/0x90
[57676.741156] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
[57676.747860] ---[ end trace dee4db6217e5fc5d ]---
[57676.752669] BUG: unable to handle kernel NULL pointer dereference at (null)
[57676.760870] IP: [<ffffffff8103a586>] __put_rmid+0x36/0x80
[57676.766458] PGD 1026c33067 PUD 10282c4067 PMD 0
[57676.771279] Oops: 0000 [#1] SMP
[57676.774722] Modules linked in: 8021q garp stp mrp llc fuse cmtp kernelcapi scsi_transport_iscsi rfcomm dccp_ipv6 dccp_ipv4 dccp ipt_ULOG bluetooth rfkill af_key nfnetlink af_802154 vmw_vsock_vmci_transport vmw_vmci vsock atm pppoe pppox ppp_generic slhc openvswitch ipmi_devintf ipmi_si ipmi_msghandler coretemp intel_rapl crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support tg3 aesni_intel lrw gf128mul glue_helper ptp pps_core ses enclosure ablk_helper cryptd sg sb_edac pcspkr edac_core acpi_power_meter i2c_i801 i2c_core mei_me mei shpchp lpc_ich mfd_core ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common ahci libahci libata megaraid_sas dm_mod nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net tun vhost macvtap macvlan vfio_pci irqbypass
[57676.848405] vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c
[57676.858377] CPU: 11 PID: 114 Comm: kworker/11:0 Tainted: G W ---- ------- 3.10.0-327.44.58.22.x86_64 #1
[57676.869330] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.31 08/22/2016
[57676.876548] Workqueue: events intel_cqm_rmid_rotate
[57676.881618] task: ffff882027868b80 ti: ffff882027870000 task.ti: ffff882027870000
[57676.889448] RIP: 0010:[<ffffffff8103a586>] [<ffffffff8103a586>] __put_rmid+0x36/0x80
[57676.897638] RSP: 0018:ffff882027873db8 EFLAGS: 00010296
[57676.903119] RAX: ffff8820259e4300 RBX: 0000000000000000 RCX: 0000000000000000
[57676.910419] RDX: ffffffffffffffff RSI: 0000000000000000 RDI: 0000000000000009
[57676.917725] RBP: ffff882027873dc8 R08: 0000000000000092 R09: ffff8800000bcec0
[57676.925031] R10: 00000000000000a0 R11: 0000000000000050 R12: 00000000ffffffff
[57676.932330] R13: ffff88201fba7000 R14: ffff88100ceeb400 R15: ffff8820265730c0
[57676.939631] FS: 0000000000000000(0000) GS:ffff88203e340000(0000) knlGS:0000000000000000
[57676.948064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[57676.953982] CR2: 0000000000000000 CR3: 0000001026c32000 CR4: 00000000001407e0
[57676.961286] DR0: 00007f2390b3d000 DR1: 00007fc8e188f000 DR2: 0000000000000000
[57676.968586] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[57676.975889] Stack:
[57676.978078] 0000000000000001 ffff8820265730c8 ffff882027873e18 ffffffff8103a74a
[57676.985892] 0000001a8109b3d9 ffff88202676dc00 0000000600000006 ffffffff8196b600
[57676.993703] ffff8810290dc500 ffff88203e356080 ffff88203e35a400 ffff88100b5ecca0
[57677.001496] Call Trace:
[57677.004120] [<ffffffff8103a74a>] intel_cqm_rmid_rotate+0xba/0x440
[57677.010471] [<ffffffff8109d8cb>] process_one_work+0x17b/0x470
[57677.016449] [<ffffffff8109e69b>] worker_thread+0x11b/0x400
[57677.022192] [<ffffffff8109e580>] ? rescuer_thread+0x400/0x400
[57677.028197] [<ffffffff810a5ddf>] kthread+0xcf/0xe0
[57677.033244] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
[57677.039939] [<ffffffff8164b6d8>] ret_from_fork+0x58/0x90
[57677.045483] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
[57677.052177] Code: e5 41 54 83 f8 fd 41 89 fc 53 76 11 be ba 00 00 00 48 c7 c7 58 f4 85 81 e8 a8 0d 04 00 48 8b 05 69 2c c1 00 49 63 d4 48 8b 1c d0 <44> 3b 23 75 2e 48 8b 05 6e 4a a4 00 c7 43 04 00 00 00 00 48 8d
[57677.072838] RIP [<ffffffff8103a586>] __put_rmid+0x36/0x80
[57677.078510] RSP <ffff882027873db8>
[57677.082172] CR2: 0000000000000000
[57677.086158] ---[ end trace dee4db6217e5fc5e ]---
[57677.451280] Kernel panic - not syncing: Fatal exception


2017-04-10 03:46:04

by Xishi Qiu

[permalink] [raw]
Subject: Re: WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80()

On 2017/3/2 14:55, Xishi Qiu wrote:
ping

> Hi, I test Trinity, and got the following log.
> My OS version is RHEL 7.2, I'm not sure if it has fixed in mainline.
> Any comment is welcome.
>
> [57676.532593] ------------[ cut here ]------------
> [57676.537415] WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80()
> [57676.546299] Modules linked in: 8021q garp stp mrp llc fuse cmtp kernelcapi scsi_transport_iscsi rfcomm dccp_ipv6 dccp_ipv4 dccp ipt_ULOG bluetooth rfkill af_key nfnetlink af_802154 vmw_vsock_vmci_transport vmw_vmci vsock atm pppoe pppox ppp_generic slhc openvswitch ipmi_devintf ipmi_si ipmi_msghandler coretemp intel_rapl crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support tg3 aesni_intel lrw gf128mul glue_helper ptp pps_core ses enclosure ablk_helper cryptd sg sb_edac pcspkr edac_core acpi_power_meter i2c_i801 i2c_core mei_me mei shpchp lpc_ich mfd_core ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common ahci libahci libata megaraid_sas dm_mod nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net tun vhost macvtap macvlan vfio_pci irqbypass
> [57676.620009] vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c
> [57676.629998] CPU: 11 PID: 114 Comm: kworker/11:0 Not tainted 3.10.0-327.44.58.22.x86_64 #1
> [57676.638525] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.31 08/22/2016
> [57676.645746] Workqueue: events intel_cqm_rmid_rotate
> [57676.650821] 0000000000000000 00000000701d762a ffff882027873d60 ffffffff8163b400
> [57676.658639] ffff882027873d98 ffffffff8107b1f0 0000000000000001 00000000ffffffff
> [57676.666458] ffff88201fba7000 ffff88100ceeb400 ffff8820265730c0 ffff882027873da8
> [57676.674285] Call Trace:
> [57676.676915] [<ffffffff8163b400>] dump_stack+0x19/0x1b
> [57676.682227] [<ffffffff8107b1f0>] warn_slowpath_common+0x70/0xb0
> [57676.688404] [<ffffffff8107b33a>] warn_slowpath_null+0x1a/0x20
> [57676.694409] [<ffffffff8103a578>] __put_rmid+0x28/0x80
> [57676.699694] [<ffffffff8103a74a>] intel_cqm_rmid_rotate+0xba/0x440
> [57676.706051] [<ffffffff8109d8cb>] process_one_work+0x17b/0x470
> [57676.712070] [<ffffffff8109e69b>] worker_thread+0x11b/0x400
> [57676.717819] [<ffffffff8109e580>] ? rescuer_thread+0x400/0x400
> [57676.723830] [<ffffffff810a5ddf>] kthread+0xcf/0xe0
> [57676.728883] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
> [57676.735586] [<ffffffff8164b6d8>] ret_from_fork+0x58/0x90
> [57676.741156] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
> [57676.747860] ---[ end trace dee4db6217e5fc5d ]---
> [57676.752669] BUG: unable to handle kernel NULL pointer dereference at (null)
> [57676.760870] IP: [<ffffffff8103a586>] __put_rmid+0x36/0x80
> [57676.766458] PGD 1026c33067 PUD 10282c4067 PMD 0
> [57676.771279] Oops: 0000 [#1] SMP
> [57676.774722] Modules linked in: 8021q garp stp mrp llc fuse cmtp kernelcapi scsi_transport_iscsi rfcomm dccp_ipv6 dccp_ipv4 dccp ipt_ULOG bluetooth rfkill af_key nfnetlink af_802154 vmw_vsock_vmci_transport vmw_vmci vsock atm pppoe pppox ppp_generic slhc openvswitch ipmi_devintf ipmi_si ipmi_msghandler coretemp intel_rapl crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support tg3 aesni_intel lrw gf128mul glue_helper ptp pps_core ses enclosure ablk_helper cryptd sg sb_edac pcspkr edac_core acpi_power_meter i2c_i801 i2c_core mei_me mei shpchp lpc_ich mfd_core ip_tables ext3 mbcache jbd sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common ahci libahci libata megaraid_sas dm_mod nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net tun vhost macvtap macvlan vfio_pci irqbypass
> [57676.848405] vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c
> [57676.858377] CPU: 11 PID: 114 Comm: kworker/11:0 Tainted: G W ---- ------- 3.10.0-327.44.58.22.x86_64 #1
> [57676.869330] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.31 08/22/2016
> [57676.876548] Workqueue: events intel_cqm_rmid_rotate
> [57676.881618] task: ffff882027868b80 ti: ffff882027870000 task.ti: ffff882027870000
> [57676.889448] RIP: 0010:[<ffffffff8103a586>] [<ffffffff8103a586>] __put_rmid+0x36/0x80
> [57676.897638] RSP: 0018:ffff882027873db8 EFLAGS: 00010296
> [57676.903119] RAX: ffff8820259e4300 RBX: 0000000000000000 RCX: 0000000000000000
> [57676.910419] RDX: ffffffffffffffff RSI: 0000000000000000 RDI: 0000000000000009
> [57676.917725] RBP: ffff882027873dc8 R08: 0000000000000092 R09: ffff8800000bcec0
> [57676.925031] R10: 00000000000000a0 R11: 0000000000000050 R12: 00000000ffffffff
> [57676.932330] R13: ffff88201fba7000 R14: ffff88100ceeb400 R15: ffff8820265730c0
> [57676.939631] FS: 0000000000000000(0000) GS:ffff88203e340000(0000) knlGS:0000000000000000
> [57676.948064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [57676.953982] CR2: 0000000000000000 CR3: 0000001026c32000 CR4: 00000000001407e0
> [57676.961286] DR0: 00007f2390b3d000 DR1: 00007fc8e188f000 DR2: 0000000000000000
> [57676.968586] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> [57676.975889] Stack:
> [57676.978078] 0000000000000001 ffff8820265730c8 ffff882027873e18 ffffffff8103a74a
> [57676.985892] 0000001a8109b3d9 ffff88202676dc00 0000000600000006 ffffffff8196b600
> [57676.993703] ffff8810290dc500 ffff88203e356080 ffff88203e35a400 ffff88100b5ecca0
> [57677.001496] Call Trace:
> [57677.004120] [<ffffffff8103a74a>] intel_cqm_rmid_rotate+0xba/0x440
> [57677.010471] [<ffffffff8109d8cb>] process_one_work+0x17b/0x470
> [57677.016449] [<ffffffff8109e69b>] worker_thread+0x11b/0x400
> [57677.022192] [<ffffffff8109e580>] ? rescuer_thread+0x400/0x400
> [57677.028197] [<ffffffff810a5ddf>] kthread+0xcf/0xe0
> [57677.033244] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
> [57677.039939] [<ffffffff8164b6d8>] ret_from_fork+0x58/0x90
> [57677.045483] [<ffffffff810a5d10>] ? kthread_create_on_node+0x140/0x140
> [57677.052177] Code: e5 41 54 83 f8 fd 41 89 fc 53 76 11 be ba 00 00 00 48 c7 c7 58 f4 85 81 e8 a8 0d 04 00 48 8b 05 69 2c c1 00 49 63 d4 48 8b 1c d0 <44> 3b 23 75 2e 48 8b 05 6e 4a a4 00 c7 43 04 00 00 00 00 48 8d
> [57677.072838] RIP [<ffffffff8103a586>] __put_rmid+0x36/0x80
> [57677.078510] RSP <ffff882027873db8>
> [57677.082172] CR2: 0000000000000000
> [57677.086158] ---[ end trace dee4db6217e5fc5e ]---
> [57677.451280] Kernel panic - not syncing: Fatal exception



2017-04-10 07:38:40

by Jiri Olsa

[permalink] [raw]
Subject: Re: WARNING: at arch/x86/kernel/cpu/perf_event_intel_cqm.c:186 __put_rmid+0x28/0x80()

On Mon, Apr 10, 2017 at 11:44:59AM +0800, Xishi Qiu wrote:
> On 2017/3/2 14:55, Xishi Qiu wrote:
> ping
>
> > Hi, I test Trinity, and got the following log.
> > My OS version is RHEL 7.2, I'm not sure if it has fixed in mainline.
> > Any comment is welcome.

RHEL7.3 and the latest should have all cqm fixes that are in mainline,
any chance you could upgrade?

jirka