2019-07-16 22:55:58

by Clark Williams

[permalink] [raw]
Subject: [PREEMPT_RT] splat in v5.2-rt1: r t_mutex_owner(lock) != current

Saw this after applying my thermal lock to raw patch and the change in i915 for lockdep. The
splat occurred on boot when creating the kdump initramfs. System is an Intel NUC i7 with 32GB ram
and 256GB SSD for rootfs.

The booting kernel has rt_mutex debugging turned on as well as lockdep and lockup configs.

Jul 16 14:41:48 theseus dracut[3082]: *** Creating initramfs image file '/boot/initramfs-5.2.0-rt1.fixes+kdump.img' done ***
Jul 16 14:41:48 theseus kernel: ------------[ cut here ]------------
Jul 16 14:41:48 theseus kernel: DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current)
Jul 16 14:41:48 theseus kernel: WARNING: CPU: 1 PID: 8349 at kernel/locking/rtmutex-debug.c:145 debug_rt_mutex_unlock+0x47/0x50
Jul 16 14:41:48 theseus kernel: Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc fuse nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw>
Jul 16 14:41:48 theseus kernel: snd_rawmidi snd_hda_core media snd_hwdep snd_seq btusb wmi_bmof snd_seq_device iwlwifi btrtl intel_wmi_thunderbolt btbcm snd_pcm iTCO_wdt btintel iTCO_vendor_support pcspkr bluetooth snd_timer rtsx_pci_ms cfg80211 snd memstick ecdh_generic i2c_i801 soundcore ec>
Jul 16 14:41:48 theseus kernel: CPU: 1 PID: 8349 Comm: fsfreeze Not tainted 5.2.0-rt1.fixes+ #16
Jul 16 14:41:48 theseus kernel: Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
Jul 16 14:41:48 theseus kernel: RIP: 0010:debug_rt_mutex_unlock+0x47/0x50
Jul 16 14:41:48 theseus kernel: Code: c2 75 01 c3 e8 6a c1 3e 00 85 c0 74 f6 8b 05 30 3c 66 01 85 c0 75 ec 48 c7 c6 a0 b3 2e b1 48 c7 c7 48 bf 2c b1 e8 42 7d f8 ff <0f> 0b c3 66 0f 1f 44 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f
Jul 16 14:41:48 theseus kernel: RSP: 0018:ffffc03c5b607dd0 EFLAGS: 00010086
Jul 16 14:41:48 theseus kernel: RAX: 0000000000000000 RBX: ffff9a7d6deb0d98 RCX: 0000000000000000
Jul 16 14:41:48 theseus kernel: RDX: ffffffffb167ce50 RSI: 00000000ffffffff RDI: 00000000ffffffff
Jul 16 14:41:48 theseus kernel: RBP: ffff9a7d6deb0ab0 R08: 0000000000000000 R09: ffffffffb167cd20
Jul 16 14:41:48 theseus kernel: R10: ffffc03c5b607d10 R11: ffffffffb2aa38eb R12: 0000000000000246
Jul 16 14:41:48 theseus kernel: R13: ffffc03c5b607e00 R14: ffffc03c5b607e10 R15: ffffffffb034c53f
Jul 16 14:41:48 theseus kernel: FS: 00007fd6e2f0e540(0000) GS:ffff9a7d9e600000(0000) knlGS:0000000000000000
Jul 16 14:41:48 theseus kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 16 14:41:48 theseus kernel: CR2: 0000563557bc0178 CR3: 0000000792188006 CR4: 00000000003606e0
Jul 16 14:41:48 theseus kernel: Call Trace:
Jul 16 14:41:48 theseus kernel: rt_mutex_slowunlock+0x25/0x80
Jul 16 14:41:48 theseus kernel: __rt_mutex_unlock+0x45/0x80
Jul 16 14:41:48 theseus kernel: percpu_up_write+0x1f/0x30
Jul 16 14:41:48 theseus kernel: thaw_super_locked+0xde/0x110
Jul 16 14:41:48 theseus kernel: do_vfs_ioctl+0x5de/0x720
Jul 16 14:41:48 theseus kernel: ksys_ioctl+0x5e/0x90
Jul 16 14:41:48 theseus kernel: __x64_sys_ioctl+0x16/0x20
Jul 16 14:41:48 theseus kernel: do_syscall_64+0x66/0xb0
Jul 16 14:41:48 theseus kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
Jul 16 14:41:48 theseus kernel: RIP: 0033:0x7fd6e2e391fb
Jul 16 14:41:48 theseus kernel: Code: 0f 1e fa 48 8b 05 8d dc 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d dc 0c 00 f7 d8 64 89 01 48
Jul 16 14:41:48 theseus kernel: RSP: 002b:00007ffe61e2f498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jul 16 14:41:48 theseus kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fd6e2e391fb
Jul 16 14:41:48 theseus kernel: RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000003
Jul 16 14:41:48 theseus kernel: RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
Jul 16 14:41:48 theseus kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
Jul 16 14:41:48 theseus kernel: R13: 00007ffe61e309fa R14: 0000000000000000 R15: 0000000000000000
Jul 16 14:41:48 theseus kernel: irq event stamp: 6254
Jul 16 14:41:48 theseus kernel: hardirqs last enabled at (6253): [<ffffffffb0ac8590>] _raw_spin_unlock_irqrestore+0x60/0x90
Jul 16 14:41:48 theseus kernel: hardirqs last disabled at (6254): [<ffffffffb0ac8713>] _raw_spin_lock_irqsave+0x23/0x90
Jul 16 14:41:48 theseus kernel: softirqs last enabled at (3330): [<ffffffffb003e4a8>] fpu__clear+0x88/0x200
Jul 16 14:41:48 theseus kernel: softirqs last disabled at (3327): [<ffffffffb003e46b>] fpu__clear+0x4b/0x200
Jul 16 14:41:48 theseus kernel: ---[ end trace 0000000000000002 ]---
Jul 16 14:41:49 theseus kdumpctl[1500]: kexec: loaded kdump kernel
Jul 16 14:41:49 theseus kdumpctl[1500]: Starting kdump: [OK]

--
The United States Coast Guard
Ruining Natural Selection since 1790


2019-07-17 06:36:11

by Juri Lelli

[permalink] [raw]
Subject: Re: [PREEMPT_RT] splat in v5.2-rt1: r t_mutex_owner(lock) != current

Hi Clark,

On 16/07/19 17:55, Clark Williams wrote:
> Saw this after applying my thermal lock to raw patch and the change in i915 for lockdep. The
> splat occurred on boot when creating the kdump initramfs. System is an Intel NUC i7 with 32GB ram
> and 256GB SSD for rootfs.
>
> The booting kernel has rt_mutex debugging turned on as well as lockdep and lockup configs.
>
> Jul 16 14:41:48 theseus dracut[3082]: *** Creating initramfs image file '/boot/initramfs-5.2.0-rt1.fixes+kdump.img' done ***
> Jul 16 14:41:48 theseus kernel: ------------[ cut here ]------------
> Jul 16 14:41:48 theseus kernel: DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current)
> Jul 16 14:41:48 theseus kernel: WARNING: CPU: 1 PID: 8349 at kernel/locking/rtmutex-debug.c:145 debug_rt_mutex_unlock+0x47/0x50
> Jul 16 14:41:48 theseus kernel: Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc fuse nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw>
> Jul 16 14:41:48 theseus kernel: snd_rawmidi snd_hda_core media snd_hwdep snd_seq btusb wmi_bmof snd_seq_device iwlwifi btrtl intel_wmi_thunderbolt btbcm snd_pcm iTCO_wdt btintel iTCO_vendor_support pcspkr bluetooth snd_timer rtsx_pci_ms cfg80211 snd memstick ecdh_generic i2c_i801 soundcore ec>
> Jul 16 14:41:48 theseus kernel: CPU: 1 PID: 8349 Comm: fsfreeze Not tainted 5.2.0-rt1.fixes+ #16
> Jul 16 14:41:48 theseus kernel: Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
> Jul 16 14:41:48 theseus kernel: RIP: 0010:debug_rt_mutex_unlock+0x47/0x50
> Jul 16 14:41:48 theseus kernel: Code: c2 75 01 c3 e8 6a c1 3e 00 85 c0 74 f6 8b 05 30 3c 66 01 85 c0 75 ec 48 c7 c6 a0 b3 2e b1 48 c7 c7 48 bf 2c b1 e8 42 7d f8 ff <0f> 0b c3 66 0f 1f 44 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> Jul 16 14:41:48 theseus kernel: RSP: 0018:ffffc03c5b607dd0 EFLAGS: 00010086
> Jul 16 14:41:48 theseus kernel: RAX: 0000000000000000 RBX: ffff9a7d6deb0d98 RCX: 0000000000000000
> Jul 16 14:41:48 theseus kernel: RDX: ffffffffb167ce50 RSI: 00000000ffffffff RDI: 00000000ffffffff
> Jul 16 14:41:48 theseus kernel: RBP: ffff9a7d6deb0ab0 R08: 0000000000000000 R09: ffffffffb167cd20
> Jul 16 14:41:48 theseus kernel: R10: ffffc03c5b607d10 R11: ffffffffb2aa38eb R12: 0000000000000246
> Jul 16 14:41:48 theseus kernel: R13: ffffc03c5b607e00 R14: ffffc03c5b607e10 R15: ffffffffb034c53f
> Jul 16 14:41:48 theseus kernel: FS: 00007fd6e2f0e540(0000) GS:ffff9a7d9e600000(0000) knlGS:0000000000000000
> Jul 16 14:41:48 theseus kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Jul 16 14:41:48 theseus kernel: CR2: 0000563557bc0178 CR3: 0000000792188006 CR4: 00000000003606e0
> Jul 16 14:41:48 theseus kernel: Call Trace:
> Jul 16 14:41:48 theseus kernel: rt_mutex_slowunlock+0x25/0x80
> Jul 16 14:41:48 theseus kernel: __rt_mutex_unlock+0x45/0x80
> Jul 16 14:41:48 theseus kernel: percpu_up_write+0x1f/0x30
> Jul 16 14:41:48 theseus kernel: thaw_super_locked+0xde/0x110
> Jul 16 14:41:48 theseus kernel: do_vfs_ioctl+0x5de/0x720
> Jul 16 14:41:48 theseus kernel: ksys_ioctl+0x5e/0x90
> Jul 16 14:41:48 theseus kernel: __x64_sys_ioctl+0x16/0x20
> Jul 16 14:41:48 theseus kernel: do_syscall_64+0x66/0xb0
> Jul 16 14:41:48 theseus kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
> Jul 16 14:41:48 theseus kernel: RIP: 0033:0x7fd6e2e391fb
> Jul 16 14:41:48 theseus kernel: Code: 0f 1e fa 48 8b 05 8d dc 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d dc 0c 00 f7 d8 64 89 01 48
> Jul 16 14:41:48 theseus kernel: RSP: 002b:00007ffe61e2f498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Jul 16 14:41:48 theseus kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fd6e2e391fb
> Jul 16 14:41:48 theseus kernel: RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000003
> Jul 16 14:41:48 theseus kernel: RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
> Jul 16 14:41:48 theseus kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
> Jul 16 14:41:48 theseus kernel: R13: 00007ffe61e309fa R14: 0000000000000000 R15: 0000000000000000
> Jul 16 14:41:48 theseus kernel: irq event stamp: 6254
> Jul 16 14:41:48 theseus kernel: hardirqs last enabled at (6253): [<ffffffffb0ac8590>] _raw_spin_unlock_irqrestore+0x60/0x90
> Jul 16 14:41:48 theseus kernel: hardirqs last disabled at (6254): [<ffffffffb0ac8713>] _raw_spin_lock_irqsave+0x23/0x90
> Jul 16 14:41:48 theseus kernel: softirqs last enabled at (3330): [<ffffffffb003e4a8>] fpu__clear+0x88/0x200
> Jul 16 14:41:48 theseus kernel: softirqs last disabled at (3327): [<ffffffffb003e46b>] fpu__clear+0x4b/0x200
> Jul 16 14:41:48 theseus kernel: ---[ end trace 0000000000000002 ]---
> Jul 16 14:41:49 theseus kdumpctl[1500]: kexec: loaded kdump kernel
> Jul 16 14:41:49 theseus kdumpctl[1500]: Starting kdump: [OK]

Looks like this is the same percpu_rwsem problem Peter is chasing down:

https://lore.kernel.org/linux-rt-users/[email protected]/

Thanks,

Juri

2019-07-17 13:52:08

by Clark Williams

[permalink] [raw]
Subject: Re: [PREEMPT_RT] splat in v5.2-rt1: r t_mutex_owner(lock) != current

On Wed, 17 Jul 2019 08:34:59 +0200
Juri Lelli <[email protected]> wrote:

> Hi Clark,
>
> On 16/07/19 17:55, Clark Williams wrote:
> > Saw this after applying my thermal lock to raw patch and the change in i915 for lockdep. The
> > splat occurred on boot when creating the kdump initramfs. System is an Intel NUC i7 with 32GB ram
> > and 256GB SSD for rootfs.
> >
> > The booting kernel has rt_mutex debugging turned on as well as lockdep and lockup configs.
> >
> > Jul 16 14:41:48 theseus dracut[3082]: *** Creating initramfs image file '/boot/initramfs-5.2.0-rt1.fixes+kdump.img' done ***
> > Jul 16 14:41:48 theseus kernel: ------------[ cut here ]------------
> > Jul 16 14:41:48 theseus kernel: DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current)
> > Jul 16 14:41:48 theseus kernel: WARNING: CPU: 1 PID: 8349 at kernel/locking/rtmutex-debug.c:145 debug_rt_mutex_unlock+0x47/0x50
> > Jul 16 14:41:48 theseus kernel: Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc fuse nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw>
> > Jul 16 14:41:48 theseus kernel: snd_rawmidi snd_hda_core media snd_hwdep snd_seq btusb wmi_bmof snd_seq_device iwlwifi btrtl intel_wmi_thunderbolt btbcm snd_pcm iTCO_wdt btintel iTCO_vendor_support pcspkr bluetooth snd_timer rtsx_pci_ms cfg80211 snd memstick ecdh_generic i2c_i801 soundcore ec>
> > Jul 16 14:41:48 theseus kernel: CPU: 1 PID: 8349 Comm: fsfreeze Not tainted 5.2.0-rt1.fixes+ #16
> > Jul 16 14:41:48 theseus kernel: Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
> > Jul 16 14:41:48 theseus kernel: RIP: 0010:debug_rt_mutex_unlock+0x47/0x50
> > Jul 16 14:41:48 theseus kernel: Code: c2 75 01 c3 e8 6a c1 3e 00 85 c0 74 f6 8b 05 30 3c 66 01 85 c0 75 ec 48 c7 c6 a0 b3 2e b1 48 c7 c7 48 bf 2c b1 e8 42 7d f8 ff <0f> 0b c3 66 0f 1f 44 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f
> > Jul 16 14:41:48 theseus kernel: RSP: 0018:ffffc03c5b607dd0 EFLAGS: 00010086
> > Jul 16 14:41:48 theseus kernel: RAX: 0000000000000000 RBX: ffff9a7d6deb0d98 RCX: 0000000000000000
> > Jul 16 14:41:48 theseus kernel: RDX: ffffffffb167ce50 RSI: 00000000ffffffff RDI: 00000000ffffffff
> > Jul 16 14:41:48 theseus kernel: RBP: ffff9a7d6deb0ab0 R08: 0000000000000000 R09: ffffffffb167cd20
> > Jul 16 14:41:48 theseus kernel: R10: ffffc03c5b607d10 R11: ffffffffb2aa38eb R12: 0000000000000246
> > Jul 16 14:41:48 theseus kernel: R13: ffffc03c5b607e00 R14: ffffc03c5b607e10 R15: ffffffffb034c53f
> > Jul 16 14:41:48 theseus kernel: FS: 00007fd6e2f0e540(0000) GS:ffff9a7d9e600000(0000) knlGS:0000000000000000
> > Jul 16 14:41:48 theseus kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > Jul 16 14:41:48 theseus kernel: CR2: 0000563557bc0178 CR3: 0000000792188006 CR4: 00000000003606e0
> > Jul 16 14:41:48 theseus kernel: Call Trace:
> > Jul 16 14:41:48 theseus kernel: rt_mutex_slowunlock+0x25/0x80
> > Jul 16 14:41:48 theseus kernel: __rt_mutex_unlock+0x45/0x80
> > Jul 16 14:41:48 theseus kernel: percpu_up_write+0x1f/0x30
> > Jul 16 14:41:48 theseus kernel: thaw_super_locked+0xde/0x110
> > Jul 16 14:41:48 theseus kernel: do_vfs_ioctl+0x5de/0x720
> > Jul 16 14:41:48 theseus kernel: ksys_ioctl+0x5e/0x90
> > Jul 16 14:41:48 theseus kernel: __x64_sys_ioctl+0x16/0x20
> > Jul 16 14:41:48 theseus kernel: do_syscall_64+0x66/0xb0
> > Jul 16 14:41:48 theseus kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > Jul 16 14:41:48 theseus kernel: RIP: 0033:0x7fd6e2e391fb
> > Jul 16 14:41:48 theseus kernel: Code: 0f 1e fa 48 8b 05 8d dc 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d dc 0c 00 f7 d8 64 89 01 48
> > Jul 16 14:41:48 theseus kernel: RSP: 002b:00007ffe61e2f498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > Jul 16 14:41:48 theseus kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fd6e2e391fb
> > Jul 16 14:41:48 theseus kernel: RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000003
> > Jul 16 14:41:48 theseus kernel: RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
> > Jul 16 14:41:48 theseus kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
> > Jul 16 14:41:48 theseus kernel: R13: 00007ffe61e309fa R14: 0000000000000000 R15: 0000000000000000
> > Jul 16 14:41:48 theseus kernel: irq event stamp: 6254
> > Jul 16 14:41:48 theseus kernel: hardirqs last enabled at (6253): [<ffffffffb0ac8590>] _raw_spin_unlock_irqrestore+0x60/0x90
> > Jul 16 14:41:48 theseus kernel: hardirqs last disabled at (6254): [<ffffffffb0ac8713>] _raw_spin_lock_irqsave+0x23/0x90
> > Jul 16 14:41:48 theseus kernel: softirqs last enabled at (3330): [<ffffffffb003e4a8>] fpu__clear+0x88/0x200
> > Jul 16 14:41:48 theseus kernel: softirqs last disabled at (3327): [<ffffffffb003e46b>] fpu__clear+0x4b/0x200
> > Jul 16 14:41:48 theseus kernel: ---[ end trace 0000000000000002 ]---
> > Jul 16 14:41:49 theseus kdumpctl[1500]: kexec: loaded kdump kernel
> > Jul 16 14:41:49 theseus kdumpctl[1500]: Starting kdump: [OK]
>
> Looks like this is the same percpu_rwsem problem Peter is chasing down:
>
> https://lore.kernel.org/linux-rt-users/[email protected]/
>
> Thanks,
>
> Juri

Yep, that's it. Interesting that he had to run a reproducer and mine just happens on boot.

--
The United States Coast Guard
Ruining Natural Selection since 1790