2019-02-17 23:41:02

by Sergei Trofimovich

[permalink] [raw]
Subject: 5.0.0-rc6+: Oops at boot: RIP: 0010:__memmove+0x81/0x1a0 / vt_do_kdgkb_ioctl+0x34d/0x440 (race at reenter?)

[ Copying as is from https://bugzilla.kernel.org/show_bug.cgi?id=202605
and sending to LKML. Greg, Jiri, can you clarify mailing
list im MAINTAINERS as well?
https://github.com/torvalds/linux/blob/master/MAINTAINERS#L15527
mentions no list for tty/vt/. ]

Kernel Oops
[ 38.739241] Oops: 0003 [#1] PREEMPT SMP
[ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
[ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
[ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
happes on a fresh vanilla master kernel roughly at boot
(before tty login prompt):
$ uname -r
5.0.0-rc6-00153-g5ded5871030e

The kernel page fault happens at 'loadkeys start'.
I suspect some kind of race at reenter of vt_do_kdgkb_ioctl(KDSKBSENT):
https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L1986

The oops trace looks similar to the following reports (no details besides Oops)
https://bugzilla.kernel.org/show_bug.cgi?id=194589
https://bugzilla.kernel.org/show_bug.cgi?id=202111

[ 38.044921] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
[ 38.533196] usb 1-1.2: r8712u: CustomerID = 0x0000
[ 38.533200] usb 1-1.2: r8712u: MAC Address from efuse = 00:0d:81:a9:09:90
[ 38.533203] usb 1-1.2: r8712u: Loading firmware from "rtlwifi/rtl8712u.bin"
[ 38.533331] usbcore: registered new interface driver r8712u
[ 38.736178] BUG: unable to handle kernel paging request at ffff9c8735448000
[ 38.737215] #PF error: [PROT] [WRITE]
[ 38.737216] PGD 288a05067 P4D 288a05067 PUD 288a07067 PMD 7f60c2063 PTE 80000007f5448161
[ 38.739241] Oops: 0003 [#1] PREEMPT SMP
[ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
[ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
[ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
[ 38.739251] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
[ 38.739252] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
[ 38.745857] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
[ 38.745858] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
[ 38.745859] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
[ 38.745860] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
[ 38.745861] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
[ 38.745862] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
[ 38.745863] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 38.745864] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
[ 38.745865] Call Trace:
[ 38.745871] vt_do_kdgkb_ioctl+0x34d/0x440
[ 38.745875] vt_ioctl+0xba3/0x1190
[ 38.745879] ? __bpf_prog_run32+0x39/0x60
[ 38.745882] ? mem_cgroup_commit_charge+0x7b/0x4e0
[ 38.762583] tty_ioctl+0x23f/0x920
[ 38.762586] ? preempt_count_sub+0x98/0xe0
[ 38.762590] ? __seccomp_filter+0x67/0x600
[ 38.762594] do_vfs_ioctl+0xa2/0x6a0
[ 38.762597] ? syscall_trace_enter+0x192/0x2d0
[ 38.762599] ksys_ioctl+0x3a/0x70
[ 38.762601] __x64_sys_ioctl+0x16/0x20
[ 38.762604] do_syscall_64+0x54/0xe0
[ 38.772513] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 38.772515] RIP: 0033:0x7f450c2bb427
[ 38.772517] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 8d d2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 da 0c 00 f7 d8 64 89 01 48
[ 38.772518] RSP: 002b:00007fffbcedd348 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 38.772519] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f450c2bb427
[ 38.772520] RDX: 00007fffbcedd360 RSI: 0000000000004b49 RDI: 0000000000000003
[ 38.772521] RBP: 00007fffbcedd361 R08: 00007f450c389c40 R09: 000055cbef2494a0
[ 38.772522] R10: 0000000000000000 R11: 0000000000000246 R12: 000055cbef2412b0
[ 38.772522] R13: 00007fffbcedd360 R14: 000000000000000b R15: 0000000000000003
[ 38.772525] Modules linked in: snd_hda_codec_hdmi bridge r8712u(C) stp llc snd_hda_codec_via snd_hda_codec_generic snd_hda_intel snd_hda_codec x86_pkg_temp_thermal dummy kvm_intel snd_hwdep snd_hda_core snd_pcm snd_timer kvm snd atl1c soundcore irqbypass xfs tun nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 loop fuse binfmt_misc ipv6
[ 38.779196] r8712u 1-1.2:1.0 wl0: renamed from wlan0
[ 38.779240] CR2: ffff9c8735448000
[ 38.790894] ---[ end trace 8116e48ba19076a0 ]---
[ 38.790897] RIP: 0010:__memmove+0x81/0x1a0
[ 38.790898] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
[ 38.790899] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
[ 38.790900] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
[ 38.790901] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
[ 38.790902] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
[ 38.790903] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
[ 38.790904] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
[ 38.790905] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
[ 38.790906] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 38.790907] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
[ 38.790908] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34
[ 38.790909] in_atomic(): 0, irqs_disabled(): 1, pid: 388, name: loadkeys
[ 38.790911] CPU: 6 PID: 388 Comm: loadkeys Tainted: G D C 5.0.0-rc6-00153-g5ded5871030e #91
[ 38.790911] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
[ 38.790912] Call Trace:
[ 38.790917] dump_stack+0x67/0x90
[ 38.822550] ? wake_up_klogd+0x10/0x70
[ 38.822553] ___might_sleep.cold.17+0xd4/0xe4
[ 38.822556] exit_signals+0x1c/0x200
[ 38.822558] do_exit+0xa8/0xb90
[ 38.822560] ? ksys_ioctl+0x3a/0x70
[ 38.822562] rewind_stack_do_exit+0x17/0x20

I suspect my system runs multiple loadkeys instances in parallel
and that triggers an Oops.

I noticed that vt_do_kdgkb_ioctl() uses global variable to optimise
func_table re-allocation. Even comment hints at lack of locking:

/* FIXME: This one needs untangling and locking */
int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)

https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L2084

Could that be it?

func_table[] is a global array pointers to to initially statically and later
dynamically allocated buffer.
https://github.com/torvalds/linux/blob/master/drivers/tty/vt/defkeymap.c_shipped#L191

Thanks!

--

Sergei


Attachments:
(No filename) (0.98 kB)
Цифровая подпись OpenPGP

2019-02-18 08:50:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: 5.0.0-rc6+: Oops at boot: RIP: 0010:__memmove+0x81/0x1a0 / vt_do_kdgkb_ioctl+0x34d/0x440 (race at reenter?)

On Sun, Feb 17, 2019 at 11:39:57PM +0000, Sergei Trofimovich wrote:
> [ Copying as is from https://bugzilla.kernel.org/show_bug.cgi?id=202605
> and sending to LKML. Greg, Jiri, can you clarify mailing
> list im MAINTAINERS as well?
> https://github.com/torvalds/linux/blob/master/MAINTAINERS#L15527
> mentions no list for tty/vt/. ]
>
> Kernel Oops
> [ 38.739241] Oops: 0003 [#1] PREEMPT SMP
> [ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
> [ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> [ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
> happes on a fresh vanilla master kernel roughly at boot
> (before tty login prompt):
> $ uname -r
> 5.0.0-rc6-00153-g5ded5871030e
>
> The kernel page fault happens at 'loadkeys start'.
> I suspect some kind of race at reenter of vt_do_kdgkb_ioctl(KDSKBSENT):
> https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L1986
>
> The oops trace looks similar to the following reports (no details besides Oops)
> https://bugzilla.kernel.org/show_bug.cgi?id=194589
> https://bugzilla.kernel.org/show_bug.cgi?id=202111
>
> [ 38.044921] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
> [ 38.533196] usb 1-1.2: r8712u: CustomerID = 0x0000
> [ 38.533200] usb 1-1.2: r8712u: MAC Address from efuse = 00:0d:81:a9:09:90
> [ 38.533203] usb 1-1.2: r8712u: Loading firmware from "rtlwifi/rtl8712u.bin"
> [ 38.533331] usbcore: registered new interface driver r8712u
> [ 38.736178] BUG: unable to handle kernel paging request at ffff9c8735448000
> [ 38.737215] #PF error: [PROT] [WRITE]
> [ 38.737216] PGD 288a05067 P4D 288a05067 PUD 288a07067 PMD 7f60c2063 PTE 80000007f5448161
> [ 38.739241] Oops: 0003 [#1] PREEMPT SMP
> [ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
> [ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> [ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
> [ 38.739251] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
> [ 38.739252] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
> [ 38.745857] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
> [ 38.745858] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
> [ 38.745859] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
> [ 38.745860] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
> [ 38.745861] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
> [ 38.745862] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
> [ 38.745863] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 38.745864] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
> [ 38.745865] Call Trace:
> [ 38.745871] vt_do_kdgkb_ioctl+0x34d/0x440
> [ 38.745875] vt_ioctl+0xba3/0x1190
> [ 38.745879] ? __bpf_prog_run32+0x39/0x60
> [ 38.745882] ? mem_cgroup_commit_charge+0x7b/0x4e0
> [ 38.762583] tty_ioctl+0x23f/0x920
> [ 38.762586] ? preempt_count_sub+0x98/0xe0
> [ 38.762590] ? __seccomp_filter+0x67/0x600
> [ 38.762594] do_vfs_ioctl+0xa2/0x6a0
> [ 38.762597] ? syscall_trace_enter+0x192/0x2d0
> [ 38.762599] ksys_ioctl+0x3a/0x70
> [ 38.762601] __x64_sys_ioctl+0x16/0x20
> [ 38.762604] do_syscall_64+0x54/0xe0
> [ 38.772513] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 38.772515] RIP: 0033:0x7f450c2bb427
> [ 38.772517] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 8d d2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 da 0c 00 f7 d8 64 89 01 48
> [ 38.772518] RSP: 002b:00007fffbcedd348 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 38.772519] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f450c2bb427
> [ 38.772520] RDX: 00007fffbcedd360 RSI: 0000000000004b49 RDI: 0000000000000003
> [ 38.772521] RBP: 00007fffbcedd361 R08: 00007f450c389c40 R09: 000055cbef2494a0
> [ 38.772522] R10: 0000000000000000 R11: 0000000000000246 R12: 000055cbef2412b0
> [ 38.772522] R13: 00007fffbcedd360 R14: 000000000000000b R15: 0000000000000003
> [ 38.772525] Modules linked in: snd_hda_codec_hdmi bridge r8712u(C) stp llc snd_hda_codec_via snd_hda_codec_generic snd_hda_intel snd_hda_codec x86_pkg_temp_thermal dummy kvm_intel snd_hwdep snd_hda_core snd_pcm snd_timer kvm snd atl1c soundcore irqbypass xfs tun nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 loop fuse binfmt_misc ipv6
> [ 38.779196] r8712u 1-1.2:1.0 wl0: renamed from wlan0
> [ 38.779240] CR2: ffff9c8735448000
> [ 38.790894] ---[ end trace 8116e48ba19076a0 ]---
> [ 38.790897] RIP: 0010:__memmove+0x81/0x1a0
> [ 38.790898] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
> [ 38.790899] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
> [ 38.790900] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
> [ 38.790901] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
> [ 38.790902] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
> [ 38.790903] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
> [ 38.790904] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
> [ 38.790905] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
> [ 38.790906] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 38.790907] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
> [ 38.790908] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34
> [ 38.790909] in_atomic(): 0, irqs_disabled(): 1, pid: 388, name: loadkeys
> [ 38.790911] CPU: 6 PID: 388 Comm: loadkeys Tainted: G D C 5.0.0-rc6-00153-g5ded5871030e #91
> [ 38.790911] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> [ 38.790912] Call Trace:
> [ 38.790917] dump_stack+0x67/0x90
> [ 38.822550] ? wake_up_klogd+0x10/0x70
> [ 38.822553] ___might_sleep.cold.17+0xd4/0xe4
> [ 38.822556] exit_signals+0x1c/0x200
> [ 38.822558] do_exit+0xa8/0xb90
> [ 38.822560] ? ksys_ioctl+0x3a/0x70
> [ 38.822562] rewind_stack_do_exit+0x17/0x20
>
> I suspect my system runs multiple loadkeys instances in parallel
> and that triggers an Oops.
>
> I noticed that vt_do_kdgkb_ioctl() uses global variable to optimise
> func_table re-allocation. Even comment hints at lack of locking:
>
> /* FIXME: This one needs untangling and locking */
> int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
>
> https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L2084
>
> Could that be it?

That could be it, care to add a lock to handle this?

And why does your system run loadkeys in parallel?

thanks,

greg k-h

2019-02-24 13:23:11

by Sergei Trofimovich

[permalink] [raw]
Subject: Re: 5.0.0-rc6+: Oops at boot: RIP: 0010:__memmove+0x81/0x1a0 / vt_do_kdgkb_ioctl+0x34d/0x440 (race at reenter?)

On Mon, 18 Feb 2019 09:38:10 +0100
Greg Kroah-Hartman <[email protected]> wrote:

> On Sun, Feb 17, 2019 at 11:39:57PM +0000, Sergei Trofimovich wrote:
> > [ Copying as is from https://bugzilla.kernel.org/show_bug.cgi?id=202605
> > and sending to LKML. Greg, Jiri, can you clarify mailing
> > list im MAINTAINERS as well?
> > https://github.com/torvalds/linux/blob/master/MAINTAINERS#L15527
> > mentions no list for tty/vt/. ]
> >
> > Kernel Oops
> > [ 38.739241] Oops: 0003 [#1] PREEMPT SMP
> > [ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
> > [ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> > [ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
> > happes on a fresh vanilla master kernel roughly at boot
> > (before tty login prompt):
> > $ uname -r
> > 5.0.0-rc6-00153-g5ded5871030e
> >
> > The kernel page fault happens at 'loadkeys start'.
> > I suspect some kind of race at reenter of vt_do_kdgkb_ioctl(KDSKBSENT):
> > https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L1986
> >
> > The oops trace looks similar to the following reports (no details besides Oops)
> > https://bugzilla.kernel.org/show_bug.cgi?id=194589
> > https://bugzilla.kernel.org/show_bug.cgi?id=202111
> >
> > [ 38.044921] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
> > [ 38.533196] usb 1-1.2: r8712u: CustomerID = 0x0000
> > [ 38.533200] usb 1-1.2: r8712u: MAC Address from efuse = 00:0d:81:a9:09:90
> > [ 38.533203] usb 1-1.2: r8712u: Loading firmware from "rtlwifi/rtl8712u.bin"
> > [ 38.533331] usbcore: registered new interface driver r8712u
> > [ 38.736178] BUG: unable to handle kernel paging request at ffff9c8735448000
> > [ 38.737215] #PF error: [PROT] [WRITE]
> > [ 38.737216] PGD 288a05067 P4D 288a05067 PUD 288a07067 PMD 7f60c2063 PTE 80000007f5448161
> > [ 38.739241] Oops: 0003 [#1] PREEMPT SMP
> > [ 38.739243] CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
> > [ 38.739244] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> > [ 38.739249] RIP: 0010:__memmove+0x81/0x1a0
> > [ 38.739251] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
> > [ 38.739252] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
> > [ 38.745857] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
> > [ 38.745858] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
> > [ 38.745859] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
> > [ 38.745860] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
> > [ 38.745861] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
> > [ 38.745862] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
> > [ 38.745863] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 38.745864] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
> > [ 38.745865] Call Trace:
> > [ 38.745871] vt_do_kdgkb_ioctl+0x34d/0x440
> > [ 38.745875] vt_ioctl+0xba3/0x1190
> > [ 38.745879] ? __bpf_prog_run32+0x39/0x60
> > [ 38.745882] ? mem_cgroup_commit_charge+0x7b/0x4e0
> > [ 38.762583] tty_ioctl+0x23f/0x920
> > [ 38.762586] ? preempt_count_sub+0x98/0xe0
> > [ 38.762590] ? __seccomp_filter+0x67/0x600
> > [ 38.762594] do_vfs_ioctl+0xa2/0x6a0
> > [ 38.762597] ? syscall_trace_enter+0x192/0x2d0
> > [ 38.762599] ksys_ioctl+0x3a/0x70
> > [ 38.762601] __x64_sys_ioctl+0x16/0x20
> > [ 38.762604] do_syscall_64+0x54/0xe0
> > [ 38.772513] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [ 38.772515] RIP: 0033:0x7f450c2bb427
> > [ 38.772517] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 8d d2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 da 0c 00 f7 d8 64 89 01 48
> > [ 38.772518] RSP: 002b:00007fffbcedd348 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [ 38.772519] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f450c2bb427
> > [ 38.772520] RDX: 00007fffbcedd360 RSI: 0000000000004b49 RDI: 0000000000000003
> > [ 38.772521] RBP: 00007fffbcedd361 R08: 00007f450c389c40 R09: 000055cbef2494a0
> > [ 38.772522] R10: 0000000000000000 R11: 0000000000000246 R12: 000055cbef2412b0
> > [ 38.772522] R13: 00007fffbcedd360 R14: 000000000000000b R15: 0000000000000003
> > [ 38.772525] Modules linked in: snd_hda_codec_hdmi bridge r8712u(C) stp llc snd_hda_codec_via snd_hda_codec_generic snd_hda_intel snd_hda_codec x86_pkg_temp_thermal dummy kvm_intel snd_hwdep snd_hda_core snd_pcm snd_timer kvm snd atl1c soundcore irqbypass xfs tun nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 loop fuse binfmt_misc ipv6
> > [ 38.779196] r8712u 1-1.2:1.0 wl0: renamed from wlan0
> > [ 38.779240] CR2: ffff9c8735448000
> > [ 38.790894] ---[ end trace 8116e48ba19076a0 ]---
> > [ 38.790897] RIP: 0010:__memmove+0x81/0x1a0
> > [ 38.790898] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
> > [ 38.790899] RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
> > [ 38.790900] RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
> > [ 38.790901] RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
> > [ 38.790902] RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
> > [ 38.790903] R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
> > [ 38.790904] R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
> > [ 38.790905] FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
> > [ 38.790906] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 38.790907] CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
> > [ 38.790908] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34
> > [ 38.790909] in_atomic(): 0, irqs_disabled(): 1, pid: 388, name: loadkeys
> > [ 38.790911] CPU: 6 PID: 388 Comm: loadkeys Tainted: G D C 5.0.0-rc6-00153-g5ded5871030e #91
> > [ 38.790911] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
> > [ 38.790912] Call Trace:
> > [ 38.790917] dump_stack+0x67/0x90
> > [ 38.822550] ? wake_up_klogd+0x10/0x70
> > [ 38.822553] ___might_sleep.cold.17+0xd4/0xe4
> > [ 38.822556] exit_signals+0x1c/0x200
> > [ 38.822558] do_exit+0xa8/0xb90
> > [ 38.822560] ? ksys_ioctl+0x3a/0x70
> > [ 38.822562] rewind_stack_do_exit+0x17/0x20
> >
> > I suspect my system runs multiple loadkeys instances in parallel
> > and that triggers an Oops.
> >
> > I noticed that vt_do_kdgkb_ioctl() uses global variable to optimise
> > func_table re-allocation. Even comment hints at lack of locking:
> >
> > /* FIXME: This one needs untangling and locking */
> > int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
> >
> > https://github.com/torvalds/linux/blob/master/drivers/tty/vt/keyboard.c#L2084
> >
> > Could that be it?
>
> That could be it, care to add a lock to handle this?

Will try.

> And why does your system run loadkeys in parallel?

It comes from systemd which spawns 'systemd-vconsole-setup'
once per each vtcon at start:
https://github.com/systemd/systemd/blob/883eb9be985fd86d9cabe967eeeab91cdd396a81/src/vconsole/90-vconsole.rules.in#L12

My system has two vtcon "devices":
# cat /sys/devices/virtual/vtconsole/vtcon0/name
(S) dummy device
# cat /sys/devices/virtual/vtconsole/vtcon1/name
(M) frame buffer device

Both trigger 'systemd-vconsole-setup' at exactly same time.

I guess multiple vtcons is i915-specific framebuffer setup:
drivers/gpu/drm/i915/i915_drv.c: ret = do_take_over_console(&dummy_con, 0, MAX_NR_CONSOLES - 1, 1);

This is a dmesg output with a tiny debug path applied to make
sure loadkeys is the only user of vt_do_kdgkb_ioctl() calls:

--- a/drivers/tty/vt/keyboard.c
+++ b/drivers/tty/vt/keyboard.c
@@ -1996,6 +1996,14 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
int i, j, k;
int ret;

+ printk("In vt_do_kdgkb_ioctl(%d=%s)/cpu=%d/comm=%s(%d)\n",
+ cmd, (cmd == KDGKBSENT)
+ ? "KDGKBSENT"
+ : ((cmd == KDSKBSENT)
+ ? "KDSKBSENT"
+ : "UNKNOWN"),
+ hard_smp_processor_id(), current->comm, task_pid_nr(current));
+
if (!capable(CAP_SYS_TTY_CONFIG))
perm = 0;

Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u ru4"...
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: /usr/bin/setfont succeeded.
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u ru4"...
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Successfully forked off '(loadkeys)' as PID 423.
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Successfully forked off '(loadkeys)' as PID 424.
...
Feb 24 12:06:35 sf kernel: ^[[0;1;39m^[[0;1;39mIn vt_do_kdgkb_ioctl(19273=KDSKBSENT)/cpu=5/comm=loadkeys(424)
Feb 24 12:06:35 sf kernel: ^[[0;1;39m^[[0;1;39mIn vt_do_kdgkb_ioctl(19273=KDSKBSENT)/cpu=2/comm=loadkeys(423)
<many more of these. interleave sometimes>
...
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: /usr/bin/loadkeys succeeded.
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: UTF-8 kbdmode enabled on /dev/tty9
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Font and unimap successfully copied to /dev/tty9
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: UTF-8 kbdmode enabled on /dev/tty9
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Font and unimap successfully copied to /dev/tty9
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: UTF-8 kbdmode enabled on /dev/tty12
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Font and unimap successfully copied to /dev/tty12
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: UTF-8 kbdmode enabled on /dev/tty12
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Font and unimap successfully copied to /dev/tty12

--

Sergei


Attachments:
(No filename) (0.98 kB)
Цифровая подпись OpenPGP

2019-02-25 22:34:23

by Sergei Trofimovich

[permalink] [raw]
Subject: [PATCH] tty/vt: fix write/write race in ioctl(KDSKBSENT) handler

The bug manifests as an attempt to access deallocated memory:

BUG: unable to handle kernel paging request at ffff9c8735448000
#PF error: [PROT] [WRITE]
PGD 288a05067 P4D 288a05067 PUD 288a07067 PMD 7f60c2063 PTE 80000007f5448161
Oops: 0003 [#1] PREEMPT SMP
CPU: 6 PID: 388 Comm: loadkeys Tainted: G C 5.0.0-rc6-00153-g5ded5871030e #91
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
RIP: 0010:__memmove+0x81/0x1a0
Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
FS: 00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
Call Trace:
vt_do_kdgkb_ioctl+0x34d/0x440
vt_ioctl+0xba3/0x1190
? __bpf_prog_run32+0x39/0x60
? mem_cgroup_commit_charge+0x7b/0x4e0
tty_ioctl+0x23f/0x920
? preempt_count_sub+0x98/0xe0
? __seccomp_filter+0x67/0x600
do_vfs_ioctl+0xa2/0x6a0
? syscall_trace_enter+0x192/0x2d0
ksys_ioctl+0x3a/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x54/0xe0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The bug manifests on systemd systems with multiple vtcon devices:
# cat /sys/devices/virtual/vtconsole/vtcon0/name
(S) dummy device
# cat /sys/devices/virtual/vtconsole/vtcon1/name
(M) frame buffer device

There systemd runs 'loadkeys' tool in tapallel for each vtcon
instance. This causes two parallel ioctl(KDSKBSENT) calls to
race into adding the same entry into 'func_table' array at:

drivers/tty/vt/keyboard.c:vt_do_kdgkb_ioctl()

The function has no locking around writes to 'func_table'.

The simplest reproducer is to have initrams with the following
init on a 8-CPU machine x86_64:

#!/bin/sh

loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &

loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &
loadkeys -q windowkeys ru4 &
wait

The change adds lock on write path only. Reads are still racy.

CC: Greg Kroah-Hartman <[email protected]>
CC: Jiri Slaby <[email protected]>
Link: https://lkml.org/lkml/2019/2/17/256
Signed-off-by: Sergei Trofimovich <[email protected]>
---
drivers/tty/vt/keyboard.c | 33 +++++++++++++++++++++++++++------
1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c
index 88312c6c92cc..0617e87ab343 100644
--- a/drivers/tty/vt/keyboard.c
+++ b/drivers/tty/vt/keyboard.c
@@ -123,6 +123,7 @@ static const int NR_TYPES = ARRAY_SIZE(max_vals);
static struct input_handler kbd_handler;
static DEFINE_SPINLOCK(kbd_event_lock);
static DEFINE_SPINLOCK(led_lock);
+static DEFINE_SPINLOCK(func_buf_lock); /* guard 'func_buf' and friends */
static unsigned long key_down[BITS_TO_LONGS(KEY_CNT)]; /* keyboard key bitmap */
static unsigned char shift_down[NR_SHIFT]; /* shift state counters.. */
static bool dead_key_next;
@@ -1990,11 +1991,12 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
char *p;
u_char *q;
u_char __user *up;
- int sz;
+ int sz, fnw_sz;
int delta;
char *first_free, *fj, *fnw;
int i, j, k;
int ret;
+ unsigned long flags;

if (!capable(CAP_SYS_TTY_CONFIG))
perm = 0;
@@ -2037,7 +2039,14 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
goto reterr;
}

+ fnw = NULL;
+ fnw_sz = 0;
+ /* race aginst other writers */
+ again:
+ spin_lock_irqsave(&func_buf_lock, flags);
q = func_table[i];
+
+ /* fj pointer to next entry after 'q' */
first_free = funcbufptr + (funcbufsize - funcbufleft);
for (j = i+1; j < MAX_NR_FUNC && !func_table[j]; j++)
;
@@ -2045,10 +2054,12 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
fj = func_table[j];
else
fj = first_free;
-
+ /* buffer usage increase by new entry */
delta = (q ? -strlen(q) : 1) + strlen(kbs->kb_string);
+
if (delta <= funcbufleft) { /* it fits in current buf */
if (j < MAX_NR_FUNC) {
+ /* make enough space for new entry at 'fj' */
memmove(fj + delta, fj, first_free - fj);
for (k = j; k < MAX_NR_FUNC; k++)
if (func_table[k])
@@ -2061,20 +2072,28 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
sz = 256;
while (sz < funcbufsize - funcbufleft + delta)
sz <<= 1;
- fnw = kmalloc(sz, GFP_KERNEL);
- if(!fnw) {
- ret = -ENOMEM;
- goto reterr;
+ if (fnw_sz != sz) {
+ spin_unlock_irqrestore(&func_buf_lock, flags);
+ kfree(fnw);
+ fnw = kmalloc(sz, GFP_KERNEL);
+ fnw_sz = sz;
+ if (!fnw) {
+ ret = -ENOMEM;
+ goto reterr;
+ }
+ goto again;
}

if (!q)
func_table[i] = fj;
+ /* copy data before insertion point to new location */
if (fj > funcbufptr)
memmove(fnw, funcbufptr, fj - funcbufptr);
for (k = 0; k < j; k++)
if (func_table[k])
func_table[k] = fnw + (func_table[k] - funcbufptr);

+ /* copy data after insertion point to new location */
if (first_free > fj) {
memmove(fnw + (fj - funcbufptr) + delta, fj, first_free - fj);
for (k = j; k < MAX_NR_FUNC; k++)
@@ -2087,7 +2106,9 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm)
funcbufleft = funcbufleft - delta + sz - funcbufsize;
funcbufsize = sz;
}
+ /* finally insert item itself */
strcpy(func_table[i], kbs->kb_string);
+ spin_unlock_irqrestore(&func_buf_lock, flags);
break;
}
ret = 0;
--
2.20.1